What is a Feature Group Template?

Feature Group Template is a SQL template where a SQL query is reused with the help of variable(s). These variable(s) are provided with feature group table name(s) (data table) as values to create a feature group. This enables data scientists to use the feature group template and input different feature groups as variable values to create different feature groups without rewriting the same SQL queries.

Example SQL query format: SELECT * FROM {table_name} GROUP BY {table_name}.column_name

Where table_name is a variable that can be replaced with different feature groups.

Why are Feature Group Templates useful?

Feature Group Templates help in the efficient and convenient creation of different feature groups with commonality in terms of how they can be created/derived.

Example scenario: suppose we have several large dataset of thousands of movies under various genres and we want to sample a list of movies for the last two decades starting from 2000 that includes all genres. We can write a simple SQL template to accomplish this:

SELECT * FROM {movies_genre_catalog} WHERE 2000 < movie_year AND movie_year < 2021

Where variable movies_genre_catalog can be replaced with feature groups corresponding to each respective genre dataset.

This is how it becomes convenient to create different feature groups using the same SQL template that uses a variable. Feature Group Templates also enable proper organization and management of templated feature groups for a machine learning project or a group of associated machine learning projects. In the running example, the end user will be able to create several feature groups by providing the feature group table name for their respective genres. Further, the user will have the details of every feature group, like the SQL template the feature is linked to, the variable name and the feature group provided to it, the feature group version history, etc., making it easy to create more feature groups and manage existing ones.