Required Feature Group Types

To train a model under this use case, you will need to create feature groups of the following type(s):

Feature Group Type API Configuration Name Required Description
User-Item Interactions USER_ITEM_INTERACTIONS True This dataset corresponds to all the user-item interactions on your website or application. For example, all the actions (e.g. click, purchase, view) taken by a particular user on a particular item (e.g product, video. article) recorded as a time-based log.
Catalog Attributes ITEM_ATTRIBUTES This dataset corresponds to all the information you have in your catalog. If you want to recommend actions instead of items to users, you are welcome to upload an action catalog.
User Attributes USER_ATTRIBUTES This dataset corresponds to all the attributes or meta-data that you have about your user base. Any user profile information will be relevant here.

Note: Once you upload the datasets under each Feature Group Type that comply with their respective required schemas, you will need to create Machine learning (ML) features that would be used to train your ML model(s). We use the term, "Feature Group" for a group of ML features (dataset columns) under a specific Feature Group Type. Our system support extensible schemas that enables you to provide any number of additional columns/features that you think are relevant to that Feature Group Type.


Feature Group: User-Item Interactions

This dataset corresponds to all the user-item interactions on your website or application. For example, all the actions (e.g. click, purchase, view) taken by a particular user on a particular item (e.g product, video. article) recorded as a time-based log.

Feature Mapping Feature Type Required Description
ITEM_ID categorical Y This is the unique identifier of each item in your catalog. This is typically your product id, article id, or the video id.
USER_ID categorical Y This is a unique identifier of each user in your user base.
ACTION_TYPE categorical N This is an optional column that specifies the type of action the user took. This could include any action that is specific to you (e.g., view, click, purchase, rating, comment, like, etc). You can always upload a dataset that has no action_type column if all the actions in the dataset are the same (e.g., a dataset of only purchases or clicks).
TIMESTAMP timestamp N The timestamp when a particular action occurred.
ACTION_WEIGHT numerical N This is an optional column that specifies the weight of the action (e.g., video watch time, price of item purchased). This is used to optimize the the model to maximize actions with this value.

Feature Group: Catalog Attributes

This dataset corresponds to all the information you have in your catalog. If you want to recommend actions instead of items to users, you are welcome to upload an action catalog.

Feature Mapping Feature Type Required Description
ITEM_ID categorical Y This is a unique identifier of each item in your catalog. This is typically your product id, article id, or video id.
[ITEM ATTRIBUTE] Y Any relevant attributes about the item. This would include brand/category for products, author, or length of the article for articles or tags for a video. We suggest providing at least 5-6 attributes per lead and up to a maximum of 1000 attributes.
PREDICTION_RESTRICT categorical N This is an optional column that is used to restrict predictions to items matching a specific value of this column. If this is set, then the prediction api call will require that a includeFilter specifying a value for this column be included.
ACTION_WEIGHT numerical N This is an optional column that specifies the weight of the item (e.g., average video watch time, average price of item purchased). This is used to do optimization weights at an item level or as a fallback score for unknown items.

Feature Group: User Attributes

This dataset corresponds to all the attributes or meta-data that you have about your user base. Any user profile information will be relevant here.

Feature Mapping Feature Type Required Description
USER_ID categorical Y The unique identifier for the user.
SNAPSHOT_TIME timestamp N This is an optional column that is used to indicate when the record was updated. This allows us to provide multiple rows for a single user id and during training, we pick the row with most recent value for this column compared to the interaction timestamp.
[USER ATTRIBUTE] N A relevant attributes/variables that can influence your target variable. More number of relevant attributes makes the AI model better.