To train a model under this use case, you will need to create feature groups of the following type(s):
Feature Group Type | API Configuration Name | Required | Description |
---|---|---|---|
User-Item Interactions | USER_ITEM_INTERACTIONS | True | This dataset corresponds to all the user-item interactions on your website or application. For example, all the actions (e.g. click, purchase, view) taken by a particular user on a particular item (e.g product, video. article) recorded as a time-based log. |
Catalog Attributes | ITEM_ATTRIBUTES | This dataset corresponds to all the information you have in your catalog. If you want to recommend actions instead of items to users, you can upload an action catalog. | |
User Attributes | USER_ATTRIBUTES | This dataset corresponds to all the attributes or meta-data that you have about your user. Any user profile information will be relevant here. |
Note: Once you upload the datasets under each Feature Group Type that comply with their respective required schemas, you will need to create Machine learning (ML) features that would be used to train your ML model(s). We use the term, "Feature Group" for a group of ML features (dataset columns) under a specific Feature Group Type. Our system support extensible schemas that enables you to provide any number of additional columns/features that you think are relevant to that Feature Group Type.
This dataset corresponds to all the user-item interactions on your website or application. For example, all the actions (e.g. click, purchase, view) taken by a particular user on a particular item (e.g product, video. article) recorded as a time-based log.
Feature Mapping | Feature Type | Required | Description |
---|---|---|---|
ITEM_ID | categorical | Y | This is the unique identifier of each item in your catalog. This is typically your product id, article id, or the video id. |
USER_ID | categorical | Y | This is a unique identifier of each user in your user base. |
ACTION_TYPE | categorical | N | This is an optional column and specifies the type of action the user took. This could include any action that is specific to you (e.g., view, click, purchase, rating, comment, like, etc). You can always upload a dataset that has no action_type column if all the actions in the dataset are the same (e.g., a dataset of only purchases or clicks). |
TIMESTAMP | timestamp | N | The timestamp when a particular action occurred. |
ACTION_WEIGHT | numerical | N | This is an optional column that specifies the weight of the action (e.g., video watch time, price of item purchased). This is used to optimize the the model to maximize actions with this value. |
This dataset corresponds to all the information you have in your catalog. If you want to recommend actions instead of items to users, you can upload an action catalog.
Feature Mapping | Feature Type | Required | Description |
---|---|---|---|
ITEM_ID | categorical | Y | The unique identifier for the catalog item. For example, this could be product id, article id or video id. |
[ITEM ATTRIBUTE] | Y | Any relevant attribute about the item. This would include brand/category for products, author or length of article for articles or tags for a video.We suggest providing at least 5-6 attributes per lead and up to a maximum of 1000 attributes. | |
PREDICTION_RESTRICT | categorical | N | This is an optional column that is used to restrict predictions to items matching a specific value of this column. If this is set, then the prediction api call will require that a includeFilter specifying a value for this column be included. |
ACTION_WEIGHT | numerical | N | This is an optional column that specifies the weight of the item (e.g., average video watch time, average price of item purchased). This is used to do optimization weights at an item level or as a fallback score for unknown items. |
This dataset corresponds to all the attributes or meta-data that you have about your user. Any user profile information will be relevant here.
Feature Mapping | Feature Type | Required | Description |
---|---|---|---|
USER_ID | categorical | Y | The unique identifier for the user. |
SNAPSHOT_TIME | timestamp | N | This is an optional column that is used to indicate when the record was updated. This allows us to provide multiple rows for a single user id and during training, we pick the row with most recent value for this column compared to the interaction timestamp. |
[USER ATTRIBUTE] | Y | Any relevant attribute/variable that can influence the target variable. The more data you have, the better the AI model. |