PersonalizationTrainingConfig

Training config for the PERSONALIZATION problem type

KEY	TYPE	Description
DISABLE_TIMESTAMP_SCALAR_FEATURES	bool	Exclude timestamp scalar features.
TEST_ROW_INDICATOR	str	Column indicating which rows to use for training (TRAIN), validation (VAL) and testing (TEST).
INCLUDE_ITEM_ID_FEATURE	bool	Add Item-Id to the input features of the model. Applicable for Embedding distance and CTR models.
TEST_SPLIT_ON_LAST_K_ITEMS	bool	Use last k items instead of global timestamp splits, when validating and testing the model.
TRAINING_START_DATE	str	Only consider training interaction data after this date. Specified in the timezone of the dataset.
TEST_ON_USER_SPLIT	bool	Use user splits instead of using time splits, when validating and testing the model.
MAX_USER_HISTORY_LEN_PERCENTILE	int	Filter out users with history length above this percentile.
OPTIMIZED_EVENT_TYPE	str	The final event type to optimize for and compute metrics on.
COMPUTE_RERANK_METRICS	bool	Compute metrics based on rerank results.
DROPOUT_RATE	int	Dropout rate for neural network.
FILTER_HISTORY	bool	Do not recommend items the user has already interacted with.
SESSION_EVENT_TYPES	List[str]	List of event types to treat as occurrences of sessions.
SORT_OBJECTIVE	PersonalizationObjective	Ranking scheme used to sort models on the metrics page.
DOWNSAMPLE_ITEM_POPULARITY_PERCENTILE	float	Downsample items more popular than this percentile.
DISABLE_GPU	boo	Disable training on GPU.
ADD_TIME_FEATURES	bool	Include interaction time as a feature.
EXPLICIT_TIME_SPLIT	bool	Sets an explicit time-based test boundary.
QUERY_COLUMN	str	Name of column in the interactions table that represents a natural language query, e.g. 'blue t-shirt'.
ITEM_QUERY_COLUMN	str	Name of column in the item catalog that will be matched to the query column in the interactions table.
ACTION_TYPES_EXCLUSION_DAYS	Dict[str, float]	Mapping from action type to number of days for which we exclude previously interacted items from prediction
TARGET_ACTION_WEIGHTS	Dict[str, float]	Dictionary of action types to weights for training.
BATCH_SIZE	BatchSize	Batch size for neural network.
TEST_WINDOW_LENGTH_HOURS	int	Duration (in hours) of most recent time window to use when validating and testing the model.
USE_USER_ID_FEATURE	bool	Use user id as a feature in CTR models.
SEQUENTIAL_TRAINING	bool	Train a mode sequentially through time.
SESSION_DEDUPE_MINS	float	Minimum number of minutes between two sessions for a user.
TEST_LAST_ITEMS_LENGTH	int	Number of items to leave out for each user when using leave k out folds.
OBJECTIVE	PersonalizationObjective	Ranking scheme used to select final best model.
DATA_SPLIT_FEATURE_GROUP_TABLE_NAME	str	Specify the table name of the feature group to export training data with the fold column.
FULL_DATA_RETRAINING	bool	Train models separately with all the data.
DISABLE_TRANSFORMER	bool	Disable training the transformer algorithm.
TEST_SPLIT	int	Percent of dataset to use for test data. We support using a range between 6% to 20% of your dataset to use as test data.
MIN_ITEM_HISTORY	int	Minimum number of interactions an item must have to be included in training.
COMPUTE_SESSION_METRICS	bool	Evaluate models based on how well they are able to predict the next session of interactions.
RECENT_DAYS_FOR_TRAINING	int	Limit training data to a certain latest number of days.
TARGET_ACTION_TYPES	List[str]	List of action types to use as targets for training.
TRAINING_MODE	PersonalizationTrainingMode	whether to train in production or experimental mode. Defaults to EXP.
MAX_HISTORY_LENGTH	int	Maximum length of user-item history to include user in training examples.