Training Parameters And Accuracy Measures

Our platform provides the flexibility to adjust a set of training parameters. There are general training parameters and advanced training options that could influence the model predictions. The predictions are measured on the basis of a set of accuracy measures or metrics that are also discussed in this section.


Training Options

Once you have fulfilled all the feature group requirements for the use case, you can set the following general and advanced training configuration options to train your ML model:

Training Option Name Description Possible Values
Name The name you would like to give to the model that is going to be trained. The system generates a default name depending upon the name of the project the model is a part of. The name can be comprised of any alphanumeric character and the length can be anywhere from 5 to 60 characters.
Set Refresh Schedule (UTC) Refresh schedule refers to the schedule when your dataset is set to be replaced by an updated copy of the particular dataset in context from your storage bucket location. This value to be entered is a CRON time string that describes the schedule in UTC time zone. A string in CRON Format. If you're unfamiliar with Cron Syntax, Crontab Guru can help translate the syntax back into natural language.

Advanced Training Options

For Advanced Options, our AI engine will automatically set the optimum values. We recommend overriding these options only if you are familiar with deep learning. Overview of Advanced Options:

Training Option Name API Configuration Name Description Possible Values
Test Split TEST_SPLIT Percentage of the dataset to use as test data. A range from 5% to 20% of the dataset as test data is recommended. The test data is used to estimate the accuracy and generalizing capability of the model on unseen (new) data. Percentage of the dataset to use as test data. A range from 5% to 20% of the dataset as test data is recommended. The test data is used to estimate the accuracy and generalizing capability of the model on unseen (new) data.
Sampling Unit Keys TEST_SPLIT_COLUMNS Generally, each record (data row) in the training set corresponds to an independent observation. Therefore, training and testing sets can be created by using random sampling. In cases where there are related observations, for example, in case of multiple readings taken from a sensor, it might become necessary to treat those related observations as a single unit for sampling purposes. For such cases, one or more categorical fields can be selected to define a composite key for sampling the records such that each unique record with a composite key will appear either in the training set or the testing set. One or more categorical columns
Test Row Indicator TEST_ROW_INDICATOR Select the Column from the list of columns that contains the keywords 'TEST' as value to indicate which rows to use as test data. The remaining rows are split between train/val randomly.

The semantics of train/val/test split is determined by the combination of values of the two advanced training options - Test Row Indicator and Sampling Unit Keys:

1. Default split - random split by row

2. TrainVsTest split - The column provided under the Test Row Indicator advanced option will be used to select the test rows. If the column value == 'TEST' then it is a test row otherwise it is a train/val row. Split between train/val is random by row.

3. Sampling Unit Keys - rows are grouped by the composite key defined by the values in the set of key columns. The unique composite keys are randomly split between train/val/test with a goal of getting the fraction of rows (not keys) in each split. The implication is that a given composite key appears in exactly one of the three: train, val, or test.

4. Sampling Unit Keys + TrainVsTest: The test rows are selected using the TrainVsTest column (column specified under Test Row Indicator advanced option). The remaining rows are split between train and val following the logic for Sampling Key. So, basically TrainVsTest overrides the sampling key behavior for the test split.
A column containing {TEST} values
Timestamp Based Splitting Column TIMESTAMP_BASED_SPLITTING_COLUMN In predictive modeling, timestamp column is useful to sequentially provide data such that the model could be trained on a sequence of data and then made to predict on the next sequence of data. This way, you could train on past data and check how the model behaves on future data. In some sense, the training process is made to respect causality. If you specify a particular column as our TIMESTAMP BASED SPLITTING COLUMN, and set the train/test split to be 80:20, then the rows corresponding to the first 80% of timestamps will go into the train set and the rest into the test set. This kind of split may help us understand how good our model performs on the future data when only exposed to the past data. Select the suitabe column that contains the timestamps
Test Splitting Timestamp TEST_SPLITTING_TIMESTAMP When the TIMESTAMP BASED SPLITTING COLUMN is selected, you can also select an extra parameter, TEST SPLITTING TIMESTAMP, to specify a timestamp, such that all rows with a specific timestamp or beyond will go in the test set and the remaining will be a part of the train set. Select the Date from the calender
Use Multihot Encoding USE_MULTIHOT_ENCODING Use multihot encoding for a multi-valued categorical column. multi-valued categorical column
Dropout DROPOUT Dropout percentage in deep neural networks. It is a regularization method used for better generalization over new data. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much and enhances their isolated learning. 0 to 90
Batch Size BATCH_SIZE The number of data points provided to the model at once (in one batch) during training. The batch size impacts how quickly a model learns and the stability of the learning process. It is an important hyperparameter that should be well-tuned. For more details, please visit our Blog. 16 / 32 / 64 / 128
Min Categorical Count MIN_CATEGORICAL_COUNT Minimum threshold to consider a value different from the unknown placeholder. Integers from 1 to 100 with a step of 5
Sample Weight SAMPLE_WEIGHT_COLUMN Sets the weight of each training sample in the objective that is optimized. Metrics are also reported based on this weight column. Any numeric column. Sample with zero weight are discarded.
Rebalance Classes REBALANCE_CLASSES Applies weights to each sample in inverse proportion to the frequency of the target class. Leads to increase minority class recall at the expense of precision. Yes / No
Rare Class Augmentation Threshold RARE_CLASS_AUGMENTATION_THRESHOLD Augments any rare class whose relative frequency with respect to the most frequent class is less than this threshold. 0.01 to 0.5
Augmentation Strategy AUGMENTATION_STRATEGY Strategy to deal with class imbalance and data augmentation. resample / smote
Training Rows Downsample Ratio TRAINING_ROWS_DOWNSAMPLE_RATIO Uses this ratio to train on a sample of the dataset provided. 0.01 to 0.9
Ignore Datetime Features IGNORE_DATETIME_FEATURES Remove all datetime features from the model. Useful while generalizing to different time periods. true or false
Use Pretrained Embeddings USE_PRETRAINED_EMBEDDINGS Whether to use pretrained embeddings or not. true or false
Max Text Words MAX_TEXT_WORDS Maximum number of words to use from text fields. 100 to 1000
AutoML Ensemble Size AUTOML_ENSEMBLE_SIZE Number of architectures to use for performing ensemble-averaging on architectures while creating an ensemble of suitable neural network architectures found by AutoML for the provided dataset(s). 2 to 10
AutoML Initial Learning Rate AUTOML_INITIAL_LEARNING_RATE Initial learning rate for seed architectures generated by AutoML. decimal (0.0001, 0.01)

Metrics

Our AI engine will calculate the following metrics for this use case:

Metric Name Description
Accuracy This metric calculates the percentage of predictions that were correct of the total number of predictions made by the model. 100% means that the model completed the task with no errors.
Area Under ROC Curve (AUC Curve) AUC, or Area Under the Curve, describes a model's ability to distinguish between two or more classes, with a higher AUC indicating superior performance in correctly predicting positive instances as positive and negative instances as negative. Conversely, an AUC close to 0 suggests the model is incorrectly classifying negative instances as positive and vice versa. A value between 0.6 and 1 signifies that the model has learned meaningful patterns rather than making random guesses. AUC serves as a performance metric for classification problems across various threshold settings, offering an aggregate measure of performance. Its desirability stems from being scale-invariant, assessing the ranking quality of predictions, and classification-threshold-invariant, evaluating prediction quality regardless of the chosen classification threshold. For more details, please visit the link.
Class Label This is the name of the class for which we are computing the metrics.
Support This is the number of occurrences of a class label in the dataset. E.g. let's say if 'car' class appears 1000 times in the dataset with 10,000 data points then the support will be 1000 for 'car' class.
Precision Precision is the percentage of your results which are relevant. In other words, it is the fraction of relevant results among the retrieved results. It ranges from 0 to 1. The closer it gets to 1, the better. For further details, please visit this link.
Recall Recall is the percentage of total relevant results correctly classified by the model. In other words, it is the fraction of the total amount of relevant instances that were actually retrieved. It has a range from 0 to 1. The closer it gets to 1, the better. For further details, please visit this link.
F1 Score It is the harmonic mean of precision and recall. It is generally used when precision and recall are equally important for the classification. For further details, please visit this link.
Loss It is the estimate of the accuracy of a model such that it indicates how far the model has been trained. The goal of the Machine Learning model is to minimize the loss value. It is sometimes hard to make sense of loss function especially when it is used in isolation, so other accuracy measures like precision, recall, AUC, etc., are used in conjunction with loss to evaluate the model. For further details, please visit this link.

Note: In addition to the above metrics, our engine will train a baseline model and generate metrics for the baseline model. Typically the metrics for your custom deep learning model should be better than the baseline model.