Our platform provides the flexibility to adjust a set of training parameters. There are general training parameters and advanced training options that could influence the model predictions. The predictions are measured on the basis of a set of accuracy measures or metrics that are also discussed in this section.
Once you have fulfilled all the feature group requirements for the use case, you can set the following general and advanced training configuration options to train your ML model:
Training Option Name | Description | Possible Values |
---|---|---|
Name | The name you would like to give to the model that is going to be trained. The system generates a default name depending upon the name of the project the model is a part of. | The name can be comprised of any alphanumeric character and the length can be anywhere from 5 to 60 characters. |
Set Refresh Schedule (UTC) | Refresh schedule refers to the schedule when your dataset is set to be replaced by an updated copy of the particular dataset in context from your storage bucket location. This value to be entered is a CRON time string that describes the schedule in UTC time zone. | A string in CRON Format. If you're unfamiliar with Cron Syntax, Crontab Guru can help translate the syntax back into natural language. |
For Advanced Options, our AI engine will automatically set the optimum values. We recommend overriding these options only if you are familiar with deep learning. Overview of Advanced Options:
Training Option Name | API Configuration Name | Description | Possible Values |
---|---|---|---|
Type of split | TYPE_OF_SPLIT | Defines the underlying method that will be used to split data into train & test. We support the following ways to do this: (i) Random Sampling: Randomly split data into train and test, based on the 'TEST SPLIT' percentage (ii) Timestamp Based: In predictive modeling, a timestamp can be set such that the model could be trained on a sequence of data and then made to predict on the next sequence of data based on the timestamp. This way, you could train on past data and check how the model behaves on future data. In some sense, the training process is made to respect causality. This kind of split may help us understand how good our model performs on the future data when only exposed to the past data. (iii) Row Indicator Based: Split data based on values indicated by each row of a selected column (TEST ROW INDICATOR) |
Random Sampling / Timestamp Based / Row Indicator Based |
Test Split | TEST_SPLIT | Percentage of the dataset to use as test data. A range from 5% to 20% of the dataset as test data is recommended. The test data is used to estimate the accuracy and generalizing capability of the model on unseen (new) data. | Percentage of the dataset to use as test data. A range from 5% to 20% of the dataset as test data is recommended. The test data is used to estimate the accuracy and generalizing capability of the model on unseen (new) data. |
Sampling Unit Keys | TEST_SPLIT_COLUMNS | Generally, each record (data row) in the training set corresponds to an independent observation. Therefore, training and testing sets can be created by using random sampling. In cases where there are related observations, for example, in case of multiple readings taken from a sensor, it might become necessary to treat those related observations as a single unit for sampling purposes. For such cases, one or more categorical fields can be selected to define a composite key for sampling the records such that each unique record with a composite key will appear either in the training set or the testing set. | One or more categorical columns |
Test Row Indicator | TEST_ROW_INDICATOR | Select the Column from the list of columns that contains the keywords 'TEST' as value to indicate which rows to use as test data. The remaining rows are split between train/val randomly. The semantics of train/val/test split is determined by the combination of values of the two advanced training options - Test Row Indicator and Sampling Unit Keys: 1. Default split - random split by row 2. TrainVsTest split - The column provided under the Test Row Indicator advanced option will be used to select the test rows. If the column value == 'TEST' then it is a test row otherwise it is a train/val row. Split between train/val is random by row. 3. Sampling Unit Keys - rows are grouped by the composite key defined by the values in the set of key columns. The unique composite keys are randomly split between train/val/test with a goal of getting the fraction of rows (not keys) in each split. The implication is that a given composite key appears in exactly one of the three: train, val, or test. 4. Sampling Unit Keys + TrainVsTest: The test rows are selected using the TrainVsTest column (column specified under Test Row Indicator advanced option). The remaining rows are split between train and val following the logic for Sampling Key. So, basically TrainVsTest overrides the sampling key behavior for the test split. |
A column containing {TEST} values |
Timestamp Based Splitting Column | TIMESTAMP_BASED_SPLITTING_COLUMN | In predictive modeling, timestamp column is useful to sequentially provide data such that the model could be trained on a sequence of data and then made to predict on the next sequence of data. This way, you could train on past data and check how the model behaves on future data. In some sense, the training process is made to respect causality. If you specify a particular column as our TIMESTAMP BASED SPLITTING COLUMN, and set the train/test split to be 80:20, then the rows corresponding to the first 80% of timestamps will go into the train set and the rest into the test set. This kind of split may help us understand how good our model performs on the future data when only exposed to the past data. | Select the suitabe column that contains the timestamps |
Timestamp Based Splitting Method | TIMESTAMP_BASED_SPLITTING_METHOD | Select the method to split data based on timestamp: (i) Test Split Percentage Based: If you set the train/test split to be 80:20, then the rows corresponding to the first 80% of timestamps will go into the train set and the rest into the test set. (ii) Test Start Timestamp Based: Specify a timestamp such that all rows with a specific timestamp or beyond will go in the test set and the remaining will be a part of the train set. |
Test Split Percentage Based / Test Start Timestamp Based |
Test Splitting Timestamp | TEST_SPLITTING_TIMESTAMP | When the TIMESTAMP BASED SPLITTING COLUMN is selected, you can also select an extra parameter, TEST SPLITTING TIMESTAMP, to specify a timestamp, such that all rows with a specific timestamp or beyond will go in the test set and the remaining will be a part of the train set. | Select the Date from the calender |
Use Multihot Encoding | USE_MULTIHOT_ENCODING | Use multihot encoding for a multi-valued categorical column. | multi-valued categorical column |
Dropout | DROPOUT | Dropout percentage in deep neural networks. It is a regularization method used for better generalization over new data. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much and enhances their isolated learning. | 0 to 90 |
Batch Size | BATCH_SIZE | The number of data points provided to the model at once (in one batch) during training. The batch size impacts how quickly a model learns and the stability of the learning process. It is an important hyperparameter that should be well-tuned. For more details, please visit our Blog. | 16 / 32 / 64 / 128 |
Sample Weight | SAMPLE_WEIGHT_COLUMN | Sets the weight of each training sample in the objective that is optimized. Metrics are also reported based on this weight column. | Any numeric column. Sample with zero weight are discarded. |
Rebalance Classes | REBALANCE_CLASSES | Applies weights to each sample in inverse proportion to the frequency of the target class. Leads to increase minority class recall at the expense of precision. | Yes / No |
Min Categorical Count | MIN_CATEGORICAL_COUNT | Minimum threshold to consider a value different from the unknown placeholder. | Integers from 1 to 100 with a step of 5 |
Training Rows Downsample Ratio | TRAINING_ROWS_DOWNSAMPLE_RATIO | Uses this ratio to train on a sample of the dataset provided. | 0.01 to 0.9 |
Ignore Datetime Features | IGNORE_DATETIME_FEATURES | Remove all datetime features from the model. Useful while generalizing to different time periods. | true or false |
Use Pretrained Embeddings | USE_PRETRAINED_EMBEDDINGS | Whether to use pretrained embeddings or not. | true or false |
Max Text Words | MAX_TEXT_WORDS | Maximum number of words to use from text fields. | 100 to 1000 |
AutoML Ensemble Size | AUTOML_ENSEMBLE_SIZE | Number of architectures to use for performing ensemble-averaging on architectures while creating an ensemble of suitable neural network architectures found by AutoML for the provided dataset(s). | 2 to 10 |
AutoML Initial Learning Rate | AUTOML_INITIAL_LEARNING_RATE | Initial learning rate for seed architectures generated by AutoML. | decimal (0.0001, 0.01) |
Target Transform | TARGET_TRANSFORM | Specify a transform (e.g. log, quantile) to apply to the target variable. | log / quantile / yeo-johnson / box-cox |
Loss Function | LOSS_FUNCTION_REG | This option provides a choice of objective (loss) function, that is being optimized during training. Choice of loss function could be important in achieving good performance on desired metric. If set to 'Automatic', every eligible algorithm gets an appropriate loss function that is generally known to work well with that algorithm. Specific parameters of a loss functions can be overriden using the 'LOSS PARAMETERS' field. | Automatic / Custom / Huber / Mean Squared Error / Mean Absolute Error / Mean Absolute Percentage Error / Mean Squared Logarithmic Error / Cross Entropy / Focal Cross Entropy |
Loss Parameters | LOSS_PARAMETERS_REG | This field accepts specific keyword parameters that can be overridden for a loss function. For example, the exponent weight gamma for 'Focal Cross Entropy' loss can be provided, but in case a value is not provided, a default value of the paramter is used that is known to work well with that loss function. Format -> key1=value1;key2=value2;... | -> Cross Entropy label_smoothing: float [0, 1] -> Focal Cross Entropy gamma: float [0, 1] -> Huber delta: float [0, inf) |
Custom Loss Functions | CUSTOM_LOSS_FUNCTIONS_REG | Select all/any registered custom loss functions which you want to use as objective function during training. When a loss function of a certain type is selected, it will be applied to all algorithms which support that particular loss type. At most one loss function of a given type can be added. Unused selections will be ignored during training and algorithms which don't support any of the selected loss functions will not be trained. The following enlists the loss types and the algorithms that are compatible with it. 1. Regression - Deep Learning (TensorFlow) -> Abacus Deep Learning - Best Fit Neural Network -> Abacus Deep Learning - Best Fit Neural Network with Text Embeddings -> Abacus Deep Learning - HPO DNN -> Abacus Deep Learning - AutoML 2. Classification - Deep Learning (TensorFlow) -> Abacus Deep Learning - Best Fit Neural Network -> Abacus Deep Learning - Best Fit Neural Network with Text Embeddings -> Abacus Deep Learning - HPO DNN -> Abacus Deep Learning - AutoML |
Registered custom loss functions eligible for training |
Custom Metrics | CUSTOM_METRICS | Metrics are used to evaluate the perfomance of the trained model. The platform already calculates a number of metrics for each problem which are shown in the metrics page. Use this option to select and evaluate any additional custom metric that are registered. |
Registered custom metrics eligible for the model |
Our AI engine will calculate the following metrics for this use case:
Metric Name | Description |
---|---|
Accuracy | This metric calculates the percentage of predictions that were correct of the total number of predictions made by the model. 100% means that the model completed the task with no errors. |
Area Under ROC Curve (AUC Curve) | AUC, or Area Under the Curve, describes a model's ability to distinguish between two or more classes, with a higher AUC indicating superior performance in correctly predicting positive instances as positive and negative instances as negative. Conversely, an AUC close to 0 suggests the model is incorrectly classifying negative instances as positive and vice versa. A value between 0.6 and 1 signifies that the model has learned meaningful patterns rather than making random guesses. AUC serves as a performance metric for classification problems across various threshold settings, offering an aggregate measure of performance. Its desirability stems from being scale-invariant, assessing the ranking quality of predictions, and classification-threshold-invariant, evaluating prediction quality regardless of the chosen classification threshold. For more details, please visit the link. |
Class Label | This is the name of the class for which we are computing the metrics. |
Support | This is the number of occurrences of a class label in the dataset. E.g. let's say if 'car' class appears 1000 times in the dataset with 10,000 data points then the support will be 1000 for 'car' class. |
Precision | Precision is the percentage of your results which are relevant. In other words, it is the fraction of relevant results among the retrieved results. It ranges from 0 to 1. The closer it gets to 1, the better. For further details, please visit this link. |
Recall | Recall is the percentage of total relevant results correctly classified by the model. In other words, it is the fraction of the total amount of relevant instances that were actually retrieved. It has a range from 0 to 1. The closer it gets to 1, the better. For further details, please visit this link. |
F1 Score | It is the harmonic mean of precision and recall. It is generally used when precision and recall are equally important for the classification. For further details, please visit this link. |
Loss | It is the estimate of the accuracy of a model such that it indicates how far the model has been trained. The goal of the Machine Learning model is to minimize the loss value. It is sometimes hard to make sense of loss function especially when it is used in isolation, so other accuracy measures like precision, recall, AUC, etc., are used in conjunction with loss to evaluate the model. For further details, please visit this link. |
Mean Absolute Error (MAE) | MAE is the average difference between the predicted and actual values. The lower the value of this metric the better. A score of 0 means that the model has perfect results. MAE uses the same scale as the data being measured. So, it is a scale-dependent accuracy measure and therefore cannot be used to make comparisons between the models that use different scales. It measures the average magnitude of the errors in a set of predictions, without considering their direction. It is a common measure of forecast error in time series analysis and is relatively easier to interpret (as compared to Root Mean Square Error). For further details, please visit this link. |
Weighted Mean Percentage Error | WAPE stands for weighted absolute percentage error. WAPE can be construed as the Average Absolute Error divided by the Average Actual quantity. It is a very simple calculation that goes as follows: 1) You have a set of actual values (often called actuals) 2) You have a set of forecast values values for which you want to calculate the WAPE. 3) For each value in the set of actual values, calculate the absolute difference of actual value and the forecasted value. Sum all the absolute differences. 4) Divide the sum by the sum of all the actual values. For further details, please visit the wiki link . |
Root Mean Square Error (RMSE) | It is the square root of the average of squares of all the differences between the predicted and actual result values. In other words, the difference between the predicted and actual values is calculated and squared. Next, an average of all the squared differences is calculated. Finally, the root of the average value is taken and considered as RMSE. This makes RMSE, a non-negative value, and a score of 0 (almost never achieved in practice) would indicate a perfect fit to the data. In general, a lower RMSE score is better than a higher one. However, comparisons across different types of data would be invalid because the metric is dependent on the scale of the numbers used. The errors are squared before they are averaged, so RMSE gives a relatively high weight to large errors. This means that it is more useful when large errors are particularly undesirable, for example, camera calibration where being off by 5 degrees is more than twice as bad as being off by 2.5 degrees. Further this makes RMSE sensitive to outliers. RMSE is fundamentally harder to understand and interpret as compared to the Mean Absolute Error (MAE). Each error in MAE influences it in direct proportion to the absolute value of the error, which is not the case for RMSE. For further details, please visit this link. |
Coefficient of Determination (R2) | The coefficient of determination (denoted by R2) is a key output of regression analysis. For understanding R2, we would need to first understand dependent variables and independent variables. A dependent variable is a feature (column) in the dataset that contributes toward the change in the target variable / independent variable (the class value or the target column in the dataset). The coefficient of determination (R2) is the percentage variance in the dependent variable that the independent variables explain collectively. The value of R2 ranges from 0 to 1 where 0 means no variation in the target variable is explained by the dependent variables and 1 means all the variations are explained by the dependent variables. Thus, in general, the closer the value of R2 to 1, the better. For further details, please visit this page. |
Note: In addition to the above metrics, our engine will train a baseline model and generate metrics for the baseline model. Typically the metrics for your custom deep learning model should be better than the baseline model.