Training Parameters And Accuracy Measures

Our platform provides the flexibility to adjust a set of training parameters. There are general training parameters and advanced training options that could influence the model predictions. The predictions are measured on the basis of a set of accuracy measures or metrics that are also discussed in this section.


Training Options

Once you have fulfilled all the feature group requirements for the use case, you can set the following general and advanced training configuration options to train your ML model:

Training Option Name Description Possible Values
Name The name you would like to give to the model that is going to be trained. The system generates a default name depending upon the name of the project the model is a part of. The name can be comprised of any alphanumeric character and the length can be anywhere from 5 to 60 characters.
Set Refresh Schedule (UTC) Refresh schedule refers to the schedule when your dataset is set to be replaced by an updated copy of the particular dataset in context from your storage bucket location. This value to be entered is a CRON time string that describes the schedule in UTC time zone. A string in CRON Format. If you're unfamiliar with Cron Syntax, Crontab Guru can help translate the syntax back into natural language.
Number of Clusters to Use Select the number of clusters to use. If set to Automatic, we will choose the optimal number of clusters to reduce intra-cluster distance and increase inter-cluster distance. Integers from 2 to 30 with a step of 1 or Automatic

Metrics

Our AI engine will calculate the following metrics for this use case:

Metric Name Description
Number of Clusters The number of clusters created by the model.
Cluster Fractions The fraction of points in each cluster.
Silhouette Coefficient The silhouette score of the model. The best value is 1 and the worst value is 0. Higher values means the clusters are further apart and more distinct.
Davies Bouldin Score The Davies-Bouldin score of the model. The best value is 0. Lower scores indicate better clusters that are farther apart and less dispersed.
T-SNE Chart This chart represents the data points in 2 dimensions. Each cluster is a different colour.