Skip to main content

Clustering

Overview​

Choose this use-case if you wish to develop an unsupervised learning model to group unlabelled data into meaningful clusters based on inherent patterns and similarities.

To follow this tutorial, ensure you have followed these steps:

  1. Log in to the Abacus developer platform
  2. Create a new project of type "Clustering"
  3. Provide a name, and click on Skip to Project Dashboard.

If you are having trouble creating a new project, follow this guide

Example use cases would be:

  • Customer segmentation for targeted marketing campaigns
  • Anomaly detection in fraud prevention or system monitoring
  • Market basket analysis for product recommendations

Steps to Train a Model​

Step 1: Ingest data into the platform​

Once you are ready to upload data, follow these steps:

  1. Click on Datasets within the project page on the left side panel.
  2. Click on Create Dataset on the top right corner.
  3. IMPORTANT: Provide a name and choose the CUSTOM_TABLE Feature Group type
NER doc upload

Step 2: Configuring Feature Groups​

Once the dataset has finished inspecting, navigate to the Feature Group area from the left panel. You should be able to see a Feature Group that has the same name as the dataset name you provided.

  1. Navigate to Feature group → Features
  2. Select the appropriate feature group type (see Required Feature Groups)

To learn more about how feature group mapping works, visit our Feature Group mapping guide for detailed configuration instructions.

Step 3: Model Training​

You are now ready to train your model. Create a new model by navigating to the models page:

Select your training Feature Group, and start the training process.

After you click on train:

  • Select your training feature group(s) — Choose the datasets you want to use for training
  • Configure training options — We recommend keeping the default values, but you can customize settings based on your needs. Click the (?) icon next to any option to see what it does
  • Monitor training progress — Your model will begin training automatically. You can track its status in real-time

For use cases with ground truth data, you'll be able to evaluate model performance through the Metrics page, accessible directly from your model's dashboard.

Step 4: Deploying your model​

Once the model is ready, click on the Deploy button within the model's page.

Model Deploy Generic
  • Offline Deployment Mode: Select this option when you want the model to be available for batch processing.
  • Online Deployment Mode: Select this option when you want the model to be available to use via the API
Model Deploy Generic

Here are some of the important options you will be able to see within the Deployment page of a model:

Model Deploy Generic
  • Prediction API: Clicking on this button will provide you with a sample API request to call this model externally.
  • Switch Version/Algorithm: This option allows you to change the model version or algorithm used for predictions.
  • Auto Deployment: Whether model retrains will also auto-deploy the newest version of the model.

To learn more about Deployments, visit our Deployments Guide

Your model is now ready to use!

Batch Predictions​

Batch predictions is the process of using a deployed model to make predictions on a batch of input data. It's how you get predictions out of all Machine Learning Models.

The process is as follows, and is applicable for all machine learning project types:

  1. Upload data, ensuring that it is in the same format as the model training data
  2. Navigate to Batch Predictions --> Create New Batch Prediction from the left side panel
NER doc upload
  1. Follow the wizard
    • Select the deployment (This is the model that will be used to make predictions)
    • Select the input Feature Group to the batch prediction
    • Select the output Feature Group for the batch prediction
NER doc upload

Abacus will now use the deployed model you selected to create predictions. The output Feature Group will have columns with the predicted values. If you re-run the same batch prediction, a new version of Feature Group will be created with the new output.

To learn more about Batch Predictions, visit our Batch Predictions Guide

Required Feature Group Types​

To train a model under this use case, you will need to create feature groups of the following type(s):

Feature Group TypeAPI Configuration NameRequiredDescription
Custom TableTABLETrueThis dataset contains the features to be used for clustering. The model will analyze these features to identify natural groupings and patterns in the data.

Feature Group: Custom Table​

This dataset contains the features to cluster. The model will automatically identify patterns and group similar data points together.

Feature MappingFeature TypeRequiredDescription
[FEATURE_1]Numeric/CategoricalYAny relevant feature that describes the data points to be clustered
[FEATURE_2]Numeric/CategoricalYAdditional features that capture different aspects of the data
[FEATURE_N]Numeric/CategoricalYInclude as many features as needed to capture the characteristics of your data. The more relevant features, the better the clustering results.
ITEM_IDTextNOptional identifier for each data point to track cluster assignments

Predictions​

For any deployed model within the abacus platform, you can leverage an API to call it externally. The steps to do this are:

  1. Deploy the Model
  2. Navigate to the Deployments Page
  3. Click on the Predictions API button on the left side

That will give you the exact API endpoints and tokens you need to use the deployment.

The Relevant API References for this use case are: