Overview Use Cases Customer Churn Prediction Time Series Anomaly Detection Event Anomaly Detection Cloud Spend Alerts Personalized Promotions Predictive Modeling Real-Time Forecasting Financial Metrics Forecasting Demand Forecasting Cumulative Forecasting Text extraction and classification Feature Group Requirements Training Models Predictions NLP Powered Search Sentiment Analysis Finetuned LLM ChatLLM Language Detection Image Classification & Detection Object Detection Clustering Timeseries Clustering Sales and Revenue Forecasting Predictive Lead Scoring Personalized Search Personalized Recommendations Related Items Model Drift and Monitoring Tensorflow with Vector Matching Custom Python Model Data Ingestion Streaming Feature Store Vector Store AI Workflows Named Entity Recognition Guidelines Optimization Connectors Authentication Getting Started with the Python SDK API Documentation Chat Bot API Search How to

Required Feature Group Types

To train a model under this use case, you will need to create feature groups of the following type(s):

Feature Group Type	API Configuration Name	Required	Description
Labeled document data	DOCUMENTS	True	For detailed guidelines on the format of documents and annotations, please refer to the "Named Entity Recognition Guidelines" use case documentation.

Note: Once you upload the datasets under each Feature Group Type that comply with their respective required schemas, you will need to create Machine learning (ML) features that would be used to train your ML model(s). We use the term, "Feature Group" for a group of ML features (dataset columns) under a specific Feature Group Type. Our system support extensible schemas that enables you to provide any number of additional columns/features that you think are relevant to that Feature Group Type.

Feature Group: Labeled document data

For detailed guidelines on the format of documents and annotations, please refer to the "Named Entity Recognition Guidelines" use case documentation.

Feature Mapping	Required	Description
DOCUMENT	Y	Document text. Represents either the full content or tokenized segments of the document. For example: {content: "sample text 1"} or {tokens: [{content: "sample", start_offset: "00:00", end_offset: "00:07"}]}
ANNOTATIONS	N	Lists of labels for document text. Includes text-based annotations with start and end offsets, and supports bounding box annotations for precise labeling. For example: [{text_extraction: {text_segment: {end_offset: 6, start_offset: 0}}, display_name: "Sample annotation label 1"}]
DOCUMENT_ID	N	The unique identifier of the document
STATUS	N	The status of the review document
COMMENTS	N	Comments on the review document
METADATA	N	Metadata of the annotation
ROW_ID	N	The unique identifier of the feature group row.