Required Feature Group Types

To train a model under this use case, you will need to create feature groups of the following type(s):

Feature Group Type API Configuration Name Required Description
Labeled document data DOCUMENTS True For detailed guidelines on the format of documents and annotations, please refer to the "Named Entity Recognition Guidelines" use case documentation.

Note: Once you upload the datasets under each Feature Group Type that comply with their respective required schemas, you will need to create Machine learning (ML) features that would be used to train your ML model(s). We use the term, "Feature Group" for a group of ML features (dataset columns) under a specific Feature Group Type. Our system support extensible schemas that enables you to provide any number of additional columns/features that you think are relevant to that Feature Group Type.


Feature Group: Labeled document data

For detailed guidelines on the format of documents and annotations, please refer to the "Named Entity Recognition Guidelines" use case documentation.

Feature Mapping Feature Type Required Description
DOCUMENT Y Document text. Represents either the full content or tokenized segments of the document. For example: {content: "sample text 1"} or {tokens: [{content: "sample", start_offset: "00:00", end_offset: "00:07"}]}
ANNOTATIONS N Lists of labels for document text. Includes text-based annotations with start and end offsets, and supports bounding box annotations for precise labeling. For example: [{text_extraction: {text_segment: {end_offset: 6, start_offset: 0}}, display_name: "Sample annotation label 1"}]
DOCUMENT_ID N The unique identifier of the document
STATUS N The status of the review document
COMMENTS N Comments on the review document
METADATA N Metadata of the annotation
ROW_ID N The unique identifier of the feature group row.