To train a model under this use case, you will need to create feature groups of the following type(s):
Feature Group Type | API Configuration Name | Required | Description |
---|---|---|---|
List of documents | DOCUMENTS | True | This dataset corresponds to the list of documents that you want to use to fine tune your LLM. |
Evaluation | EVALUATION | False | The Evaluation dataset is used to evaluate the model's performance. It contains a list of questions and their expected answers. |
Note: Once you upload the datasets under each Feature Group Type that comply with their respective required schemas, you will need to create Machine learning (ML) features that would be used to train your ML model(s). We use the term, "Feature Group" for a group of ML features (dataset columns) under a specific Feature Group Type. Our system support extensible schemas that enables you to provide any number of additional columns/features that you think are relevant to that Feature Group Type.
This dataset corresponds to the list of documents that you want to use to fine tune your LLM.
Feature Mapping | Feature Type | Required | Description |
---|---|---|---|
DOCUMENT | Y | The document text | |
DOCUMENT_ID | Y | The unique document identifier | |
DOCUMENT_SOURCE | N | The source URL of the document |
The Evaluation dataset is used to evaluate the model's performance. It contains a list of questions and their expected answers.
Feature Mapping | Feature Type | Required | Description |
---|---|---|---|
QUESTION | Y | Question used to evaluate the model | |
ANSWER | N | The question's expected answer |