Dataset

A dataset reference

KEY TYPE Description
datasetId str The unique identifier of the dataset.
sourceType str The source of the Dataset. EXTERNAL_SERVICE, UPLOAD, or STREAMING.
dataSource str Location of data. It may be a URI such as an s3 bucket or the database table.
createdAt str The timestamp at which this dataset was created.
ignoreBefore str The timestamp at which all previous events are ignored when training.
ephemeral bool The dataset is ephemeral and not used for training.
lookbackDays int Specific to streaming datasets, this specifies how many days worth of data to include when generating a snapshot. Value of 0 indicates leaves this selection to the system.
databaseConnectorId str The Database Connector used.
databaseConnectorConfig dict The database connector query used to retrieve data.
connectorType str The type of connector used to get this dataset FILE or DATABASE.
featureGroupTableName str The table name of the dataset's feature group
applicationConnectorId str The Application Connector used.
applicationConnectorConfig dict The application connector query used to retrieve data.
incremental bool If dataset is an incremental dataset.
isDocumentset bool If dataset is a documentset.
extractBoundingBoxes bool Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.
mergeFileSchemas bool If the merge file schemas policy is enabled.
referenceOnlyDocumentset bool Signifies whether to save the data reference only. Only valid if is_documentset if True.
versionLimit int Version limit for the dataset.
latestDatasetVersion DatasetVersion The latest version of this dataset.
schema DatasetColumn List of resolved columns.
refreshSchedules RefreshSchedule List of schedules that determines when the next version of the dataset will be created.
parsingConfig ParsingConfig The parsing config used for dataset.
documentProcessingConfig DocumentProcessingConfig The document processing config used for dataset (when is_documentset is True).
attachmentParsingConfig AttachmentParsingConfig The attachment parsing config used for dataset (eg. for salesforce attachment parsing)