A dataset reference
KEY | TYPE | Description |
---|---|---|
datasetId | str | The unique identifier of the dataset. |
sourceType | str | The source of the Dataset. EXTERNAL_SERVICE, UPLOAD, or STREAMING. |
dataSource | str | Location of data. It may be a URI such as an s3 bucket or the database table. |
createdAt | str | The timestamp at which this dataset was created. |
ignoreBefore | str | The timestamp at which all previous events are ignored when training. |
ephemeral | bool | The dataset is ephemeral and not used for training. |
lookbackDays | int | Specific to streaming datasets, this specifies how many days worth of data to include when generating a snapshot. Value of 0 indicates leaves this selection to the system. |
databaseConnectorId | str | The Database Connector used. |
databaseConnectorConfig | dict | The database connector query used to retrieve data. |
connectorType | str | The type of connector used to get this dataset FILE or DATABASE. |
featureGroupTableName | str | The table name of the dataset's feature group |
applicationConnectorId | str | The Application Connector used. |
applicationConnectorConfig | dict | The application connector query used to retrieve data. |
incremental | bool | If dataset is an incremental dataset. |
isDocumentset | bool | If dataset is a documentset. |
extractBoundingBoxes | bool | Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True. |
mergeFileSchemas | bool | If the merge file schemas policy is enabled. |
referenceOnlyDocumentset | bool | Signifies whether to save the data reference only. Only valid if is_documentset if True. |
versionLimit | int | Version limit for the dataset. |
latestDatasetVersion | DatasetVersion | The latest version of this dataset. |
schema | DatasetColumn | List of resolved columns. |
refreshSchedules | RefreshSchedule | List of schedules that determines when the next version of the dataset will be created. |
parsingConfig | ParsingConfig | The parsing config used for dataset. |
documentProcessingConfig | DocumentProcessingConfig | The document processing config used for dataset (when is_documentset is True). |
attachmentParsingConfig | AttachmentParsingConfig | The attachment parsing config used for dataset (eg. for salesforce attachment parsing) |