Arguments:

REQUIRED

KEY

TYPE

DESCRIPTION

Yes

tableName

str

Organization-unique table name for this dataset.

fileFormat

str

The file format of the dataset.

csvDelimiter

str

If the file format is CSV, use a specific CSV delimiter.

isDocumentset

bool

Signifies if the dataset is a docstore dataset. A docstore dataset contains documents like images, PDFs, audio files etc. or is tabular data with links to such files.

extractBoundingBoxes

bool

Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.

parsingConfig

ParsingConfig

Custom config for dataset parsing.

KEY	TYPE	Description
filePathWithSchema	str	Path to the file with schema. Defaults to None.
escape	str	Escape character for CSV files. Defaults to '"'.
csvDelimiter	str	Delimiter for CSV files. Defaults to None.

mergeFileSchemas

bool

Signifies whether to merge the schemas of all files in the dataset. If is_documentset is True, this is also set to True by default.

documentProcessingConfig

DatasetDocumentProcessingConfig

The document processing configuration. Only valid if is_documentset is True.

KEY	TYPE	Description
pageTextColumn	str	Name of the output column which contains the extracted text for each page. If not provided, no column will be created.

versionLimit

int

The number of recent versions to preserve for the dataset (minimum 30).

Note: The arguments for the API methods follow camelCase but for Python SDK underscore_case is followed.

Response:

KEY

TYPE

DESCRIPTION

success

Boolean

true if the call succeeded, false if there was an error

result

Upload

KEY	TYPE	Description
uploadId	str	The unique ID generated when the upload process of the full large file in smaller parts is initiated.
datasetUploadId	str	Same as upload_id. It is kept for backwards compatibility purposes.
status	str	The current status of the upload.
datasetId	str	A reference to the dataset this upload is adding data to.
datasetVersion	str	A reference to the dataset version the upload is adding data to.
modelId	str	A reference the model the upload is creating a version for
modelVersion	str	A reference to the model version the upload is creating.
batchPredictionId	str	A reference to the batch prediction the upload is creating.
parts	List[dict]	A list containing the order of the file parts that have been uploaded.
createdAt	str	The timestamp at which the upload was created.

Exceptions:

TYPE	WHEN
InvalidEnumParameterError	An invalid value is passed for `fileFormat`.

Language: