Method
createDatasetFromUpload POST
Copy POST

Creates a dataset and returns an upload ID that can be used to upload a file.

Arguments:

REQUIRED KEY TYPE DESCRIPTION
Yes tableName str Organization-unique table name for this dataset.
No fileFormat str The file format of the dataset.
No csvDelimiter str If the file format is CSV, use a specific CSV delimiter.
No isDocumentset bool Signifies if the dataset is a docstore dataset. A docstore dataset contains documents like images, PDFs, audio files etc. or is tabular data with links to such files.
No extractBoundingBoxes bool Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.
No parsingConfig ParsingConfig Custom config for dataset parsing.
KEY TYPE Description
csvDelimiter str Delimiter for CSV files. Defaults to None.
escape str Escape character for CSV files. Defaults to '"'.
filePathWithSchema str Path to the file with schema. Defaults to None.
No mergeFileSchemas bool Signifies whether to merge the schemas of all files in the dataset. If is_documentset is True, this is also set to True by default.
No documentProcessingConfig DatasetDocumentProcessingConfig The document processing configuration. Only valid if is_documentset is True.
KEY TYPE Description
pageTextColumn str Name of the output column which contains the extracted text for each page. If not provided, no column will be created.
No versionLimit int The number of recent versions to preserve for the dataset (minimum 30).
Note: The arguments for the API methods follow camelCase but for Python SDK underscore_case is followed.

Response:

KEY TYPE DESCRIPTION
success Boolean true if the call succeeded, false if there was an error
result Upload
KEY TYPE Description
uploadId str The unique ID generated when the upload process of the full large file in smaller parts is initiated.
datasetUploadId str Same as upload_id. It is kept for backwards compatibility purposes.
status str The current status of the upload.
datasetId str A reference to the dataset this upload is adding data to.
datasetVersion str A reference to the dataset version the upload is adding data to.
modelId str A reference the model the upload is creating a version for
modelVersion str A reference to the model version the upload is creating.
batchPredictionId str A reference to the batch prediction the upload is creating.
parts List[dict] A list containing the order of the file parts that have been uploaded.
createdAt str The timestamp at which the upload was created.

Exceptions:

TYPE WHEN
InvalidEnumParameterError

An invalid value is passed for fileFormat.

Language: