Method
createDocumentRetriever POST
Copy POST

Returns a document retriever that stores embeddings for document chunks in a feature group.

Arguments:

REQUIRED KEY TYPE DESCRIPTION
Yes projectId str The ID of project that the Document Retriever is created in.
Yes name str The name of the Document Retriever. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
Yes featureGroupId str The ID of the feature group that the Document Retriever is associated with.
No documentRetrieverConfig DocumentRetrieverConfig The configuration, including chunk_size and chunk_overlap_fraction, for document retrieval.
KEY TYPE Description
chunkSize int The size of chunks for vector store, i.e., maximum number of words in the chunk.
chunkOverlapFraction float The fraction of overlap between two consecutive chunks.
textEncoder str The text encoder used to encode texts in the vector store.
scoreMultiplierColumn str The values in this metadata column are used to modify the relevance scores of returned chunks.
pruneVectors bool Corpus specific transformation of vectors that applies dimensional reduction techniques to strip common components from the vectors.
indexMetadataColumns bool If True, metadata columns of the FG will also be used for indexing and querying.
useDocumentSummary bool If True, uses the summary of the document in addition to chunks of the document for indexing and querying.
summaryInstructions str Instructions for the LLM to generate the document summary.
Note: The arguments for the API methods follow camelCase but for Python SDK underscore_case is followed.

Response:

KEY TYPE DESCRIPTION
success Boolean true if the call succeeded, false if there was an error
result DocumentRetriever
KEY TYPE Description
name str The name of the document retriever.
documentRetrieverId str The unique identifier of the vector store.
createdAt str When the vector store was created.
featureGroupId str The feature group id associated with the document retriever.
featureGroupName str The feature group name associated with the document retriever.
indexingRequired bool Whether the document retriever is required to be indexed due to changes in underlying data.
latestDocumentRetrieverVersion DocumentRetrieverVersion The latest version of vector store.
KEY TYPE Description
documentRetrieverId str The unique identifier of the Document Retriever.
documentRetrieverVersion str The unique identifier of the Document Retriever version.
createdAt str When the Document Retriever was created.
status str The status of Document Retriever version. It represents indexing status until indexing isn't complete, and deployment status after indexing is complete.
deploymentStatus str The status of deploying the Document Retriever version.
featureGroupId str The feature group id associated with the document retriever.
featureGroupVersion str The unique identifier of the feature group version at which the Document Retriever version is created.
error str The error message when it failed to create the document retriever version.
numberOfChunks int The number of chunks for the document retriever.
embeddingFileSize int The size of embedding file for the document retriever.
warnings list The warning messages when creating the document retriever.
resolvedConfig DocumentRetrieverConfig The resolved configurations, such as default settings, for indexing documents.
KEY TYPE Description
chunkSize int The size of chunks for vector store, i.e., maximum number of words in the chunk.
chunkOverlapFraction float The fraction of overlap between two consecutive chunks.
textEncoder str The text encoder used to encode texts in the vector store.
scoreMultiplierColumn str The values in this metadata column are used to modify the relevance scores of returned chunks.
pruneVectors bool Corpus specific transformation of vectors that applies dimensional reduction techniques to strip common components from the vectors.
indexMetadataColumns bool If True, metadata columns of the FG will also be used for indexing and querying.
useDocumentSummary bool If True, uses the summary of the document in addition to chunks of the document for indexing and querying.
summaryInstructions str Instructions for the LLM to generate the document summary.
documentRetrieverConfig DocumentRetrieverConfig The config for vector store creation.
KEY TYPE Description
chunkSize int The size of chunks for vector store, i.e., maximum number of words in the chunk.
chunkOverlapFraction float The fraction of overlap between two consecutive chunks.
textEncoder str The text encoder used to encode texts in the vector store.
scoreMultiplierColumn str The values in this metadata column are used to modify the relevance scores of returned chunks.
pruneVectors bool Corpus specific transformation of vectors that applies dimensional reduction techniques to strip common components from the vectors.
indexMetadataColumns bool If True, metadata columns of the FG will also be used for indexing and querying.
useDocumentSummary bool If True, uses the summary of the document in addition to chunks of the document for indexing and querying.
summaryInstructions str Instructions for the LLM to generate the document summary.

Exceptions:

TYPE WHEN
DataNotFoundError

projectId is not found.

DataNotFoundError

featureGroupId is not found.

Language: