ChatLLMTrainingConfig

Training config for the CHAT_LLM problem type

KEY TYPE Description
DISABLE_DATA_SUMMARIZATION bool After executing a query summarize the reponse and reply back with only the table and query run.
RESPONSE_FORMAT None (str): When set to 'JSON', the LLM will generate a JSON formatted string.
FILTER_COLUMNS list Allow users to filter the document retrievers on these metadata columns.
HIDE_SQL_AND_CODE bool When running data queries, this will hide the generated SQL and Code in the response.
LOOKUP_REWRITE_INSTRUCTIONS None None
DATA_PROMPT_CONTEXT str Prompt context for the data feature group IDs.
ENABLE_CODE_EXECUTION bool Enable python code execution in the ChatLLM. This equips the LLM with a python kernel in which all its code is executed.
NUM_COMPLETION_TOKENS int Default for maximum number of tokens for chat answers. Reducing this will get faster responses which are more succinct.
DATA_PROMPT_COLUMN_CONTEXT Dict[str, str] Dict of 'table_name.column_name' and 'column_context' pairs to provide column context for some selected columns in the selected structured data table. This replaces the default auto-generated information about the column data.
RESPONSE_INSTRUCTIONS str Customized instructions for how the model should respond inlcuding the format, persona and tone of the answers.
BEHAVIOR_INSTRUCTIONS str Customize the overall behaviour of the model. This controls things like - when to execute code (if enabled), write sql query, search web (if enabled), etc.
ENABLE_INLINE_SOURCE_CITATIONS bool Enable inline citations of the sources in the response.
DATABASE_CONNECTOR_TABLES List[str] List of tables to use from the database connector for the ChatLLM.
KEYWORD_REQUIREMENT_INSTRUCTIONS str Instructions for a LLM call to automatically generate keyword requirements to retrieve relevant documents for the conversation.
DATABASE_CONNECTOR_ID str Database connector ID to use for connecting external database that gives access to structured data to the LLM.
JSON_RESPONSE_INSTRUCTIONS str Instructions to be followed while generating the json_response if `response_format` is set to "JSON". This can include the schema information if the schema is dynamic and its keys cannot be pre-determined.
COLUMN_FILTERING_INSTRUCTIONS str Instructions for a LLM call to automatically generate filter expressions on document metadata to retrieve relevant documents for the conversation.
DATA_COLUMNS_TO_IGNORE List[str] Columns to ignore while encoding information about structured data tables in context for the LLM. A list of strings of format "."
INCLUDE_BM25_RETRIEVAL bool Combine BM25 search score with vector search using reciprocal rank fusion.
DATA_FEATURE_GROUP_IDS None (List[str]): List of feature group IDs to use to possibly query for the ChatLLM. The created ChatLLM is commonly referred to as DataLLM.
ENABLE_TOOL_BAR bool Enable the tool bar in Enterprise ChatLLM to provide additional functionalities like tool_use, web_search, image_gen, etc.
ENABLE_RESPONSE_CACHING bool Enable caching of LLM responses to speed up response times and improve reproducibility.
QUERY_REWRITE_INSTRUCTIONS str Special instructions for the LLM which rewrites the RAG query.
RETRIEVAL_COLUMNS list Include the metadata column values in the retrieved search results.
TEMPERATURE float The generative LLM temperature.
SEARCH_SCORE_CUTOFF float Minimum search score to consider a document as a valid search result.
ENABLE_LLM_REWRITE bool If enabled, an LLM will rewrite the RAG queries sent to document retriever. Disabled by default.
MAX_SEARCH_RESULTS int Maximum number of search results in the retrieval augmentation step. If we know that the questions are likely to have snippets which are easily matched in the documents, then a lower number will help with accuracy.
DOCUMENT_RETRIEVERS List[str] List of names or IDs of document retrievers to use as vector stores of information for RAG responses.
METADATA_COLUMNS None None
INCLUDE_GENERAL_KNOWLEDGE bool Allow the LLM to rely not just on RAG search results, but to fall back on general knowledge. Disabled by default.
JSON_RESPONSE_SCHEMA str Specifies the JSON schema that the model should adhere to if `response_format` is set to "JSON". This should be a json-formatted string where each field of the expected schema is mapped to a dictionary containing the fields 'type', 'required' and 'description'. For example - '{"sample_field": {"type": "integer", "required": true, "description": "Sample Field"}}'
ENABLE_WEB_SEARCH bool Allow the LLM to use Web Search Engines to retrieve information for better results.
UNKNOWN_ANSWER_PHRASE str Fallback response when the LLM can't find an answer.
DATA_PROMPT_TABLE_CONTEXT Dict[str, str] Dict of table name and table context pairs to provide table wise context for each structured data table.