Training config for the CHAT_LLM problem type
KEY | TYPE | Description |
---|---|---|
RETRIEVAL_COLUMNS | list | Include the metadata column values in the retrieved search results. |
NUM_COMPLETION_TOKENS | int | Default for maximum number of tokens for chat answers. Reducing this will get faster responses which are more succinct. |
INCLUDE_GENERAL_KNOWLEDGE | bool | Allow the LLM to rely not just on RAG search results, but to fall back on general knowledge. Disabled by default. |
UNKNOWN_ANSWER_PHRASE | str | Fallback response when the LLM can't find an answer. |
FILTER_COLUMNS | list | Allow users to filter the document retrievers on these metadata columns. |
TEMPERATURE | float | The generative LLM temperature. |
METADATA_COLUMNS | None | None |
MAX_SEARCH_RESULTS | int | Maximum number of search results in the retrieval augmentation step. If we know that the questions are likely to have snippets which are easily matched in the documents, then a lower number will help with accuracy. |
ENABLE_TOOL_BAR | bool | Enable the tool bar in Enterprise ChatLLM to provide additional functionalities like tool_use, web_search, image_gen, etc. |
DATA_PROMPT_CONTEXT | str | Prompt context for the data feature group IDs. |
SEARCH_SCORE_CUTOFF | float | Minimum search score to consider a document as a valid search result. |
ENABLE_RESPONSE_CACHING | bool | Enable caching of LLM responses to speed up response times and improve reproducibility. |
JSON_RESPONSE_SCHEMA | str | Specifies the JSON schema that the model should adhere to if `response_format` is set to "JSON". This should be a json-formatted string where each field of the expected schema is mapped to a dictionary containing the fields 'type', 'required' and 'description'. For example - '{"sample_field": {"type": "integer", "required": true, "description": "Sample Field"}}' |
INCLUDE_BM25_RETRIEVAL | bool | Combine BM25 search score with vector search using reciprocal rank fusion. |
ENABLE_CODE_EXECUTION | bool | Enable python code execution in the ChatLLM. This equips the LLM with a python kernel in which all its code is executed. |
JSON_RESPONSE_INSTRUCTIONS | str | Instructions to be followed while generating the json_response if `response_format` is set to "JSON". This can include the schema information if the schema is dynamic and its keys cannot be pre-determined. |
ENABLE_INLINE_SOURCE_CITATIONS | bool | Enable inline citations of the sources in the response. |
HIDE_SQL_AND_CODE | bool | When running data queries, this will hide the generated SQL and Code in the response. |
DOCUMENT_RETRIEVERS | List[str] | List of names or IDs of document retrievers to use as vector stores of information for RAG responses. |
RESPONSE_INSTRUCTIONS | str | Customized instructions for how the model should respond inlcuding the format, persona and tone of the answers. |
DATA_FEATURE_GROUP_IDS | None | (List[str]): List of feature group IDs to use to possibly query for the ChatLLM. The created ChatLLM is commonly referred to as DataLLM. |
DATABASE_CONNECTOR_TABLES | List[str] | List of tables to use from the database connector for the ChatLLM. |
DATA_COLUMNS_TO_IGNORE | List[str] | Columns to ignore while encoding information about structured data tables in context for the LLM. A list of strings of format " |
QUERY_REWRITE_INSTRUCTIONS | str | Special instructions for the LLM which rewrites the RAG query. |
DATA_PROMPT_COLUMN_CONTEXT | Dict[str, str] | Dict of 'table_name.column_name' and 'column_context' pairs to provide column context for some selected columns in the selected structured data table. This replaces the default auto-generated information about the column data. |
RESPONSE_FORMAT | None | (str): When set to 'JSON', the LLM will generate a JSON formatted string. |
BEHAVIOR_INSTRUCTIONS | str | Customize the overall behaviour of the model. This controls things like - when to execute code (if enabled), write sql query, search web (if enabled), etc. |
DATA_PROMPT_TABLE_CONTEXT | Dict[str, str] | Dict of table name and table context pairs to provide table wise context for each structured data table. |
ENABLE_WEB_SEARCH | bool | Allow the LLM to use Web Search Engines to retrieve information for better results. |
ENABLE_LLM_REWRITE | bool | If enabled, an LLM will rewrite the RAG queries sent to document retriever. Disabled by default. |
DISABLE_DATA_SUMMARIZATION | bool | After executing a query summarize the reponse and reply back with only the table and query run. |
DATABASE_CONNECTOR_ID | str | Database connector ID to use for connecting external database that gives access to structured data to the LLM. |
LOOKUP_REWRITE_INSTRUCTIONS | None | None |
KEYWORD_REQUIREMENT_INSTRUCTIONS | str | Instructions for a LLM call to automatically generate keyword requirements to retrieve relevant documents for the conversation. |
COLUMN_FILTERING_INSTRUCTIONS | str | Instructions for a LLM call to automatically generate filter expressions on document metadata to retrieve relevant documents for the conversation. |