Override AI model configuration
Override AI model configuration
The ai_agent
configuration allows you to override the default AI model configuration. It is available for the following endpoints:
The override examples include:
- Replacing the default AI model with a custom one based on your organization's needs.
- Tweaking the base
prompt
to allow a more customized user experience. - Changing a parameter, such as
temperature
, to make the results more or less creative.
Sample configuration
A complete configuration for ai/ask
is as follows:
{
"type": "ai_agent_ask",
"basic_text": {
"llm_endpoint_params": {
"type": "openai_params",
"frequency_penalty": 1.5,
"presence_penalty": 1.5,
"stop": "<|im_end|>",
"temperature": 0,
"top_p": 1
},
"model": "azure__openai__gpt_4o_mini",
"num_tokens_for_completion": 8400,
"prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
"system_message": "You are a helpful travel assistant specialized in budget travel"
},
"basic_text_multi": {
"llm_endpoint_params": {
"type": "openai_params",
"frequency_penalty": 1.5,
"presence_penalty": 1.5,
"stop": "<|im_end|>",
"temperature": 0,
"top_p": 1
},
"model": "azure__openai__gpt_4o_mini",
"num_tokens_for_completion": 8400,
"prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
"system_message": "You are a helpful travel assistant specialized in budget travel"
},
"long_text": {
"embeddings": {
"model": "openai__text_embedding_ada_002",
"strategy": {
"id": "basic",
"num_tokens_per_chunk": 64
}
},
"llm_endpoint_params": {
"type": "openai_params",
"frequency_penalty": 1.5,
"presence_penalty": 1.5,
"stop": "<|im_end|>",
"temperature": 0,
"top_p": 1
},
"model": "azure__openai__gpt_4o_mini",
"num_tokens_for_completion": 8400,
"prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
"system_message": "You are a helpful travel assistant specialized in budget travel"
},
"long_text_multi": {
"embeddings": {
"model": "openai__text_embedding_ada_002",
"strategy": {
"id": "basic",
"num_tokens_per_chunk": 64
}
},
"llm_endpoint_params": {
"type": "openai_params",
"frequency_penalty": 1.5,
"presence_penalty": 1.5,
"stop": "<|im_end|>",
"temperature": 0,
"top_p": 1
},
"model": "azure__openai__gpt_4o_mini",
"num_tokens_for_completion": 8400,
"prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
"system_message": "You are a helpful travel assistant specialized in budget travel"
}
}
Differences in parameter sets
The set of parameters available for ask
, text_gen
, extract
, extract_structured
differs slightly, depending on the API call.
-
The agent configuration for the
ask
endpoint includesbasic_text
,basic_text_multi
,long_text
andlong_text_multi
parameters. This is because of themode
parameter you use to specify if the request is for a single item or multiple items. If you selectedmultiple_item_qa
as themode
, you can also usemulti
parameters for overrides. -
The agent configuration for
text_gen
includes thebasic_gen
parameter that is used to generate text.
LLM endpoint params
The llm_endpoint_params
configuration options differ depending on the overall AI model being Google, OpenAI or AWS based.
For example, both llm_endpoint_params
objects accept a temperature
parameter, but the outcome differs depending on the model.
For Google and AWS models, the temperature
is used for sampling during response generation, which occurs when top-P
and top-K
are applied. Temperature controls the degree of randomness in the token selection.
For OpenAI models, temperature
is the sampling temperature with values between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. When introducing your own configuration, use temperature
or or top_p
but not both.
System message
The system_message
parameter's aim is to help the LLM understand its role and what it’s supposed to do.
For example, if your solution is processing travel itineraries, you can add a system message saying:
You are a travel agent aid. You are going to help support staff process large amounts of schedules, tickets, etc.
This message is separate from the content you send in, but it can improve the results.
Number of tokens for completion
The num_tokens_for_completion
parameter represents the number of tokens Box AI can return. This number can vary based on the model used.