Override AI model configuration

Guides Box AI AI model overrides Override AI model configuration

Override AI model configuration

The ai_agent configuration allows you to override the default AI model configuration. It is available for the following endpoints:

Use the GET ai_agent_default endpoint to fetch the default configuration.

The override examples include:

Replacing the default AI model with a custom one based on your organization's needs.
Tweaking the base prompt to allow a more customized user experience.
Changing a parameter, such as temperature, to make the results more or less creative.

A complete configuration for ai/ask is as follows:

{
  "type": "ai_agent_ask",
  "basic_text": {
    "llm_endpoint_params": {
      "type": "openai_params",
      "frequency_penalty": 1.5,
      "presence_penalty": 1.5,
      "stop": "<|im_end|>",
      "temperature": 0,
      "top_p": 1
    },
    "model": "azure__openai__gpt_4o_mini",
    "num_tokens_for_completion": 8400,
    "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
    "system_message": "You are a helpful travel assistant specialized in budget travel"
  },
  "basic_text_multi": {
    "llm_endpoint_params": {
      "type": "openai_params",
      "frequency_penalty": 1.5,
      "presence_penalty": 1.5,
      "stop": "<|im_end|>",
      "temperature": 0,
      "top_p": 1
    },
    "model": "azure__openai__gpt_4o_mini",
    "num_tokens_for_completion": 8400,
    "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
    "system_message": "You are a helpful travel assistant specialized in budget travel"
  },
  "long_text": {
    "embeddings": {
      "model": "openai__text_embedding_ada_002",
      "strategy": {
        "id": "basic",
        "num_tokens_per_chunk": 64
      }
    },
    "llm_endpoint_params": {
      "type": "openai_params",
      "frequency_penalty": 1.5,
      "presence_penalty": 1.5,
      "stop": "<|im_end|>",
      "temperature": 0,
      "top_p": 1
    },
    "model": "azure__openai__gpt_4o_mini",
    "num_tokens_for_completion": 8400,
    "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
    "system_message": "You are a helpful travel assistant specialized in budget travel"
  },
  "long_text_multi": {
    "embeddings": {
      "model": "openai__text_embedding_ada_002",
      "strategy": {
        "id": "basic",
        "num_tokens_per_chunk": 64
      }
    },
    "llm_endpoint_params": {
      "type": "openai_params",
      "frequency_penalty": 1.5,
      "presence_penalty": 1.5,
      "stop": "<|im_end|>",
      "temperature": 0,
      "top_p": 1
    },
    "model": "azure__openai__gpt_4o_mini",
    "num_tokens_for_completion": 8400,
    "prompt_template": "It is `{current_date}`, consider these travel options `{content}` and answer the `{user_question}`.",
    "system_message": "You are a helpful travel assistant specialized in budget travel"
  }
}

The set of parameters available for ask, text_gen, extract, extract_structured differs slightly, depending on the API call.

The agent configuration for the ask endpoint includes basic_text, basic_text_multi, long_text and long_text_multi parameters. This is because of the mode parameter you use to specify if the request is for a single item or multiple items. If you selected multiple_item_qa as the mode, you can also use multi parameters for overrides.
The agent configuration for text_gen includes the basic_gen parameter that is used to generate text.

The llm_endpoint_params configuration options differ depending on the overall AI model being Google, OpenAI or AWS based.

For example, both llm_endpoint_params objects accept a temperature parameter, but the outcome differs depending on the model.

For Google and AWS models, the temperature is used for sampling during response generation, which occurs when top-P and top-K are applied. Temperature controls the degree of randomness in the token selection.

For OpenAI models, temperature is the sampling temperature with values between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic. When introducing your own configuration, use temperature or or top_p but not both.

System message

The system_message parameter's aim is to help the LLM understand its role and what it’s supposed to do. For example, if your solution is processing travel itineraries, you can add a system message saying:

You are a travel agent aid. You are going to help support staff process large amounts of schedules, tickets, etc.

This message is separate from the content you send in, but it can improve the results.

The num_tokens_for_completion parameter represents the number of tokens Box AI can return. This number can vary based on the model used.

Override AI model configuration

Override AI model configuration

Sample configuration

Differences in parameter sets

LLM endpoint params

System message

Number of tokens for completion

Related APIs

Related Guides