Chapter 5. Enabling tool calling for Mistral 3 models


Configure a Mistral 3 model deployment to use tool calling with the vLLM OpenAI-compatible API.

Tool calling enables the model to request that your application execute an external function by returning a structured tool_calls object. Your application runs the tool and sends the result back to the model to continue the conversation.

Prerequisites

  • You have deployed a Mistral 3 Instruct model with Red Hat AI Inference Server.
  • You have defined one or more tools that the model is allowed to call.
  • You are running the vLLM serving container included with AI Inference Server.

Procedure

  1. Deploy the RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4 model with AI Inference Server and enable tool calling:

    podman run --rm -it \
      --device nvidia.com/gpu=all \
      --shm-size=4g \
      -p 8000:8000 \
      --env "HUGGING_FACE_HUB_TOKEN=$HF_TOKEN" \
      registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.3.0 \
        --model RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4 \
        --tokenizer-mode mistral \
        --config-format mistral \
        --load-format mistral \
        --enable-auto-tool-choice \
        --tool-call-parser mistral \
        --host 0.0.0.0 \
        --port 8000
    Copy to Clipboard Toggle word wrap
    • --enable-auto-tool-choice allows the server to return tool calls automatically when the model requests them.
    • --tool-call-parser mistral uses Mistral’s native tool calling format for parsing tool calls.

Verification

  1. Send a chat completion request that includes tool definitions, for example:

    $ curl http://localhost:8000/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
        "model": "RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4",
        "messages": [
          {
            "role": "user",
            "content": "What is the weather in Paris right now?"
          }
        ],
        "tools": [
          {
            "type": "function",
            "function": {
              "name": "get_weather",
              "description": "Get the current weather for a location",
              "parameters": {
                "type": "object",
                "properties": {
                  "location": {
                    "type": "string",
                    "description": "The city name"
                  }
                },
                "required": ["location"]
              }
            }
          }
        ],
        "tool_choice": "auto"
      }'
    Copy to Clipboard Toggle word wrap

    If the model decides a tool is needed, the response includes a tool_calls array instead of a final answer.

    Note

    Tool execution is performed by your application, not by the model. The model generates a structured request describing which tool to call and which arguments to use only.

  2. Execute the requested tool in your application and send the tool result back to the model. For example:

    $ curl http://localhost:8000/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
        "model": "RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4",
        "messages": [
          {
            "role": "user",
            "content": "What is the weather in Paris right now?"
          },
          {
            "role": "assistant",
            "content": null,
            "tool_calls": [
              {
                "id": "call_123",
                "type": "function",
                "function": {
                  "name": "get_weather",
                  "arguments": "{\"location\": \"Paris\"}"
                }
              }
            ]
          },
          {
            "role": "tool",
            "tool_call_id": "call_123",
            "content": "The weather in Paris is 18°C and sunny."
          }
        ]
      }'
    Copy to Clipboard Toggle word wrap

    The model uses the tool output to generate a final natural language response and returns it as JSON.

Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top