Chapter 5. Enabling tool calling for Mistral 3 models
Configure a Mistral 3 model deployment to use tool calling with the vLLM OpenAI-compatible API.
Tool calling enables the model to request that your application execute an external function by returning a structured tool_calls object. Your application runs the tool and sends the result back to the model to continue the conversation.
Prerequisites
- You have deployed a Mistral 3 Instruct model with Red Hat AI Inference Server.
- You have defined one or more tools that the model is allowed to call.
- You are running the vLLM serving container included with AI Inference Server.
Procedure
Deploy the RedHatAI/Mistral-Large-3-675B-Instruct-2512-NVFP4 model with AI Inference Server and enable tool calling:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
--enable-auto-tool-choiceallows the server to return tool calls automatically when the model requests them. -
--tool-call-parser mistraluses Mistral’s native tool calling format for parsing tool calls.
-
Verification
Send a chat completion request that includes tool definitions, for example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the model decides a tool is needed, the response includes a
tool_callsarray instead of a final answer.NoteTool execution is performed by your application, not by the model. The model generates a structured request describing which tool to call and which arguments to use only.
Execute the requested tool in your application and send the tool result back to the model. For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The model uses the tool output to generate a final natural language response and returns it as JSON.