Validated models


Red Hat AI Inference Server 3.0

Red Hat AI Inference Server Validated models

Red Hat AI Documentation Team

Abstract

Learn about the validated models that you can run with Red Hat AI Inference Server.

Preface

Red Hat provides validated third-party models that you can serve with AI Inference Server. These models are designed for efficient deployment on the Red Hat AI platform. Models are validated using open source tools. Red Hat uses GuideLLM for performance benchmarking and Language Model Evaluation Harness for accuracy evaluations.

Note

You can explore the validated models, complete with model details and deployment instructions, in the Red Hat AI validated models - v1.0 collection on Hugging Face.

Chapter 1. Red Hat AI validated models

The following table lists the Red Hat AI validated models for use with Red Hat AI Inference Server 3.0.

  • If you are using AI Inference Server as standalone product, use the Hugging Face images.
  • If you are using AI Inference Server as part of a RHEL AI deployment, use the model OCI artifact image.
  • If you are using AI Inference Server as part of a OpenShift AI deployment, use the model ModelCar image.
Important

AMD GPUs support FP8 (W8A8) and GGUF quantization variant models only. For more information, see Supported hardware.

Expand
Table 1.1. Red Hat AI validated models
ModelQuantized variantsHugging Face model cards [1]OCI artifact images [2]ModelCar images [3]

Llama-4-Scout-17B-16E-Instruct

INT4, FP8

  • Baseline:

    registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5

  • FP8:

    registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

Llama-4-Maverick-17B-128E-Instruct

FP8

  • Baseline:

    registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct:1.5

  • FP8:

    registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct-fp8:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct-fp8:1.5

Mistral-Small-3.1-24B-Instruct-2503

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503:1.5

  • INT4:

    registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

Llama-3.3-70B-Instruct

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/llama-3-3-70b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/llama-3-3-70b-instruct-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-fp8-dynamic:1.5

Llama-3.1-8B-Instruct

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/llama-3-1-8b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/llama-3-1-8b-instruct-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3.1-8b-instruct

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/granite-3-1-8b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/granite-3-1-8b-instruct-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/phi-4:1.5

  • INT4:

    registry.redhat.io/rhelai1/phi-4-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/phi-4-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/phi-4-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-phi-4:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-phi-4-fp8-dynamic:1.5

Qwen2.5-7B-Instruct

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/qwen2-5-7b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/qwen2-5-7b-instruct-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-fp8-dynamic:1.5

Mistral-Small-24B-Instruct-2501

INT4, INT8, FP8

  • Baseline:

    registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501:1.5

  • INT4:

    registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w4a16:1.5

  • INT8:

    registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w8a8:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-fp8-dynamic:1.5

Mixtral-8x7B-Instruct-v0.1

None

  • Baseline:

    registry.redhat.io/rhelai1/mixtral-8x7b-instruct-v0-1:1.4

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-mixtral-8x7b-instruct-v0-1:1.4

granite-3.1-8b-base

INT4 (baseline currently unavailable)

  • INT4:

    registry.redhat.io/rhelai1/granite-3-1-8b-base-quantized-w4a16:1.5

  • INT4:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-base-quantized-w4a16:1.5

granite-3.1-8b-starter-v2

None

  • Unavailable on Hugging Face
  • Baseline:

    registry.redhat.io/rhelai1/granite-3.1-8b-starter-v2:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-starter-v2:1.5

Llama-3.1-Nemotron-70B-Instruct-HF

FP8

  • Baseline:

    registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf:1.5

  • FP8:

    registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline:

    registry.redhat.io/rhelai1/gemma-2-9b-it:1.5

  • FP8:

    registry.redhat.io/rhelai1/gemma-2-9b-it-fp8:1.5

  • Baseline:

    registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it:1.5

  • FP8:

    registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it-fp8:1.5

  1. For use with standalone Red Hat AI Inference Server
  2. For use with RHEL AI
  3. For use with Red Hat OpenShift AI

Legal Notice

Copyright © 2025 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat