Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 1. Red Hat AI validated models


The Red Hat AI models are validated using open source tools. The model type you use depends on how you want to deploy the model.

Note

If you are using AI Inference Server as part of a RHEL AI deployment, use OCI artifact images.

If you are using AI Inference Server as part of a OpenShift AI deployment, use ModelCar images.

Red Hat uses GuideLLM for performance benchmarking and Language Model Evaluation Harness for accuracy evaluations.

Explore the Red Hat AI validated models collections on Hugging Face.

Important

AMD GPUs support FP8 (W8A8) and GGUF quantization variant models only. For more information, see Supported hardware.

1.1. Red Hat AI validated models - October 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.1. Red Hat AI validated models - October 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gpt-oss-120b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

gpt-oss-20b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

NVIDIA-Nemotron-Nano-9B-v2

INT4, FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Qwen3-Coder-480B-A35B-Instruct

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Voxtral-Mini-3B-2507

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

whisper-large-v3-turbo

INT4

  • RHAIIS 3.2.2
  • RHOAI 2.25

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.2. Red Hat AI validated models - September 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

DeepSeek-R1-0528

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

gemma-3n-E4B-it

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

Kimi-K2-Instruct

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

Qwen3-8B

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

1.3. Validated models on Hugging Face - May 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.3. Red Hat AI validated models - May 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gemma-2-9b-it

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-base

INT4

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-8B-Instruct

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-Nemotron-70B-Instruct-HF

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.3-70B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Maverick-17B-128E-Instruct

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Scout-17B-16E-Instruct

INT4, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Meta-Llama-3.1-8B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-24B-Instruct-2501

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-3.1-24B-Instruct-2503

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mixtral-8x7B-Instruct-v0.1

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

phi-4

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Qwen2.5-7B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

1.4. Validated OCI artifact model container images

Expand
Table 1.4. Validated OCI artifact model container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-base-quantized-w4a16:1.5

granite-3.1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/granite-3.1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5

1.5. ModelCar container images

Expand
Table 1.5. ModelCar container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-base-quantized-w4a16:1.5

granite-3-1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5
Retour au début
Red Hat logoGithubredditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance. Découvrez nos récentes mises à jour.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez le Blog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

Theme

© 2025 Red Hat