Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 1. Red Hat AI validated models


The Red Hat AI models are validated using open source tools. The model type you use depends on how you want to deploy the model.

Note

If you are using AI Inference Server as part of a RHEL AI deployment, use OCI artifact images.

If you are using AI Inference Server as part of a OpenShift AI deployment, use ModelCar images.

Red Hat uses GuideLLM for performance benchmarking and Language Model Evaluation Harness for accuracy evaluations.

Explore the Red Hat AI validated models collections on Hugging Face.

Important

AMD GPUs support FP8 (W8A8) and GGUF quantization variant models only. For more information, see Supported hardware.

1.1. Red Hat AI validated models - October 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.1. Red Hat AI validated models - October 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gpt-oss-120b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

gpt-oss-20b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

NVIDIA-Nemotron-Nano-9B-v2

INT4, FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Qwen3-Coder-480B-A35B-Instruct

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Voxtral-Mini-3B-2507

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

whisper-large-v3-turbo

INT4

  • RHAIIS 3.2.2
  • RHOAI 2.25

1.2. Validated models on Hugging Face - September 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.2. Red Hat AI validated models - September 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

DeepSeek-R1-0528

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

gemma-3n-E4B-it

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

Kimi-K2-Instruct

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

Qwen3-8B

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

1.3. Validated models on Hugging Face - May 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.3. Red Hat AI validated models - May 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gemma-2-9b-it

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-base

INT4

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-8B-Instruct

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-Nemotron-70B-Instruct-HF

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.3-70B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Maverick-17B-128E-Instruct

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Scout-17B-16E-Instruct

INT4, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Meta-Llama-3.1-8B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-24B-Instruct-2501

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-3.1-24B-Instruct-2503

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mixtral-8x7B-Instruct-v0.1

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

phi-4

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Qwen2.5-7B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

1.4. Validated OCI artifact model container images

Expand
Table 1.4. Validated OCI artifact model container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-base-quantized-w4a16:1.5

granite-3.1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/granite-3.1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5

1.5. ModelCar container images

Expand
Table 1.5. ModelCar container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-base-quantized-w4a16:1.5

granite-3-1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5
Nach oben
Red Hat logoGithubredditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können. Entdecken Sie unsere neuesten Updates.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

Theme

© 2025 Red Hat