Questo contenuto non è disponibile nella lingua selezionata.

Chapter 1. Red Hat AI validated models


The Red Hat AI models are validated using open source tools. The model type you use depends on how you want to deploy the model.

Note

If you are using AI Inference Server as part of a RHEL AI deployment, use OCI artifact images.

If you are using AI Inference Server as part of a OpenShift AI deployment, use ModelCar images.

Red Hat uses GuideLLM for performance benchmarking and Language Model Evaluation Harness for accuracy evaluations.

Explore the Red Hat AI validated models collections on Hugging Face.

Important

AMD GPUs support FP8 (W8A8) and GGUF quantization variant models only. For more information, see Supported hardware.

1.1. Red Hat AI validated models - October 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.1. Red Hat AI validated models - October 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gpt-oss-120b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

gpt-oss-20b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

NVIDIA-Nemotron-Nano-9B-v2

INT4, FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Qwen3-Coder-480B-A35B-Instruct

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Voxtral-Mini-3B-2507

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

whisper-large-v3-turbo

INT4

  • RHAIIS 3.2.2
  • RHOAI 2.25

1.2. Validated models on Hugging Face - September 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.2. Red Hat AI validated models - September 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

DeepSeek-R1-0528

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

gemma-3n-E4B-it

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

Kimi-K2-Instruct

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

Qwen3-8B

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

1.3. Validated models on Hugging Face - May 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.3. Red Hat AI validated models - May 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gemma-2-9b-it

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-base

INT4

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-8B-Instruct

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-Nemotron-70B-Instruct-HF

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.3-70B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Maverick-17B-128E-Instruct

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Scout-17B-16E-Instruct

INT4, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Meta-Llama-3.1-8B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-24B-Instruct-2501

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-3.1-24B-Instruct-2503

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mixtral-8x7B-Instruct-v0.1

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

phi-4

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Qwen2.5-7B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

1.4. Validated OCI artifact model container images

Expand
Table 1.4. Validated OCI artifact model container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-base-quantized-w4a16:1.5

granite-3.1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/granite-3.1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5

1.5. ModelCar container images

Expand
Table 1.5. ModelCar container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-base-quantized-w4a16:1.5

granite-3-1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5
Torna in cima
Red Hat logoGithubredditYoutubeTwitter

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Aiutiamo gli utenti Red Hat a innovarsi e raggiungere i propri obiettivi con i nostri prodotti e servizi grazie a contenuti di cui possono fidarsi. Esplora i nostri ultimi aggiornamenti.

Rendiamo l’open source più inclusivo

Red Hat si impegna a sostituire il linguaggio problematico nel codice, nella documentazione e nelle proprietà web. Per maggiori dettagli, visita il Blog di Red Hat.

Informazioni su Red Hat

Forniamo soluzioni consolidate che rendono più semplice per le aziende lavorare su piattaforme e ambienti diversi, dal datacenter centrale all'edge della rete.

Theme

© 2025 Red Hat