Este contenido no está disponible en el idioma seleccionado.

Chapter 1. Red Hat AI validated models


The Red Hat AI models are validated using open source tools. The model type you use depends on how you want to deploy the model.

Note

If you are using AI Inference Server as part of a RHEL AI deployment, use OCI artifact images.

If you are using AI Inference Server as part of a OpenShift AI deployment, use ModelCar images.

Red Hat uses GuideLLM for performance benchmarking and Language Model Evaluation Harness for accuracy evaluations.

Explore the Red Hat AI validated models collections on Hugging Face.

Important

AMD GPUs support FP8 (W8A8) and GGUF quantization variant models only. For more information, see Supported hardware.

1.1. Red Hat AI validated models - October 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.1. Red Hat AI validated models - October 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gpt-oss-120b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

gpt-oss-20b

None

  • RHAIIS 3.2.2
  • RHOAI 2.25

NVIDIA-Nemotron-Nano-9B-v2

INT4, FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Qwen3-Coder-480B-A35B-Instruct

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

Voxtral-Mini-3B-2507

FP8

  • RHAIIS 3.2.2
  • RHOAI 2.25

whisper-large-v3-turbo

INT4

  • RHAIIS 3.2.2
  • RHOAI 2.25

1.2. Validated models on Hugging Face - September 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.2. Red Hat AI validated models - September 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

DeepSeek-R1-0528

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

gemma-3n-E4B-it

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

Kimi-K2-Instruct

INT4

  • RHAIIS 3.2.1
  • RHOAI 2.24

Qwen3-8B

FP8

  • RHAIIS 3.2.1
  • RHOAI 2.24

1.3. Validated models on Hugging Face - May 2025 collection

The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.

Expand
Table 1.3. Red Hat AI validated models - May 2025 collection
ModelQuantized variantsHugging Face model cardsValidated on

gemma-2-9b-it

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-base

INT4

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

granite-3.1-8b-instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-8B-Instruct

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.1-Nemotron-70B-Instruct-HF

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-3.3-70B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Maverick-17B-128E-Instruct

FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Llama-4-Scout-17B-16E-Instruct

INT4, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Meta-Llama-3.1-8B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-24B-Instruct-2501

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mistral-Small-3.1-24B-Instruct-2503

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Mixtral-8x7B-Instruct-v0.1

None

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

phi-4

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

Qwen2.5-7B-Instruct

INT4, INT8, FP8

  • RHAIIS 3.0
  • RHELAI 1.5
  • RHOAI 2.20

1.4. Validated OCI artifact model container images

Expand
Table 1.4. Validated OCI artifact model container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/granite-3-1-8b-base-quantized-w4a16:1.5

granite-3.1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/granite-3.1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5

1.5. ModelCar container images

Expand
Table 1.5. ModelCar container images
ModelQuantized variantsModelCar images

llama-4-scout-17b-16e-instruct

INT4, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-quantized-w4a16:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-scout-17b-16e-instruct-fp8-dynamic:1.5

llama-4-maverick-17b-128e-instruct

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-4-maverick-17b-128e-instruct-fp8:1.5

mistral-small-3-1-24b-instruct-2503

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-3-1-24b-instruct-2503-fp8-dynamic:1.5

llama-3-3-70b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-3-70b-instruct-fp8-dynamic:1.5

llama-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-8b-instruct-fp8-dynamic:1.5

granite-3-1-8b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-instruct-fp8-dynamic:1.5

phi-4

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-phi-4:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-phi-4-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-phi-4-fp8-dynamic:1.5

qwen2-5-7b-instruct

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-qwen2-5-7b-instruct-fp8-dynamic:1.5

mistral-small-24b-instruct-2501

INT4, INT8, FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501:1.5
  • INT4: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w4a16:1.5
  • INT8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-quantized-w8a8:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-mistral-small-24b-instruct-2501-fp8-dynamic:1.5

mixtral-8x7b-instruct-v0-1

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-mixtral-8x7b-instruct-v0-1:1.4

granite-3-1-8b-base

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-base-quantized-w4a16:1.5

granite-3-1-8b-starter-v2

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-granite-3-1-8b-starter-v2:1.5

llama-3-1-nemotron-70b-instruct-hf

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-llama-3-1-nemotron-70b-instruct-hf-fp8-dynamic:1.5

gemma-2-9b-it

FP8

  • Baseline: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it:1.5
  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-2-9b-it-fp8:1.5

deepseek-r1-0528

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-deepseek-r1-0528-quantized-w4a16:1.5

qwen3-8b

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-8b-fp8-dynamic:1.5

kimi-k2-instruct

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-kimi-k2-instruct-quantized-w4a16:1.5

gemma-3n-e4b-it

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-gemma-3n-e4b-it-fp8-dynamic:1.5

gpt-oss-120b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-120b:1.5

gpt-oss-20b

None

  • Baseline: registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5

qwen3-coder-480b-a35b-instruct

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-qwen3-coder-480b-a35b-instruct-fp8:1.5

whisper-large-v3-turbo

INT4 (baseline currently unavailable)

  • INT4: registry.redhat.io/rhelai1/modelcar-whisper-large-v3-turbo-quantized-w4a16:1.5

voxtral-mini-3b-2507

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-voxtral-mini-3b-2507-fp8-dynamic:1.5

nvidia-nemotron-nano-9b-v2

FP8 (baseline currently unavailable)

  • FP8: registry.redhat.io/rhelai1/modelcar-nvidia-nemotron-nano-9b-v2-fp8-dynamic:1.5
Volver arriba
Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2025 Red Hat