Chapter 4. Validated models on Hugging Face - May 2025 collection
The following models, available from RedHat AI on Hugging Face, are validated for use with Red Hat AI Inference Server.
| Model | Quantized variants | Hugging Face model card | Validated on |
|---|---|---|---|
| gemma-2-9b-it | FP8 |
| |
| granite-3.1-8b-base | INT4 |
| |
| granite-3.1-8b-instruct | INT4, INT8, FP8 |
| |
| Llama-3.1-8B-Instruct | None |
| |
| Llama-3.1-Nemotron-70B-Instruct-HF | FP8 |
| |
| Llama-3.3-70B-Instruct | INT4, INT8, FP8 |
| |
| Llama-4-Maverick-17B-128E-Instruct | FP8 |
| |
| Llama-4-Scout-17B-16E-Instruct | INT4, FP8 |
| |
| Meta-Llama-3.1-8B-Instruct | INT4, INT8, FP8 |
| |
| Mistral-Small-24B-Instruct-2501 | INT4, INT8, FP8 |
| |
| Mistral-Small-3.1-24B-Instruct-2503 | INT4, INT8, FP8 |
| |
| Mixtral-8x7B-Instruct-v0.1 | None |
| |
| phi-4 | INT4, INT8, FP8 |
| |
| Qwen2.5-7B-Instruct | INT4, INT8, FP8 |
|