Este contenido no está disponible en el idioma seleccionado.
Supported product and hardware configurations
Supported hardware and software configurations for deploying Red Hat AI Inference Server
Abstract
Preface Copiar enlaceEnlace copiado en el portapapeles!
This document describes the supported hardware, software, and delivery platforms that you can use to run Red Hat AI Inference Server in production environments.
Technology Preview and Developer Preview features are provided for early access to potential new features.
Technology Preview or Developer Preview features are not supported or recommended for production workloads.
Chapter 1. Product and version compatibility Copiar enlaceEnlace copiado en el portapapeles!
The following table lists the supported product versions for Red Hat AI Inference Server 3.2.
Red Hat AI Inference Server version | vLLM core version | LLM Compressor version |
---|---|---|
3.2.2 | v0.10.1.1 | v0.7.1 |
3.2.1 | v0.10.0 | Not included in this release |
3.2.0 | v0.9.2 | Not included in this release |
Chapter 2. Supported AI accelerators Copiar enlaceEnlace copiado en el portapapeles!
The following tables list the supported AI data center grade accelerators for Red Hat AI Inference Server 3.2.
Red Hat AI Inference Server only supports data center grade accelerators.
Red Hat AI Inference Server 3.2 is not compatible with CUDA versions lower than 12.8.
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM v0.10.1.1 | NVIDIA data center GPUs:
|
| Not included by default |
NVIDIA T4 and A100 accelerators do not support FP8 (W8A8) quantization.
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM v0.10.1.1 |
| x86 | x86 Technology Preview |
AMD GPUs support FP8 (W8A8) and GGUF quantization schemes only.
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM v0.10.1.1 | Google v4, v5e, v5p, v6e, Trillium | x86 Developer Preview | Not supported |
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM v0.10.1.1 | IBM Spyre | x86 Developer Preview | Not supported |
Chapter 3. Supported deployment environments Copiar enlaceEnlace copiado en el portapapeles!
The following deployment environments for Red Hat AI Inference Server are supported.
Environment | Supported versions | Deployment notes |
---|---|---|
OpenShift Container Platform (self‑managed) | 4.14 – 4.19 | Deploy on bare‑metal hosts or virtual machines. |
Red Hat OpenShift Service on AWS (ROSA) | 4.14 – 4.19 | Requires a ROSA cluster with STS and GPU‑enabled P5 or G5 node types. See Prepare your environment for more information. |
Red Hat Enterprise Linux (RHEL) | 9.2 – 10.0 | Deploy on bare‑metal hosts or virtual machines. |
Linux (not RHEL) | - | Supported under third‑party policy deployed on bare‑metal hosts or virtual machines. OpenShift Container Platform Operators are not required. |
Kubernetes (not OpenShift Container Platform) | - | Supported under third‑party policy deployed on bare‑metal hosts or virtual machines. |
Red Hat AI Inference Server is available only as a container image. The host operating system and kernel must support the required accelerator drivers. For more information, see Supported AI accelerators.
Chapter 4. OpenShift Container Platform software prerequisites for GPU deployments Copiar enlaceEnlace copiado en el portapapeles!
The following table lists the OpenShift Container Platform software prerequisites for GPU deployments.
Component | Minimum version | Operator |
---|---|---|
NVIDIA GPU Operator | 24.3 | |
AMD GPU Operator | 6.2 | |
Node Feature Discovery [1] | 4.14 |
[1] Included by default with OpenShift Container Platform. Node Feature Discovery is required for scheduling NUMA-aware workloads.
Chapter 5. Lifecycle and update policy Copiar enlaceEnlace copiado en el portapapeles!
Security and critical bug fixes are delivered as container images available from the registry.access.redhat.com/rhaiis
container registry and are announced through RHSA advisories. See RHAIIS container images on catalog.redhat.com for more details.