Supported product and hardware configurations
Supported hardware and software configurations for deploying Red Hat AI Inference Server
Abstract
Preface Copy linkLink copied to clipboard!
This document describes the supported hardware, software, and delivery platforms that you can use to run Red Hat AI Inference Server in production environments.
Technology Preview and Developer Preview features are provided for early access to potential new features.
Technology Preview or Developer Preview features are not supported or recommended for production workloads.
Chapter 1. Product and version compatibility Copy linkLink copied to clipboard!
The following table lists the supported product versions for Red Hat AI Inference Server 3.1.
Product | Supported version |
---|---|
Red Hat AI Inference Server | 3.1 |
vLLM core | 0.9.0.1 |
LLM Compressor | 0.5.1 Technology Preview |
Chapter 2. Supported AI accelerators Copy linkLink copied to clipboard!
The following tables list the supported AI accelerators for Red Hat AI Inference Server 3.1.
Red Hat AI Inference Server 3.1 is not compatible with CUDA versions lower than 12.8.
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM 0.9.0.1 |
|
| x86 Technology Preview |
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM 0.8.4 |
| x86 | x86 Technology Preview |
Container image | vLLM release | AI accelerators | Requirements | vLLM architecture support | LLM Compressor support |
---|---|---|---|---|---|
| vLLM 0.8.5 | Google TPU v6e | x86 Developer Preview | Not supported |
Chapter 3. Supported deployment environments Copy linkLink copied to clipboard!
The following deployment environments for Red Hat AI Inference Server are supported.
Environment | Supported versions | Deployment notes |
---|---|---|
OpenShift Container Platform (self‑managed) | 4.14 – 4.18 | Deploy on bare‑metal hosts or virtual machines. |
Red Hat OpenShift Service on AWS (ROSA) | 4.14 – 4.18 | Requires ROSA STS cluster with GPU‑enabled P5 or G5 node types. |
Red Hat Enterprise Linux (RHEL) | 9.2 – 10.0 | Deploy on bare‑metal hosts or virtual machines. |
Linux (not RHEL) | - | Supported under third‑party policy deployed on bare‑metal hosts or virtual machines. OpenShift Container Platform Operators are not required. |
Kubernetes (not OpenShift Container Platform) | - | Supported under third‑party policy deployed on bare‑metal hosts or virtual machines. |
Red Hat AI Inference Server is available only as a container image. The host operating system and kernel must support the required accelerator drivers. For more information, see Supported AI accelerators.
Chapter 4. OpenShift Container Platform software prerequisites for GPU deployments Copy linkLink copied to clipboard!
The following table lists the OpenShift Container Platform software prerequisites for GPU deployments.
Component | Minimum version | Operator |
---|---|---|
NVIDIA GPU Operator | 24.3 | |
AMD GPU Operator | 6.2 | |
Node Feature Discovery [1] | 4.14 |
[1] Included by default with OpenShift Container Platform. Node Feature Discovery is required for scheduling NUMA-aware workloads.
Chapter 5. Lifecycle and update policy Copy linkLink copied to clipboard!
Security and critical bug fixes are delivered as container images available from the registry.access.redhat.com/rhaiis
container registry and are announced through RHSA advisories. See RHAIIS container images on catalog.redhat.com for more details.