Este contenido no está disponible en el idioma seleccionado.

Supported product and hardware configurations

Red Hat AI Inference Server 3.2

Supported hardware and software configurations for deploying Red Hat AI Inference Server

Red Hat AI Documentation Team

Abstract

Learn about supported hardware and software configurations for Red Hat AI Inference Server.

Preface
Copiar enlace

This document describes the supported hardware, software, and delivery platforms that you can use to run Red Hat AI Inference Server in production environments.

Important

Technology Preview and Developer Preview features are provided for early access to potential new features.

Technology Preview or Developer Preview features are not supported or recommended for production workloads.

Chapter 1. Product and version compatibility
Copiar enlace

The following table lists the supported product versions for Red Hat AI Inference Server 3.2.

Expand

Table 1.1. Product and version compatibility
Red Hat AI Inference Server version	vLLM core version	LLM Compressor version
3.2.2	v0.10.1.1	v0.7.1
3.2.1	v0.10.0	Not included in this release
3.2.0	v0.9.2	Not included in this release

Chapter 2. Supported AI accelerators
Copiar enlace

The following tables list the supported AI data center grade accelerators for Red Hat AI Inference Server 3.2.

Important

Red Hat AI Inference Server only supports data center grade accelerators.

Red Hat AI Inference Server 3.2 is not compatible with CUDA versions lower than 12.8.

Expand

Table 2.1. Supported NVIDIA AI accelerators
Container image	vLLM release	AI accelerators	Requirements	vLLM architecture support	LLM Compressor support
`rhaiis/vllm‑cuda-rhel9:3.2.2`	vLLM v0.10.1.1	NVIDIA data center GPUs: Turing: T4 Ampere: A2, A10, A16, A30, A40, A100 Ada Lovelace: L4, L20, L40, L40S Hopper: H100, H200, H20, GH200 Blackwell: GB200, B200, RTX PRO 6000 Blackwell Server Edition	CUDA Toolkit 12.8 NVIDIA Container Toolkit 1.14 NVIDIA GPU Operator 24.3 Python 3.12 PyTorch 2.7.0	x86 AArch64 (Tech Preview, `vllm‑cuda-rhel9:3.2.0` only)	Not included by default

Note

NVIDIA T4 and A100 accelerators do not support FP8 (W8A8) quantization.

Expand

Table 2.2. Supported AMD AI accelerators
Container image	vLLM release	AI accelerators	Requirements	vLLM architecture support	LLM Compressor support
`rhaiis/vllm‑rocm-rhel9:3.2.2`	vLLM v0.10.1.1	AMD Instinct MI210 AMD Instinct MI300X	ROCm 6.2 AMD GPU Operator 6.2 Python 3.12 PyTorch 2.7.0	x86	x86 Technology Preview

Note

AMD GPUs support FP8 (W8A8) and GGUF quantization schemes only.

Expand

Table 2.3. Google TPU AI accelerators (Developer Preview)
Container image	vLLM release	AI accelerators	Requirements	vLLM architecture support	LLM Compressor support
`rhaiis/vllm‑xla-rhel9:3.2.2`	vLLM v0.10.1.1	Google v4, v5e, v5p, v6e, Trillium	Python 3.12 PyTorch 2.9.0	x86 Developer Preview	Not supported

Expand

Table 2.4. IBM Spyre AI accelerators (Developer Preview)
Container image	vLLM release	AI accelerators	Requirements	vLLM architecture support	LLM Compressor support
`rhaiis/vllm‑spyre-rhel9:3.2.2`	vLLM v0.10.1.1	IBM Spyre	Python 3.12 PyTorch 2.9.0	x86 Developer Preview	Not supported

Chapter 3. Supported deployment environments
Copiar enlace

The following deployment environments for Red Hat AI Inference Server are supported.

Expand

Table 3.1. Red Hat AI Inference Server supported deployment environments
Environment	Supported versions	Deployment notes
OpenShift Container Platform (self‑managed)	4.14 – 4.19	Deploy on bare‑metal hosts or virtual machines.
Red Hat OpenShift Service on AWS (ROSA)	4.14 – 4.19	Requires a ROSA cluster with STS and GPU‑enabled P5 or G5 node types. See Prepare your environment for more information.
Red Hat Enterprise Linux (RHEL)	9.2 – 10.0	Deploy on bare‑metal hosts or virtual machines.
Linux (not RHEL)	-	Supported under third‑party policy deployed on bare‑metal hosts or virtual machines. OpenShift Container Platform Operators are not required.
Kubernetes (not OpenShift Container Platform)	-	Supported under third‑party policy deployed on bare‑metal hosts or virtual machines.

Note

Red Hat AI Inference Server is available only as a container image. The host operating system and kernel must support the required accelerator drivers. For more information, see Supported AI accelerators.

Chapter 4. OpenShift Container Platform software prerequisites for GPU deployments
Copiar enlace

The following table lists the OpenShift Container Platform software prerequisites for GPU deployments.

Expand

Table 4.1. Software prerequisites for GPU deployments
Component	Minimum version	Operator
NVIDIA GPU Operator	24.3	NVIDIA GPU Operator OLM Operator
AMD GPU Operator	6.2	AMD GPU Operator OLM Operator
Node Feature Discovery ^[1]	4.14	Node Feature Discovery Operator

[1] Included by default with OpenShift Container Platform. Node Feature Discovery is required for scheduling NUMA-aware workloads.

Chapter 5. Lifecycle and update policy
Copiar enlace

Security and critical bug fixes are delivered as container images available from the registry.access.redhat.com/rhaiis container registry and are announced through RHSA advisories. See RHAIIS container images on catalog.redhat.com for more details.

Legal Notice
Copiar enlace

The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.

Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.

Linux® is the registered trademark of Linus Torvalds in the United States and other countries.

Java® is a registered trademark of Oracle and/or its affiliates.

XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.

MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.

Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.

The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.

All other trademarks are the property of their respective owners.

Este contenido no está disponible en el idioma seleccionado.

Supported product and hardware configurations

Supported hardware and software configurations for deploying Red Hat AI Inference Server

Preface
Copiar enlace

Chapter 1. Product and version compatibility
Copiar enlace

Chapter 2. Supported AI accelerators
Copiar enlace

Chapter 3. Supported deployment environments
Copiar enlace

Chapter 4. OpenShift Container Platform software prerequisites for GPU deployments
Copiar enlace

Chapter 5. Lifecycle and update policy
Copiar enlace

Legal Notice
Copiar enlace

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Este contenido no está disponible en el idioma seleccionado.

Supported product and hardware configurations

Supported hardware and software configurations for deploying Red Hat AI Inference Server

PrefaceCopiar enlaceEnlace copiado en el portapapeles!

Chapter 1. Product and version compatibilityCopiar enlaceEnlace copiado en el portapapeles!

Chapter 2. Supported AI acceleratorsCopiar enlaceEnlace copiado en el portapapeles!

Chapter 3. Supported deployment environmentsCopiar enlaceEnlace copiado en el portapapeles!

Chapter 4. OpenShift Container Platform software prerequisites for GPU deploymentsCopiar enlaceEnlace copiado en el portapapeles!

Chapter 5. Lifecycle and update policyCopiar enlaceEnlace copiado en el portapapeles!

Legal NoticeCopiar enlaceEnlace copiado en el portapapeles!

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Preface
Copiar enlace

Chapter 1. Product and version compatibility
Copiar enlace

Chapter 2. Supported AI accelerators
Copiar enlace

Chapter 3. Supported deployment environments
Copiar enlace

Chapter 4. OpenShift Container Platform software prerequisites for GPU deployments
Copiar enlace

Chapter 5. Lifecycle and update policy
Copiar enlace

Legal Notice
Copiar enlace