Chapter 1. About this release
Red Hat AI Inference Server is now available. This Red Hat AI Inference Server 3.0 release provides container images that optimizes inferencing with large language models (LLMs) for NVIDIA and ROCm accelerators. The container images are available from registry.redhat.io:
-
registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.0.0
-
registry.redhat.io/rhaiis/vllm-rocm-rhel9:3.0.0
With Red Hat AI Inference Server, you can serve and inference models with higher performance, lower cost, and enterprise-grade stability and security. Red Hat AI Inference Server is built on the upstream, open source vLLM software project.