Este contenido no está disponible en el idioma seleccionado.
Chapter 2. Version 3.3.3 release notes
Red Hat AI Inference Server 3.3.3 is a maintenance release containing security fixes and enhancements.
Version 3.3.2 was skipped to align version numbering across Red Hat AI Inference Server components.
The following container images are available from registry.redhat.io:
-
registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.3.3 -
registry.redhat.io/rhaiis/model-opt-cuda-rhel9:3.3.3 -
registry.redhat.io/rhaiis/vllm-rocm-rhel9:3.3.3 -
registry.redhat.io/rhaiis/vllm-spyre-rhel9:3.3.3
Red Hat AI Inference Server 3.3.3 packages vLLM v0.13.0 for CUDA, ROCm, and CPU container images, and vLLM v0.11.0 for Spyre container images (Power, Z, x86). These are the same vLLM versions as Red Hat AI Inference Server 3.3.1.
Red Hat AI Model Optimization Toolkit 3.3.3 packages LLM Compressor v0.9.0.3 in the model-opt container image. The Spyre container image packages LLM Compressor v0.7.1.2.
2.1. Security updates in Red Hat AI Inference Server 3.3.3 (May 2026) Copiar enlaceEnlace copiado en el portapapeles!
This release provides security updates. For a complete list of updates, see the following errata advisories:
- RHSA-2026:16030 (CUDA)
- RHSA-2026:16008 (model-opt)
- RHSA-2026:16009 (ROCm)
- RHSA-2026:16174 (Spyre)
2.2. Enhancements Copiar enlaceEnlace copiado en el portapapeles!
- TranslateGemma model support for CUDA AI accelerators
The
google/translategemma-12b-itmultimodal translation model is supported for inference serving on CUDA AI accelerators. TranslateGemma supports text and image translation across 100+ languages.NoteThe model requires the
--chat-template-content-format openaiflag when starting the server.