此内容没有您所选择的语言版本。

Chapter 2. Version 3.4.0-ea.1 release notes


Red Hat Enterprise Linux AI is a generative AI inference platform for Linux environments that uses Red Hat AI Inference Server for running and optimizing models, and includes Red Hat AI Model Optimization Toolkit for model quantization, sparsity, and general compression for supported AI accelerators. Red Hat AI Model Optimization Toolkit has native Hugging Face and vLLM support. You can seamlessly integrate optimized models with deployment pipelines for faster, cost-saving inference at scale, powered by the compressed-tensors model format.

Important

Red Hat Enterprise Linux AI 3.4.0-ea.1 is an Early Access release. Early Access releases are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Early Access releases for production or business-critical workloads. Use Early Access releases to test upcoming product features in advance of their possible inclusion in a Red Hat product offering, and to test functionality and provide feedback during the development process. These features might not have any documentation, are subject to change or removal at any time, and testing is limited. Red Hat might provide ways to submit feedback on Early Access features without an associated SLA.

Red Hat Enterprise Linux AI is packaged as a bootc container image for easy deployment on a Linux server appliance with NVIDIA CUDA or AMD ROCm AI accelerators installed. The following container images are available as early access releases from registry.redhat.io:

  • registry.redhat.io/rhelai-early-access/bootc-cuda-rhel9:3.4.0-ea.2
  • registry.redhat.io/rhelai-early-access/bootc-rocm-rhel9:3.4.0-ea.2
Important

There is no direct upgrade path from Red Hat Enterprise Linux AI 1.5 to Red Hat Enterprise Linux AI 3.0. You can upgrade from Red Hat Enterprise Linux AI 3.0 to 3.4 and all versions in-between.

Important

The registry.redhat.io/rhelai-early-access/bootc-rocm-rhel9:3.4.0-ea.2 image does not include Red Hat AI Model Optimization Toolkit, which is not supported for AMD ROCm AI accelerators.

2.1. New features

Red Hat Enterprise Linux AI 3.4.0-ea.1 packages Red Hat AI Inference Server 3.4.0-ea.1, which includes the following highlights:

Upgraded vLLM to v0.14.1
Red Hat AI Inference Server 3.4.0-ea.1 packages the upstream vLLM v0.14.1 release with asynchronous scheduling enabled by default, a new gRPC server entrypoint, auto-context length fitting, and security fixes including token leak prevention in crash logs.
New model support
Red Hat AI Inference Server 3.4.0-ea.1 adds support for Grok-2, Mistral 3, MiMo-V2-Flash, Nemotron Parse 1.1, and various other model architectures. LoRA multimodal support has been expanded for LLaVA, BLIP2, PaliGemma, Pixtral, and GLM4-V models. Tool calling enhancements include FunctionGemma and GLM-4.7 parsers.
Performance improvements
Asynchronous scheduling now overlaps engine core scheduling with GPU execution, improving throughput without manual configuration. CUTLASS MoE optimizations deliver up to 5.3% throughput gain and up to 10.8% time to first token improvement. Fused RoPE and MLA KV-cache write optimization improves DeepSeek-style model performance.
New AI accelerator support
Red Hat AI Inference Server 3.4.0-ea.1 adds RTX PRO 4500 Blackwell Server Edition GPU support for NVIDIA, AITER RMSNorm fusion for AMD, and chunked prefill and prefix caching for IBM Spyre accelerators. CPU backend adds support for head sizes 80 and 112.
Quantization advances
Marlin support extends to Turing (sm75) architecture. New Quark int4-fp8 w4a8 MoE support, MXFP4 W4A16 support for dense models, and ModelOpt FP8 variants are now available.
Large-scale serving updates
Extended Dual-Batch Overlap (XBO) implementation, NVIDIA Inference Xfer Library (NIXL) asymmetric tensor parallelism, and LMCache KV cache registration improve large-scale serving capabilities.

2.2. Known issues

There are no known issues for Red Hat Enterprise Linux AI 3.4.0-ea.1.

Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

关于红帽文档

Legal Notice

Theme

© 2026 Red Hat
返回顶部