このコンテンツは選択した言語では利用できません。

Chapter 3. New features and enhancements


This section describes new features and enhancements in Red Hat OpenShift AI 3.2.

3.1. Enhancements

PostgreSQL mandated as the production persistence layer for Llama Stack

PostgreSQL is now the only supported database for production Llama Stack deployments in OpenShift AI. The default configuration in run.yaml uses PostgreSQL for both core persistence and the RAG file provider.

This enhancement ensures that production deployments use a scalable, production-ready persistence layer that meets enterprise performance and scalability requirements. SQLite remains available for local development and testing scenarios where appropriate.

Enhanced TLS security for llm-d components
This release delivers a foundational security enhancement by enforcing strict TLS validation for all internal llm-d component communication. All insecure skip TLS verify settings have been removed from the llm-d stack. All internal services, including the Gateway, Scheduler, and vLLM backends, now use TLS certificates automatically signed by the OpenShift Service CA. Clients are configured to trust this CA, ensuring all connections are fully encrypted and validated, which prevents man-in-the-middle (MITM) attacks and enforces a zero-trust security posture.
Model deployment wizard available from the model catalog

The OpenShift AI user interface for configuring and deploying large language models from the model catalog has been updated to use a new deployment wizard.

This streamlined interface simplifies common deployment scenarios by providing essential configuration options with sensible defaults when deploying models from the model catalog. The deployment wizard reduces setup complexity and helps users to deploy models from the model catalog more efficiently.

Model deployment wizard available from a model registry

The OpenShift AI user interface for configuring and deploying large language models from a model registry has been updated to use a new deployment wizard.

This streamlined interface simplifies common deployment scenarios by providing essential configuration options with sensible defaults when deploying models from a model registry. The deployment wizard reduces setup complexity and helps users to deploy models from a model registry more efficiently.

3.2. New features

Support added to run Red Hat OpenShift AI on OpenShift Kubernetes Engine (OKE)

You can now install and run Red Hat OpenShift AI on OpenShift Kubernetes Engine (OKE). Red Hat provides a specific licensing exception for OpenShift AI users, making it easier to use OpenShift AI. With this feature, the dependent Operators required by Red Hat OpenShift AI can be installed on OKE.

Note

This exception exclusively applies to Operators used to support Red Hat OpenShift AI AI workloads. Installing or using these Operators for purposes unrelated to Red Hat OpenShift AI is a violation of the OKE service agreement.

To learn more about OKE, see About OpenShift Kubernetes Engine.

Deployment strategy selection for model serving

You can now configure the deployment strategy for model deployments from the OpenShift AI dashboard. You can choose between Rolling update and Recreate strategies.

  • Rolling update (Default): Maintains availability by gradually replacing old pods with new ones.
  • Recreate: Terminates the existing pod before it starts the new pod. This strategy is critical for managing large language models (LLMs) that consume significant GPU resources, because it prevents the resource contention that occurs when two instances run simultaneously during an update.
New chat functionality in Generative AI Studio
You can now clear the chat history and start a new conversation in the Playground by clicking the New Chat button. The chat interface clears the chat history while preserving your Playground configuration settings.
Enhanced filtering for serving runtime selection

Red Hat OpenShift AI now includes improved filtering and distinct recommendations for selecting a serving runtime. You can choose how the serving runtime is determined by using the following options:

  • Auto-select the best runtime for your model based on model type, model format, and hardware profile: This option automatically selects a serving runtime if there is exactly one match. It also includes hardware profile matching based on the accelerator. For example, if you have a hardware profile with the NVIDIA GPU accelerator, the system suggests the vLLM NVIDIA GPU ServingRuntime for KServe runtime.

    Note

    If a cluster administrator enables the Use distributed inference with llm-d by default when deploying generative models option as the preferred default for generative models in the administrator settings, the system suggests the Distributed inference with llm-d runtime.

  • Select from a list of serving runtimes, including custom ones: This option displays all global and project-scoped serving runtime templates available to you.
Feature Store integration with workbenches

Feature Store now completely integrates with data science projects and workbenches. Capabilities such as centrally managed role-based access control (RBAC), and feature lifecycle and lineage visibility, are now production-ready and fully supported. You can use Feature Store to standardize feature reuse and governance across projects. This allows data scientists to work within workbenches, while platform teams maintain centralized control, security, and scalability.

Feature Store now supports AI computing frameworks, Ray and Apache Spark. These tools enable scalable, distributed feature engineering for machine learning (ML) and generative AI workloads.

Red Hat logoGithubredditYoutubeTwitter

詳細情報

試用、購入および販売

コミュニティー

Red Hat ドキュメントについて

Red Hat をお使いのお客様が、信頼できるコンテンツが含まれている製品やサービスを活用することで、イノベーションを行い、目標を達成できるようにします。 最新の更新を見る.

多様性を受け入れるオープンソースの強化

Red Hat では、コード、ドキュメント、Web プロパティーにおける配慮に欠ける用語の置き換えに取り組んでいます。このような変更は、段階的に実施される予定です。詳細情報: Red Hat ブログ.

会社概要

Red Hat は、企業がコアとなるデータセンターからネットワークエッジに至るまで、各種プラットフォームや環境全体で作業を簡素化できるように、強化されたソリューションを提供しています。

Theme

© 2026 Red Hat
トップに戻る