此内容没有您所选择的语言版本。

Chapter 5. Developer Preview features


Important

This section describes Developer Preview features in Red Hat OpenShift AI 3.0. Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview features for production or business-critical workloads. Developer Preview features provide early access to functionality in advance of possible inclusion in a Red Hat product offering. Customers can use these features to test functionality and provide feedback during the development process. Developer Preview features might not have any documentation, are subject to change or removal at any time, and have received limited testing. Red Hat might provide ways to submit feedback on Developer Preview features without an associated SLA.

For more information about the support scope of Red Hat Developer Preview features, see Developer Preview Support Scope.

Model-as-a-Service (MaaS) integration

This feature is available as a Developer Preview.

OpenShift AI now includes Model-as-a-Service (MaaS) to address resource consumption and governance challenges associated with serving large language models (LLMs).

MaaS provides centralized control over model access and resource usage by exposing models through managed API endpoints, allowing administrators to enforce consumption policies across teams.

This Developer Preview introduces the following capabilities:

  • Policy and quota management
  • Authentication and authorization
  • Usage tracking
  • User management
AI Available Assets integration with Model-as-a-Service (MaaS)

This feature is available as a Developer Preview.

You can now access and consume Model-as-a-Service (MaaS) models directly from the AI Available Assets page in the GenAI Studio.

Administrators can configure a MaaS by enabling the toggle in the Model Deployments page. When a model is marked as a service, it becomes global and visible across all projects in the cluster.

Additional fields added to Model Deployments for AI Available Assets integration

This feature is available as a Developer Preview.

Administrators can now add metadata to models during deployment so that they are automatically listed on the AI Available Assets page.

The following table describes the new metadata fields that streamline the process of making models discoverable and consumable by other teams:

Expand
Field nameField typeDescription

Use Case

Free-form text

Describes the model’s primary purpose, for example, "Customer Churn Prediction" or "Image Classification for Product Catalog."

Description

Free-form text

Provides more detailed context and functionality notes for the model.

Add to AI Assets

Checkbox

When enabled, automatically publishes the model and its metadata to the AI Available Assets page.

Compatibility of Llama Stack remote providers and SDK with MCP HTTP streaming protocol

This feature is available as a Developer Preview.

Llama Stack remote providers and the SDK are now compatible with the Model Control Protocol (MCP) HTTP streaming protocol.

This enhancement enables developers to build fully stateless MCP servers, simplify deployment on standard Llama Stack infrastructure (including serverless environments), and improve scalability. It also prepares for future enhancements such as connection resumption and provides a smooth transition away from Server-Sent Events (SSE).

Packaging of ITS Hub dependencies to the Red Hat–maintained Python index

This feature is available as a Developer Preview.

All Inference Time Scaling (ITS) runtime dependencies are now packaged in the Red Hat-maintained Python index, allowing Red Hat AI and OpenShift AI customers to install its_hub and its dependencies directly by using pip.

This enhancement enables users to build custom inference images with ITS algorithms focused on improving model accuracy at inference time without requiring model retraining, such as:

  • Particle filtering
  • Best-of-N
  • Beam search
  • Self-consistency
  • Verifier or PRM-guided search

    For more information, see the ITS Hub on GitHub.

Dynamic hardware-aware continual training strategy

Static hardware profile support is now available to help users select training methods, models, and hyperparameters based on VRAM requirements and reference benchmarks. This approach ensures predictable and reliable training workflows without dynamic hardware discovery.

The following components are included:

  • API Memory Estimator: Accepts model, training method, dataset metadata, and assumed hyperparameters as input and returns an estimated VRAM requirement for the training job. Delivered as an API within Training Hub.
  • Reference Profiles and Benchmarks: Provides end-to-end training time benchmarks for OpenShift AI Innovation (OSFT) and Performance Team (LAB SFT) baselines, delivered as static tables and documentation in Training Hub.
  • Hyperparameter Guidance: Publishes safe starting ranges for key hyperparameters such as learning rate, batch size, epochs, and LoRA rank. Integrated into example notebooks maintained by the AI Innovation team.

    Important

    Hardware discovery is not included in this release. Only static reference tables and guidance are provided; automated GPU or CPU detection is not yet supported.

Human-in-the-Loop (HIL) functionality in the Llama Stack agent

Human-in-the-Loop (HIL) functionality has been added to the Llama Stack agent to allow users to approve unread tool calls before execution.

This enhancement includes the following capabilities:

  • Users can approve or reject unread tool calls through the responses API.
  • Configuration options specify which tool calls require HIL approval.
  • Tool calls pause until user approval is received for HIL-enabled tools.
  • Tool calls that do not require HIL continue to run without interruption.
返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat