此内容没有您所选择的语言版本。
Chapter 5. Developer Preview features
This section describes Developer Preview features in Red Hat OpenShift AI 3.0. Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview features for production or business-critical workloads. Developer Preview features provide early access to functionality in advance of possible inclusion in a Red Hat product offering. Customers can use these features to test functionality and provide feedback during the development process. Developer Preview features might not have any documentation, are subject to change or removal at any time, and have received limited testing. Red Hat might provide ways to submit feedback on Developer Preview features without an associated SLA.
For more information about the support scope of Red Hat Developer Preview features, see Developer Preview Support Scope.
- Model-as-a-Service (MaaS) integration
This feature is available as a Developer Preview.
OpenShift AI now includes Model-as-a-Service (MaaS) to address resource consumption and governance challenges associated with serving large language models (LLMs).
MaaS provides centralized control over model access and resource usage by exposing models through managed API endpoints, allowing administrators to enforce consumption policies across teams.
This Developer Preview introduces the following capabilities:
- Policy and quota management
- Authentication and authorization
- Usage tracking
- User management
- AI Available Assets integration with Model-as-a-Service (MaaS)
This feature is available as a Developer Preview.
You can now access and consume Model-as-a-Service (MaaS) models directly from the AI Available Assets page in the GenAI Studio.
Administrators can configure a MaaS by enabling the toggle in the Model Deployments page. When a model is marked as a service, it becomes global and visible across all projects in the cluster.
- Additional fields added to Model Deployments for AI Available Assets integration
This feature is available as a Developer Preview.
Administrators can now add metadata to models during deployment so that they are automatically listed on the AI Available Assets page.
The following table describes the new metadata fields that streamline the process of making models discoverable and consumable by other teams:
| Field name | Field type | Description |
|---|---|---|
| Use Case | Free-form text | Describes the model’s primary purpose, for example, "Customer Churn Prediction" or "Image Classification for Product Catalog." |
| Description | Free-form text | Provides more detailed context and functionality notes for the model. |
| Add to AI Assets | Checkbox | When enabled, automatically publishes the model and its metadata to the AI Available Assets page. |
- Compatibility of Llama Stack remote providers and SDK with MCP HTTP streaming protocol
This feature is available as a Developer Preview.
Llama Stack remote providers and the SDK are now compatible with the Model Control Protocol (MCP) HTTP streaming protocol.
This enhancement enables developers to build fully stateless MCP servers, simplify deployment on standard Llama Stack infrastructure (including serverless environments), and improve scalability. It also prepares for future enhancements such as connection resumption and provides a smooth transition away from Server-Sent Events (SSE).
- Packaging of ITS Hub dependencies to the Red Hat–maintained Python index
This feature is available as a Developer Preview.
All Inference Time Scaling (ITS) runtime dependencies are now packaged in the Red Hat-maintained Python index, allowing Red Hat AI and OpenShift AI customers to install
its_huband its dependencies directly by usingpip.This enhancement enables users to build custom inference images with ITS algorithms focused on improving model accuracy at inference time without requiring model retraining, such as:
- Particle filtering
- Best-of-N
- Beam search
- Self-consistency
Verifier or PRM-guided search
For more information, see the ITS Hub on GitHub.
- Dynamic hardware-aware continual training strategy
Static hardware profile support is now available to help users select training methods, models, and hyperparameters based on VRAM requirements and reference benchmarks. This approach ensures predictable and reliable training workflows without dynamic hardware discovery.
The following components are included:
- API Memory Estimator: Accepts model, training method, dataset metadata, and assumed hyperparameters as input and returns an estimated VRAM requirement for the training job. Delivered as an API within Training Hub.
- Reference Profiles and Benchmarks: Provides end-to-end training time benchmarks for OpenShift AI Innovation (OSFT) and Performance Team (LAB SFT) baselines, delivered as static tables and documentation in Training Hub.
Hyperparameter Guidance: Publishes safe starting ranges for key hyperparameters such as learning rate, batch size, epochs, and LoRA rank. Integrated into example notebooks maintained by the AI Innovation team.
ImportantHardware discovery is not included in this release. Only static reference tables and guidance are provided; automated GPU or CPU detection is not yet supported.
- Human-in-the-Loop (HIL) functionality in the Llama Stack agent
Human-in-the-Loop (HIL) functionality has been added to the Llama Stack agent to allow users to approve unread tool calls before execution.
This enhancement includes the following capabilities:
- Users can approve or reject unread tool calls through the responses API.
- Configuration options specify which tool calls require HIL approval.
- Tool calls pause until user approval is received for HIL-enabled tools.
- Tool calls that do not require HIL continue to run without interruption.