此内容没有您所选择的语言版本。
Chapter 4. Technology Preview features
This section describes Technology Preview features in Red Hat OpenShift AI 3.2. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
- Llama Stack versions in OpenShift AI 3.2
- OpenShift AI 3.2.0 uses the Open Data Hub Llama Stack version 0.3.5+rhai0 in the Llama Stack Distribution, which is based on the upstream Llama Stack version 0.3.5.
- Llama Stack servers now require installation of the PostgreSQL Operator
- In OpenShift AI 3.2, the PostgreSQL Operator is now required to deploy a Llama Stack server. For more information, see the Deploying a Llama Stack server documentation.
- Enabling high availability on Llama Stack
-
Llama Stack servers can be configured to remain operational in the event of a single point of failure as a Technology Preview feature. You can enable PostgreSQL high-availability settings in your
LlamaStackDistributioncustom resource. For more information, see the Enabling high availability on Llama Stack (Optional) documentation.
- Custom embeddings on Llama Stack
OpenShift AI 3.2 allows you to customize your embedding models as a Technology Preview feature. In the version of Llama Stack shipped in OpenShift AI 3.2, vLLM controls embeddings by default. You can update the
VLLM_EMBEDDING_URLenvironment variable in yourLlamaStackDistributioncustom resource to enable embeddings, or you can use custom embeddings providers. For example:- name: ENABLE_SENTENCE_TRANSFORMERS value: "true" - name: EMBEDDING_PROVIDER value: "sentence-transformers"- name: ENABLE_SENTENCE_TRANSFORMERS value: "true" - name: EMBEDDING_PROVIDER value: "sentence-transformers"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- NVIDIA NeMo Guardrails
- You can use NVIDIA NeMo Guardrails as a Technology Preview feature to add guardrails and safety controls to your deployed models in Red Hat OpenShift AI. NeMo Guardrails provides a framework for controlling conversations with large language models, enabling you to define a variety of rails, such as sensitive data detection, content filtering, or custom validation rules.
- Stop button for chatbot in Generative AI Studio
- You can interrupt the chatbot as it is composing a response to a prompt. In the Playground, after you send a prompt, the Send button in the chat input field changes to a Stop button. Click it if you want to interrupt the model’s response, for example, when the response takes longer than you anticipated or if you notice that you made an error in your prompt. The chatbot posts "You stopped this message" to confirm your stop request.
- Kubeflow Trainer v2
Kubeflow Trainer v2 is now available as a Technology Preview feature in OpenShift AI 3.2.
Kubeflow Trainer v2 is the next generation of distributed training for OpenShift AI, replacing the Kubeflow Training Operator v1 (KFTOv1). This Kubernetes-native solution simplifies how data scientists and ML engineers run PyTorch training workloads at scale using a unified TrainJob API and Python SDK.
This Technology Preview release introduces the following capabilities:
- Simplified job definitions using TrainJob and TrainingRuntime resources
- Python SDK for programmatic job creation and management
- A new web-based user interface for inspecting and interacting with training jobs
- Real-time progress tracking with visibility into training steps, epochs, and metrics
- Smart checkpoint management with automatic preservation during pod preemption or termination
- Pausing and resuming train jobs
Resource-aware scheduling via native integration with Red Hat build of Kueue
Users of the deprecated Kubeflow Training Operator v1 (KFTOv1) should migrate their workloads to Kubeflow Trainer v2 before KFTOv1 is removed. For guidance and more details, see the migration guide.
For more information about Kubeflow Trainer v2 features and usage, see the Kubeflow Trainer v2 documentation.