Chapter 4. Technology Preview features

Important

This section describes Technology Preview features in Red Hat OpenShift AI 3.2. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

pgvector support as a remote vector store provider in Llama Stack

Starting with OpenShift AI 3.2, you can use PostgreSQL with the pgvector extension as a remote vector store provider for the Llama Stack vector_store endpoint as a Technology Preview feature.

This enhancement enables vector storage backed by PostgreSQL, providing durable and transactional persistence for vector embeddings. For more information, see Llama Stack API provider support and Deploying a PostgreSQL instance with pgvector.

Llama Stack versions in OpenShift AI 3.2: OpenShift AI 3.2.0 uses the Open Data Hub Llama Stack version 0.3.5+rhai0 in the Llama Stack Distribution, which is based on the upstream Llama Stack version 0.3.5.

Llama Stack servers now require installation of the PostgreSQL Operator: In OpenShift AI 3.2, the PostgreSQL Operator is now required to deploy a Llama Stack server. For more information, see the Deploying a Llama Stack server documentation.

Enabling high availability on Llama Stack: Llama Stack servers can be configured to remain operational in the event of a single point of failure as a Technology Preview feature. You can enable PostgreSQL high-availability settings in your LlamaStackDistribution custom resource. For more information, see the Enabling high availability on Llama Stack (Optional) documentation.

Custom embeddings on Llama Stack

OpenShift AI 3.2 allows you to customize your embedding models as a Technology Preview feature. In the version of Llama Stack shipped in OpenShift AI 3.2, vLLM controls embeddings by default. You can update the VLLM_EMBEDDING_URL environment variable in your LlamaStackDistribution custom resource to enable embeddings, or you can use custom embeddings providers. For example:

  - name: ENABLE_SENTENCE_TRANSFORMERS
    value: "true"
  - name: EMBEDDING_PROVIDER
    value: "sentence-transformers"

  - name: ENABLE_SENTENCE_TRANSFORMERS
    value: "true"
  - name: EMBEDDING_PROVIDER
    value: "sentence-transformers"

Copy to Clipboard

Toggle word wrap

NVIDIA NeMo Guardrails: You can use NVIDIA NeMo Guardrails as a Technology Preview feature to add guardrails and safety controls to your deployed models in Red Hat OpenShift AI. NeMo Guardrails provides a framework for controlling conversations with large language models, enabling you to define a variety of rails, such as sensitive data detection, content filtering, or custom validation rules.

Stop button for chatbot in Generative AI Studio: You can interrupt the chatbot as it is composing a response to a prompt. In the Playground, after you send a prompt, the Send button in the chat input field changes to a Stop button. Click it if you want to interrupt the model’s response, for example, when the response takes longer than you anticipated or if you notice that you made an error in your prompt. The chatbot posts "You stopped this message" to confirm your stop request.

Kubeflow Trainer v2

Kubeflow Trainer v2 is now available as a Technology Preview feature in OpenShift AI 3.2.

Kubeflow Trainer v2 is the next generation of distributed training for OpenShift AI, replacing the Kubeflow Training Operator v1 (KFTOv1). This Kubernetes-native solution simplifies how data scientists and ML engineers run PyTorch training workloads at scale using a unified TrainJob API and Python SDK.

This Technology Preview release introduces the following capabilities:

Simplified job definitions using TrainJob and TrainingRuntime resources
Python SDK for programmatic job creation and management
A new web-based user interface for inspecting and interacting with training jobs
Real-time progress tracking with visibility into training steps, epochs, and metrics
Smart checkpoint management with automatic preservation during pod preemption or termination
Pausing and resuming train jobs
Resource-aware scheduling via native integration with Red Hat build of Kueue
Users of the deprecated Kubeflow Training Operator v1 (KFTOv1) should migrate their workloads to Kubeflow Trainer v2 before KFTOv1 is removed. For guidance and more details, see the migration guide.
For more information about Kubeflow Trainer v2 features and usage, see the Kubeflow Trainer v2 documentation.

Chapter 4. Technology Preview features

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links