Chapter 4. Developer Preview features

Important

This section describes Developer Preview features in Red Hat OpenShift AI 2.22. Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview features for production or business-critical workloads. Developer Preview features provide early access to functionality in advance of possible inclusion in a Red Hat product offering. Customers can use these features to test functionality and provide feedback during the development process. Developer Preview features might not have any documentation, are subject to change or removal at any time, and have received limited testing. Red Hat might provide ways to submit feedback on Developer Preview features without an associated SLA.

For more information about the support scope of Red Hat Developer Preview features, see Developer Preview Support Scope.

Llama Stack Developer Preview: Build Generative AI Apps with OpenShift AI

With this release, the Llama Stack Developer Preview feature on OpenShift AI enables Retrieval-Augmented Generation (RAG) and agentic workflows for building next-generation generative AI applications. It supports remote inference, built-in embeddings, and vector database operations. It also integrates with providers like TrustyAI’s provider for safety and Trusty AI’s LM-Eval provider for evaluation.

This preview includes tools, components, and guidance for enabling the Llama Stack Operator, interacting with the RAG Tool, and automating PDF ingestion and keyword search capabilities to enhance document discovery.

Run evaluations for TrustyAI-Llama Stack safety and Guardrails

You can now run evaluations and apply Guardrails on Llama Stack with TrustyAI as a Developer Preview feature, using the built-in LM-Eval component and advanced content moderation tools. To use this feature, ensure TrustyAI is enabled, the FMS Orchestrator and detectors are set up, and KServe RawDeployment mode is in use for full compatibility if needed. There is no manual set up required.

Then, in the DataScienceCluster custom resource for the Red Hat OpenShift AI Operator, set the spec.llamastackoperator.managementState field to Managed.

For more information, see the following resources on GitHub:

LLM Compressor integration

LLM Compressor capabilities are now available in Red Hat OpenShift AI as a Developer Preview feature. A new workbench image with the llm-compressor library and a corresponding data science pipelines runtime image make it easier to compress and optimize your large language models (LLMs) for efficient deployment with vLLM. For more information, see llm-compressor in GitHub.

You can use LLM Compressor capabilities in two ways:

Use a Jupyter notebook with the workbench image available at Red Hat Quay.io: opendatahub / llmcompressor-workbench.
For an example Jupyter notebook, see examples/llmcompressor/workbench_example.ipynb in the red-hat-ai-examples repository.
Run a data science pipeline that executes model compression as a batch process with the runtime image available at Red Hat Quay.io: opendatahub / llmcompressor-pipeline-runtime.
For an example pipeline, see examples/llmcompressor/oneshot_pipeline.py in the red-hat-ai-examples repository.

Support for AppWrapper in Kueue

AppWrapper support in Kueue is available as a Developer Preview feature. The experimental API enables the use of AppWrapper-based workloads with the distributed workloads feature.

Chapter 4. Developer Preview features

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links