Este contenido no está disponible en el idioma seleccionado.

Chapter 3. New features and enhancements

This section describes new features and enhancements in Red Hat OpenShift AI 3.3.2.

3.1. New features
Copiar enlace

Migration from OpenShift AI 2.25.4 to 3.3.2

OpenShift AI 3.3.2 is the first 3.x release to support migration from OpenShift AI 2.25.4. The OpenShift AI 3.x release introduces significant technology and component changes, making a direct upgrade from 2.25 technically complex.

For instructions on how to plan your migration from OpenShift AI 2.25.4 to 3.3.2, see Assess and plan for migration from OpenShift AI 2.25.4 to 3.3.2.

Enable targeted deployment of workbenches to specific worker nodes in Red Hat OpenShift AI Dashboard using node selectors

Starting with OpenShift AI version 3.0, hardware profiles are fully supported and Generally Available (GA). The hardware profiles feature enables users to target specific worker nodes for workbenches or model-serving workloads. It allows users to target specific accelerator types or CPU-only nodes.

This feature replaces the deprecated accelerator profiles feature and container size selector field, offering a broader set of capabilities for targeting different hardware configurations. While accelerator profiles, taints, and tolerations provide some capabilities for matching workloads to hardware, they do not ensure that workloads land on specific nodes, especially if some nodes lack the appropriate taints.

The hardware profiles feature supports both accelerator and CPU-only configurations, along with node selectors, to enhance targeting capabilities for specific worker nodes. Administrators can configure hardware profiles in the settings menu. Users can select the enabled profiles by using the UI for workbenches, model serving, and AI pipelines where applicable.

Model serving support for IBM Spyre AI accelerators on IBM Power

Model serving with IBM Spyre AI accelerators is now Generally Available (GA) on the IBM Power platform. The IBM Spyre Operator automates the installation and integration of key components, including the device plugin, secondary scheduler, and monitoring tools.

For more information, see the IBM Spyre Operator - Red Hat Ecosystem Catalog.

Allow and disallow functionality added to the model catalog: The model catalog now provides an administrative capability in the OpenShift AI dashboard to selectively hide, disallow list, or remove specific models from the visible catalog. This new feature ensures compliance with internal security, policy, or regulatory restrictions.

Kubeflow Trainer v2

Kubeflow Trainer v2 is Generally Available (GA) in Red Hat OpenShift AI 3.3.

Kubeflow Trainer v2 is the next generation of distributed training for OpenShift AI, replacing the Kubeflow Training Operator v1 (KFTOv1). This Kubernetes-native solution simplifies how data scientists and ML engineers run PyTorch training workloads at scale using a unified TrainJob API, pre-built ClusterTrainingRuntimes, and the Kubeflow Python SDK.

Hugging Face model catalog integration

With this release, data scientists and machine learning (ML) engineers can deploy public Hugging Face models directly from the model catalog. By using the KServe Container Storage Interface (CSI) and the hf: protocol, you can bypass manual staging and initiate model serving from the dashboard to accelerate experimentation.

This release includes the following capabilities:

Direct discovery: Public, non-gated Hugging Face models are available in the model catalog.
Instant deployment: You can use the Deploy action to trigger model serving without manual configuration.
Protocol integration: OpenShift AI provides native support for the hf: prefix to fetch models during the deployment process.
Consider the following limitations and requirements for this feature:
Public non-gated models only: This feature does not support gated or private models that require API keys or secret management.
Connectivity requirements: Deployment is not supported in disconnected or air-gapped environments. The cluster must have external network access to Hugging Face.
Model validation: Red Hat does not validate or guarantee the security of third-party models. You are responsible for vetting all external content.
Performance: Deployment times depend on model size and network speed, which might increase initial startup latency.

3.2. Enhancements
Copiar enlace

Updated naming of resources to include data-science prefix: To ensure a consistent naming convention across the product, resource naming is now updated to include the data-science- prefix.

vLLM-Gaudi 1.23 support: Red Hat OpenShift AI 3.3 now supports vllm-gaudi version 1.23, enhancing, enhancing performance and stability of vLLM applications.

Model catalog performance data with advanced search and filtering

The model catalog provides comprehensive model validation data, including performance benchmarks, hardware compatibility, and other relevant metrics for Red Hat validated third-party models.

This includes advanced search and filtering, for example, on throughput and latency for benchmarked hardware profiles, so users can quickly find validated models for their use case and available resources. This feature provides a unified discovery experience for models in the Red Hat OpenShift AI hub.

Configuring AuthN/AuthZ for llm-d

A documentation guide for configuring Authentication (AuthN) and Authorization (AuthZ) for Distributed Inference with llm-d is now available. This guide ensures that users can configure distributed inference workloads to be protected against unauthorized access and lateral movement within the cluster.

This is a documentation update only. Distributed inference with llm-d functionality remains unchanged.

Comprehensive High Performance Networking Guide for RDMA over Converged Ethernet (RoCE)

A documentation guide provides the roadmap for establishing high-availability, production-grade distributed inference with llm-d environments using RoCE. This guide decouples the complexities of high-performance networking to ensure your multi-GPU fabric remains lossless and stable, maximizing TFLOPS (Tera Floating-Point Operations Per Second) efficiency and minimizing tail-latency at scale.

This is a documentation update only. Distributed inference with llm-d functionality remains unchanged

Red Hat Operator catalogs moved from OperatorHub to the software catalog in the console

In OpenShift 4.20, the Red Hat-provided Operator catalogs have moved from OperatorHub to the software catalog and the Operators navigation item is renamed to Ecosystem in the console. The unified software catalog presents Operators, Helm charts, and other installable content in the same console view.

To access the Red Hat-provided Operator catalogs in the console, select Ecosystem Software Catalog.
To manage, update, and remove installed Operators, select Ecosystem Installed Operators.

For more information, see Red Hat Operator catalogs moved from OperatorHub to the software catalog in the console

Este contenido no está disponible en el idioma seleccionado.

Chapter 3. New features and enhancements

3.1. New features
Copiar enlace

3.2. Enhancements
Copiar enlace

Aprender

Pruebe, compre y venda

Comunidades

Acerca de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de la documentación de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Este contenido no está disponible en el idioma seleccionado.

Chapter 3. New features and enhancements

3.1. New featuresCopiar enlaceEnlace copiado en el portapapeles!

3.2. EnhancementsCopiar enlaceEnlace copiado en el portapapeles!

Aprender

Pruebe, compre y venda

Comunidades

Acerca de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de la documentación de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

3.1. New features
Copiar enlace

3.2. Enhancements
Copiar enlace