Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 2. New features and enhancements


This section describes new features and enhancements in Red Hat OpenShift AI 2.16.

2.1. New Features

Customizable serving runtime parameters
You can now pass parameter values and environment variables to your runtimes when serving a model. Customization of the runtime parameters is particularly important in GenAI use cases involving vLLM.
Support for deploying quantized models
You can use the vLLM ServingRuntime for KServe runtime to deploy models that are quantized for the Marlin kernel. If your model is quantized for Marlin, vLLM automatically uses the Marlin kernel based on the underlying hardware. For other quantized models, you can use the --quantization=marlin custom parameter. For information about supported hardware, see Supported Hardware for Quantization Kernels on the vLLM website.
code-server workbench image

The code-server workbench image included in Red Hat OpenShift AI, previously available as a Technology Preview feature, is now generally available. For more information, see Working in code-server.

With the code-server workbench image, you can customize your workbench environment by using a variety of extensions to add new languages, themes, debuggers, and connect to additional services. You can also enhance the efficiency of your data science work with syntax highlighting, auto-indentation, and bracket matching.

Note

Elyra-based pipelines are not available with the code-server workbench image.

2.2. Enhancements

Custom connection types
Administrators can use the enhanced connections feature to configure custom connections to data sources such as databases, making it easier for users to access data for model development. Additionally, users can access models from repositories such as Hugging Face for model serving, thanks to the built-in connection type for URI-based repositories.
NVIDIA Triton Inference Server version 24.10 runtime: additional models tested and verified

The NVIDIA Triton Inference Server version 24.10 runtime has been tested with the following models for both KServe (REST and gRPC) and ModelMesh (REST):

  • Forest Inference Library (FIL)
  • Python
  • TensorRT
Distributed workloads: additional training images tested and verified

Several additional training images are tested and verified:

  • ROCm-compatible KFTO cluster image

    A new ROCm-compatible KFTO cluster image, quay.io/modh/training:py311-rocm61-torch241, is tested and verified. This image is compatible with AMD accelerators that are supported by ROCm 6.1.

  • ROCm-compatible Ray cluster images

    The ROCm-compatible Ray cluster images quay.io/modh/ray:2.35.0-py39-rocm61 and quay.io/modh/ray:2.35.0-py311-rocm61, previously available as a Developer Preview feature, are tested and verified. These images are compatible with AMD accelerators that are supported by ROCm 6.1.

  • CUDA-compatible KFTO image

    The CUDA-compatible KFTO cluster image, previously available as a Developer Preview feature, is tested and verified. The image is now available in a new location: quay.io/modh/training:py311-cuda121-torch241. This image is compatible with NVIDIA GPUs that are supported by CUDA 12.1.

These images are AMD64 images, which might not work on other architectures. For more information about the latest available training images in Red Hat OpenShift AI, see Red Hat OpenShift AI Supported Configurations.

Improved search terms for Red Hat OpenShift AI Operator

In the Administrator perspective of the OpenShift console, on the Operators > OperatorHub page, the Red Hat OpenShift AI Operator can now be found by entering any of the following terms in the Filter by keyword search field:

  • AI
  • RHOAI
  • OAI
  • ML
  • Machine Learning
  • Data Science
  • ODH
  • Open Data Hub
Red Hat logoGithubRedditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

© 2024 Red Hat, Inc.