此内容没有您所选择的语言版本。
Chapter 2. New features and enhancements
This section describes new features and enhancements in Red Hat OpenShift AI 3.4 EA1 and 3.4 EA2.
2.1. New features 复制链接链接已复制到粘贴板!
2.1.1. 3.4 EA2 new features 复制链接链接已复制到粘贴板!
- Multi-architecture support for the model catalog
The model catalog includes support for IBM Power (ppc64le) architecture. With this enhancement, you can discover and deploy models directly from the dashboard. Support is available for the following validated models:
-
registry.redhat.io/rhai/modelcar-granite-3-3-8b-instruct -
registry.redhat.io/rhai/modelcar-granite-4-0-h-small:3.0 -
registry.redhat.io/rhai/modelcar-granite-4-0-h-tiny:3.0
-
- Just-In-Time Checkpointing and S3 Storage for Kubeflow Trainer
Kubeflow Trainer now provides Just-In-Time (JIT) and periodic checkpointing for distributed training jobs on OpenShift AI. This enhancement automatically saves the training state, including model weights, optimizer state, and training step at regular intervals and immediately before interruptions such as preemption, eviction, or maintenance. Interrupted jobs automatically resume from the latest valid checkpoint, significantly reducing wasted GPU compute and improving overall training efficiency.
Checkpoints can be stored on PersistentVolumeClaims (PVCs) or S3-compatible object storage. With S3, checkpoints are uploaded in the background without pausing training, enabling low-overhead, continuous protection of progress. S3-backed storage also provides a cost-efficient, portable alternative to PVCs, allowing checkpoints to be retained, shared, and reused across clusters.
2.1.2. 3.4 EA1 new features 复制链接链接已复制到粘贴板!
Model deployments are not visible under the model registry deployments tab on IBM Power (ppc64le) in RHOAI 3.4-EA1.
- Workbench and runtime images default to Red Hat Python index
- Workbench and runtime images default to the Red Hat Python index. When you install or update Python packages, packages are pulled from the Red Hat Python index rather than PyPI. This provides you with Red Hat built and supported Python packages.
- Garak evaluation provider available in Llama Stack distribution
- The Garak evaluation provider is available in the Llama Stack distribution. Garak provides security scanning capabilities for large language models to help identify potential vulnerabilities and safety issues. The provider is available in two versions: an inline version that runs scans in the same process as the Llama Stack server, and a remote version that runs scans by using Kubeflow Pipelines.
- PostgreSQL database support for Model Registry
- You can configure a PostgreSQL database as the backend for Model Registry from the OpenShift AI dashboard.
- Default database solution for Model Registry
Model Registry includes a default database solution for testing. Use this solution to start using Model Registry without configuring an external database.
NoteThe default database is not intended for production workloads.
2.2. Enhancements 复制链接链接已复制到粘贴板!
2.2.1. 3.4 EA2 enhancements 复制链接链接已复制到粘贴板!
- Simplified configuration for llm-d scheduler settings
-
Configure llm-d scheduler settings using the
endpointPickerConfigfield in the LLMInferenceService specification. You can specify the configuration inline or reference a ConfigMap. This approach replaces the previous method that required making extensive specifications in the EndpointPicker configuration in the scheduler’s--configTextargument.
- Configure vLLM runtime arguments using Kubernetes container
argsfield -
You can configure vLLM runtime arguments using the standard Kubernetes container
argsfield in LLMInferenceService resources. User-specified arguments are merged with system defaults, allowing you to add new arguments or override specific defaults without replacing the entire argument list.
The previous VLLM_ADDITIONAL_ARGS environment variable method continues to work for backward compatibility.
2.2.2. 3.4 EA1 enhancements 复制链接链接已复制到粘贴板!
- Hybrid search support for Qdrant remote vector database provider
- Vector Store Search supports hybrid and keyword search for the Qdrant Vector IO provider.