Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 8. Known issues

This section describes known issues in Red Hat OpenShift AI 3.3 and any known methods of working around these issues.

RHOAIENG-53865 - MaaS tier-based rate limiting fails when configured through the dashboard UI

When you configure Models-as-a-Service (MaaS) tier-based rate limiting through the Red Hat OpenShift AI dashboard UI, the following issues occur:

The system creates separate TokenRateLimitPolicy resources for each tier instead of a single, combined policy. This default configuration causes rate limiting to silently fail for most tiers, allowing users in unprotected tiers to exceed intended limits.
The dashboard UI does not read or display rate limits configured through the CLI.
When you edit tier settings through the dashboard UI, the UI-configured settings overwrite the CLI configuration.
Workaround
There are two possible workarounds to ensure that rate limiting is enforced for all tiers:
Manually update the TokenRateLimitPolicy resources with merged limits for each tier.
Create a single, combined TokenRateLimitPolicy resource with limits for all tiers.

To manually update the TokenRateLimitPolicy resources to use a merge strategy:

List the existing TokenRateLimitPolicy resources:

$ oc get tokenratelimitpolicy -n openshift-ingress

Edit each tier policy to use defaults.strategy: merge instead of atomic. For example, edit the Free tier policy, tier-free-token-rate-limits:
```
$ oc edit tokenratelimitpolicy tier-free-token-rate-limits -n openshift-ingress
```

In the editor, locate the spec.defaults section and change the strategy from atomic to merge:

spec:
  defaults:
    strategy: merge  # Change from atomic to merge
    limits:
      free-tokens:   # Ensure this has a distinct name
        when:
          predicate: auth.identity.tier == "free"
        rates:
          - limit: 1000
            window: 1m0s

Save and exit the editor.
Repeat steps 2-4 for the Premium and Enterprise tier policies (tier-premium-token-rate-limits and tier-enterprise-token-rate-limits), ensuring that each limit has a distinct name, such as premium-tokens and enterprise-tokens.

To create a single, combined TokenRateLimitPolicy resource with limits for all tiers:

Configure rate limits using the CLI by creating a single, combined TokenRateLimitPolicy resource with limits for all tiers such as in the following example:

Complete example configuration for Free tier

apiVersion: kuadrant.io/v1alpha1
  kind: TokenRateLimitPolicy
  metadata:
    name: tier-free-token-rate-limits
  spec:
    targetRef:
      kind: Gateway
      name: maas-default-gateway
    defaults:
      strategy: merge
      limits:
        free-tokens:
          when:
          - predicate: auth.identity.tier == "free"
          rates:
          - limit: 1000
            window: 1m0s

Warning

Do not edit tier settings through the dashboard UI after applying CLI-configured policies, because changes in the UI overwrite any CLI configurations.

Verification

Verify that the policy is enforced:

In the side navigation menu of the OpenShift AI dashboard, click Administration > CustomResourceDefinitions.
In the CustomResourceDefinitions list, search for and click TokenrateLimitPolicy .
Click the Instances tab to view the list of policies.
In the Name column, click the name of the specific policy you want to verify. For example, gateway-default-deny.
In the TokenRateLimitPolicy details page, locate the Enforced status field:
1. True: The policy is being picked up by the controller.
2. False or -: The policy is not being used.
Diagnose why a policy is not being used by scrolling to the Conditions section to view the status details.
Review the Reason column for error codes, such as TargetNotFound.
Review the Message column for a detailed explanation of the issue, such as a missing target gateway.

RHOAIENG-52057 - LLMInferenceService deployment fails without Leader WorkerSet operator

When deploying an LLMInferenceService object for Distributed Inference Server, the deployment fails with the following error:

failed to reconcile multi-node main workload: failed to build the expected main LWS: failed to get expected leader worker set demo-llm/qwen-kserve-mn: no matches for kind "LeaderWorkerSet" in version "leaderworkerset.x-k8s.io/v1

Workaround: Install the LeaderWorkerSet Operator.

RHOAIENG-50523 - Unable to upload RAG documents in Gen AI Playground on disconnected clusters

On disconnected clusters, uploading documents in the Gen AI Playground RAG section fails. The progress bar never exceeds 50% because Llama Stack attempts to download the ibm-granite/granite-embedding-125m-english embedding model from HuggingFace, even though the model is already included in the Llama Stack Distribution image in OpenShift AI 3.3.

Workaround

Modify the LlamaStackDistribution custom resource to include the following environment variables:

export MY_PROJECT=my-project

oc patch llamastackdistribution lsd-genai-playground \
  -n $MY_PROJECT \
  --type='json' \
  -p='[
    {
      "op": "add",
      "path": "/spec/server/containerSpec/env/-",
      "value": {
        "name": "SENTENCE_TRANSFORMERS_HOME",
        "value": "/opt/app-root/src/.cache/huggingface/hub"
      }
    },
    {
      "op": "add",
      "path": "/spec/server/containerSpec/env/-",
      "value": {
        "name": "HF_HUB_OFFLINE",
        "value": "1"
      }
    },
    {
      "op": "add",
      "path": "/spec/server/containerSpec/env/-",
      "value": {
        "name": "TRANSFORMERS_OFFLINE",
        "value": "1"
      }
    },
    {
      "op": "add",
      "path": "/spec/server/containerSpec/env/-",
      "value": {
        "name": "HF_DATASETS_OFFLINE",
        "value": "1"
      }
    }
  ]'

The Llama Stack pod restarts automatically after applying this configuration.

RHAIENG-2827 - Unsecured routes created by older CodeFlare SDK versions

Existing 2.x workbenches continue to use an older version of the CodeFlare SDK when used in OpenShift AI 3.x. The older version of the SDK creates unsecured OpenShift routes on behalf of the user.

Workaround: To resolve this issue, update your workbench to the latest image provided in OpenShift AI 3.x before using CodeFlare SDK.

RHOAIENG-48867 - TrainJob fails to resume after Red Hat OpenShift AI upgrade due to immutable JobSet spec

TrainJobs that are suspended (e.g., queued by Kueue) before a Red Hat OpenShift AI upgrade cannot resume after the upgrade completes. The Trainer controller fails to update the immutable JobSet spec.replicatedJobs field.

Workaround: To resolve this issue, delete and recreate the affected TrainJob after the upgrade.

RHOAIENG-45142 - Dashboard URLs return 404 errors after upgrading Red Hat OpenShift AI from 2.x to 3.x

The Red Hat OpenShift AI dashboard URL subdomain changed from rhods-dashboard-redhat-ods-applications.apps.<cluster>`to `data-science-gateway.apps.<cluster> due to the use of Gateways in OpenShift AI version 3.x. Existing bookmarks to the dashboard using the default rhods-dashboard-redhat-ods-applications.apps.<cluster> format will no longer function after you upgrade to OpenShift AI version 3.0 or later. It is recommended that you update your bookmarks and any internal documentation to use the new URL format: data-science-gateway.apps.<cluster>.

Workaround: To resolve this issue, deploy an nginx-based redirect solution that recreates the old route name and redirects traffic to the new gateway URL. For instructions, see Dashboard URLs return 404 errors after RHOAI upgrade from 2.x to 3.x

Note

Cluster administrators must provide the new dashboard URL to all Red Hat OpenShift AI administrators and users. In a future release, URL redirects may be supported.

RHOAIENG-43686 - Red Hat build of Kueue 1.2 installation or upgrade fails with Kueue CRD reconciliation error

Installing Red Hat build of Kueue 1.2 or upgrading from Red Hat build of Kueue 1.1 to 1.2 fails if legacy Kueue CustomResourceDefinitions (CRDs) remain in the cluster from a previous Red Hat OpenShift AI 2.x installation. As a result, when the legacy v1alpha1 CRDs are present, the Kueue operator cannot reconcile successfully and the Data Science Cluster (DSC) remains in a Not Ready state.

Workaround: To resolve this issue, delete the legacy Kueue CRDs, cohorts.kueue.x-k8s.io/v1alpha1 or topologies.kueue.x-k8s.io/v1alpha1 from the cluster. For detailed instructions, see Red Hat Build of Kueue 1.2 installation or upgrade fails with Kueue CRD reconciliation error.

RHOAIENG-49389 - Tier management unavailable after deleting all tiers

If you delete all service tiers from Settings > Tiers, the Create tier button is no longer displayed. You cannot create tiers through the dashboard until at least one tier exists. To avoid this issue, ensure at least one tier remains in the system at all times.

Workaround

Create a basic tier using the CLI, then configure its settings through the dashboard. You must have cluster administrator privileges for your OpenShift cluster to perform these steps:

Retrieve the tier-to-group-mapping ConfigMap:

$ oc get configmap tier-to-group-mapping redhat-ods-namespace -o yaml tier-config.yaml

Edit the ConfigMap to add a basic tier definition:

apiVersion: v1
  kind: ConfigMap
  metadata:
    name: tier-to-group-mapping
    namespace: redhat-ods-applications
  data:
    tiers.yaml: |
      - name: basic
        displayName: Basic Tier
        level: 0
        groups:
          - system:authenticated

Apply the updated ConfigMap:
```
$ oc apply -f tier-config.yaml
```
In the dashboard, navigate to Settings Tiers to configure rate limits for the newly created tier.

RHOAIENG-47589 - Missing Kueue validation for TrainJob

A TrainJob creation without a defined Kueue LocalQueue passes without validation check, even when Kueue managed namespace is enabled. As a result, it is possible to create TrainJob not managed by Kueue in Kueue managed namespace.

Workaround: None.

RHOAIENG-49017 - Upgrade RAGAS provider to Llama Stack 0.4.z / 0.5.z

In order to use the Ragas provider in OpenShift AI 3.3, you must update your Llama Stack distribution to use llama-stack-provider-ragas==0.5.4, which works with Llama Stack >=0.4.2,<0.5.0. This version of the provider is a workaround release that is using the deprecated register endpoints as a workaround. See the full compatibility matrix for more information.

Workaround: None.

RHOAIENG-44516 - MLflow tracking server does not accept Kubernetes service account tokens

Red Hat OpenShift AI does not accept Kubernetes service accounts when you authenticate through the dashboard MLflow URL.

Workaround

To authenticate with a service account token, complete the following steps:

Create an OpenShift Route directly to the MLflow service endpoints.
Use the Route URL as the MLFLOW_TRACKING_URI when you authenticate.

Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 8. Known issues

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Mehr Inklusion in Open Source

Über Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links