Chapter 8. Managing workloads with Kueue
As a cluster administrator, you can manage AI and machine learning workloads at scale by integrating the Red Hat build of Kueue with Red Hat OpenShift AI. This integration provides capabilities for quota management, resource allocation, and prioritized job scheduling.
Starting with OpenShift AI 2.24, the embedded Kueue component for managing distributed workloads is deprecated. Kueue is now provided through Red Hat build of Kueue, which is installed and managed by the Red Hat build of Kueue Operator. You cannot install both the embedded Kueue and the Red Hat build of Kueue Operator on the same cluster because this creates conflicting controllers that manage the same resources.
OpenShift AI does not automatically migrate existing workloads. To ensure your workloads continue using queue management after upgrading, cluster administrators must manually migrate from the embedded Kueue to the Red Hat build of Kueue Operator. For more information, see Migrating to the Red Hat build of Kueue Operator.
8.1. Overview of managing workloads with Kueue Copy linkLink copied to clipboard!
You can use Kueue in OpenShift AI to manage AI and machine learning workloads at scale. Kueue controls how cluster resources are allocated and shared through hierarchical quota management, dynamic resource allocation, and prioritized job scheduling. These capabilities help prevent cluster contention, ensure fair access across teams, and optimize the use of heterogeneous compute resources, such as hardware accelerators.
Kueue lets you schedule diverse workloads, including distributed training jobs (RayJob, RayCluster, PyTorchJob), workbenches (Notebook), and model serving (InferenceService). Kueue validation and queue enforcement apply only to workloads in namespaces with the kueue.openshift.io/managed=true label.
Using Kueue in OpenShift AI provides these benefits:
- Prevents resource conflicts and prioritizes workload processing
- Manages quotas across teams and projects
- Ensures consistent scheduling for all workload types
- Maximizes GPU and other specialized hardware utilization
Starting with OpenShift AI 2.24, the embedded Kueue component for managing distributed workloads is deprecated. Kueue is now provided through Red Hat build of Kueue, which is installed and managed by the Red Hat build of Kueue Operator. You cannot install both the embedded Kueue and the Red Hat build of Kueue Operator on the same cluster because this creates conflicting controllers that manage the same resources.
OpenShift AI does not automatically migrate existing workloads. To ensure your workloads continue using queue management after upgrading, cluster administrators must manually migrate from the embedded Kueue to the Red Hat build of Kueue Operator. For more information, see Migrating to the Red Hat build of Kueue Operator.
8.1.1. Kueue management states Copy linkLink copied to clipboard!
You configure how OpenShift AI interacts with Kueue by setting the managementState in the DataScienceCluster object.
UnmanagedThis state is supported for using Kueue with OpenShift AI. In
Unmanagedstate, OpenShift AI integrates with an existing Kueue installation managed by the Red Hat build of Kueue Operator. You must have the Red Hat build of Kueue Operator installed and running on the cluster.When you enable
Unmanagedmode, the OpenShift AI Operator creates a defaultKueuecustom resource (CR) if one does not already exist. This prompts the Red Hat build of Kueue Operator to activate Kueue on the cluster.Managed-
This state is deprecated. Previously, OpenShift AI deployed and managed an embedded Kueue distribution.
Managedmode is not compatible with the Red Hat build of Kueue Operator. If both are installed, OpenShift AI stops reconciliation to avoid conflicts. You must migrate any environment using theManagedstate to theUnmanagedstate to ensure continued support. Removed-
This state disables Kueue in OpenShift AI. If the state was previously
Managed, OpenShift AI uninstalls the embedded Kueue distribution. If the state was previouslyUnmanaged, OpenShift AI stops checking for the external Kueue integration but does not uninstall the Red Hat build of Kueue Operator. An emptymanagementStatevalue also functions asRemoved.
8.1.2. Queue enforcement for projects Copy linkLink copied to clipboard!
To ensure workloads do not bypass the queuing system, a validating webhook automatically enforces queuing rules on any project that is enabled for Kueue management. You enable a project for Kueue management by applying the kueue.openshift.io/managed=true label to the project namespace.
This validating webhook enforcement method replaces the Validating Admission Policy that was used with the deprecated embedded Kueue component. The system also supports the legacy kueue-managed label for backward compatibility, but kueue.openshift.io/managed=true is the recommended label going forward.
After a project is enabled for Kueue management, the webhook requires that any new or updated workload has the kueue.x-k8s.io/queue-name label. If this label is missing, the webhook prevents the workload from being created or updated.
OpenShift AI creates a default, cluster-scoped ClusterQueue (if one does not already exist) and a namespace-scoped LocalQueue for that namespace (if one does not already exist). These default resources are created with the opendatahub.io/managed=false annotation, so they are not managed after creation. Cluster administrators can change or delete them.
The webhook enforces this rule on the create and update operations for the following resource types:
-
InferenceService -
Notebook -
PyTorchJob -
RayCluster -
RayJob
You can apply hardware profiles to other workload types, but the validation webhook enforces the kueue.x-k8s.io/queue-name label requirement only for these specific resource types.
8.1.3. Restrictions for managing workloads with Kueue Copy linkLink copied to clipboard!
When you use Kueue to manage workloads in OpenShift AI, the following restrictions apply:
-
Namespaces must be labeled with
kueue.openshift.io/managed=trueto enable Kueue validation and queue enforcement. - All workloads that you create from the OpenShift AI dashboard, such as workbenches and model servers, must use a hardware profile that specifies a local queue.
-
When you specify a local queue in a hardware profile, OpenShift AI automatically applies the corresponding
kueue.x-k8s.io/queue-namelabel to workloads that use that profile. - You cannot use hardware profiles that contain node selectors or tolerations for node placement. To direct workloads to specific nodes, use a hardware profile that specifies a local queue that is associated with a queue configured with the appropriate resource flavors.
- You cannot use accelerator profiles with Kueue. You must migrate any existing accelerator profiles to hardware profiles.
- Because workbenches are not suspendable workloads, you can only assign them to a local queue that is associated with a non-preemptive cluster queue. The default cluster queue that OpenShift AI creates is non-preemptive.
Additional resources
8.1.4. Kueue workflow Copy linkLink copied to clipboard!
Managing workloads with Kueue in OpenShift AI involves tasks for OpenShift cluster administrators, OpenShift AI administrators, and machine learning (ML) engineers or data scientists:
Cluster administrator
Installs and configures Kueue:
- Installs the Red Hat build of Kueue Operator on the cluster, as described in the Red Hat build of Kueue documentation.
-
Activates the Kueue integration by setting the
managementStatetoUnmanagedin theDataScienceClustercustom resource, as described in Configuring workload management with Kueue. - Configures quotas to optimize resource allocation for user workloads, as described in the Red Hat build of Kueue documentation.
Enables Kueue in the dashboard by setting
disableKueuetofalsein theOdhDashboardConfigcustom resource, as described in Enabling Kueue in the dashboard.NoteWhen Kueue is enabled in the dashboard, OpenShift AI automatically enables Kueue management for all new projects created from the dashboard. For existing projects, or for projects created by using the OpenShift CLI (
oc), you must enable Kueue management manually by applying thekueue.openshift.io/managed=truelabel to the project namespace.
OpenShift AI administrator
Prepares the OpenShift AI environment:
- Creates Kueue-enabled hardware profiles so that users can submit workloads from the OpenShift AI dashboard, as described in Working with hardware profiles.
ML Engineer or data scientist
Submits workloads to the queuing system:
- For workloads created from the OpenShift AI dashboard, such as workbenches and model servers, selects a Kueue-enabled hardware profile during creation.
-
For workloads created by using a command-line interface or an SDK, such as distributed training jobs, adds the
kueue.x-k8s.io/queue-namelabel to the workload’s YAML manifest and sets its value to the targetLocalQueuename.
8.2. Configuring workload management with Kueue Copy linkLink copied to clipboard!
To use workload queuing in OpenShift AI, install the Red Hat build of Kueue Operator and activate the Kueue integration in OpenShift AI.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- You are using OpenShift 4.18 or later.
- You have installed and configured the cert-manager Operator for Red Hat OpenShift for your cluster.
You have installed the OpenShift CLI (
oc) as described in the appropriate documentation for your cluster:- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
Procedure
In a terminal window, log in to the OpenShift CLI (
oc) as shown in the following example:oc login <openshift_cluster_url> -u <admin_username> -p <password>
$ oc login <openshift_cluster_url> -u <admin_username> -p <password>Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Install the Red Hat build of Kueue Operator on your OpenShift cluster as described in the Red Hat build of Kueue documentation.
Activate the Kueue integration. You can use the predefined names for the default cluster queue and default local queue, or specify custom names.
To use the predefined queue names (
default), run the following command. Replace<operator-namespace>with your operator namespace. The default operator namespace isredhat-ods-operator.oc patch datasciencecluster default-dsc --type='merge' -p '{"spec":{"components":{"kueue":{"managementState":"Unmanaged"}}}}' -n <operator-namespace>$ oc patch datasciencecluster default-dsc --type='merge' -p '{"spec":{"components":{"kueue":{"managementState":"Unmanaged"}}}}' -n <operator-namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To specify custom queue names, run the following command. Replace
<example-cluster-queue>and<example-local-queue>with your custom queue names, and replace<operator-namespace>with your operator namespace. The default operator namespace isredhat-ods-operator.oc patch datasciencecluster default-dsc --type='merge' -p '{"spec":{"components":{"kueue":{"managementState":"Unmanaged","defaultClusterQueueName":"<example-cluster-queue>","defaultLocalQueueName":"<example-local-queue>"}}}}' -n <operator-namespace>$ oc patch datasciencecluster default-dsc --type='merge' -p '{"spec":{"components":{"kueue":{"managementState":"Unmanaged","defaultClusterQueueName":"<example-cluster-queue>","defaultLocalQueueName":"<example-local-queue>"}}}}' -n <operator-namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the Red Hat build of Kueue pods are running:
oc get pods -n openshift-kueue-operator
$ oc get pods -n openshift-kueue-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow You should see output similar to the following example:
kueue-controller-manager-d9fc745df-ph77w 1/1 Running openshift-kueue-operator-69cfbf45cf-lwtpm 1/1 Running
kueue-controller-manager-d9fc745df-ph77w 1/1 Running openshift-kueue-operator-69cfbf45cf-lwtpm 1/1 RunningCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the default
ClusterQueuewas created:oc get clusterqueues
$ oc get clusterqueuesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Next steps
-
Configure quotas by creating and modifying
ResourceFlavor,ClusterQueue, andLocalQueueobjects. For details, see the Red Hat build of Kueue documentation. - Enable Kueue in the dashboard so that users can select Kueue-enabled options when creating workloads. When you enable Kueue, you also enable Kueue management for all new projects created from the dashboard. See Enabling Kueue in the dashboard.
- Cluster administrators and OpenShift AI administrators can create hardware profiles so that users can submit workloads from the OpenShift AI dashboard. See Working with hardware profiles.
8.2.1. Enabling Kueue in the dashboard Copy linkLink copied to clipboard!
Enable Kueue in the OpenShift AI dashboard so that users can select Kueue-enabled options when creating workloads.
When you enable Kueue in the dashboard, OpenShift AI automatically enables Kueue management for all new projects created from the dashboard. For these projects, OpenShift AI applies the kueue.openshift.io/managed=true label to the namespace and creates a LocalQueue object if one does not already exist. The LocalQueue object is created with the opendatahub.io/managed=false annotation, so it is not managed after creation. Cluster administrators can modify or delete it as needed. A validating webhook then enforces that any new or updated workload resource in a Kueue-enabled project includes the kueue.x-k8s.io/queue-name label.
For existing projects, or for projects created by using the OpenShift CLI (oc), you must enable Kueue management manually by applying the kueue.openshift.io/managed=true label to the project namespace.
oc label namespace <project-namespace> kueue.openshift.io/managed=true --overwrite
$ oc label namespace <project-namespace> kueue.openshift.io/managed=true --overwrite
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- You are using OpenShift 4.18 or later.
- You have installed and activated the Red Hat build of Kueue Operator, as described in Configuring workload management with Kueue.
- You have configured quotas, as described in the Red Hat build of Kueue documentation.
Procedure
In a terminal window, log in to the OpenShift CLI (
oc) as shown in the following example:oc login <openshift_cluster_url> -u <admin_username> -p <password>
$ oc login <openshift_cluster_url> -u <admin_username> -p <password>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the
odh-dashboard-configcustom resource in the OpenShift AI applications namespace. Replace<applications-namespace>with your OpenShift AI applications namespace. The default isredhat-ods-applications.oc patch odhdashboardconfig odh-dashboard-config \ -n \<applications-namespace\> \ --type merge \ -p {"spec":{"dashboardConfig":{"disableHardwareProfiles":false,"disableKueue":false}}}$ oc patch odhdashboardconfig odh-dashboard-config \ -n \<applications-namespace\> \ --type merge \ -p {"spec":{"dashboardConfig":{"disableHardwareProfiles":false,"disableKueue":false}}}Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
- From the OpenShift AI dashboard, create a new project.
Verify that the project namespace is labeled for Kueue management:
oc get ns <project-namespace> -o jsonpath='{.metadata.labels.kueue\.openshift\.io/managed}{"\n"}'$ oc get ns <project-namespace> -o jsonpath='{.metadata.labels.kueue\.openshift\.io/managed}{"\n"}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output should be
true.Confirm that a default
LocalQueueexists for the project namespace:oc get localqueues -n <project-namespace>
$ oc get localqueues -n <project-namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Create a test workload (for example, a
Notebook) and verify that it includes thekueue.x-k8s.io/queue-namelabel.
Next step
- Cluster administrators and OpenShift AI administrators can create hardware profiles so that users can submit workloads from the OpenShift AI dashboard. See Working with hardware profiles.
8.3. Troubleshooting common problems with Kueue Copy linkLink copied to clipboard!
If your users are experiencing errors in Red Hat OpenShift AI relating to Kueue workloads, read this section to understand what could be causing the problem, and how to resolve the problem.
If the problem is not documented here or in the release notes, contact Red Hat Support.
8.3.1. A user receives a "failed to call webhook" error message for Kueue Copy linkLink copied to clipboard!
Problem
After the user runs the cluster.apply() command, the following error is shown:
ApiException: (500)
Reason: Internal Server Error
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Internal error occurred: failed calling webhook \"mraycluster.kb.io\": failed to call webhook: Post \"https://kueue-webhook-service.redhat-ods-applications.svc:443/mutate-ray-io-v1-raycluster?timeout=10s\": no endpoints available for service \"kueue-webhook-service\"","reason":"InternalError","details":{"causes":[{"message":"failed calling webhook \"mraycluster.kb.io\": failed to call webhook: Post \"https://kueue-webhook-service.redhat-ods-applications.svc:443/mutate-ray-io-v1-raycluster?timeout=10s\": no endpoints available for service \"kueue-webhook-service\""}]},"code":500}
ApiException: (500)
Reason: Internal Server Error
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Internal error occurred: failed calling webhook \"mraycluster.kb.io\": failed to call webhook: Post \"https://kueue-webhook-service.redhat-ods-applications.svc:443/mutate-ray-io-v1-raycluster?timeout=10s\": no endpoints available for service \"kueue-webhook-service\"","reason":"InternalError","details":{"causes":[{"message":"failed calling webhook \"mraycluster.kb.io\": failed to call webhook: Post \"https://kueue-webhook-service.redhat-ods-applications.svc:443/mutate-ray-io-v1-raycluster?timeout=10s\": no endpoints available for service \"kueue-webhook-service\""}]},"code":500}
Diagnosis
The Kueue pod might not be running.
Resolution
- In the OpenShift console, select the user’s project from the Project list.
-
Click Workloads
Pods. - Verify that the Kueue pod is running. If necessary, restart the Kueue pod.
Review the logs for the Kueue pod to verify that the webhook server is serving, as shown in the following example:
{"level":"info","ts":"2024-06-24T14:36:24.255137871Z","logger":"controller-runtime.webhook","caller":"webhook/server.go:242","msg":"Serving webhook server","host":"","port":9443}{"level":"info","ts":"2024-06-24T14:36:24.255137871Z","logger":"controller-runtime.webhook","caller":"webhook/server.go:242","msg":"Serving webhook server","host":"","port":9443}Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3.2. A user receives a "Default Local Queue … not found" error message Copy linkLink copied to clipboard!
Problem
After the user runs the cluster.apply() command, the following error is shown:
Default Local Queue with kueue.x-k8s.io/default-queue: true annotation not found please create a default Local Queue or provide the local_queue name in Cluster Configuration.
Default Local Queue with kueue.x-k8s.io/default-queue: true annotation not found please create a default Local Queue or provide the local_queue name in Cluster Configuration.
Diagnosis
No default local queue is defined, and a local queue is not specified in the cluster configuration.
Resolution
Check whether a local queue exists in the user’s project, as follows:
- In the OpenShift console, select the user’s project from the Project list.
-
Click Home
Search, and from the Resources list, select LocalQueue. - If no local queues are found, create a local queue.
- Provide the user with the details of the local queues in their project, and advise them to add a local queue to their cluster configuration.
Define a default local queue.
For information about creating a local queue and defining a default local queue, see Configuring quota management for distributed workloads.
8.3.3. A user receives a "local_queue provided does not exist" error message Copy linkLink copied to clipboard!
Problem
After the user runs the cluster.apply() command, the following error is shown:
local_queue provided does not exist or is not in this namespace. Please provide the correct local_queue name in Cluster Configuration.
local_queue provided does not exist or is not in this namespace. Please provide the correct local_queue name in Cluster Configuration.
Diagnosis
An incorrect value is specified for the local queue in the cluster configuration, or an incorrect default local queue is defined. The specified local queue either does not exist, or exists in a different namespace.
Resolution
In the OpenShift console, select the user’s project from the Project list.
- Click Search, and from the Resources list, select LocalQueue.
Resolve the problem in one of the following ways:
- If no local queues are found, create a local queue.
-
If one or more local queues are found, provide the user with the details of the local queues in their project. Advise the user to ensure that they spelled the local queue name correctly in their cluster configuration, and that the
namespacevalue in the cluster configuration matches their project name.
Define a default local queue.
For information about creating a local queue and defining a default local queue, see Configuring quota management for distributed workloads.
8.3.4. The pod provisioned by Kueue is terminated before the image is pulled Copy linkLink copied to clipboard!
Problem
Kueue waits for a period of time before marking a workload as ready for all of the workload pods to become provisioned and running. By default, Kueue waits for 5 minutes. If the pod image is very large and is still being pulled after the 5-minute waiting period elapses, Kueue fails the workload and terminates the related pods.
Diagnosis
- In the OpenShift console, select the user’s project from the Project list.
-
Click Workloads
Pods. - Click the user’s pod name to open the pod details page.
- Click the Events tab, and review the pod events to check whether the image pull completed successfully.
Resolution
If the pod takes more than 5 minutes to pull the image, resolve the problem in one of the following ways:
-
Add an
OnFailurerestart policy for resources that are managed by Kueue. -
Configure a custom timeout for the
waitForPodsReadyproperty in theKueuecustom resource (CR). The CR is installed in theopenshift-kueue-operatornamespace by the Red Hat build of Kueue Operator.
For more information about this configuration option, see Enabling waitForPodsReady in the Kueue documentation.
8.4. Migrating to the Red Hat build of Kueue Operator Copy linkLink copied to clipboard!
Starting with OpenShift AI 2.24, the embedded Kueue component for managing distributed workloads is deprecated.
OpenShift AI now uses the Red Hat build of Kueue Operator to provide enhanced workload scheduling for distributed training, workbench, and model serving workloads.
Check if your environment is using the embedded Kueue component by verifying the spec.components.kueue.managementState field in the DataScienceCluster custom resource. If the field is set to Managed, you must migrate to the Red Hat build of Kueue Operator before upgrading OpenShift AI to avoid controller conflicts and ensure continued support for queue-based workloads.
OpenShift AI does not automatically migrate workloads, and you cannot install both the embedded Kueue and the Red Hat build of Kueue Operator on the same cluster.
Prerequisites
Your environment is currently using the embedded Kueue component. That is, the
spec.components.kueue.managementStatefield in theDataScienceClustercustom resource is set toManaged.NoteIf
spec.components.kueue.managementStateis set toRemovedorUnmanaged, skip this migration.- You have cluster administrator privileges for your OpenShift cluster.
- You are using OpenShift 4.18 or later.
- You have installed and configured the cert-manager Operator for Red Hat OpenShift for your cluster.
Procedure
Optional: When you migrate from the embedded Kueue to Red Hat build of Kueue, the OpenShift AI Operator automatically moves your existing Kueue configuration from the
kueue-manager-configConfigMap to theKueuecustom resource (CR).If you want to keep the
kueue-manager-configConfigMap for reference, run the following command. Replace<applications-namespace>with your OpenShift AI applications namespace. The default namespace isredhat-ods-applications.oc annotate configmap kueue-manager-config -n <applications-namespace> opendatahub.io/managed=false
$ oc annotate configmap kueue-manager-config -n <applications-namespace> opendatahub.io/managed=falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the OpenShift web console as a cluster administrator.
Uninstall the embedded Kueue component to avoid potential configuration conflicts.
NoteIf you need to keep workloads running without interruption, you can skip this step. However, skipping it is not recommended because it might cause temporary configuration issues during the OpenShift AI upgrade.
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. - Click the Data Science Cluster tab.
- Click the default-dsc object.
- Click the YAML tab.
Set
spec.components.kueue.managementStatetoRemovedas shown:spec: components: kueue: managementState: Removedspec: components: kueue: managementState: RemovedCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Click Save.
Wait for the OpenShift AI Operator to reconcile, and then verify that the embedded Kueue was removed:
-
On the Details tab of the
default-dscobject, check that the KueueReady condition has a Status ofFalseand a Reason ofRemoved. -
Go to Workloads
Deployments, select the project where OpenShift AI is installed (for example, redhat-ods-applications), and confirm that Kueue-related deployments (for example,kueue-controller-manager) are no longer present.
-
On the Details tab of the
-
In the web console, click Operators
Install the Red Hat build of Kueue Operator on your OpenShift cluster:
- Follow the steps to install the Red Hat build of Kueue Operator, as described in the Red Hat build of Kueue documentation.
-
Go to Operators
Installed Operators and confirm that the Red Hat build of Kueue Operator is listed with Status as Succeeded.
Activate the Red Hat build of Kueue Operator in OpenShift AI:
-
In the web console, click Operators
Installed Operators and then click the Red Hat OpenShift AI Operator. - Click the Data Science Cluster tab.
- Click the default-dsc object.
- Click the YAML tab.
Set
spec.components.kueue.managementStatetoUnmanaged. You can either use the predefined names (default) for the default cluster queue and default local queue, or specify custom names, as shown in the following examples.To use the predefined queue names, apply the following configuration:
spec: components: kueue: managementState: Unmanagedspec: components: kueue: managementState: UnmanagedCopy to Clipboard Copied! Toggle word wrap Toggle overflow To specify custom queue names, apply the following configuration, replacing
<example-cluster-queue>and<example-local-queue>with your custom values:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- Click Save.
-
In the web console, click Operators
Enable Kueue management for existing projects by applying the
kueue.openshift.io/managed=truelabel to each project namespace:oc label namespace <project-namespace> kueue.openshift.io/managed=true --overwrite
$ oc label namespace <project-namespace> kueue.openshift.io/managed=true --overwriteCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<project-namespace>with the name of your project.NoteKueue validation and queue enforcement apply only to workloads in namespaces labeled with
kueue.openshift.io/managed=true.
Verification
- Verify that the embedded Kueue component was removed.
-
Verify that the
DataScienceClusterresource shows a healthyUnmanagedstatus for Kueue. - Verify that existing workloads in the queue continue to be processed by the Red Hat build of Kueue controllers. Submit a new test workload to confirm functionality.
Next steps
-
Configure quotas by creating and modifying
ResourceFlavor,ClusterQueue, andLocalQueueobjects. For details, see the Red Hat build of Kueue documentation. - Enable Kueue in the dashboard so that users can select Kueue-enabled options when creating workloads. When enabled, Kueue management is automatically applied to all new projects created from the dashboard. See Enabling Kueue in the dashboard.
- Cluster administrators and OpenShift AI administrators can create hardware profiles so that users can submit workloads from the OpenShift AI dashboard. See Working with hardware profiles.