Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 4. JobSet Operator


4.1. JobSet Operator overview

Use the JobSet Operator on OpenShift Container Platform to easily manage and run large-scale, coordinated workloads like high-performance computing (HPC) and AI training. The JobSet Operator can help you gain fast recovery and efficient resource use through features like multi-template job support and stable networking.

Important

JobSet Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

4.1.1. About the JobSet Operator

Use the JobSet Operator on OpenShift Container Platform to manage large, distributed, and coordinated computing workloads, such as high-performance computing (HPC) or artificial intelligence (AI) training, and gain automatic stability, coordination, and failure recovery.

The JobSet Operator is based on the JobSet open source project.

JobSet Operator is designed to manage a group of jobs as a single, coordinated unit. This is especially useful for fields like HPC and training massive AI models where you need a team of machines to run for hours or days.

You can use the JobSet Operator to solve problems that are too big or too complex for a standard OpenShift Container Platform job. The JobSet Operator provides coordination, stability, and recovery.

The JobSet Operator automatically sets up stable headless service to get an IP address so workers can find and communicate with each other, even after a failure and restart. It also provides automatic failure recovery. If one small part of a large training job fails, the Operator can be configured to restart the entire group of workers from a saved checkpoint. This saves time and computing costs.

The JobSet Operator offers startup control, allowing you to define a specific startup sequence to ensure dependencies are met. For example, making sure the leader is running before any workers attempt to connect.

JobSet Operator makes managing large, distributed, and coordinated computing tasks on OpenShift Container Platform easier, turning many individual components into one resilient and manageable system.

4.2. Installing the JobSet Operator

Install the JobSet Operator on OpenShift Container Platform to enable management of large-scale, coordinated computing workloads, giving your applications a unified API and failure recovery.

Important

JobSet Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

4.2.1. Installing the JobSet Operator

Install the JobSet Operator on OpenShift Container Platform using the web console to begin managing large-scale, coordinated computing workloads.

Prerequisites

  • You have access to the cluster with cluster-admin privileges.
  • You have access to the OpenShift Container Platform web console.
  • You have installed the cert-manager Operator for Red Hat OpenShift.

Procedure

  1. Log in to the OpenShift Container Platform web console.
  2. Verify that the cert-manager Operator for Red Hat OpenShift is installed.
  3. Install the JobSet Operator.

    1. Navigate to Ecosystem Software Catalog.
    2. Search for and select the openshift-operators project.
    3. Enter JobSet Operator into the filter box.
    4. Select the JobSet Operator and click Install.
    5. On the Install Operator page:

      1. The Update channel is set to tech-preview-v0.1, which installs the latest stable release of JobSet Operator 0.1.
      2. Under Installation mode, select A specific namespace on the cluster.
      3. Under Installed Namespace, select Operator recommended Namespace: openshift-jobset-operator.
      4. Under Update approval, select one of the following update strategies:

        • The Automatic strategy allows Operator Lifecycle Manager (OLM) to automatically update the Operator when a new version is available.
        • The Manual strategy requires a user with appropriate credentials to approve the Operator update.
      5. Click Install.
  4. Create the custom resource (CR) for the JobSet Operator:

    1. Navigate to Installed Operators JobSet Operator.
    2. Navigate to Create JobSetOperator page.
    3. Set the name to cluster.
    4. Set the managementState to Managed.
    5. Under Provided APIs, click Create instance in the JobSetOperator pane.
    6. Click Create.

Verification

  • Check that the JobSet Operator and operand pods are running by entering the following command:

    $ oc get pod -n openshift-jobset-operator
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                                        READY   STATUS    RESTARTS   AGE
    jobset-controller-manager-5595547fb-b4g2x   1/1     Running   0          48s
    jobset-operator-596cb848c6-q2dmp            1/1     Running   0          2m33s
    Copy to Clipboard Toggle word wrap

4.3. JobSet Operator release notes

Track the development, features, and fixes for the JobSet Operator, which manages coordinated, large-scale computing workloads on OpenShift Container Platform.

Important

JobSet Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

For more information, see About the JobSet Operator.

4.3.1. Release notes for JobSet Operator 0.1.0

Review the new features and advisories for the initial Technology Preview release of JobSet Operator 0.1.0.

Issued: 4 November 2025

The following advisories are available for the JobSet Operator 0.1.0:

4.3.1.1. New features and enhancements

  • This is the initial Technology Preview release of the JobSet Operator.
Nach oben
Red Hat logoGithubredditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können. Entdecken Sie unsere neuesten Updates.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

Theme

© 2025 Red Hat