Chapter 1. Enabling data science pipelines 2.0


From OpenShift AI version 2.9, data science pipelines are based on KubeFlow Pipelines (KFP) version 2.0. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.

Note

The PipelineConf class is deprecated, and there is no KFP 2.0 equivalent.

Important

Data science pipelines 2.0 contains an installation of Argo Workflows. OpenShift AI does not support direct customer usage of this installation of Argo Workflows.

To install or upgrade to OpenShift AI 2.9 or later with data science pipelines, ensure that your cluster does not have an existing installation of Argo Workflows that is not installed by OpenShift AI.

Argo Workflows resources that are created by OpenShift AI have the following labels in the OpenShift Console under Administration > CustomResourceDefinitions, in the argoproj.io group:

 labels:
    app.kubernetes.io/part-of: data-science-pipelines-operator
    app.opendatahub.io/data-science-pipelines-operator: 'true'

1.1. Installing OpenShift AI with data science pipelines 2.0

To install OpenShift AI 2.9 or later with data science pipelines, ensure that there is no installation of Argo Workflows that is not installed by data science pipelines on your cluster, and follow the installation steps described in Installing and uninstalling OpenShift AI Self-Managed, or for disconnected environments, see Installing and uninstalling Red Hat OpenShift AI in a disconnected environment.

If there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, data science pipelines will be disabled after you install OpenShift AI 2.9 or later.

To enable data science pipelines, remove the separate installation of Argo Workflows from your cluster. Data science pipelines will be enabled automatically.

1.2. Upgrading to data science pipelines 2.0

Important

After you upgrade to OpenShift AI 2.9 or later, pipelines created with data science pipelines 1.0 continue to run, but are inaccessible from the OpenShift AI dashboard. If you are a current data science pipelines user, do not upgrade to OpenShift AI with data science pipelines 2.0 until you are ready to migrate to the new pipelines solution.

To upgrade to data science pipelines 2.0, follow these steps:

  1. Ensure that your cluster does not have an existing installation of Argo Workflows that is not installed by OpenShift AI, and then follow the upgrade steps described in Upgrading OpenShift AI Self-Managed, or for disconnected environments, Upgrading Red Hat OpenShift AI in a disconnected environment.

    If you upgrade to OpenShift AI 2.9 or later with data science pipelines enabled, and there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, OpenShift AI components will not be upgraded. To complete the component upgrade, disable data science pipelines or remove the separate installation of Argo Workflows from your cluster. The component upgrade will then complete automatically.

  2. Update your workbenches to use the notebook image version 2024.1 or later. For more information, see Updating a project workbench.
  3. Manually migrate your pipelines from data science pipelines 1.0 to 2.0. For more information, see Migrating pipelines from data science pipelines 1.0 to 2.0.

1.3. Migrating pipelines from data science pipelines 1.0 to 2.0

OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. To use existing pipelines with data science pipelines 2.0, you must manually migrate them.

  1. On OpenShift AI 2.9 or later, create a new data science project.
  2. Configure a new pipeline server.
  3. Update and recompile your data science pipelines 1.0 pipelines as described in Migrate from KFP SDK v1: v1 to v2 migration instructions and breaking changes.

    Note

    Data science pipelines 2.0 does not use the kfp-tekton library. In most cases, you can replace usage of kfp-tekton with the kfp library.

  4. Import your updated pipelines to your new data science pipelines 2.0-based data science project.
  5. (Optional) Remove your data science pipelines 1.0 pipeline server.
Important

Data science pipelines 1.0 used the kfp-tekton Python library. Data science pipelines 2.0 does not use kfp-tekton. You can uninstall kfp-tekton when there are no remaining data science pipelines 1.0 pipeline servers in use on your cluster.

For Data science pipelines 2.0, use the latest version of the KFP SDK. For more information, see the Kubeflow Pipelines SDK API Reference.

1.4. Accessing data science pipelines 1.0 pipelines and history

You can view historical data science pipelines 1.0 pipeline run information in the OpenShift Console under Pipelines > Project > PipelineRuns.

You can still connect to the KFP API server by using the kfp-tekton SDK for programmatic access to your pipelines and pipeline run history. For more information, see Kubeflow Pipelines SDK for Tekton.

1.5. Uninstalling the OpenShift Pipelines Operator

When your migration to data science pipelines 2.0 is complete, and if you are not using OpenShift Pipelines for any purpose other than data science pipelines 1.0, you can remove the OpenShift Pipelines Operator.

Important

Before removing the OpenShift Pipelines Operator, ensure that migration of your data science pipelines 1.0 pipelines to 2.0 is complete, and that there are no remaining data science pipelines 1.0 pipeline servers in use on your cluster.

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.