Chapter 1. Enabling data science pipelines 2.0
From OpenShift AI version 2.9, data science pipelines are based on KubeFlow Pipelines (KFP) version 2.0. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
The PipelineConf
class is deprecated, and there is no KFP 2.0 equivalent.
Data science pipelines 2.0 contains an installation of Argo Workflows. OpenShift AI does not support direct customer usage of this installation of Argo Workflows.
To install or upgrade to OpenShift AI 2.9 or later with data science pipelines, ensure that your cluster does not have an existing installation of Argo Workflows that is not installed by OpenShift AI.
Argo Workflows resources that are created by OpenShift AI have the following labels in the OpenShift Console under Administration > CustomResourceDefinitions, in the argoproj.io
group:
labels: app.kubernetes.io/part-of: data-science-pipelines-operator app.opendatahub.io/data-science-pipelines-operator: 'true'
1.1. Installing OpenShift AI with data science pipelines 2.0
To install OpenShift AI 2.9 or later with data science pipelines, ensure that there is no installation of Argo Workflows that is not installed by data science pipelines on your cluster, and follow the installation steps described in Installing and uninstalling OpenShift AI Self-Managed, or for disconnected environments, see Installing and uninstalling Red Hat OpenShift AI in a disconnected environment.
If there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, data science pipelines will be disabled after you install OpenShift AI 2.9 or later.
To enable data science pipelines, remove the separate installation of Argo Workflows from your cluster. Data science pipelines will be enabled automatically.
1.2. Upgrading to data science pipelines 2.0
After you upgrade to OpenShift AI 2.9 or later, pipelines created with data science pipelines 1.0 continue to run, but are inaccessible from the OpenShift AI dashboard. If you are a current data science pipelines user, do not upgrade to OpenShift AI with data science pipelines 2.0 until you are ready to migrate to the new pipelines solution.
To upgrade to data science pipelines 2.0, follow these steps:
Ensure that your cluster does not have an existing installation of Argo Workflows that is not installed by OpenShift AI, and then follow the upgrade steps described in Upgrading OpenShift AI Self-Managed, or for disconnected environments, Upgrading Red Hat OpenShift AI in a disconnected environment.
If you upgrade to OpenShift AI 2.9 or later with data science pipelines enabled, and there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, OpenShift AI components will not be upgraded. To complete the component upgrade, disable data science pipelines or remove the separate installation of Argo Workflows from your cluster. The component upgrade will then complete automatically.
- Update your workbenches to use the notebook image version 2024.1 or later. For more information, see Updating a project workbench.
- Manually migrate your pipelines from data science pipelines 1.0 to 2.0. For more information, see Migrating pipelines from data science pipelines 1.0 to 2.0.
1.3. Migrating pipelines from data science pipelines 1.0 to 2.0
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. To use existing pipelines with data science pipelines 2.0, you must manually migrate them.
- On OpenShift AI 2.9 or later, create a new data science project.
- Configure a new pipeline server.
Update and recompile your data science pipelines 1.0 pipelines as described in Migrate from KFP SDK v1: v1 to v2 migration instructions and breaking changes.
NoteData science pipelines 2.0 does not use the
kfp-tekton
library. In most cases, you can replace usage ofkfp-tekton
with thekfp
library.- Import your updated pipelines to your new data science pipelines 2.0-based data science project.
- (Optional) Remove your data science pipelines 1.0 pipeline server.
Data science pipelines 1.0 used the kfp-tekton
Python library. Data science pipelines 2.0 does not use kfp-tekton
. You can uninstall kfp-tekton
when there are no remaining data science pipelines 1.0 pipeline servers in use on your cluster.
For Data science pipelines 2.0, use the latest version of the KFP SDK. For more information, see the Kubeflow Pipelines SDK API Reference.
1.4. Accessing data science pipelines 1.0 pipelines and history
You can view historical data science pipelines 1.0 pipeline run information in the OpenShift Console under Pipelines > Project > PipelineRuns.
You can still connect to the KFP API server by using the kfp-tekton
SDK for programmatic access to your pipelines and pipeline run history. For more information, see Kubeflow Pipelines SDK for Tekton.
1.5. Uninstalling the OpenShift Pipelines Operator
When your migration to data science pipelines 2.0 is complete, and if you are not using OpenShift Pipelines for any purpose other than data science pipelines 1.0, you can remove the OpenShift Pipelines Operator.
Before removing the OpenShift Pipelines Operator, ensure that migration of your data science pipelines 1.0 pipelines to 2.0 is complete, and that there are no remaining data science pipelines 1.0 pipeline servers in use on your cluster.