Chapter 1. Managing AI pipelines


1.1. Configuring a pipeline server

Before you can successfully create an AI pipeline in OpenShift AI, you must configure a pipeline server. This task includes configuring where your pipeline artifacts and data are stored.

Note

You are not required to specify any storage directories when configuring a connection for your pipeline server. When you import a pipeline, the /pipelines folder is created in the root folder of the bucket, containing a YAML file for the pipeline. If you upload a new version of the same pipeline, a new YAML file with a different ID is added to the /pipelines folder.

When you run a pipeline, the artifacts are stored in the /pipeline-name folder in the root folder of the bucket.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have created a project that you can add a pipeline server to.
  • You have an existing S3-compatible object storage bucket and you have configured write access to your S3 bucket on your storage account.
  • If you are configuring a pipeline server for production pipeline workloads, you have an existing external MySQL or MariaDB database.
  • If you are configuring a pipeline server with an external MySQL database, your database must use at least MySQL version 5.x. However, Red Hat recommends that you use MySQL version 8.x.

    Note

    The mysql_native_password authentication plugin is required for the ML Metadata component to successfully connect to your database. mysql_native_password is disabled by default in MySQL 8.4 and later. If your database uses MySQL 8.4 or later, you must update your MySQL deployment to enable the mysql_native_password plugin.

    For more information about enabling the mysql_native_password plugin, see Native Pluggable Authentication in the MySQL documentation.

  • If you are configuring a pipeline server with a MariaDB database, your database must use MariaDB version 10.3 or later. However, Red Hat recommends that you use at least MariaDB version 10.5.

Procedure

  1. From the OpenShift AI dashboard, click Projects.
  2. On the Projects page, click the name of the project that you want to configure a pipeline server for.

    The project details page opens.

  3. Click the Pipelines tab.
  4. Click Configure pipeline server.

    The Configure pipeline server dialog opens.

  5. In the Object storage connection section, provide values for the mandatory fields:

    1. In the Access key field, enter the access key ID for the S3-compatible object storage provider.
    2. In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
    3. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
    4. In the Region field, enter the default region of your S3-compatible object storage account.
    5. In the Bucket field, enter the name of your S3-compatible object storage bucket.

      Important

      If you specify incorrect connection settings, you cannot update these settings on the same pipeline server. Therefore, you must delete the pipeline server and configure another one.

      If you want to use an existing artifact that was not generated by a task in a pipeline, you can use the kfp.dsl.importer component to import the artifact from its URI. You can only import these artifacts to the S3-compatible object storage bucket that you define in the Bucket field in your pipeline server configuration. For more information about the kfp.dsl.importer component, see Special Case: Importer Components.

  6. Click Advanced settings to display the Database, Pipeline definition storage, and Pipeline caching sections.
  7. In the Database section, choose one of the following options to specify where to store your pipeline metadata and run information:

    • Select Default database on the cluster to deploy a MariaDB database in your project.

      Important

      The Default database on the cluster option is intended for development and testing purposes only. For production pipeline workloads, select the External MySQL database option to use an external MySQL or MariaDB database.

    • Select External MySQL database to add a new connection to an external MySQL or MariaDB database that your pipeline server can access.

      1. In the Host field, enter the database hostname.
      2. In the Port field, enter the database port.
      3. In the Username field, enter the default user name that is connected to the database.
      4. In the Password field, enter the password for the default user account.
      5. In the Database field, enter the database name.
  8. Optional: By default, pipeline definitions are stored as Kubernetes resources, enabling version control, GitOps workflows, and integration with OpenShift GitOps or similar tools. To store pipeline definitions in the internal database instead, clear the Store pipeline definitions in Kubernetes checkbox in the Pipeline definition storage section.
  9. Optional: By default, caching is configurable at both the pipeline and task levels. To disable caching for all pipelines and tasks in the pipeline server and override any pipeline-level and task-level caching settings, clear the Allow caching to be configured per pipeline and task checkbox in the Pipeline caching section.
  10. Click Configure pipeline server.

Verification

On the Pipelines tab for the project:

  • The Import pipeline button is available.
  • When you click the action menu () and then click Manage pipeline server configuration, the pipeline server details are displayed.

To configure a pipeline server with an external Amazon Relational Database Service (RDS) database, you must configure OpenShift AI to trust the certificates issued by its certificate authorities (CA).

Important

If you are configuring a pipeline server for production pipeline workloads, Red Hat recommends that you use an external MySQL or MariaDB database.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have logged in to Red Hat OpenShift AI.
  • You have created a project that you can add a pipeline server to.
  • You have an existing S3-compatible object storage bucket, and you have configured your storage account with write access to your S3 bucket.

Procedure

  1. Before configuring your pipeline server, from Amazon RDS: Certificate bundles by AWS Region, download the PEM certificate bundle for the region that the database was created in.

    For example, if the database was created in the us-east-1 region, download us-east-1-bundle.pem.

  2. In a terminal window, log in to the OpenShift cluster where OpenShift AI is deployed.

    oc login api.<cluster_name>.<cluster_domain>:6443 --web
    Copy to Clipboard Toggle word wrap
  3. Run the following command to fetch the current OpenShift AI trusted CA configuration and store it in a new file:

    oc get dscinitializations.dscinitialization.opendatahub.io default-dsci -o json | jq '.spec.trustedCABundle.customCABundle' > /tmp/my-custom-ca-bundles.crt
    Copy to Clipboard Toggle word wrap
  4. Run the following command to append the PEM certificate bundle that you downloaded to the new custom CA configuration file:

    cat us-east-1-bundle.pem >> /tmp/my-custom-ca-bundles.crt
    Copy to Clipboard Toggle word wrap
  5. Run the following command to update the OpenShift AI trusted CA configuration to trust certificates issued by the CAs included in the new custom CA configuration file:

    oc patch dscinitialization default-dsci --type='json' -p='[{"op":"replace","path":"/spec/trustedCABundle/customCABundle","value":"'"$(awk '{printf "%s\\n", $0}' /tmp/my-custom-ca-bundles.crt)"'"}]'
    Copy to Clipboard Toggle word wrap
  6. Configure a pipeline server, as described in Configuring a pipeline server.

Verification

  • The pipeline server starts successfully.
  • You can import and run AI pipelines.

1.2. Defining a pipeline

The Kubeflow Pipelines SDK enables you to define end-to-end machine learning and AI pipelines. Use the latest Kubeflow Pipelines 2.0 SDK to build your AI pipeline in Python code. After you have built your pipeline, use the SDK to compile it into an Intermediate Representation (IR) YAML file. For more information about compiling pipelines, see Compiling the pipeline YAML with the Kubeflow Pipelines SDK and Compiling Kubernetes-native manifests with the Kubeflow Pipelines SDK. Compiling to Kubernetes-native manifests is optional and applies only when your pipeline server is configured to use Kubernetes API storage. After defining the pipeline, you can import the YAML file to the OpenShift AI dashboard to enable you to configure its execution settings.

Important

If you are using OpenShift AI on a cluster running in FIPS mode, any custom container images for AI pipelines must be based on UBI 9 or RHEL 9. This ensures compatibility with FIPS-approved pipeline components and prevents errors related to mismatched OpenSSL or GNU C Library (glibc) versions.

You can also use the Elyra JupyterLab extension to create and run AI pipelines within JupyterLab. For more information about creating pipelines in JupyterLab, see Working with pipelines in JupyterLab. For more information about the Elyra JupyterLab extension, see Elyra Documentation.

Before you can define your pipeline in the cluster, you must convert your Python-defined pipeline into YAML format. You can use the Kubeflow Pipelines (KFP) Software Development Kit (SDK) to compile your pipeline code into a deployable YAML file for declarative GitOps deployment.

Prerequisites

  • You have installed Python 3.11 or later in your local environment.
  • You have installed the Kubeflow Pipelines SDK package (kfp) version 2.14.3 or later.
  • You have a valid Python pipeline definition file.

Procedure

Compile your pipeline by using the KFP SDK to generate the pipeline YAML file.

In the following example, replace <pipeline_file>.py with the name of your Python pipeline file and specify an output file for the compiled YAML:

$ kfp dsl compile \
    --py <pipeline_file>.py \
    --output <compiled_pipeline_file>.yaml
Copy to Clipboard Toggle word wrap
Note

The generated <compiled_pipeline_file>.yaml file contains the compiled pipeline specification in YAML format. You can use this content as the value of the pipelineSpec field when you create a PipelineVersion custom resource (CR). You can also store the file in Git for declarative or GitOps-based deployment.

Verification

Verify that the generated file includes a pipelineSpec key followed by the compiled pipeline definition:

$ head -n 10 <compiled_pipeline_file>.yaml
Copy to Clipboard Toggle word wrap

If your pipeline server uses the Kubernetes native API mode, you can compile your pipeline directly to Kubernetes manifests. The output includes Pipeline and PipelineVersion custom resources with spec.pipelineSpec and, when you use Kubernetes resource configuration, an optional spec.platformSpec.

Prerequisites

  • You have installed Python 3.11 or later in your local environment.
  • You have installed the Kubeflow Pipelines SDK package (kfp) version 2.14.3 or later.
  • You have a valid Python pipeline definition file.

Procedure

  1. Save the following code as a new file named compile.py in your working directory.

    The example uses the KubernetesManifestOptions class from the kfp.compiler.compiler_utils module to define pipeline metadata such as the name, version, and namespace.

    Example compile script

    from kfp import dsl, compiler
    from kfp.compiler.compiler_utils import KubernetesManifestOptions
    
    @dsl.pipeline(name="<pipeline_name>")
    def my_pipeline():
        pass  # define your tasks
    
    compiler.Compiler().compile(
        pipeline_func=my_pipeline,
        package_path="<output_file>.yaml",
        kubernetes_manifest_format=True,
        kubernetes_manifest_options=KubernetesManifestOptions(
            pipeline_name="<pipeline_name>",
            pipeline_version_name="<version_name>",
            namespace="<namespace>",
            include_pipeline_manifest=True,
        ),
    )
    Copy to Clipboard Toggle word wrap

  2. Run the script to compile your pipeline and generate the Kubernetes manifests:

    $ python compile.py
    Copy to Clipboard Toggle word wrap

Verification

Verify that the compiled output includes the expected resources:

apiVersion: pipelines.kubeflow.org/v2beta1
kind: Pipeline
---
apiVersion: pipelines.kubeflow.org/v2beta1
kind: PipelineVersion
spec:
  pipelineSpec: ...
  platformSpec: ...   # present when Kubernetes resource configuration is used
Copy to Clipboard Toggle word wrap

You can connect the Kubeflow Pipelines (KFP) SDK to a pipeline server that is exposed by OpenShift AI. The pipeline server route is protected by OpenShift OAuth, so you must provide a valid access token when you create the KFP client.

Prerequisites

  • You have logged in to the OpenShift CLI (oc) as a user who can access the project.
  • You have created a project and configured a pipeline server for that project.
  • You have installed Python and the required packages in your environment.
  • Optional: If your cluster uses a custom or self-signed certificate, you know the path to the trusted certificate bundle that your environment uses.

Procedure

  1. Set environment variables for your project and pipeline server route:

    export NAMESPACE=<project_namespace>
    export DSPA_NAME=$(oc -n "$NAMESPACE" get dspa -o jsonpath='{.items[0].metadata.name}')
    export API_URL="https://$(oc -n "$NAMESPACE" get route "ds-pipeline-${DSPA_NAME}" -o jsonpath='{.spec.host}')"
    Copy to Clipboard Toggle word wrap

    Replace <project_namespace> with the name of your project.

  2. Obtain an OpenShift access token for the current user:

    export OCP_TOKEN=$(oc whoami --show-token)
    Copy to Clipboard Toggle word wrap
    Note

    Avoid pasting the access token directly into commands or scripts. The token can appear in your shell history or in process listings if you pass it as a literal argument.

    To reduce this risk, store the token in an environment variable and reference it from your code or commands. For example:

    ./.venv/bin/python my_script.py --kfp-server-host "$API_URL" --namespace "$NAMESPACE" --token "$OCP_TOKEN"
    Copy to Clipboard Toggle word wrap

    Alternatively, use a prompt with read -s to input the token securely at runtime.

  3. Optional: If you are running outside the cluster or you use a custom or self-signed certificate, set an environment variable for your trusted certificate bundle:

    export SSL_CA_CERT=/etc/pki/tls/custom-certs/ca-bundle.crt
    Copy to Clipboard Toggle word wrap

    Adjust the path if your environment uses a different certificate location.

  4. In your Python environment, create a KFP client that uses the pipeline server route and OpenShift access token:

    import os
    from kfp.client import Client
    
    api_url = os.environ["API_URL"]
    token = os.environ["OCP_TOKEN"]
    namespace = os.environ["NAMESPACE"]
    
    # Optional: Use a custom certificate bundle if required
    ssl_ca_cert = os.environ.get("SSL_CA_CERT", None)
    
    client_args = {
        "host": api_url,
        "existing_token": token,
        "namespace": namespace,
    }
    
    if ssl_ca_cert:
        client_args["ssl_ca_cert"] = ssl_ca_cert
    
    client = Client(**client_args)
    Copy to Clipboard Toggle word wrap
  5. Verify the connection by calling the API. For example, list experiments or pipelines:

    print(client.list_experiments())
    # or
    print(client.list_pipelines())
    Copy to Clipboard Toggle word wrap

Verification

  • The Python code runs without authentication errors.
  • The command output lists experiments or pipelines that are defined on the pipeline server for the specified project.

Next steps

  • Use the KFP SDK to compile and upload pipelines, create pipeline runs, or manage pipeline versions against the authenticated pipeline server.
  • If required, integrate this client configuration into your own automation scripts or external applications that orchestrate pipelines on OpenShift AI.

You can define AI pipelines and pipeline versions by using the Kubernetes API, which stores them as custom resources in the cluster instead of the internal database. This approach makes it easier to use OpenShift GitOps (Argo CD) or similar tools to manage pipelines and pipeline versions, while still allowing you to manage them through the OpenShift AI dashboard, API, and the Kubeflow Pipelines (KFP) Software Development Kit (SDK). You can generate the required manifests by using the Kubeflow Pipelines SDK; see Compiling the pipeline YAML with the Kubeflow Pipelines SDK or Compiling Kubernetes-native manifests with the Kubeflow Pipelines SDK.

Note

If your pipeline server is already configured to use Kubernetes API storage, you can still use the OpenShift AI dashboard and REST API to view pipeline details, run pipelines, and create schedules. In this mode, the Kubernetes API acts as the storage backend, so your existing tools continue to work as expected.

Prerequisites

  • You have OpenShift AI administrator privileges or you are the project owner.
  • You have a project with a running pipeline server.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • If you plan to create a PipelineVersion custom resource, you have either:

    • Compiled your Python pipeline to IR YAML by using the KFP SDK. See Compiling the pipeline YAML with the Kubeflow Pipelines SDK.
    • Compiled Kubernetes-native manifests by using the KFP SDK. See Compiling Kubernetes-native manifests with the Kubeflow Pipelines SDK.

Procedure

  1. In a terminal window, log in to your OpenShift cluster by using the OpenShift CLI (oc):

    $ oc login -u <user_name>
    Copy to Clipboard Toggle word wrap

    When prompted, enter the OpenShift server URL, connection type, and your password.

  2. To configure the pipeline server to use Kubernetes API storage instead of the default database option, set the spec.apiServer.pipelineStore field to kubernetes in your project’s DataSciencePipelinesApplication (DSPA) custom resource.

    In the following command, replace <dspa_name> with the name of your DSPA custom resource, and replace <namespace> with the name of your project:

    $ oc patch dspa <dspa_name> -n <namespace> \
      --type=merge \
      -p {"spec": {"apiServer": {"pipelineStore": "kubernetes"}}}
    Copy to Clipboard Toggle word wrap
    Warning

    When you switch the pipeline server from database storage to Kubernetes API storage, existing pipelines that were stored in the internal database are no longer visible in the OpenShift AI dashboard or REST API. To view or manage those pipelines again, change the spec.apiServer.pipelineStore field back to database.

  3. Define a Pipeline custom resource in a YAML file with the following contents:

    Example pipeline definition

    apiVersion: pipelines.kubeflow.org/v2beta1
    kind: Pipeline
    metadata:
      name: <name>
      namespace: <namespace>
    spec:
      displayName: <displayName>
    Copy to Clipboard Toggle word wrap

    • name: The immutable Kubernetes resource name of your pipeline.
    • namespace: The name of your project.
    • displayName: The user-friendly display name of your pipeline, which is shown in the dashboard and REST API.
  4. Apply the pipeline definition to create the Pipeline custom resource in your cluster.

    In the following command, replace <pipeline_yaml_file> with the name of your YAML file:

    Example command

    $ oc apply -f <pipeline_yaml_file>.yaml
    Copy to Clipboard Toggle word wrap

  5. Alternatively, if you compiled Kubernetes-native manifests with the KFP SDK, you can apply the generated file directly without manually creating separate YAML files:

    $ oc apply -f <output_file>.yaml
    Copy to Clipboard Toggle word wrap

    The generated file includes both Pipeline and PipelineVersion resources. You can skip the following manual definition steps and proceed to the verification step.

  6. Define a PipelineVersion custom resource in a YAML file with the following contents:

    Example pipeline version definition

    apiVersion: pipelines.kubeflow.org/v2beta1
    kind: PipelineVersion
    metadata:
      name: <name>
      namespace: <namespace>
    spec:
      pipelineName: <pipelineName>
      displayName: <displayName>
      description: This is the first version of the pipeline.
      pipelineSpec:
            # ... YAML generated by compiling Python pipeline with KFP SDK ...
    Copy to Clipboard Toggle word wrap

    • name: The name of your pipeline version.
    • namespace: The name of your project.
    • pipelineName: The immutable Kubernetes resource name of your pipeline. This value must match the metadata.name value in the Pipeline custom resource.
    • displayName: The user-friendly display name of your pipeline version, which is shown in the dashboard and REST API.
    • pipelineSpec: The YAML content that you generated by using the Kubeflow Pipelines (KFP) SDK.
  7. Apply the pipeline version definition to create the PipelineVersion custom resource in your cluster.

    In the following command, replace <pipeline_version_yaml_file> with the name of your YAML file:

    Example command

    $ oc apply -f <pipeline_version_yaml_file>.yaml
    Copy to Clipboard Toggle word wrap

    After creating the pipeline version, the system automatically applies the following labels to the pipeline version for easier filtering:

    Example automatic labels

    pipelines.kubeflow.org/pipeline-id: <metadata.uid of the pipeline>
    pipelines.kubeflow.org/pipeline: <pipeline name>
    Copy to Clipboard Toggle word wrap

Verification

  1. Check that the Pipeline custom resource was successfully created:

    $ oc get pipeline <pipeline_name> -n <namespace>
    Copy to Clipboard Toggle word wrap
  2. Check that the PipelineVersion custom resource was successfully created:

    $ oc get pipelineversion <pipeline_version_name> -n <namespace>
    Copy to Clipboard Toggle word wrap

You can migrate existing pipelines and pipeline versions from the internal database to Kubernetes custom resources. This makes it easier to use OpenShift GitOps (Argo CD) or similar tools to manage pipelines and pipeline versions, while still allowing you to manage them through the OpenShift AI dashboard, API, and the Kubeflow Pipelines (KFP) Software Development Kit (SDK).

This procedure uses a community-supported Kubeflow Pipelines migration script to export pipelines from the AI Pipelines API and generate corresponding Pipeline and PipelineVersion custom resources for import into your cluster.

Important

The migration script in this procedure is maintained by the Kubeflow Pipelines community and is not supported by Red Hat. Before you use the script, review the repository and validate it in a non-production environment.

Warning

The pipeline and pipeline version IDs change during migration, so existing pipeline runs do not map to the migrated pipeline version. The original ID is stored in the pipelines.kubeflow.org/original-id label.

Prerequisites

  • You have OpenShift AI administrator privileges or you are the project owner.
  • You have a project with a running pipeline server.
  • The pipeline server is configured with spec.apiServer.pipelineStore: database.
  • You have Python 3.11 installed in your local environment.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. In a terminal window, log in to your OpenShift cluster by using the OpenShift CLI (oc):

    $ oc login -u <user_name>
    Copy to Clipboard Toggle word wrap

    When prompted, enter the OpenShift server URL, connection type, and your password.

  2. Set environment variables for your project and get the pipeline API route.

    In the export command, replace <namespace> with the name of your project:

    echo "Setting the prerequisite variables"
    export NAMESPACE=<namespace>
    export DSPA_NAME=$(oc -n $NAMESPACE get dspa -o jsonpath={.items[0].metadata.name})
    export API_URL="https://$(oc -n $NAMESPACE get route "ds-pipeline-$DSPA_NAME" -o jsonpath={.spec.host})"
    Copy to Clipboard Toggle word wrap
  3. Create a Python virtual environment and install the required dependencies.

    echo "Set up the Python prerequisites"
    python3.11 -m venv .venv
    ./.venv/bin/pip install kfp requests PyYAML
    Copy to Clipboard Toggle word wrap
  4. Download and run the Kubeflow Pipelines community migration script.

    The script connects to the AI Pipelines API, exports all pipelines and versions from the specified project, and generates one YAML file per pipeline in a local kfp-exported-pipelines/ directory. Each file includes a Pipeline resource followed by all associated PipelineVersion resources.

    1. Run the following command:

      curl -L https://raw.githubusercontent.com/kubeflow/pipelines/refs/heads/master/tools/k8s-native/migration.py -o migration.py
      ./.venv/bin/python migration.py --skip-tls-verify --kfp-server-host $API_URL --namespace $NAMESPACE --token "$(oc whoami --show-token)"
      Copy to Clipboard Toggle word wrap
      Note

      The --skip-tls-verify option disables certificate validation and should be used only in development environments or when connecting to a server with a self-signed certificate. In production environments, provide a valid certificate bundle instead.

      Additionally, passing the access token directly on the command line might expose it in shell history or process lists. To reduce this risk, store the token in an environment variable and reference it in your command:

      export KFP_TOKEN=$(oc whoami --show-token)
      ./.venv/bin/python migration.py --kfp-server-host $API_URL --namespace $NAMESPACE --token "$KFP_TOKEN"
      Copy to Clipboard Toggle word wrap

      Alternatively, use a prompt with read -s to input the token securely at runtime.

    2. Optional: For more information about the script, run the following command:

      ./.venv/bin/python migration.py --help
      Copy to Clipboard Toggle word wrap
    3. If you plan to create new or updated PipelineVersion custom resources after migration, you can compile your pipeline code by using the Kubeflow Pipelines SDK. For more information, see Compiling the pipeline YAML with the Kubeflow Pipelines SDK and Compiling Kubernetes-native manifests with the Kubeflow Pipelines SDK.
  5. Apply the exported Kubernetes custom resources to your cluster.

    oc apply -f ./kfp-exported-pipelines
    Copy to Clipboard Toggle word wrap
  6. Change the pipeline server to use Kubernetes API storage.

    oc -n "$NAMESPACE" patch dspa "$DSPA_NAME" --type=merge -p {"spec":{"apiServer":{"pipelineStore":"kubernetes"}}}
    Copy to Clipboard Toggle word wrap
    Note

    To view pipelines that were stored in the internal database and not migrated, you can temporarily change the pipeline server back to database storage.

    oc -n $NAMESPACE patch dspa $DSPA_NAME --type=merge -p {"spec":{"apiServer":{"pipelineStore":"database"}}}
    Copy to Clipboard Toggle word wrap
  7. Repeat this procedure for each additional project that you want to migrate, changing NAMESPACE to the appropriate project name.
  8. Optional: Clean up the local environment.

    rm -rf .venv migration.py
    Copy to Clipboard Toggle word wrap

Verification

  1. Check that the Pipeline and PipelineVersion custom resources were created in your project:

    $ oc -n <namespace> get pipelines.pipelines.kubeflow.org
    $ oc -n <namespace> get pipelineversions.pipelines.kubeflow.org
    Copy to Clipboard Toggle word wrap
  2. Verify that the pipeline server is using Kubernetes API storage:

    $ oc -n <namespace> get dspa <dspa_name> -o jsonpath={.spec.apiServer.pipelineStore}{"\n"}
    Copy to Clipboard Toggle word wrap

    The command should return kubernetes.

1.3. Importing a pipeline

To help you begin working with AI pipelines in OpenShift AI, you can import a YAML file containing your pipeline’s code to an active pipeline server, or you can import the YAML file from a URL. This file contains a Kubeflow pipeline compiled by using the Kubeflow compiler. After you have imported the pipeline to a pipeline server, you can execute the pipeline by creating a pipeline run.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a configured pipeline server.
  • You have compiled your pipeline with the Kubeflow compiler and you have access to the resulting YAML file.
  • If you are uploading your pipeline from a URL, the URL is publicly accessible.
Note

If your pipeline is defined in Python code instead of a YAML file, compile it first by using the KFP SDK. For more information, see Compiling the pipeline YAML with the Kubeflow Pipelines SDK.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definition page, from the Project drop-down list, select the project that you want to import a pipeline to.
  3. Click Import pipeline.
  4. In the Import pipeline dialog, enter the details for the pipeline that you want to import.

    1. In the Pipeline name field, enter a name for the pipeline that you want to import.
    2. In the Pipeline description field, enter a description for the pipeline that want to import.
    3. Select where you want to import your pipeline from by performing one of the following actions:

      • Select Upload a file to upload your pipeline from your local machine’s file system. Import your pipeline by clicking Upload, or by dragging and dropping a file.
      • Select Import by url to upload your pipeline from a URL, and then enter the URL into the text box.
    4. Click Import pipeline.

Verification

  • The pipeline that you imported is displayed on the Pipeline definitions page and on the Pipelines tab on the project details page.

1.4. Deleting a pipeline

If you no longer require access to your AI pipeline on the dashboard, you can delete it so that it does not appear on the Pipeline definitions page. .Prerequisites * You have logged in to Red Hat OpenShift AI. * There are active pipelines available on the Pipeline definitions page. * The pipeline that you want to delete does not contain any pipeline versions. * The pipeline that you want to delete does not contain any pipeline versions. For more information, see Deleting a pipeline version.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipeline that you want to delete.
  3. Click the action menu () beside the pipeline that you want to delete, and then click Delete pipeline.
  4. In the Delete pipeline dialog, enter the pipeline name in the text field to confirm that you intend to delete it.
  5. Click Delete pipeline.

Verification

  • The AI pipeline that you deleted is no longer displayed on the Pipeline definitions page.

1.5. Deleting a pipeline server

After you have finished running your AI pipelines, you can delete the pipeline server. Deleting a pipeline server automatically deletes all of its associated pipelines, pipeline versions, and runs. If your pipeline data is stored in a database, the database is also deleted along with its meta-data. In addition, after deleting a pipeline server, you cannot create new pipelines or pipeline runs until you create another pipeline server.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a pipeline server.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipeline server that you want to delete.
  3. From the Pipeline server actions list, select Delete pipeline server.
  4. In the Delete pipeline server dialog, enter the name of the pipeline server in the text field to confirm that you intend to delete it.
  5. Click Delete.

Verification

  • Pipelines previously assigned to the deleted pipeline server no longer appear on the Pipeline definitions page for the relevant project.
  • Pipeline runs previously assigned to the deleted pipeline server no longer appear on the Runs page for the relevant project.

1.6. Viewing the details of a pipeline server

You can view the details of pipeline servers configured in OpenShift AI, such as the pipeline’s connection details and where its data is stored.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that contains an active and available pipeline server.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipeline server that you want to view.
  3. From the Pipeline server actions list, select Manage pipeline server configuration.

Verification

  • You can view the pipeline server details in the Manage pipeline server dialog.

1.7. Viewing existing pipelines

You can view the details of pipelines that you have imported to Red Hat OpenShift AI, such as the pipeline’s last run, when it was created, the pipeline’s executed runs, and details of any associated pipeline versions.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a pipeline server.
  • You have imported a pipeline to an active pipeline server.
  • Existing pipelines are available.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipelines that you want to view.
  3. Optional: Click Expand ( rhoai expand icon ) on the row of a pipeline to view its pipeline versions.

Verification

  • A list of pipelines is displayed on the Pipeline definitions page.

1.8. Overview of pipeline versions

You can manage incremental changes to pipelines in OpenShift AI by using versioning. This allows you to develop and deploy pipelines iteratively, preserving a record of your changes. You can track and manage your changes on the OpenShift AI dashboard, allowing you to schedule and execute runs against all available versions of your pipeline.

1.9. Uploading a pipeline version

You can upload a YAML file to an active pipeline server that contains the latest version of your pipeline, or you can upload the YAML file from a URL. The YAML file must consist of a Kubeflow pipeline compiled by using the Kubeflow compiler. After you upload a pipeline version to a pipeline server, you can execute it by creating a pipeline run.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a configured pipeline server.
  • You have a pipeline version available and ready to upload.
  • If you are uploading your pipeline version from a URL, the URL is publicly accessible.
  • If your pipeline version is based on Python code, compile it to YAML before uploading. For more information, see Compiling the pipeline YAML with the Kubeflow Pipelines SDK.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that you want to upload a pipeline version to.
  3. Click the Import pipeline drop-down list, and then select Upload new version.
  4. In the Upload new version dialog, enter the details for the pipeline version that you are uploading.

    1. From the Pipeline list, select the pipeline that you want to upload your pipeline version to.
    2. In the Pipeline version name field, confirm the name for the pipeline version, and change it if necessary.
    3. In the Pipeline version description field, enter a description for the pipeline version.
    4. Select where you want to upload your pipeline version from by performing one of the following actions:

      • Select Upload a file to upload your pipeline version from your local machine’s file system. Import your pipeline version by clicking Upload, or by dragging and dropping a file.
      • Select Import by url to upload your pipeline version from a URL, and then enter the URL into the text box.
    5. Click Upload.

Verification

  • The pipeline version that you uploaded is displayed on the Pipeline definitions page. Click Expand ( rhoai expand icon ) on the row containing the pipeline to view its versions.
  • The Version column on the row containing the pipeline version that you uploaded on the Pipeline definitions page increments by one.

1.10. Deleting a pipeline version

You can delete specific versions of a pipeline when you no longer require them. Deleting a default pipeline version automatically changes the default pipeline version to the next most recent version. If no pipeline versions exist, the pipeline persists without a default version.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a pipeline server.
  • You have imported a pipeline to an active pipeline server.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.

    The Pipeline definitions page opens.

  2. Delete the pipeline versions that you no longer require:

    • To delete a single pipeline version:

      1. From the Project list, select the project that contains a version of a pipeline that you want to delete.
      2. On the row containing the pipeline, click Expand ( rhoai expand icon ).
      3. Click the action menu () beside the version that you want to delete, and then click Delete pipeline version.

        The Delete pipeline version dialog opens.

      4. Enter the name of the pipeline version in the text field to confirm that you intend to delete it.
      5. Click Delete.
    • To delete multiple pipeline versions:

      1. On the row containing each pipeline version that you want to delete, select the checkbox.
      2. Click the action menu (⋮) next to the Import pipeline drop-down list, and then select Delete from the list.

Verification

  • The pipeline version that you deleted is no longer displayed on the Pipeline definitions page or on the Pipelines tab for the project.

1.11. Viewing the details of a pipeline version

You can view the details of a pipeline version that you have uploaded to Red Hat OpenShift AI, such as its graph and YAML code.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a pipeline server.
  • You have a pipeline available on an active and available pipeline server.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipeline versions that you want to view details for.
  3. Click the pipeline name to view further details of its most recent version. The pipeline version details page opens, displaying the Graph, Summary, and Pipeline spec tabs.

    Alternatively, click Expand ( rhoai expand icon ) on the row containing the pipeline that you want to view versions for, and then click the pipeline version that you want to view the details of. The pipeline version details page opens, displaying the Graph, Summary, and Pipeline spec tabs.

Verification

  • On the pipeline version details page, you can view the pipeline graph, summary details, and YAML code.

1.12. Downloading a pipeline version

To make further changes to an AI pipeline version that you previously uploaded to OpenShift AI, you can download pipeline version code from the user interface.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • You have previously created a project that is available and contains a configured pipeline server.
  • You have created and imported a pipeline to an active pipeline server that is available to download.

Procedure

  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the version that you want to download.
  3. Click Expand ( rhoai expand icon ) beside the pipeline that contains the version that you want to download.
  4. Click the pipeline version that you want to download.

    The pipeline version details page opens.

  5. Click the Pipeline spec tab, and then click the Download button ( rhoai download icon ) to download the YAML file that contains the pipeline version code to your local machine.

Verification

  • The pipeline version code downloads to your browser’s default directory for downloaded files.

1.13. Overview of pipelines caching

You can use caching within AI pipelines to optimize execution times and improve resource efficiency. Caching reduces redundant task execution by reusing results from previous runs with identical inputs.

Caching is particularly beneficial for iterative tasks, where intermediate steps might not need to be repeated. Understanding caching can help you design more efficient pipelines and save time in model development.

Caching operates by storing the outputs of successfully completed tasks and comparing the inputs of new tasks against previously cached ones. If a match is found, OpenShift AI reuses the cached results instead of re-executing the task, reducing computation time and resource usage.

1.13.1. Caching criteria

For caching to be effective, the following criteria determine if a task can use previously cached results:

  • Input data and parameters: If the input data and parameters for a task are unchanged from a previous run, cached results are eligible for reuse.
  • Task code and configuration: Changes to the task code or configurations invalidate the cache to ensure that modifications are always reflected.
  • Pipeline environment: Changes to the pipeline environment, such as dependency versions, also affect caching eligibility to maintain consistency.

Cached steps in pipelines are visually indicated in the user interface (UI):

  • Tasks that use cached results display a green icon, helping you quickly identify which steps were cached. The Status field in the side panel displays Cached for cached tasks.
  • The UI also includes information about when the task was previously executed, allowing for easy verification of cache usage.

To check the caching status of specific tasks, navigate to the pipeline details view in the UI. Cached and non-cached tasks are clearly indicated. Cached tasks do not display execution logs because they reuse previously generated outputs and are not re-executed.

1.13.3. Controlling caching in pipelines

Caching is enabled by default in OpenShift AI to improve performance. However, there are instances when disabling caching might be necessary for specific tasks, an entire pipeline, or all pipelines. For example, caching might not be beneficial for tasks that rely on frequently updated data or unique computational needs. In other cases, such as debugging, development, or when deterministic re-execution is required, you might want to disable caching for all pipelines.

Important

Disabling caching at the pipeline or pipeline server level causes all tasks to re-run, potentially increasing compute time and resource usage.

You can control caching for AI pipelines in the following ways:

  • Individual task: Data scientists can disable caching for specific steps in a pipeline.
  • Pipeline (submit time): Data scientists can disable caching when submitting a pipeline run.
  • Pipeline (compile time): Data scientists can disable caching when compiling a pipeline.
  • All pipelines (pipeline server): You can disable caching for all pipelines in the pipeline server, which overrides all pipeline and task-level caching settings.

1.13.3.1. Disabling caching for individual tasks

To disable caching for a particular task, apply the set_caching_options method directly to the task in your pipeline code:

task_name.set_caching_options(False)
Copy to Clipboard Toggle word wrap

After applying this setting, OpenShift AI runs the task in future pipeline runs, ignoring any cached results.

You can re-enable caching for individual tasks by setting the set_caching_options parameter to True or by omitting set_caching_options.

This setting is ignored if caching is disabled in the pipeline server.

To disable caching for the entire pipeline during pipeline submission, set the enable_caching parameter to False in your pipeline code. This setting ensures that no steps are cached during pipeline execution. The enable_caching parameter is available only when using the kfp.client to submit pipelines or start pipeline runs, such as the run_pipeline method.

Example:

import kfp
client = kfp.Client()
client.run_pipeline(
    experiment_id=experiment.id,
    pipeline_id=pipeline.id,
    job_name="no-cache-run",
    params={},                # optional
    enable_caching=False,
)
Copy to Clipboard Toggle word wrap

This setting is ignored if caching is disabled during pipeline compilation or in the pipeline server.

To disable caching for the entire pipeline during compilation, set one of the following options in your local environment or workbench:

  • Environment variable:

    export KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT=true
    Copy to Clipboard Toggle word wrap
  • CLI flag (when using kfp dsl compile):

    kfp dsl compile --disable-execution-caching-by-default
    Copy to Clipboard Toggle word wrap

These settings are ignored if caching is disabled in the pipeline server.

To disable caching for all pipelines in the pipeline server and override all pipeline and task-level caching settings, use either of the following methods:

Pipeline server configuration
  1. From the OpenShift AI dashboard, click Develop & train Pipelines Pipeline definitions.
  2. On the Pipeline definitions page, from the Project drop-down list, select the project that contains the pipeline server that you want to configure.
  3. From the Pipeline server actions list, select Manage pipeline server configuration.
  4. In the Pipeline caching section, clear the Allow caching to be configured per pipeline and task checkbox.
  5. Click Save.
DataSciencePipelinesApplication (cluster administrator)

In the OpenShift console or CLI, set the cacheEnabled field to false in the DataSciencePipelinesApplication (DSPA) custom resource for the project.

Example:

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
kind: DataSciencePipelinesApplication
metadata:
  name: my-dspa
  namespace: my-namespace
spec:
  apiServer:
    cacheEnabled: false
Copy to Clipboard Toggle word wrap

To allow caching to be configured at the pipeline and task level, set the cacheEnabled field to true in the DSPA custom resource.

After applying this setting, all pipeline and task-level caching settings are ignored.

Note

Changing this setting updates the CACHEENABLED environment variable in the pipeline server deployment.

Verification

After configuring caching settings, you can verify its behavior by using one of the following methods:

  • Check the UI: Locate the green icons in the task list to identify cached steps.
  • Test task re-runs: Disable caching on specific tasks or the pipeline to confirm that steps re-execute as expected.
  • Validate inputs: Ensure the task inputs, parameters, and runtime settings are unchanged when caching is applied.
Note

You can also disable caching for a single node or for your entire pipeline in JupyterLab using Elyra. For more information, see Disabling node caching in Elyra.

Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top