OpenShift AI tutorial - Fraud detection example
Use OpenShift AI to train and deploy an example fraud detection model
Abstract
Chapter 1. Introduction Copy linkLink copied to clipboard!
Welcome. In this tutorial, you learn how to use data science and artificial intelligence (AI) and machine learning (ML) in an OpenShift development workflow.
You complete the following tasks in Red Hat OpenShift AI without the need to install anything on your computer:
- Explore a pre-trained fraud detection model by using a Jupyter notebook.
- Deploy the model by using OpenShift AI model serving.
- Refine and train the model by using automated pipelines.
- Learn how to train the model by using distributed computing frameworks.
1.1. About the example fraud detection model Copy linkLink copied to clipboard!
The example fraud detection model monitors credit card transactions for potential fraudulent activity. It analyzes the following credit card transaction details:
- The geographical distance from an earlier credit card transaction.
- The price of the current transaction, compared to the median price of all the user’s transactions.
- Whether the user completed the transaction by using the hardware chip in the credit card, by entering a PIN number, or by making an online purchase.
Based on this data, the model outputs the likelihood of the transaction being fraudulent.
1.2. Before you begin Copy linkLink copied to clipboard!
You must have access to an OpenShift cluster which has Red Hat OpenShift AI installed.
If you do not have access to a cluster that has an instance of OpenShift AI, see the Red Hat OpenShift AI page on the Red Hat Developer website. There, you can create an account and access the free Red Hat Developer Sandbox or you can learn how to install OpenShift AI on your own OpenShift cluster.
ImportantIf your cluster uses self-signed certificates, before you begin the tutorial, your OpenShift AI administrator must add self-signed certificates for OpenShift AI as described in Working with certificates (Self-Managed) or Working with certificates (Cloud Service).
Make sure that the version of this tutorial matches the OpenShift AI version on your cluster.
Red Hat OpenShift AI v2.25 is the version described in this tutorial.
To find the OpenShift AI version for your cluster:
- Log in to the OpenShift AI dashboard.
In the top navigation bar, click the help icon (
) and then select About.
The About page shows the installed version.
- Check the Red Hat OpenShift AI Supported Configurations page to verify that your installed version is a supported version.
- If this version of the tutorial does not match the installed version, access the matching version of the tutorial by navigating to the top-level Red Hat OpenShift AI documentation page and selecting the matching version from the drop-down list.
If you’re ready, start the tutorial.
Chapter 2. Setting up a project and storage Copy linkLink copied to clipboard!
2.2. Setting up your data science project Copy linkLink copied to clipboard!
To implement a data science workflow, you must create a data science project (as described in the following procedure). Projects help your team to organize and work together on resources within separated namespaces. From a project you can create many workbenches, each with their own IDE environment (for example, JupyterLab), and each with their own connections and cluster storage. In addition, the workbenches can share models and data with pipelines and model servers.
Prerequisites
- You have logged in to Red Hat OpenShift AI.
Procedure
- On the navigation menu, select Data science projects. This page lists any existing projects that you have access to.
If you are using the Red Hat Developer Sandbox, it provides a default data science project (for example,
myname-dev). Select it and skip to the Verification section.If you are using your own OpenShift cluster, you can select an existing project (if any) or create a new one. Click Create project.
NoteYou can start a Jupyter notebook by clicking the Start basic workbench button, selecting a notebook image, and clicking Start server. However, in that case, it is a one-off Jupyter notebook run in isolation.
In the Create project modal, enter a display name and description.
- Click Create.
Verification
You can see your project’s initial state. Individual tabs show more information about the project components and project access permissions:
- Workbenches are instances of your development and experimentation environment. They typically contain individual development environments (IDEs), such as JupyterLab, RStudio, and Visual Studio Code.
- Pipelines contain the data science pipelines which run within the project.
- Models for quickly serving a trained model for real-time inference. You can have many model servers per data science project. One model server can host many models.
- Cluster storage is a persistent volume that retains the files and data you’re working on within a workbench. A workbench has access to one or more cluster storage instances.
- Connections contain required configuration parameters for connecting to a data source, such as an S3 object bucket.
- Permissions define which users and groups can access the project.
Next step
2.3. Storing data with connections Copy linkLink copied to clipboard!
Add connections to workbenches to connect your project to data inputs and object storage buckets. A connection is a resource that has the configuration parameters needed to connect to a data source or data sink, such as an AWS S3 object storage bucket.
For this tutorial, you run a provided script that creates the following local Minio storage buckets for you:
- My Storage - Use this bucket for storing your models and data. You can reuse this bucket and its connection for your notebooks and model servers.
- Pipelines Artifacts - Use this bucket as storage for your pipeline artifacts. When you create a pipeline server, you need a pipeline artifacts bucket. For this tutorial, create this bucket to separate it from the first storage bucket for clarity.
Although you can use one storage bucket for both storing models and data and for storing pipeline artifacts, this tutorial follows best practice and uses separate storage buckets for each purpose.
The provided script also creates a connection to each storage bucket.
To run the script that installs local MinIO storage buckets and creates connections to them, follow the steps in Running a script to install local object storage buckets and create connections.
If you want to use your own S3-compatible object storage buckets (instead of using the provided script), follow the steps in Creating connections to your own S3-compatible object storage.
2.3.1. Running a script to install local object storage buckets and create connections Copy linkLink copied to clipboard!
For convenience, run a script (provided in the following procedure) that automatically completes these tasks:
- Creates a Minio instance in your project.
- Creates two storage buckets in that Minio instance.
- Generates a random user id and password for your Minio instance.
- Creates two connections in your project, one for each bucket and both using the same credentials.
- Installs required network policies for service mesh functionality.
The guide for deploying Minio is the basis for this script.
The Minio-based Object Storage that the script creates is not meant for production usage.
If you want to connect to your own storage, see Creating connections to your own S3-compatible object storage.
Prerequisites
You must know the OpenShift resource name for your data science project so that you run the provided script in the correct project. To get the project’s resource name:
In the OpenShift AI dashboard, select Data science projects and then click the ? icon next to the project name. A text box opens with information about the project, including its resource name:
The following procedure describes how to run the script from the OpenShift console. If you are knowledgeable in OpenShift and can access the cluster from the command line, instead of following the steps in this procedure, you can use the following command to run the script:
oc apply -n <your-project-name/> -f https://github.com/rh-aiservices-bu/fraud-detection/raw/main/setup/setup-s3.yaml
oc apply -n <your-project-name/> -f https://github.com/rh-aiservices-bu/fraud-detection/raw/main/setup/setup-s3.yaml
Procedure
In the OpenShift AI dashboard, click the application launcher icon and then select the OpenShift Console option.
In the OpenShift console, click + in the top navigation bar.
Select your project from the list of projects.
Verify that you selected the correct project.
Copy the following code and paste it into the Import YAML editor.
NoteThis code gets and applies the
setup-s3-no-sa.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Click Create.
Verification
In the OpenShift console, there is a "Resources successfully created" message and a list of the following resources:
-
demo-setup -
demo-setup-edit -
create-s3-storage
-
In the OpenShift AI dashboard:
- Select Data science projects and then click the name of your project, Fraud detection.
Click Connections. There are two connections listed:
My StorageandPipeline Artifacts.
If your cluster uses self-signed certificates, your OpenShift AI administrator might need to configure a certificate authority (CA) to securely connect to the S3 object storage, as described in Accessing S3-compatible object storage with self-signed certificates (Self-Managed) or Accessing S3-compatible object storage with self-signed certificates (Cloud Service).
Next step
If you want to complete the pipelines section of this tutorial, go to Enabling data science pipelines.
Otherwise, skip to Creating a workbench.
2.3.2. Creating connections to your own S3-compatible object storage Copy linkLink copied to clipboard!
If you have existing S3-compatible storage buckets that you want to use for this tutorial, you must create a connection to one storage bucket for saving your data and models. If you want to complete the pipelines section of this tutorial, create another connection to a different storage bucket for saving pipeline artifacts.
If you do not have your own s3-compatible storage, or if you want to use a disposable local Minio instance instead, skip this task and follow the steps in Running a script to install local object storage buckets and create connections. The provided script automatically completes the following tasks for you: creates a Minio instance in your project, creates two storage buckets in that Minio instance, creates two connections in your project, one for each bucket and both using the same credentials, and installs required network policies for service mesh functionality.
Prerequisites
To create connections to your existing S3-compatible storage buckets, you need the following credential information for the storage buckets:
- Endpoint URL
- Access key
- Secret key
- Region
- Bucket name
If you do not have this information, contact your storage administrator.
Procedure
Create a connection for saving your data and models:
- In the OpenShift AI dashboard, navigate to the page for your data science project.
Click the Connections tab, and then click Create connection.
- In the Add connection modal, for the Connection type select S3 compatible object storage - v1.
Complete the Add connection form and name your connection My Storage. This connection is for saving your personal work, including data and models.
- Click Create.
Create a connection for saving pipeline artifacts:
NoteIf you do not intend to complete the pipelines section of the tutorial, you can skip this step.
- Click Add connection.
Complete the form and name your connection Pipeline Artifacts.
- Click Create.
Verification
In the Connections tab for the project, check to see that your connections are listed.
If your cluster uses self-signed certificates, your OpenShift AI administrator might need to provide a certificate authority (CA) to securely connect to the S3 object storage, as described in Accessing S3-compatible object storage with self-signed certificates (Self-Managed) or Accessing S3-compatible object storage with self-signed certificates (Cloud Service).
Next step
If you want to complete the pipelines section of this tutorial, go to Enabling data science pipelines.
Otherwise, skip to Creating a workbench.
2.4. Enabling data science pipelines Copy linkLink copied to clipboard!
You must prepare your tutorial environment so that you can use data science pipelines.
If you do not intend to complete the pipelines section of this tutorial you can skip this step and move on to the next section, Setting up Kueue resources.
Later in this tutorial, you implement an example pipeline by using the JupyterLab Elyra extension. With Elyra, you can create a visual end-to-end pipeline workflow that executes in OpenShift AI.
Prerequisites
- You have installed local object storage buckets and created connections, as described in Storing data with connections.
Procedure
- In the OpenShift AI dashboard, on the Fraud Detection page, click the Pipelines tab.
Click Configure pipeline server.
In the Configure pipeline server form, in the Access key field next to the key icon, click the dropdown menu and then click Pipeline Artifacts.
The Configure pipeline server form autofills with credentials for the connection.
- In the Advanced Settings section, leave the default values.
- Click Configure pipeline server.
Wait until the loading spinner disappears and Start by importing a pipeline is displayed.
ImportantYou must wait until the pipeline configuration is complete before you continue and create your workbench. If you create your workbench before the pipeline server is ready, your workbench cannot submit pipelines to it.
If you have waited more than 5 minutes, and the pipeline server configuration does not complete, you can delete the pipeline server and create it again.
You can also ask your OpenShift AI administrator to verify that they applied self-signed certificates on your cluster as described in Working with certificates (Self-Managed) or Working with certificates (Cloud Service).
Verification
- Navigate to the Pipelines tab for the project.
Next to Import pipeline, click the action menu (⋮) and then select View pipeline server configuration.
An information box opens and displays the object storage connection information for the pipeline server.
Next step
2.5. Setting up Kueue resources Copy linkLink copied to clipboard!
You must prepare your tutorial environment so that you can use Kueue for distributing training with the Training Operator.
In the Distributing training jobs with the Training Operator section of this tutorial, you implement a distributed training job by using Kueue for managing job resources. With Kueue, you can manage cluster resource quotas and how different workloads consume them.
If you are using the Red Hat Developer Sandbox, or if you do not intend to use Kueue to schedule your training jobs in the Distributing training jobs with the Training Operator section of this tutorial, skip this procedure and continue to the next section, Creating a workbench and selecting a workbench image.
Procedure
In the OpenShift AI dashboard, click the application launcher icon and then select the OpenShift Console option.
In the OpenShift console, click + in the top navigation bar.
Select your project from the list of projects.
Verify that you selected the correct project.
Copy the following code and paste it into the Import YAML editor.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Click Create.
Verification
In the OpenShift console, there is a "Resources successfully created" message with a list of the following resources:
-
default-flavor -
cluster-queue -
local-queue
-
Chapter 3. Creating a workbench and using notebooks Copy linkLink copied to clipboard!
3.1. Creating a workbench and selecting a workbench image Copy linkLink copied to clipboard!
A workbench is an instance of your development and experimentation environment. When you create a workbench, you select a workbench image that has the tools and libraries that you need for developing models.
Prerequisites
-
You created a
My Storageconnection as described in Storing data with connections. - If you intend to complete the pipelines section of this tutorial, you configured a pipeline server as described in Enabling data science pipelines.
Procedure
- Navigate to the project detail page for the data science project that you created in Setting up your data science project.
Click the Workbenches tab, and then click the Create workbench button.
Fill out the name and description.
Red Hat provides several supported workbench images. In the Workbench image section, you can select one of the default images or a custom image that an administrator has set up for you. The Tensorflow image has the libraries needed for this tutorial.
Select the latest Tensorflow image.
Choose a small deployment.
- If your OpenShift cluster has available GPUs, the Create workbench form includes an Accelerator option. Select None. This tutorial does not require any GPUs.
Leave the default environment variables and storage options.
- For Connections, click Attach existing connection.
Select
My Storage(the object storage that you configured earlier) and then click Attach.- Click Create workbench.
Verification
In the Workbenches tab for the project, the status of the workbench changes from Starting to Running.
If you made a mistake, you can edit the workbench to make changes.
3.2. Importing the tutorial files into the JupyterLab environment Copy linkLink copied to clipboard!
The JupyterLab environment is a web-based environment, but everything you do inside it happens on Red Hat OpenShift AI and is powered by the OpenShift cluster. This means that, without having to install and maintain anything on your own computer, and without using valuable local resources such as CPU, GPU and RAM, you can conduct your data science work in this powerful and stable managed environment.
Prerequisites
You created a workbench, as described in Creating a workbench and selecting a workbench image.
Procedure
Click the link for your workbench. If prompted, log in and allow JupyterLab to authorize your user.
Your JupyterLab environment window opens.
This file-browser window shows the files and folders that are saved inside your own personal space in OpenShift AI.
Bring the content of this tutorial inside your JupyterLab environment:
On the toolbar, click the Git Clone icon:
Enter the following tutorial Git https URL:
https://github.com/rh-aiservices-bu/fraud-detection.git
https://github.com/rh-aiservices-bu/fraud-detection.gitCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Select the Include submodules option, and then click Clone.
In the file browser, double-click the newly-created fraud-detection folder.
In the left navigation bar, click the Git icon, and then click Current Branch to expand the branches and tags selector panel.
On the Branches tab, in the Filter field, enter v2.25.
Select origin/v2.25.
The current branch changes to v2.25.
Verification
In the file browser, view the notebooks that you cloned from Git.
Next step
or
3.3. Running code in a notebook Copy linkLink copied to clipboard!
If you’re already at ease with Jupyter, you can skip to the next section.
A notebook is an environment where you have cells that can display formatted text or code.
This is an empty cell:
This is a cell with some code:
Code cells contain Python code that you can run interactively. You can edit the code and then run it. The code does not run on your computer or in the browser, but directly in your connected environment, Red Hat OpenShift AI in your case.
You can run a code cell from the notebook interface or from the keyboard:
From the user interface: Select the cell (by clicking inside the cell or to the left side of the cell) and then click Run from the toolbar.
-
From the keyboard: Press
CTRL+ENTERto run a cell or pressSHIFT+ENTERto run the cell and automatically select the next one.
After you run a cell, you can see the result of its code and information about when the code in the cell ran, as shown in this example:
When you save a notebook, the code and the results are saved. You can reopen the notebook to view the results without having to run the program again, and still having access to the code.
Notebooks are so named because they are like a physical notebook: you can take notes about your experiments, along with the code itself, including any parameters that you set. You can see the output of the experiment inline (this is the result after a cell runs), along with all the notes that you want to take. To take notes, from the menu switch the cell type from Code to Markdown.
Prerequisites
- You have imported the tutorial files into your JupyterLab environment as described in Importing the tutorial files into the JupyterLab environment.
Procedure
In your JupyterLab environment, locate the
0_sandbox.ipynbfile and double-click it to launch the notebook. The notebook opens in a new tab in the content section of the environment.Experiment by, for example, running the existing cells, adding more cells and creating functions.
You can do what you want - it is your environment and there is no risk of breaking anything or impacting other users. This environment isolation is also a great advantage brought by OpenShift AI.
Optionally, create a new notebook in which the code cells are run by using a Python 3 kernel:
Create a new notebook by either selecting File →New →Notebook or by clicking the Python 3 tile in the Notebook section of the launcher window:
You can use different kernels, with different languages or versions, to run in your notebook.
Additional resources
Next step
3.4. Training a model Copy linkLink copied to clipboard!
In your notebook environment, open the 1_experiment_train.ipynb file and follow the instructions directly in the notebook. The instructions guide you through some simple data exploration, experimentation, and model training tasks.
When you save the model, you convert the model to the portable Open Neural Network Exchange (ONNX) format. By using ONNX, you can transfer models between frameworks with minimal preparation and without the need for rewriting the models.
Next step
Chapter 4. Deploying and testing a model Copy linkLink copied to clipboard!
4.1. Preparing a model for deployment Copy linkLink copied to clipboard!
After you train a model, you can deploy it by using the OpenShift AI model serving capabilities.
To prepare a model for deployment, you must move the model from your workbench to your S3-compatible object storage. Use the connection that you created in the Storing data with connections section and upload the model from a notebook.
Prerequisites
You created the
My Storageconnection and have added it to your workbench.
Procedure
-
In your JupyterLab environment, open the
2_save_model.ipynbfile. - Follow the instructions in the notebook to make the model accessible in storage.
Verification
When you have completed the notebook instructions, the models/fraud/1/model.onnx file is in your object storage and it is ready for your model server to use.
Next step
4.2. Deploying a model Copy linkLink copied to clipboard!
Given that the model is accessible in storage and saved in the portable ONNX format, you can use an OpenShift AI model server to deploy it as an API.
OpenShift AI offers two options for model serving:
- Single-model serving - Each model in the project deploys on its own model server. This platform works well for large models or models that need dedicated resources.
- Multi-model serving - All models in the project deploy on the same model server. This platform is suitable for sharing resources among deployed models. Multi-model serving is the only option offered in the Red Hat Developer Sandbox environment.
For this tutorial, because you are deploying only one model, you can select either serving type. The steps for deploying the fraud detection model depend on the type of model serving platform that you select:
4.2.1. Deploying a model on a single-model server Copy linkLink copied to clipboard!
OpenShift AI single-model servers host only one model. You create a new model server and deploy your model to it.
Prerequisites
-
A user with
adminprivileges has enabled the single-model serving platform on your OpenShift cluster.
Procedure
In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.
NoteDepending on the model serving configuration on your cluster, you might see only one model serving platform option.
- In the Single-model serving platform tile, click Select single-model.
In the form, specify the following values:
-
For Model deployment name, type
fraud. -
For Serving runtime, select
OpenVINO Model Server. -
For Model framework (name - version), select
onnx-1. -
For Deployment mode, select
Advanced. -
For Existing connection, select
My Storage. -
Type the path that leads to the version folder that has your model file:
models/fraud Leave the other fields with the default settings.
-
For Model deployment name, type
- Click Deploy.
Verification
Notice the loading symbol under the Status section. The symbol changes to a checkmark when the deployment completes successfully.
Next step
4.2.2. Deploying a model on a multi-model server Copy linkLink copied to clipboard!
OpenShift AI multi-model servers can host several models at once. You create a new model server and deploy your model to it.
Prerequisites
-
A user with
adminprivileges has enabled the multi-model serving platform on your OpenShift cluster.
Procedure
In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.
NoteDepending on how model serving has been configured on your cluster, you might see only one model serving platform option.
- In the Multi-model serving platform tile, click Select multi-model.
In the form, specify the following values:
-
For Model server name, type a name, for example
Model Server. -
For Serving runtime, select
OpenVINO Model Server. Leave the other fields with the default settings.
-
For Model server name, type a name, for example
- Click Add.
In the Models and model servers list, next to the new model server, click Deploy model.
In the form, specify the following values:
-
For Model deployment name, type
fraud. -
For Model framework (name - version), select
onnx-1. -
For Existing connection, select
My Storage. -
Type the path that leads to the version folder that has your model file:
models/fraud Leave the other fields with the default settings.
-
For Model deployment name, type
- Click Deploy.
Verification
Notice the loading symbol under the Status section. The symbol changes to a checkmark when the deployment completes successfully.
Next step
4.3. Testing the model API Copy linkLink copied to clipboard!
After you deploy the model, you can test its API endpoints.
Procedure
- In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.
Take note of the model’s Inference endpoint URL. You need this information when you test the model API.
If the Inference endpoint field has an Internal endpoint details link, click the link to open a text box that shows the URL details, and then take note of the restUrl value.
Return to the JupyterLab environment and try out your new endpoint.
If you deployed your model with the multi-model serving platform, follow the directions in
3_rest_requests_multi_model.ipynbto try a REST API call and4_grpc_requests_multi_model.ipynbto try a gRPC API call.If you deployed your model with the single-model serving platform, follow the directions in
5_rest_requests_single_model.ipynbto try a REST API call.
Next step
(Optional) Running a data science pipeline generated from Python code
Chapter 5. Implementing pipelines Copy linkLink copied to clipboard!
5.1. Automating workflows with data science pipelines Copy linkLink copied to clipboard!
Earlier, you used a notebook to train and save your model. Optionally, you can automate these tasks by using Red Hat OpenShift AI pipelines. Pipelines offer a way to automate the execution of many notebooks and Python code. By using pipelines, you can run long training jobs or retrain your models on a schedule without having to manually run them in a notebook.
To explore the pipeline editor, complete the steps in the following procedures to create your own pipeline.
Alternately, you can skip the following procedures and instead run the 6 Train Save.pipeline file.
5.1.1. Creating a pipeline Copy linkLink copied to clipboard!
You can create a simple pipeline by using the GUI pipeline editor. The pipeline uses the notebook that you used earlier to train a model and then save it to S3 storage.
Prerequisites
- You configured a pipeline server as described in Enabling data science pipelines.
- If you configured the pipeline server after you created your workbench, you stopped and then started your workbench.
Procedure
Open your workbench’s JupyterLab environment. If the launcher is not visible, click + to open it.
Click Pipeline Editor.
You have created a blank pipeline.
Set the default runtime image for when you run your notebook or Python code.
In the pipeline editor, click Open Panel.
Select the Pipeline Properties tab.
In the Pipeline Properties panel, scroll down to Generic Node Defaults and Runtime Image. Set the value to
Tensorflow with Cuda and Python 3.11 (UBI 9).
- Select File → Save Pipeline.
Verification
- In the file-browser window, you can see the pipeline file that you created.
Next step
5.1.2. Adding nodes to your pipeline Copy linkLink copied to clipboard!
Add some steps, or nodes, in your pipeline for the 1_experiment_train.ipynb and 2_save_model.ipynb notebooks.
Prerequisites
- You created a pipeline file as described in title
Procedure
From the JupyterLab file-browser panel, drag the
1_experiment_train.ipynband2_save_model.ipynbnotebooks onto the pipeline canvas.Click the output port of
1_experiment_train.ipynband drag a connecting line to the input port of2_save_model.ipynb.- Save the pipeline.
Verification
- Your pipeline has two nodes.
5.1.3. Specifying the training file as a dependency Copy linkLink copied to clipboard!
Set node properties in your pipeline to specify the training file as a dependency.
If you do not set this file dependency, the file is not included in the node when it runs and the training job fails.
Prerequisites
- You added two nodes to your pipeline as described in Adding nodes to your pipeline.
Procedure
Click the
1_experiment_train.ipynbnode.- In the Properties panel, click the Node Properties tab.
Scroll down to the File Dependencies section and then click Add.
-
Set the value to
data/*.csvwhich has the data to train your model. Select the Include Subdirectories option.
- Save the pipeline.
Verification
- The training file is a dependency of the first node in your pipeline.
5.1.4. Creating and storing the ONNX-formatted output file Copy linkLink copied to clipboard!
You must set the models/fraud/1/model.onnx file as the output file for both nodes in your pipeline.
Prerequisites
- You added the training file as a dependency to the first node in your pipeline, as described in Specifying the training file as a dependency.
Procedure
- Select node 1.
- Select the Node Properties tab.
- Scroll down to the Output Files section, and then click Add.
Set the value to
models/fraud/1/model.onnx.- Repeat steps 2-4 for node 2.
- Click Save Pipeline.
Verification
-
In node 1, the notebook creates the
models/fraud/1/model.onnxfile. -
In node 2, the notebook uploads the
models/fraud/1/model.onnxfile to the S3 storage bucket.
Next step
5.1.5. Configuring the connection to storage Copy linkLink copied to clipboard!
In node 2, the notebook uploads the model to the S3 storage bucket. You must set the S3 storage bucket keys by using the secret created by the My Storage connection that you set up in Storing data with connections.
You can use this secret in your pipeline nodes without having to save the information in your pipeline code. This is important, for example, if you want to save your pipelines - without any secret keys - to source control.
The name of the secret is aws-connection-my-storage.
If you named your connection something other than My Storage, you can obtain the secret name in the OpenShift AI dashboard by hovering over the help (?) icon in the Connections tab.
The aws-connection-my-storage secret includes the following fields:
-
AWS_ACCESS_KEY_ID -
AWS_DEFAULT_REGION -
AWS_S3_BUCKET -
AWS_S3_ENDPOINT -
AWS_SECRET_ACCESS_KEY
You must set the secret name and key for each of these fields.
Prerequisites
-
You created the
My Storageconnection, as described in Storing data with connections. -
You set the
models/fraud/1/model.onnxfile as the output file for both nodes in your pipeline, as described in Creating and storing the ONNX-formatted output file.
Procedure
Remove any pre-filled environment variables.
Select node 2, and then select the Node Properties tab.
Under Additional Properties, note that some environment variables have been pre-filled. The pipeline editor inferred that you need them from the notebook code.
Because you do not want to save the value in your pipelines, remove all of these environment variables.
Click Remove for each of the pre-filled environment variables.
Add the S3 bucket and keys by using the Kubernetes secret.
Under Kubernetes Secrets, click Add.
Enter the following values and then click Add.
Environment Variable:
AWS_ACCESS_KEY_ID-
Secret Name:
aws-connection-my-storage Secret Key:
AWS_ACCESS_KEY_ID
-
Secret Name:
Repeat Step 2 for each of the following Kubernetes secrets:
Environment Variable:
AWS_SECRET_ACCESS_KEY-
Secret Name:
aws-connection-my-storage -
Secret Key:
AWS_SECRET_ACCESS_KEY
-
Secret Name:
Environment Variable:
AWS_S3_ENDPOINT-
Secret Name:
aws-connection-my-storage -
Secret Key:
AWS_S3_ENDPOINT
-
Secret Name:
Environment Variable:
AWS_DEFAULT_REGION-
Secret Name:
aws-connection-my-storage -
Secret Key:
AWS_DEFAULT_REGION
-
Secret Name:
Environment Variable:
AWS_S3_BUCKET-
Secret Name:
aws-connection-my-storage -
Secret Key:
AWS_S3_BUCKET
-
Secret Name:
-
Select File → Save Pipeline As to save and rename the pipeline. For example, rename it to
My Train Save.pipeline.
Verification
-
You set the S3 storage bucket keys by using the secret created by the
My Storageconnection.
Next step
5.1.6. Running your pipeline Copy linkLink copied to clipboard!
Upload the pipeline on your cluster and run it. You can do so directly from the pipeline editor. You can use your own newly created pipeline or the pipeline in the provided 6 Train Save.pipeline file.
Prerequisites
- You set the S3 storage bucket keys, as described in Configuring the connection to storage.
Procedure
Click the play button in the toolbar of the pipeline editor.
- Enter a name for your pipeline.
-
Verify that the Runtime Configuration: is set to
Data Science Pipeline. Click OK.
NoteIf you see an error message stating that "no runtime configuration for Data Science Pipeline is defined", you might have created your workbench before the pipeline server was available.
To address this error, you must verify that you configured the pipeline server and then restart the workbench.
- Follow these steps in the OpenShift AI dashboard:
Check the status of the pipeline server:
In your Fraud Detection project, click the Pipelines tab.
- If you see the Configure pipeline server option, follow the steps in Enabling data science pipelines.
- If you see the Import a pipeline option, the pipeline server is configured. Continue to the next step.
Restart your Fraud Detection workbench:
- Click the Workbenches tab.
- Click Stop and then click Stop workbench.
- After the workbench status is Stopped, click Start.
- Wait until the workbench status is Running.
- Return to your workbench’s JupyterLab environment and run the pipeline.
In the OpenShift AI dashboard, open your data science project and expand the newly created pipeline.
Click View runs.
Click your run and then view the pipeline run in progress.
Verification
The models/fraud/1/model.onnx file is in your S3 bucket. You can serve the model, as described in Preparing a model for deployment.
Next step
(Optional) Running a data science pipeline generated from Python code
5.2. Running a data science pipeline generated from Python code Copy linkLink copied to clipboard!
Earlier, you created a simple pipeline by using the GUI pipeline editor. You might want to create pipelines by using code that can be version-controlled and shared with others. The Kubeflow pipelines (kfp) SDK provides a Python API for creating pipelines. The SDK is available as a Python package that you can install by using the pip install kfp command. With this package, you can use Python code to create a pipeline and then compile it to YAML format. Then you can import the YAML code into OpenShift AI.
This tutorial does not describe the details of how to use the SDK. Instead, it provides the files for you to view and upload.
Optionally, view the provided Python code in your JupyterLab environment by navigating to the
fraud-detection-notebooksproject’spipelinedirectory. It contains the following files:-
7_get_data_train_upload.pyis the main pipeline code. build.shis a script that builds the pipeline and creates the YAML file.For your convenience, the output of the
build.shscript is provided in the7_get_data_train_upload.yamlfile. The7_get_data_train_upload.yamloutput file is located in the top-levelfraud-detectiondirectory.
-
-
Right-click the
7_get_data_train_upload.yamlfile and then click Download. Upload the
7_get_data_train_upload.yamlfile to OpenShift AI.- In the OpenShift AI dashboard, navigate to your data science project page. Click the Pipelines tab and then click Import pipeline.
- Enter values for Pipeline name and Pipeline description.
Click Upload and then select
7_get_data_train_upload.yamlfrom your local files to upload the pipeline.Click Import pipeline to import and save the pipeline.
The pipeline shows in graphic view.
- Select Actions → Create run.
On the Create run page, provide the following values:
-
For Experiment, leave the value as
Default. -
For Name, type any name, for example
Run 1. For Pipeline, select the pipeline that you uploaded.
You can leave the other fields with their default values.
-
For Experiment, leave the value as
Click Create run to create the run.
A new run starts immediately.
Chapter 6. Running a distributed workload Copy linkLink copied to clipboard!
You can distribute the training of a machine learning model across many CPUs by using by using Ray or the Training Operator.
6.1. Distributing training jobs Copy linkLink copied to clipboard!
Earlier, you trained the fraud detection model directly in a notebook and then in a pipeline. You can also distribute the training of a machine learning model across many CPUs.
Distributing training is not necessary for a simple model. However, by applying it to the example fraud model, you learn how to train more complex models that require more compute power.
You can try one or both of the following options:
- The Ray distributed computing framework, as described in Distributing training jobs with Ray.
- The Training Operator, as described in Distributing training jobs with the Training Operator.
6.1.1. Distributing training jobs with Ray Copy linkLink copied to clipboard!
You can use Ray, a distributed computing framework, to parallelize Python code across many CPUs or GPUs.
In your notebook environment, open the 8_distributed_training.ipynb file and follow the instructions directly in the notebook. The instructions guide you through setting authentication, creating Ray clusters, and working with jobs.
Optionally, if you want to view the Python code for this step, you can find it in the ray-scripts/train_tf_cpu.py file.
For more information about TensorFlow training on Ray, see the Ray TensorFlow guide.
6.1.2. Distributing training jobs with the Training Operator Copy linkLink copied to clipboard!
The Training Operator is a tool for scalable distributed training of machine learning (ML) models created with various ML frameworks, such as PyTorch.
You can use the Training Operator to distribute the training of a machine learning model across many hardware resources.
In your notebook environment, open the 9_distributed_training_kfto.ipynb file and follow the instructions directly in the notebook. The instructions guide you through setting authentication, initializing the Training Operator client, and submitting a PyTorchJob.
Optionally, you can view the complete Python code in the kfto-scripts/train_pytorch_cpu.py file.
For more information about PyTorchJob training with the Training Operator, see the Training Operator PyTorchJob guide.
Chapter 7. Conclusion Copy linkLink copied to clipboard!
Congratulations. In this tutorial, you learned how to incorporate data science, artificial intelligence, and machine learning into an OpenShift development workflow.
You used an example fraud detection model and completed the following tasks:
- Explored a pre-trained fraud detection model by using a Jupyter notebook.
- Deployed the model by using OpenShift AI model serving.
- Refined and trained the model by using automated pipelines.
- Learned how to train the model by using distributed computing frameworks.