Chapter 2. Using project workbenches
2.1. Creating a workbench and selecting an IDE
A workbench is an isolated area where you can examine and work with ML models. You can also work with data and run programs, for example to prepare and clean data. While a workbench is not required if, for example, you only want to service an existing model, one is needed for most data science workflow tasks, such as writing code to process data or training a model.
When you create a workbench, you specify an image (an IDE, packages, and other dependencies). Supported IDEs include JupyterLab, code-server, and RStudio (Technology Preview).
The IDEs are based on a server-client architecture. Each IDE provides a server that runs in a container on the OpenShift cluster, while the user interface (the client) is displayed in your web browser. For example, the Jupyter notebook server runs in a container on the Red Hat OpenShift cluster. The client is the JupyterLab interface that opens in your web browser on your local computer. All of the commands that you enter in JupyterLab are executed by the notebook server. Similarly, other IDEs like code-server or RStudio Server provide a server that runs in a container on the OpenShift cluster, while the user interface is displayed in your web browser. This architecture allows you to interact through your local computer in a browser environment, while all processing occurs on the cluster. The cluster provides the benefits of larger available resources and security because the data being processed never leaves the cluster.
In a workbench, you can also configure connections (to access external data for training models and to save models so that you can deploy them) and cluster storage (for persisting data). Workbenches within the same project can share models and data through object storage with the data science pipelines and model servers.
For data science projects that require data retention, you can add container storage to the workbench you are creating.
Within a project, you can create multiple workbenches. When to create a new workbench depends on considerations, such as the following:
- The workbench configuration (for example, CPU, RAM, or IDE). If you want to avoid editing the configuration of an existing workbench’s configuration to accommodate a new task, you can create a new workbench instead.
- Separation of tasks or activities. For example, you might want to use one workbench for your Large Language Models (LLM) experimentation activities, another workbench dedicated to a demo, and another workbench for testing.
2.1.1. About workbench images
A workbench image (sometimes referred to as a notebook image) is optimized with the tools and libraries that you need for model development. You can use the provided workbench images or an OpenShift AI administrator can create custom workbench images adapted to your needs.
To provide a consistent, stable platform for your model development, many provided workbench images contain the same version of Python. Most workbench images available on OpenShift AI are pre-built and ready for you to use immediately after OpenShift AI is installed or upgraded.
For information about Red Hat support of workbench images and packages, see Red Hat OpenShift AI: Supported Configurations.
The following table lists the workbench images that are installed with Red Hat OpenShift AI by default.
If the preinstalled packages that are provided in these images are not sufficient for your use case, you have the following options:
- Install additional libraries after launching a default image. This option is good if you want to add libraries on an ad hoc basis as you develop models. However, it can be challenging to manage the dependencies of installed libraries and your changes are not saved when the workbench restarts.
- Create a custom image that includes the additional libraries or packages. For more information, see Creating custom workbench images.
Workbench images denoted with (Technology Preview)
in this table are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using Technology Preview features in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Image name | Description |
---|---|
CUDA | If you are working with compute-intensive data science models that require GPU support, use the Compute Unified Device Architecture (CUDA) workbench image to gain access to the NVIDIA CUDA Toolkit. Using this toolkit, you can optimize your work by using GPU-accelerated libraries and optimization tools. |
Standard Data Science | Use the Standard Data Science workbench image for models that do not require TensorFlow or PyTorch. This image contains commonly-used libraries to assist you in developing your machine learning models. |
TensorFlow | TensorFlow is an open source platform for machine learning. With TensorFlow, you can build, train and deploy your machine learning models. TensorFlow contains advanced data visualization features, such as computational graph visualizations. It also allows you to easily monitor and track the progress of your models. |
PyTorch | PyTorch is an open source machine learning library optimized for deep learning. If you are working with computer vision or natural language processing models, use the Pytorch workbench image. |
Minimal Python | If you do not require advanced machine learning features, or additional resources for compute-intensive data science work, you can use the Minimal Python image to develop your models. |
TrustyAI | Use the TrustyAI workbench image to leverage your data science work with model explainability, tracing, and accountability, and runtime monitoring. See the TrustyAI Explainability repository for some example Jupyter notebooks. |
code-server |
With the code-server workbench image, you can customize your workbench environment to meet your needs using a variety of extensions to add new languages, themes, debuggers, and connect to additional services. Enhance the efficiency of your data science work with syntax highlighting, auto-indentation, and bracket matching, as well as an automatic task runner for seamless automation. For more information, see code-server in GitHub. NOTE: Elyra-based pipelines are not available with the code-server workbench image. |
RStudio Server (Technology preview) |
Use the RStudio Server workbench image to access the RStudio IDE, an integrated development environment for R, a programming language for statistical computing and graphics. For more information, see the RStudio Server site.
To use the RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the Important
Disclaimer: |
CUDA - RStudio Server (Technology Preview) | Use the CUDA - RStudio Server workbench image to access the RStudio IDE and NVIDIA CUDA Toolkit. RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can optimize your work using GPU-accelerated libraries and optimization tools. For more information, see the RStudio Server site.
To use the CUDA - RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the Important
Disclaimer: The CUDA - RStudio Server workbench image contains NVIDIA CUDA technology. CUDA licensing information is available at https://docs.nvidia.com/cuda/. Review the licensing terms before you use this sample workbench. |
ROCm | Use the ROCm notebook image to run AI and machine learning workloads on AMD GPUs in OpenShift AI. It includes ROCm libraries and tools optimized for high-performance GPU acceleration, supporting custom AI workflows and data processing tasks. Use this image integrating additional frameworks or dependencies tailored to your specific AI development needs. |
ROCm-PyTorch | Use the ROCm-PyTorch notebook image to optimize PyTorch workloads on AMD GPUs in OpenShift AI. It includes ROCm-accelerated PyTorch libraries, enabling efficient deep learning training, inference, and experimentation. This image is designed for data scientists working with PyTorch-based workflows, offering integration with GPU scheduling. |
ROCm-TensorFlow | Use the ROCm-TensorFlow notebook image to optimize TensorFlow workloads on AMD GPUs in OpenShift AI. It includes ROCm-accelerated TensorFlow libraries to support high-performance deep learning model training and inference. This image simplifies TensorFlow development on AMD GPUs and integrates with OpenShift AI for resource scaling and management. |
2.1.2. Creating a workbench
When you create a workbench, you specify an image (an IDE, packages, and other dependencies). You can also configure connections, cluster storage, and add container storage.
Prerequisites
- You have logged in to Red Hat OpenShift AI.
-
If you use OpenShift AI groups, you are part of the user group or admin group (for example,
rhoai-users
orrhoai-admins
) in OpenShift. - You have created a project.
If you created a Simple Storage Service (S3) account outside of Red Hat OpenShift AI and you want to create connections to your existing S3 storage buckets, you have the following credential information for the storage buckets:
- Endpoint URL
- Access key
- Secret key
- Region
- Bucket name
For more information, see Working with data in an S3-compatible object store.
Procedure
From the OpenShift AI dashboard, click Data science projects.
The Data science projects page opens.
Click the name of the project that you want to add the workbench to.
A project details page opens.
- Click the Workbenches tab.
Click Create workbench.
The Create workbench page opens.
- In the Name field, enter a unique name for your workbench.
Optional: If you want to change the default resource name for your workbench, click Edit resource name.
The resource name is what your resource is labeled in OpenShift. Valid characters include lowercase letters, numbers, and hyphens (-). The resource name cannot exceed 30 characters, and it must start with a letter and end with a letter or number.
Note: You cannot change the resource name after the workbench is created. You can edit only the display name and the description.
- Optional: In the Description field, enter a description for your workbench.
In the Notebook image section, complete the fields to specify the workbench image to use with your workbench.
From the Image selection list, select a workbench image that suits your use case. A workbench image includes an IDE and Python packages (reusable code). Optionally, click View package information to view a list of packages that are included in the image that you selected.
If the workbench image has multiple versions available, select the workbench image version to use from the Version selection list. To use the latest package versions, Red Hat recommends that you use the most recently added image.
NoteYou can change the workbench image after you create the workbench.
In the Deployment size section, from the Hardware profile list, select a suitable hardware profile for your workbench. The hardware profile specifies the number of CPUs and the amount of memory allocated to the container, setting the guaranteed minimum (request) and maximum (limit) for both. To change these default values, click Customize resource requests and limit and enter new minimum (request) and maximum (limit) values.
ImportantBy default, hardware profiles are hidden from appearing in the dashboard navigation menu and user interface. In addition, user interface components associated with the deprecated accelerator profiles functionality are still displayed. To show the Settings
Hardware profiles option in the dashboard navigation menu and the user interface components associated with hardware profiles, set the disableHardwareProfiles
value tofalse
in theOdhDashboardConfig
custom resource (CR) in OpenShift. For more information, see Dashboard configuration options.Optional: In the Environment variables section, select and specify values for any environment variables.
Setting environment variables during the workbench configuration helps you save time later because you do not need to define them in the body of your notebooks, or with the IDE command line interface.
If you are using S3-compatible storage, add these recommended environment variables:
-
AWS_ACCESS_KEY_ID
specifies your Access Key ID for Amazon Web Services. -
AWS_SECRET_ACCESS_KEY
specifies your Secret access key for the account specified inAWS_ACCESS_KEY_ID
.
OpenShift AI stores the credentials as Kubernetes secrets in a protected namespace if you select Secret when you add the variable.
-
In the Cluster storage section, configure the storage for your workbench. Select one of the following options:
Create new persistent storage to create storage that is retained after you shut down your workbench. Complete the relevant fields to define the storage:
- Enter a name for the cluster storage.
- Enter a description for the cluster storage.
Select a storage class for the cluster storage.
NoteYou cannot change the storage class after you add the cluster storage to the workbench.
- Under Persistent storage size, enter a new size in gibibytes or mebibytes.
- Use existing persistent storage to reuse existing storage and select the storage from the Persistent storage list.
Optional: You can add a connection to your workbench. A connection is a resource that contains the configuration parameters needed to connect to a data source or sink, such as an object storage bucket. You can use storage buckets for storing data, models, and pipeline artifacts. You can also use a connection to specify the location of a model that you want to deploy.
In the Connections section, use an existing connection or create a new connection:
Use an existing connection as follows:
- Click Attach existing connections.
- From the Connection list, select a connection that you previously defined.
Create a new connection as follows:
- Click Create connection. The Add connection dialog appears.
- From the Connection type drop-down list, select the type of connection. The Connection details section appears.
If you selected S3 compatible object storage in the preceding step, configure the connection details:
- In the Connection name field, enter a unique name for the connection.
- Optional: In the Description field, enter a description for the connection.
- In the Access key field, enter the access key ID for the S3-compatible object storage provider.
- In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
- In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
- In the Region field, enter the default region of your S3-compatible object storage account.
- In the Bucket field, enter the name of your S3-compatible object storage bucket.
- Click Create.
If you selected URI in the preceding step, configure the connection details:
- In the Connection name field, enter a unique name for the connection.
- Optional: In the Description field, enter a description for the connection.
- In the URI field, enter the Uniform Resource Identifier (URI).
- Click Create.
- Click Create workbench.
Verification
- The workbench that you created appears on the Workbenches tab for the project.
- Any cluster storage that you associated with the workbench during the creation process appears on the Cluster storage tab for the project.
- The Status column on the Workbenches tab displays a status of Starting when the workbench server is starting, and Running when the workbench has successfully started.
-
Optional: Click the open icon (
) to open the IDE in a new window.
2.2. Starting a workbench
You can manually start a data science project’s workbench from the Workbenches tab on the project details page. By default, workbenches start immediately after you create them.
Prerequisites
- You have logged in to Red Hat OpenShift AI.
-
If you are using OpenShift AI groups, you are part of the user group or admin group (for example,
rhoai-users
orrhoai-admins
) in OpenShift. - You have created a data science project that contains a workbench.
Procedure
From the OpenShift AI dashboard, click Data science projects.
The Data science projects page opens.
Click the name of the project whose workbench you want to start.
A project details page opens.
- Click the Workbenches tab.
In the Status column for the workbench that you want to start, click Start.
The Status column changes from Stopped to Starting when the workbench server is starting, and then to Running when the workbench has successfully started. * Optional: Click the open icon (
) to open the IDE in a new window.
Verification
- The workbench that you started appears on the Workbenches tab for the project, with the status of Running.
2.3. Updating a project workbench
If your data science work requires you to change your workbench’s notebook image, container size, or identifying information, you can update the properties of your project’s workbench. If you require extra power for use with large datasets, you can assign accelerators to your workbench to optimize performance.
Prerequisites
- You have logged in to Red Hat OpenShift AI.
-
If you use OpenShift AI groups, you are part of the user group or admin group (for example,
rhoai-users
orrhoai-admins
) in OpenShift. - You have created a data science project that has a workbench.
Procedure
From the OpenShift AI dashboard, click Data science projects.
The Data science projects page opens.
Click the name of the project whose workbench you want to update.
A project details page opens.
- Click the Workbenches tab.
Click the action menu (⋮) beside the workbench that you want to update and then click Edit workbench.
The Edit workbench page opens.
- Update any of the workbench properties and then click Update workbench.
Verification
- The workbench that you updated appears on the Workbenches tab for the project.
2.4. Deleting a workbench from a data science project
You can delete workbenches from your data science projects to help you remove Jupyter notebooks that are no longer relevant to your work.
Prerequisites
- You have logged in to Red Hat OpenShift AI.
-
If you are using OpenShift AI groups, you are part of the user group or admin group (for example,
rhoai-users
orrhoai-admins
) in OpenShift. - You have created a data science project with a workbench.
Procedure
From the OpenShift AI dashboard, click Data science projects.
The Data science projects page opens.
Click the name of the project that you want to delete the workbench from.
A project details page opens.
- Click the Workbenches tab.
Click the action menu (⋮) beside the workbench that you want to delete and then click Delete workbench.
The Delete workbench dialog opens.
- Enter the name of the workbench in the text field to confirm that you intend to delete it.
- Click Delete workbench.
Verification
- The workbench that you deleted is no longer displayed on the Workbenches tab for the project.
- The custom resource (CR) associated with the workbench’s Jupyter notebook is deleted.