Chapter 4. Creating a workbench and selecting an IDE


A workbench is an isolated area where you can examine and work with ML models. You can also work with data and run programs, for example to prepare and clean data. While a workbench is not required if, for example, you only want to service an existing model, one is needed for most data science workflow tasks, such as writing code to process data or training a model.

When you create a workbench, you specify an image (an IDE, packages, and other dependencies). Supported IDEs include JupyterLab, code-server, and RStudio (Technology Preview).

The IDEs are based on a server-client architecture. Each IDE provides a server that runs in a container on the OpenShift cluster, while the user interface (the client) is displayed in your web browser. For example, the Jupyter notebook server runs in a container on the Red Hat OpenShift cluster. The client is the JupyterLab interface that opens in your web browser on your local computer. All of the commands that you enter in JupyterLab are executed by the notebook server. Similarly, other IDEs like code-server or RStudio Server provide a server that runs in a container on the OpenShift cluster, while the user interface is displayed in your web browser. This architecture allows you to interact through your local computer in a browser environment, while all processing occurs on the cluster. The cluster provides the benefits of larger available resources and security because the data being processed never leaves the cluster.

In a workbench, you can also configure connections (to access external data for training models and to save models so that you can deploy them) and cluster storage (for persisting data). Workbenches within the same project can share models and data through object storage with the data science pipelines and model servers.

For data science projects that require data retention, you can add container storage to the workbench you are creating.

Within a project, you can create multiple workbenches. When to create a new workbench depends on considerations, such as the following:

  • The workbench configuration (for example, CPU, RAM, or IDE). If you want to avoid editing the configuration of an existing workbench’s configuration to accommodate a new task, you can create a new workbench instead.
  • Separation of tasks or activities. For example, you might want to use one workbench for your Large Language Models (LLM) experimentation activities, another workbench dedicated to a demo, and another workbench for testing.

4.1. About workbench images

A workbench image (sometimes referred to as a notebook image) is optimized with the tools and libraries that you need for model development. You can use the provided workbench images or an OpenShift AI administrator can create custom workbench images adapted to your needs.

To provide a consistent, stable platform for your model development, many provided workbench images contain the same version of Python. Most workbench images available on OpenShift AI are pre-built and ready for you to use immediately after OpenShift AI is installed or upgraded.

For information about Red Hat support of workbench images and packages, see Red Hat OpenShift AI: Supported Configurations.

The following table lists the workbench images that are installed with Red Hat OpenShift AI by default.

If the preinstalled packages that are provided in these images are not sufficient for your use case, you have the following options:

  • Install additional libraries after launching a default image. This option is good if you want to add libraries on an ad hoc basis as you develop models. However, it can be challenging to manage the dependencies of installed libraries and your changes are not saved when the workbench restarts.
  • Create a custom image that includes the additional libraries or packages. For more information, see Creating custom workbench images.
Important

Workbench images denoted with (Technology Preview) in this table are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using Technology Preview features in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Table 4.1. Default workbench images
Image nameDescription

CUDA

If you are working with compute-intensive data science models that require GPU support, use the Compute Unified Device Architecture (CUDA) workbench image to gain access to the NVIDIA CUDA Toolkit. Using this toolkit, you can optimize your work by using GPU-accelerated libraries and optimization tools.

Standard Data Science

Use the Standard Data Science workbench image for models that do not require TensorFlow or PyTorch. This image contains commonly-used libraries to assist you in developing your machine learning models.

TensorFlow

TensorFlow is an open source platform for machine learning. With TensorFlow, you can build, train and deploy your machine learning models. TensorFlow contains advanced data visualization features, such as computational graph visualizations. It also allows you to easily monitor and track the progress of your models.

PyTorch

PyTorch is an open source machine learning library optimized for deep learning. If you are working with computer vision or natural language processing models, use the Pytorch workbench image.

Minimal Python

If you do not require advanced machine learning features, or additional resources for compute-intensive data science work, you can use the Minimal Python image to develop your models.

TrustyAI

Use the TrustyAI workbench image to leverage your data science work with model explainability, tracing, and accountability, and runtime monitoring. See the TrustyAI Explainability repository for some example Jupyter notebooks.

code-server

With the code-server workbench image, you can customize your workbench environment to meet your needs using a variety of extensions to add new languages, themes, debuggers, and connect to additional services. Enhance the efficiency of your data science work with syntax highlighting, auto-indentation, and bracket matching, as well as an automatic task runner for seamless automation. For more information, see code-server in GitHub.

NOTE: Elyra-based pipelines are not available with the code-server workbench image.

RStudio Server (Technology preview)

Use the RStudio Server workbench image to access the RStudio IDE, an integrated development environment for R, a programming language for statistical computing and graphics. For more information, see the RStudio Server site.

To use the RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the rstudio-rhel9 image stream. For more information, see Building the RStudio Server workbench images.

Important

Disclaimer:
Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through https://rstudio.org/ and is subject to RStudio licensing terms. Review the licensing terms before you use this sample workbench.

CUDA - RStudio Server (Technology Preview)

Use the CUDA - RStudio Server workbench image to access the RStudio IDE and NVIDIA CUDA Toolkit. RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can optimize your work using GPU-accelerated libraries and optimization tools. For more information, see the RStudio Server site.

To use the CUDA - RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the cuda-rstudio-rhel9 image stream. For more information, see Building the RStudio Server workbench images.

Important

Disclaimer:
Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through https://rstudio.org/ and is subject to RStudio licensing terms. Review the licensing terms before you use this sample workbench.

The CUDA - RStudio Server workbench image contains NVIDIA CUDA technology. CUDA licensing information is available at https://docs.nvidia.com/cuda/. Review the licensing terms before you use this sample workbench.

4.2. Building the RStudio Server workbench images

Important

The RStudio Server and CUDA - RStudio Server workbench images are currently available in Red Hat OpenShift AI as Technology Preview features.

Note

The RStudio Server workbench images are currently unavailable for disconnected environments.

Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Red Hat OpenShift AI includes the following RStudio Server workbench images:

  • RStudio Server workbench image

    With the RStudio Server workbench image, you can access the RStudio IDE, an integrated development environment for the R programming language. R is used for statistical computing and graphics to support data analysis and predictions.

    Important

    Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench.

  • CUDA - RStudio Server workbench image

    With the CUDA - RStudio Server workbench image, you can access the RStudio IDE and NVIDIA CUDA Toolkit. The RStudio IDE is an integrated development environment for the R programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can enhance your work by using GPU-accelerated libraries and optimization tools.

    Important

    Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench.

    The CUDA - RStudio Server workbench image contains NVIDIA CUDA technology. CUDA licensing information is available in the CUDA Toolkit documentation. You should review their licensing terms before you use this sample workbench.

To use the RStudio Server and CUDA - RStudio Server workbench images, you must first build them by creating a secret and triggering the BuildConfig, and then enable them in the OpenShift AI UI by editing the rstudio-rhel9 and cuda-rstudio-rhel9 image streams.

Prerequisites

  • Before starting the RStudio Server build process, you have at least 1 CPU and 2Gi memory available for rstudio-server-rhel9, and 1.5 CPUs and 8Gi memory available for cuda-rstudio-server-rhel9 on your cluster.
  • You are logged in to your OpenShift cluster.
  • You have the cluster-admin role in OpenShift.
  • You have an active Red Hat Enterprise Linux (RHEL) subscription.

Procedure

  1. Create a secret with Subscription Manager credentials. These are usually your Red Hat Customer Portal username and password.

    Note: The secret must be named rhel-subscription-secret, and its USERNAME and PASSWORD keys must be in capital letters.

    oc create secret generic rhel-subscription-secret --from-literal=USERNAME=<username> --from-literal=PASSWORD=<password> -n redhat-ods-applications
  2. Start the build:

    1. To start the lightweight RStudio Server build:

      oc start-build rstudio-server-rhel9 -n redhat-ods-applications --follow
    2. To start the CUDA-enabled RStudio Server build, trigger the cuda-rhel9 BuildConfig:

      oc start-build cuda-rhel9 -n redhat-ods-applications --follow

      The cuda-rhel9 build is a prerequisite for cuda-rstudio-rhel9. The cuda-rstudio-rhel9 build starts automatically.

  3. Confirm that the build process has completed successfully using the following command. Successful builds appear as Complete.

    oc get builds -n redhat-ods-applications
  4. After the builds complete successfully, use the following commands to make the workbench images available in the OpenShift AI UI.

    1. To enable the RStudio Server workbench image:

      oc label -n redhat-ods-applications imagestream rstudio-rhel9 opendatahub.io/notebook-image='true'
    2. To enable the CUDA - RStudio Server workbench image:

      oc label -n redhat-ods-applications imagestream cuda-rstudio-rhel9 opendatahub.io/notebook-image='true'

Verification

  • You can see RStudio Server and CUDA - RStudio Server images on the Applications Enabled menu in the Red Hat OpenShift AI dashboard.
  • You can see R Studio Server or CUDA - RStudio Server in the Data Science Projects Workbenches Create workbench Notebook image Image selection dropdown list.

4.3. Creating a workbench

When you create a workbench, you specify an image (an IDE, packages, and other dependencies). You can also configure connections, cluster storage, and add container storage.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you use OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You created a project.
  • If you created a Simple Storage Service (S3) account outside of Red Hat OpenShift AI and you want to create connections to your existing S3 storage buckets, you have the following credential information for the storage buckets:

    • Endpoint URL
    • Access key
    • Secret key
    • Region
    • Bucket name

    For more information, see Working with data in an S3-compatible object store.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.

    The Data Science Projects page opens.

  2. Click the name of the project that you want to add the workbench to.

    A project details page opens.

  3. Click the Workbenches tab.
  4. Click Create workbench.

    The Create workbench page opens.

  5. In the Name field, enter a unique name for your workbench.
  6. Optional: If you want to change the default resource name for your workbench, click Edit resource name.

    The resource name is what your resource is labeled in OpenShift. Valid characters include lowercase letters, numbers, and hyphens (-). The resource name cannot exceed 30 characters, and it must start with a letter and end with a letter or number.

    Note: You cannot change the resource name after the workbench is created. You can edit only the display name and the description.

  7. Optional: In the Description field, enter a description for your workbench.
  8. In the Notebook image section, complete the fields to specify the workbench image to use with your workbench.

    From the Image selection list, select a workbench image that suits your use case. A workbench image includes an IDE and Python packages (reusable code). Optionally, click View package information to view a list of packages that are included in the image that you selected.

    If the workbench image has multiple versions available, select the workbench image version to use from the Version selection list. To use the latest package versions, Red Hat recommends that you use the most recently added image.

    Note

    You can change the workbench image after you create the workbench.

  9. In the Deployment size section, from the Container size list, select a container size for your server. The container size specifies the number of CPUs and the amount of memory allocated to the container, setting the guaranteed minimum (request) and maximum (limit) for both.
  10. Optional: In the Environment variables section, select and specify values for any environment variables.

    Setting environment variables during the workbench configuration helps you save time later because you do not need to define them in the body of your notebooks, or with the IDE command line interface.

    If you are using S3-compatible storage, add these recommended environment variables:

    • AWS_ACCESS_KEY_ID specifies your Access Key ID for Amazon Web Services.
    • AWS_SECRET_ACCESS_KEY specifies your Secret access key for the account specified in AWS_ACCESS_KEY_ID.

    OpenShift AI stores the credentials as Kubernetes secrets in a protected namespace if you select Secret when you add the variable.

  11. In the Cluster storage section, configure the storage for your workbench. Select one of the following options:

    • Create new persistent storage to create storage that is retained after you shut down your workbench. Complete the relevant fields to define the storage:

      1. Enter a name for the cluster storage.
      2. Enter a description for the cluster storage.
      3. Select a storage class for the cluster storage.

        Note

        You cannot change the storage class after you add the cluster storage to the workbench.

      4. Under Persistent storage size, enter a new size in gibibytes or mebibytes.
    • Use existing persistent storage to reuse existing storage and select the storage from the Persistent storage list.
  12. Optional: You can add a connection to your workbench. A connection is a resource that contains the configuration parameters needed to connect to a data source or sink, such as an object storage bucket. You can use storage buckets for storing data, models, and pipeline artifacts. You can also use a connection to specify the location of a model that you want to deploy.

    In the Connections section, use an existing connection or create a new connection:

    • Use an existing connection as follows:

      1. Click Attach existing connections.
      2. From the Connection list, select a connection that you previously defined.
    • Create a new connection as follows:

      1. Click Create connection. The Add connection dialog appears.
      2. From the Connection type drop-down list, select the type of connection. The Connection details section appears.
      3. If you selected S3 compatible object storage in the preceding step, configure the connection details:

        1. In the Connection name field, enter a unique name for the connection.
        2. Optional: In the Description field, enter a description for the connection.
        3. In the Access key field, enter the access key ID for the S3-compatible object storage provider.
        4. In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
        5. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
        6. In the Region field, enter the default region of your S3-compatible object storage account.
        7. In the Bucket field, enter the name of your S3-compatible object storage bucket.
        8. Click Create.
      4. If you selected URI in the preceding step, configure the connection details:

        1. In the Connection name field, enter a unique name for the connection.
        2. Optional: In the Description field, enter a description for the connection.
        3. In the URI field, enter the Uniform Resource Identifier (URI).
        4. Click Create.
  13. Click Create workbench.

Verification

  • The workbench that you created appears on the Workbenches tab for the project.
  • Any cluster storage that you associated with the workbench during the creation process appears on the Cluster storage tab for the project.
  • The Status column on the Workbenches tab displays a status of Starting when the workbench server is starting, and Running when the workbench has successfully started.
  • Optional: Click the Open link to open the IDE in a new window.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.