Search

Chapter 4. Creating a workbench and selecting an IDE

download PDF

A workbench is an isolated area where you can examine and work with ML models. You can also work with data and run programs, for example to prepare and clean data. While a workbench is not required if, for example, you only want to service an existing model, one is needed for most data science workflow tasks, such as writing code to process data, or training a model.

When you create a workbench, you specify an image (an IDE, packages, and other dependencies). Supported IDEs include JupyterLab, code-server (Technology Preview), and RStudio (Technology Preview).

The IDEs are based on a server-client architecture. Each IDE provides a server that runs in a container on the OpenShift cluster, while the user interface (the client) is displayed in your web browser. For example, the Jupyter notebook server runs in a container on the Red Hat OpenShift cluster. The client is the JupyterLab interface that opens in your web browser on your local computer. All of the commands that you enter in JupyterLab are executed by the notebook server. Similarly, other IDEs like code-server or RStudio Server provide a server that runs in a container on the OpenShift cluster, while the user interface is displayed in your web browser. This architecture allows you to interact through your local computer in a browser environment, while all processing occurs on the cluster. The cluster provides the benefits of larger available resources and security because the data being processed never leaves the cluster.

In a workbench, you also configure data connections (to access external data for training models and to save models so that you can deploy them) and cluster storage (for persisting data). Workbenches within the same project can share models and data through object storage with the data science pipelines and model servers.

For data science projects that require data retention, you can add container storage to the workbench you are creating.

Within a project, you can create multiple workbenches. When to create a new workbench depends on considerations, such as the following:

  • The workbench configuration (for example, CPU, RAM, or IDE). If you want to avoid editing the configuration of an existing workbench’s configuration to accomodate a new task, you can create a new workbench instead.
  • Separation of tasks or activities. For example, you might want to use one workbench for your Large Language Models (LLM) experimentation activities, another workbench dedicated to a demo, and another workbench for testing.

4.1. About workbench images

A workbench image (sometimes referred to as a notebook image) is optimized with the tools and libraries that you need for model development. You can use the provided workbench images or an OpenShift AI admin user can create custom workbench images adapted to your needs.

To provide a consistent, stable platform for your model development, many provided workbench images contain the same version of Python. Most workbench images available on OpenShift AI are pre-built and ready for you to use immediately after OpenShift AI is installed or upgraded.

For information about Red Hat support of workbench images and packages, see Red Hat OpenShift AI: Supported Configurations.

Red Hat OpenShift AI contains the following notebook images that are available by default.

Important

Notebook images denoted with (Technology Preview) in this table are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using Technology Preview features in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Table 4.1. Default notebook images
Image nameDescription

CUDA

If you are working with compute-intensive data science models that require GPU support, use the Compute Unified Device Architecture (CUDA) notebook image to gain access to the NVIDIA CUDA Toolkit. Using this toolkit, you can optimize your work by using GPU-accelerated libraries and optimization tools.

Standard Data Science

Use the Standard Data Science notebook image for models that do not require TensorFlow or PyTorch. This image contains commonly-used libraries to assist you in developing your machine learning models.

TensorFlow

TensorFlow is an open source platform for machine learning. With TensorFlow, you can build, train and deploy your machine learning models. TensorFlow contains advanced data visualization features, such as computational graph visualizations. It also allows you to easily monitor and track the progress of your models.

PyTorch

PyTorch is an open source machine learning library optimized for deep learning. If you are working with computer vision or natural language processing models, use the Pytorch notebook image.

Minimal Python

If you do not require advanced machine learning features, or additional resources for compute-intensive data science work, you can use the Minimal Python image to develop your models.

TrustyAI

Use the TrustyAI notebook image to leverage your data science work with model explainability, tracing, and accountability, and runtime monitoring.

HabanaAI

The HabanaAI notebook image optimizes high-performance deep learning (DL) with Habana Gaudi devices. Habana Gaudi devices accelerate DL training workloads and maximize training throughput and efficiency.

code-server (Technology Preview)

With the code-server notebook image, you can customize your notebook environment to meet your needs using a variety of extensions to add new languages, themes, debuggers, and connect to additional services. Enhance the efficiency of your data science work with syntax highlighting, auto-indentation, and bracket matching, as well as an automatic task runner for seamless automation. For more information, see code-server in GitHub.

Note

Elyra-based pipelines are not available with the code-server notebook image.

RStudio Server (Technology preview)

Use the RStudio Server notebook image to access the RStudio IDE, an integrated development environment for R, a programming language for statistical computing and graphics. For more information, see the RStudio Server site.

To use the RStudio Server notebook image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the rstudio-rhel9 image stream. For more information, see Building the RStudio Server workbench images.

Important

Disclaimer:
Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through https://rstudio.org/ and is subject to RStudio licensing terms. Review the licensing terms before you use this sample workbench.

CUDA - RStudio Server (Technology preview)

Use the CUDA - RStudio Server notebook image to access the RStudio IDE and NVIDIA CUDA Toolkit. RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can optimize your work using GPU-accelerated libraries and optimization tools. For more information, see the RStudio Server site.

To use the CUDA - RStudio Server notebook image, you must first build it by creating a secret and triggering the BuildConfig, and then enable it in the OpenShift AI UI by editing the cuda-rstudio-rhel9 image stream. For more information, see Building the RStudio Server workbench images.

Important

Disclaimer:
Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through https://rstudio.org/ and is subject to RStudio licensing terms. Review the licensing terms before you use this sample workbench.

The CUDA - RStudio Server notebook image contains NVIDIA CUDA technology. CUDA licensing information is available at https://docs.nvidia.com/cuda/. Review the licensing terms before you use this sample workbench.

4.2. Building the RStudio Server workbench images

Important

The RStudio Server and CUDA - RStudio Server workbench images are currently available in Red Hat OpenShift AI as Technology Preview features.

Note

The RStudio Server workbench images are currently unavailable for disconnected environments.

Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Red Hat OpenShift AI includes the following RStudio Server workbench images:

  • RStudio Server workbench image

    With the RStudio Server workbench image, you can access the RStudio IDE, an integrated development environment for the R programming language. R is used for statistical computing and graphics to support data analysis and predictions.

    Important

    Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench.

  • CUDA - RStudio Server workbench image

    With the CUDA - RStudio Server workbench image, you can access the RStudio IDE and NVIDIA CUDA Toolkit. The RStudio IDE is an integrated development environment for the R programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can enhance your work by using GPU-accelerated libraries and optimization tools.

    Important

    Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench.

    The CUDA - RStudio Server workbench image contains NVIDIA CUDA technology. CUDA licensing information is available in the CUDA Toolkit documentation. You should review their licensing terms before you use this sample workbench.

To use the RStudio Server and CUDA - RStudio Server workbench images, you must first build them by creating a secret and triggering the BuildConfig, and then enable them in the OpenShift AI UI by editing the rstudio-rhel9 and cuda-rstudio-rhel9 image streams.

Prerequisites

  • Before starting the RStudio Server build process, you have at least 1 CPU and 2Gi memory available for rstudio-server-rhel9, and 1.5 CPUs and 8Gi memory available for cuda-rstudio-server-rhel9 on your cluster.
  • You are logged in to your OpenShift cluster.
  • You have the cluster-admin role in OpenShift Container Platform.
  • You have an active Red Hat Enterprise Linux (RHEL) subscription.

Procedure

  1. Create a secret with Subscription Manager credentials. These are usually your Red Hat Customer Portal username and password.

    Note: The secret must be named rhel-subscription-secret, and its USERNAME and PASSWORD keys must be in capital letters.

    oc create secret generic rhel-subscription-secret --from-literal=USERNAME=<username> --from-literal=PASSWORD=<password> -n redhat-ods-applications
  2. Start the build:

    1. To start the lightweight RStudio Server build:

      oc start-build rstudio-server-rhel9 -n redhat-ods-applications --follow
    2. To start the CUDA-enabled RStudio Server build, trigger the cuda-rhel9 BuildConfig:

      oc start-build cuda-rhel9 -n redhat-ods-applications --follow

      The cuda-rhel9 build is a prerequisite for cuda-rstudio-rhel9. The cuda-rstudio-rhel9 build starts automatically.

  3. Confirm that the build process has completed successfully using the following command. Successful builds appear as Complete.

    oc get builds -n redhat-ods-applications
  4. After the builds complete successfully, use the following commands to make the workbench images available in the OpenShift AI UI.

    1. To enable the RStudio Server workbench image:

      oc label -n redhat-ods-applications imagestream rstudio-rhel9 opendatahub.io/notebook-image='true'
    2. To enable the CUDA - RStudio Server workbench image:

      oc label -n redhat-ods-applications imagestream cuda-rstudio-rhel9 opendatahub.io/notebook-image='true'

Verification

  • You can see RStudio Server and CUDA - RStudio Server images on the Applications Enabled menu in the Red Hat OpenShift AI dashboard.
  • You can see R Studio Server or CUDA - RStudio Server in the Data Science Projects Workbenches Create workbench Notebook image Image selection dropdown list.

4.3. Creating a workbench

When you create a workbench, you specify an image (an IDE, packages, and other dependencies). You can also configure data connections, cluster storage, and add container storage.

Prerequisites

  • You have logged in to Red Hat OpenShift AI.
  • If you use specialized OpenShift AI groups, you are part of the user group or admin group (for example, rhoai-users or rhoai-admins ) in OpenShift.
  • You created a project.
  • If you created a Simple Storage Service (S3) account outside of Red Hat OpenShift AI and you want to create data connections to your existing S3 storage buckets, you have the following credential information for the storage buckets:

    • Endpoint URL
    • Access key
    • Secret key
    • Region
    • Bucket name

    For information about working with data stored in AWS S3, see Integrating data from Amazon S3.

Procedure

  1. From the OpenShift AI dashboard, click Data Science Projects.
  2. Click the name of the project that you want to add the workbench to.

    A Details page for the project opens.

  3. In the Workbenches section, click Create a workbench.
  4. In the Create workbench page, configure the properties of the workbench that you are creating.

    1. In the Name field, enter a name for your workbench.
    2. Optional: In the Description field, enter a description to define your workbench.
    3. In the Notebook image section, complete the fields to specify the workbench image to use with your workbench.

      From the Image selection list, select a workbench image that suits your use case. A workbench image includes an IDE and Python packages (reusable code). Optionally, click the View package information option to view a list of packages that are included in the image that you selected.

      If the workbench image has multiple versions available, select the workbench image version to use from the Versions section. To use the latest package versions, Red Hat recommends that you use the most recently added image.

      Note

      You can change the workbench image after you create the workbench.

    4. In the Deployment size section, from the Container size list, select a container size for your server. The container size controls the number of CPUs, the amount of memory, and the minimum and maximum request capacity of the container.
    5. Optional: Select and specify values for any environment variables.

      Setting environment variables during the workbench configuration helps you save time later because you do not need to define them in the body of your notebooks, or with the IDE command line interface.

      If you are using S3-compatible storage, add these recommended environment variables:

      • AWS_ACCESS_KEY_ID specifies your Access Key ID for Amazon Web Services.
      • AWS_SECRET_ACCESS_KEY specifies your Secret access key for the account specified in AWS_ACCESS_KEY_ID.

      OpenShift AI stores the credentials as Kubernetes secrets in a protected namespace if you select Secret when you add the variable.

    6. Configure the storage for your workbench. Select one of the following options:

      • Create new persistent storage to create storage that is retained after you shut down your workbench. Complete the relevant fields to define the storage.
      • Use existing persistent storage to reuse existing storage and select the storage from the Persistent storage list.
    7. Optionally, you can add a data connection to your workbench. A data connection is a resource that contains the configuration parameters needed to connect to a data source or an object storage bucket. Currently, only S3-Compatible data connections are supported. You can use storage buckets for storing data, models, and pipeline artifacts. You can also use a data connection to specify the location of a model that you want to deploy.

      In the Data connections section, select the Use a data connection checkbox.

      • Create a new data connection as follows:

        1. Select Create new data connection.
        2. In the Name field, enter a unique name for the data connection.
        3. In the Access key field, enter the access key ID for the S3-compatible object storage provider.
        4. In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
        5. In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
        6. In the Region field, enter the default region of your S3-compatible object storage account.
        7. In the Bucket field, enter the name of your S3-compatible object storage bucket.
      • Use an existing data connection as follows:

        1. Select Use existing data connection.
        2. From the Data connection list, select a data connection that you previously defined.
  5. Click Create workbench.

Verification

  • The workbench that you created appears on the Workbenches tab for the project.
  • Any cluster storage that you associated with the workbench during the creation process appears on the Cluster storage tab for the project.
  • The Status column on the Workbenches tab displays a status of Starting when the workbench server is starting, and Running when the workbench has successfully started.
  • Optionally, click the Open link to open the IDE in a new window.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.