Chapter 2. Creating custom workbench images


Red Hat OpenShift AI includes a selection of default workbench images that a data scientist can select when they create or edit a workbench.

In addition, you can import a custom workbench image, for example, if you want to add libraries that data scientists often use, or if your data scientists require a specific version of a library that is different from the version provided in a default image. Custom workbench images are also useful if your data scientists require operating system packages or applications because they cannot install them directly in their running environment (data scientist users do not have root access, which is needed for those operations).

A custom workbench image is simply a container image. You build one as you would build any standard container image, by using a Containerfile (or Dockerfile). You start from an existing image (the FROM instruction), and then add your required elements.

You have the following options for creating a custom workbench image:

Important

Red Hat supports adding custom workbench images to your deployment of OpenShift AI, ensuring that they are available for selection when creating a workbench. However, Red Hat does not support the contents of your custom workbench image. That is, if your custom workbench image is available for selection during workbench creation, but does not create a usable workbench, Red Hat does not provide support to fix your custom workbench image.

Additional resources

For a list of the OpenShift AI default workbench images and their preinstalled packages, see Supported Configurations.

For more information about creating images, see the following resources:

2.1. Creating a custom image from a default OpenShift AI image

After Red Hat OpenShift AI is installed on a cluster, you can find the default workbench images in the OpenShift console, under Builds ImageStreams for the redhat-ods-applications project.

You can create a custom image by adding OS packages or applications to a default OpenShift AI image.

Prerequisites

  • You know which default image you want to use as the base for your custom image.

    See Supported Configurations for a list of the OpenShift AI default workbench images and their preinstalled packages.

  • You have cluster-admin access to the OpenShift console for the cluster where OpenShift AI is installed.

Procedure

  1. Obtain the location of the default image that you want to use as the base for your custom image.

    1. In the OpenShift console, select Builds ImageStreams.
    2. Select the redhat-ods-applications project.
    3. From the list of installed imagestreams, click the name of the image that you want to use as the base for your custom image. For example, click pytorch.
    4. On the ImageStream details page, click YAML.
    5. In the spec:tags section, find the tag for the version of the image that you want to use.

      The location of the original image is shown in the tag’s from:name section, for example:

      name: 'quay.io/modh/odh-pytorch-notebook@sha256:b68e0192abf7d…'

    6. Copy this location for use in your custom image.
  2. Create a standard Containerfile or Dockerfile.
  3. For the FROM instruction, specify the base image location that you copied in Step 1, for example:

    FROM quay.io/modh/odh-pytorch-notebook@sha256:b68e0…

  4. Optional: Install OS images:

    1. Switch to USER 0 (USER 0 is required to install OS packages).
    2. Install the packages.
    3. Switch back to USER 1001.

      The following example creates a custom workbench image that adds Java to the default PyTorch image:

       FROM quay.io/modh/odh-pytorch-notebook@sha256:b68e0…
      
       USER 0
      
       RUN INSTALL_PKGS="java-11-openjdk java-11-openjdk-devel" && \
          dnf install -y --setopt=tsflags=nodocs $INSTALL_PKGS && \
          dnf -y clean all --enablerepo=*
      
       USER 1001
  5. Optional: Add Python packages:

    1. Specify USER 1001.
    2. Copy the requirements.txt file.
    3. Install the packages.

      The following example installs packages from the requirements.txt file in the default PyTorch image:

       FROM quay.io/modh/odh-pytorch-notebook@sha256:b68e0…
      
       USER 1001
      
       COPY requirements.txt ./requirements.txt
      
       RUN pip install -r requirements.txt
  6. Build the image file. For example, you can use podman build locally where the image file is located and then push the image to a registry that is accessible to OpenShift AI:

    $ podman build -t my-registry/my-custom-image:0.0.1 .
    $ podman push my-registry/my-custom-image:0.0.1

    Alternatively, you can leverage OpenShift’s image build capabilities by using BuildConfig.

2.2. Creating a custom image from your own image

You can build your own custom image. However, you must make sure that your image is compatible with OpenShift and OpenShift AI.

Additional resources

2.2.1. Basic guidelines for creating your own workbench image

The following basic guidelines provide information to consider when you build your own custom workbench image.

Designing your image to run with USER 1001

In OpenShift, your container will run with a random UID and a GID of 0. Make sure that your image is compatible with these user and group requirements, especially if you need write access to directories. Best practice is to design your image to run with USER 1001.

Avoid placing artifacts in $HOME

The persistent volume attached to the workbench will be mounted on /opt/app-root/src. This location is also the location of $HOME. Therefore, do not put any files or other resources directly in $HOME because they won’t be visible after the workbench is deployed (and the persistent volume is mounted).

Specifying the API endpoint

OpenShift readiness and liveness probes will query the /api endpoint. For a Jupyter IDE, this is the default endpoint. For other IDEs, you must implement the /api endpoint.

2.2.2. Advanced guidelines for creating your own workbench image

The following guidelines provide information to consider when you build your own custom workbench image.

Minimizing image size

A workbench image uses a "layered" file system. Every time you use a COPY or a RUN command in your workbench image file, a new layer is created. Artifacts are not deleted. When you remove an artifact, for example, a file, it is "masked" in the next layer. Therefore, consider the following guidelines when you create your workbench image file.

  • Avoid using the dnf update command.

    • If you start from an image that is constantly updated, such as ubi9/python-39 from the Red Hat Catalog, you might not need to use the dnf update command. This command fetches new metadata, updates files that might not have impact, and increases the workbench image size.
    • Point to a newer version of your base image rather than performing a dnf update on an older version.
  • Group RUN commands. Chain your commands by adding && \ at the end of each line.
  • If you must compile code (such as a library or an application) to include in your custom image, implement multi-stage builds so that you avoid including the build artifacts in your final image. That is, compile the library or application in an intermediate image and then copy the result to your final image, leaving behind build artifacts that you do not want included.

Setting access to files and directories

  • Set the ownership of files and folders to 1001:0 (user "default", group "0"), for example:

    COPY --chown=1001:0 os-packages.txt ./

    On OpenShift, every container is in a standard namespace (unless you modify security). The container runs with a user that has a random user ID (uid) and with a group ID (gid) of 0. Therefore, all folders that you want to write to - and all the files you want to (temporarily) modify - in your image must be accessible by the user that has the random user ID (uid). Alternatively, you can set access to any user, as shown in the following example:

    COPY --chmod=775 os-packages.txt ./
  • Build your image with /opt/app-root/src as the default location for the data that you want persisted, for example:

    WORKDIR /opt/app-root/src

    When a user launches a workbench from the OpenShift AI Applications Enabled page, the "personal" volume of the user is mounted at /opt/app-root/src. Because this location is not configurable, when you build your custom image, you must specify this default location for persisted data.

  • Fix permissions to support PIP (the package manager for Python packages) in OpenShift environments. Add the following command to your custom image (if needed, change python3.9 to the Python version that you are using):

    chmod -R g+w /opt/app-root/lib/python3.9/site-packages && \
       fix-permissions /opt/app-root -P
  • A service within your workbench image must answer at ${NB_PREFIX}/api, otherwise the OpenShift liveness/readiness probes fail and delete the pod for the workbench image.

    The NB_PREFIX environment variable specifies the URL path where the container is expected to be listening.

    The following is an example of an Nginx configuration:

    location = ${NB_PREFIX}/api {
    	return 302  /healthz;
    	access_log  off;
    }
  • For idle culling to work, the ${NB_PREFIX}/api/kernels URL must return a specifically-formatted JSON payload, as shown in the following example:

    The following is an example of an Nginx configuration:

    location = ${NB_PREFIX}/api/kernels {
    	return 302 $custom_scheme://$http_host/api/kernels/;
    	access_log  off;
    }
    
    location ${NB_PREFIX}/api/kernels/ {
    	return 302 $custom_scheme://$http_host/api/kernels/;
    	access_log  off;
    }
    
    location /api/kernels/ {
      index access.cgi;
      fastcgi_index access.cgi;
      gzip  off;
      access_log	off;
     }

    The returned JSON payload should be:

    {"id":"rstudio","name":"rstudio","last_activity":(time in ISO8601 format),"execution_state":"busy","connections": 1}

Enabling CodeReady Builder (CRB) and Extra Packages for Enterprise Linux (EPEL)

CRB and EPEL are repositories that provide packages which are absent from a standard Red Hat Enterprise Linux (RHEL) or Universal Base Image (UBI) installation. They are useful and required for installing some software, for example, RStudio.

On UBI9 images, CRB is enabled by default. To enable EPEL on UBI9-based images, run the following command:

 RUN yum install -y https://download.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm

To enable CRB and EPEL on Centos Stream 9-based images, run the following command:

 RUN yum install -y yum-utils && \
    yum-config-manager --enable crb && \
    yum install -y https://download.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm

Adding Elyra compatibility

Support for data science pipelines V2 (provided with the odh-elyra package) is available in Red Hat OpenShift AI version 2.9 and later. Previous versions of OpenShift AI support data science pipelines V1 (provided with the elyra package).

If you want your custom image to support data science pipelines V2, you must address the following requirements:

  • Include the odh-elyra package for having support with Data Science pipeline V2 (not the elyra package), for example:

     USER 1001
    
     RUN pip install odh-elyra
  • If you want to include the data science pipeline configuration automatically, as a runtime configuration, add an annotation when you import a custom workbench image.

2.3. Enabling custom images in OpenShift AI

All OpenShift AI administrators can import custom workbench images, by default, by selecting the Settings Notebook images navigation option in the OpenShift AI dashboard.

If the Settings Notebook images option is not available, check the following settings, depending on which navigation element does not appear in the dashboard:

  • The Settings menu does not appear in the OpenShift AI navigation bar

    The visibility of the OpenShift AI dashboard Settings menu is determined by your user permissions. By default, the Settings menu is available to OpenShift AI administration users (users that are members of the rhoai-admins group). Users with the OpenShift cluster-admin role are automatically added to the rhoai-admins group and are granted administrator access in OpenShift AI.

    For more information about user permissions, see Managing users and groups.

  • The Notebook images menu option does not appear under the Settings menu

    The visibility of the Notebook images menu option is controlled in the dashboard configuration, by the value of the dashboardConfig: disableBYONImageStream option. It is set to false (the Notebook images menu option is visible) by default.

    You need cluster-admin permissions to edit the dashboard configuration.

    For more information about setting dashboard configuration options, see Customizing the dashboard.

2.4. Importing a custom workbench image

In addition to workbench images provided and supported by Red Hat and independent software vendors (ISVs), you can import custom workbench images that cater to your project’s specific requirements.

You must import it so that your OpenShift AI users (data scientists) can access it when they create a project workbench.

Red Hat supports adding custom workbench images to your deployment of OpenShift AI, ensuring that they are available for selection when creating a workbench. However, Red Hat does not support the contents of your custom workbench image. That is, if your custom workbench image is available for selection during workbench creation, but does not create a usable workbench, Red Hat does not provide support to fix your custom workbench image.

Prerequisites

  • You have logged in to OpenShift AI as a user with OpenShift AI administrator privileges.
  • Your custom image exists in an image registry that is accessible to OpenShift AI.
  • The Settings Notebook images dashboard navigation menu option is enabled, as described in Enabling custom workbench images in OpenShift AI.
  • If you want to associate an accelerator with the custom image that you want to import, you know the accelerator’s identifier - the unique string that identifies the hardware accelerator.

Procedure

  1. From the OpenShift AI dashboard, click Settings Notebook images.

    The Notebook images page appears. Previously imported images are displayed. To enable or disable a previously imported image, on the row containing the relevant image, click the toggle in the Enable column.

  2. Optional: If you want to associate an accelerator and you have not already created an accelerator profile, click Create profile on the row containing the image and complete the relevant fields. If the image does not contain an accelerator identifier, you must manually configure one before creating an associated accelerator profile.
  3. Click Import new image. Alternatively, if no previously imported images were found, click Import image.

    The Import Notebook images dialog appears.

  4. In the Image location field, enter the URL of the repository containing the image. For example: quay.io/my-repo/my-image:tag, quay.io/my-repo/my-image@sha256:xxxxxxxxxxxxx, or docker.io/my-repo/my-image:tag.
  5. In the Name field, enter an appropriate name for the image.
  6. Optional: In the Description field, enter a description for the image.
  7. Optional: From the Accelerator identifier list, select an identifier to set its accelerator as recommended with the image. If the image contains only one accelerator identifier, the identifier name displays by default.
  8. Optional: Add software to the image. After the import has completed, the software is added to the image’s meta-data and displayed on the Jupyter server creation page.

    1. Click the Software tab.
    2. Click the Add software button.
    3. Click Edit ( The Edit icon ).
    4. Enter the Software name.
    5. Enter the software Version.
    6. Click Confirm ( The Confirm icon ) to confirm your entry.
    7. To add additional software, click Add software, complete the relevant fields, and confirm your entry.
  9. Optional: Add packages to the notebook images. After the import has completed, the packages are added to the image’s meta-data and displayed on the Jupyter server creation page.

    1. Click the Packages tab.
    2. Click the Add package button.
    3. Click Edit ( The Edit icon ).
    4. Enter the Package name. For example, if you want to include data science pipeline V2 automatically, as a runtime configuration, type odh-elyra.
    5. Enter the package Version. For example, type 3.16.7.
    6. Click Confirm ( The Confirm icon ) to confirm your entry.
    7. To add an additional package, click Add package, complete the relevant fields, and confirm your entry.
  10. Click Import.

Verification

  • The image that you imported is displayed in the table on the Notebook images page.
  • Your custom image is available for selection when a user creates a workbench.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.