Chapter 2. Set up your working environment
To set up your working environment for customizing models, complete these tasks:
- For disconnected environments, mirror the Python index.
- Create a custom workbench image that is based on a base image that is configured to use the Red Hat Python index and install packages. Install JupyterLab in your custom workbench image so that you can run example notebooks.
- From your running workbench, import example notebooks.
2.1. About the Red Hat Python index Copy linkLink copied to clipboard!
Red Hat AI includes a maintained Python package index that provides secure and reliable access to supported libraries, with full support for disconnected environments. For details about Red Hat support for the Python package index, see Support philosophy: A secure platform.
Table 2.1 lists the images that are configured to use the Red Hat Python index.
| Accelerator UBI9 | List of packages | Registry URL | Catalog URL |
|---|---|---|---|
| CPU | https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cpu-ubi9/simple/ | registry.redhat.io/rhai/base-image-cpu-rhel9:3.2.0-1764872006 | https://catalog.redhat.com/software/containers/rhai/base-image-cpu-rhel9/690377f9d1c73dd1e81192f0 |
| CUDA | https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/cuda12.9-ubi9/simple/ | registry.redhat.io/rhai/base-image-cuda-rhel9:3.2.0-1765367347 | https://catalog.redhat.com/software/containers/rhai/base-image-cuda-rhel9/690377f9e1522d6afa972cc6 |
| ROCm | https://console.redhat.com/api/pypi/public-rhai/rhoai/3.2/rocm6.4-ubi9/simple/ | registry.redhat.io/rhai/base-image-rocm-rhel9:3.2.0-1764877298 | https://catalog.redhat.com/software/containers/rhai/base-image-rocm-rhel9/690377f9e1522d6afa972cc9 |
Notes:
- NVIDIA CUDA, AMD GPU, and AMD ROCm RPM repositories are configured, but disabled.
The images listed in Table 2.1 have RHEL RPM repositories enabled. A RHEL RPM is a package file used for the Red Hat Package Manager system on Red Hat Enterprise Linux (RHEL). An RPM file contains all the necessary components for an application, such as executable files, configuration files, and documentation. It simplifies the process of distributing, installing, and managing software by bundling everything into a single, standalone file.
You can install additional RPMs, but you must have a Red Hat Extended Update Support (EUS) subscription and you must run your container image in root mode (for example,
podman run --user 0).For more information about Red Hat Package Manager, see Introduction to RPM.
2.2. Mirror the Python index for your disconnected environment Copy linkLink copied to clipboard!
If you are using a disconnected environment, use the following code example to access the Red Hat Python index content and copy it locally. You can then upload the packages into your own internal hosting service:
2.3. Install packages and JupyterLab Copy linkLink copied to clipboard!
To ensure reliable and secure access to supported libraries, start your model customization workflow by creating a workbench image that is based on a Red Hat base image that is configured to use the Red Hat Python index. These base images are listed in Table 2.1.
Note: When you create a custom workbench image that is based on one of the images listed in Table 2.1, install JupyterLab. You can use JupyterLab to run the example model customization notebooks.
For guidance on custom workbenches, see Creating a custom workbench image from your own image.
When you use one of the images listed in Table 2.1 as a base image, both pip and uv commands are pre-configured to use the Red Hat Python index and system trust store for HTTPS.
When you run a pip install command, it installs the package version referenced in the Red Hat Python index, ensuring that you are installing a version of the library that is secure and reliably accessible.
For example, use the following commands to install the model customization libraries:
Install the data processing library:
pip install docling
pip install doclingCopy to Clipboard Copied! Toggle word wrap Toggle overflow Install the synthetic data generation library:
pip install sdg-hub
pip install sdg-hubCopy to Clipboard Copied! Toggle word wrap Toggle overflow Install the model training library:
pip install training-hub
pip install training-hubCopy to Clipboard Copied! Toggle word wrap Toggle overflow Install the model training library with CUDA support:
pip install training-hub[cuda]
pip install training-hub[cuda]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note: For additional options and details for installing the model training library, see Training Hub installation guidelines.
2.4. Import example notebooks Copy linkLink copied to clipboard!
To get started with customizing your models, you can run provided example notebooks and scripts. Table 2.2 lists the Git repositories that provide example notebooks for each model customization component.
For a comprehensive tutorial that demonstrates an AI/ML workflow, see the Knowledge Tuning example on the Red Hat AI examples site.
The Knowledge Tuning tutorial is a curated collection of Jupyter notebooks that includes examples of using Docling to process data, training-hub to fine-tune a model on that data, and KServe to deploy the final model for a Question and Answer application.
| Model customization component | Git clone example repository | Branch | Directory |
|---|---|---|---|
| Data processing using docling |
|
|
|
| Synthetic data generation |
|
|
|
| Training |
|
|
|
| End-to-end example for model customization with these components |
|
|
|
2.4.1. Clone an example Git repository Copy linkLink copied to clipboard!
Follow these steps to clone a Git repository from the JupyterLab environment provided with your OpenShift AI workbench.
Prerequisites
- You have the https URL and branch for one of the example Git repositories listed in Table 2.2.
Procedure
- From the OpenShift AI dashboard, go to the project where you created a workbench.
Click the link for your workbench. If prompted, log in and allow JupyterLab to authorize your user.
Your JupyterLab environment window opens.
The file-browser window shows the files and directories that are saved inside your own personal space in OpenShift AI .
Bring the content of an example Git repo inside your JupyterLab environment:
- On the toolbar, click the Git Clone icon.
- Enter a Git https URL.
- Select the Include submodules option, and then click Clone.
If you want to use a branch other than
main(for example, the data processing example repo uses thestable-3.0branch), change the branch:- In the left navigation bar, click the Git icon, and then click Current Branch to expand the branches and tags selector panel.
- On the Branches tab, in the Filter field, enter the branch name.
Select the branch.
The current branch changes to the branch that you selected.
Verification
- In the file browser, double-click the newly-created directory to see the example files.