Building and maintaining your environment
Creating accounts, initalizing RHEL AI, download/organize models, and serving/chat customizations
Abstract
Chapter 1. Configuring accounts for RHEL AI Copy linkLink copied to clipboard!
There are a few accounts you need to set up before interacting with RHEL AI.
- Creating a Red Hat account
- You can create a Red Hat account by registering on the Red Hat website. You can follow the procedure in Register for a Red Hat account.
- Creating a Red Hat registry account
Before you can download models from the Red Hat registry, you need to create a registry account and login using the CLI. You can view your account username and password by selecting the Regenerate Token button on the webpage.
- You can create a Red Hat registry account by selecting the New Service Account button on the Registry Service Accounts page.
- There are several ways you can log into your registry account via the CLI. Follow the procedure in Red Hat Container Registry authentication to login on your machine.
- Configuring Red Hat Insights for hybrid cloud deployments
Red Hat Insights is an offering that gives you visibility to the environments you are deploying. This platform can also help identify operational and vulnerability risks in your system. For more information about Red Hat Insights, see Red Hat Insights data and application security.
You can create a Red Hat Insights account using an activation key and organization parameters by following the procedure in Viewing an activation key.
You can then configure your account on your machine by running the following command:
$ sudo rhc connect --organization <org id> --activation-key <created key>To run RHEL AI in a disconnected environment, or opt out of Red Hat Insights, run the following commands:
$ sudo mkdir -p /etc/ilab $ sudo touch /etc/ilab/insights-opt-outYou can also enable persistent credentials and stay logged in to the registry by running the following commands:
$ podman login registry.redhat.ioThen add the
auth.jsonfile to the the/etc/ostree/directory.$ sudo cp /run/user/1000/containers/auth.json /etc/ostree/
This allows you to stay logged into the Red Hat registry after upgrading Red Hat Enterprise Linux AI
If your system is configured as a root user, you do not need to use sudo when running the commands.
Chapter 2. Initializing InstructLab Copy linkLink copied to clipboard!
You must initialize the InstructLab environments to begin working with the Red Hat Enterprise Linux AI models.
2.1. Creating your RHEL AI environment Copy linkLink copied to clipboard!
You can start interacting with LLMs and the RHEL AI tooling by initializing the InstructLab environment.
System profiles for AMD machines is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Prerequisites
- You installed RHEL AI with the bootable container image.
- You have root user access on your machine.
Procedure
Optional: You can view your machine’s information by running the following command:
$ ilab system infoInitialize InstructLab by running the following command:
$ ilab config initThe RHEL AI CLI starts setting up your environment and
config.yamlfile. The CLI automatically detects your machine’s hardware and selects a system profile based on the GPU types. System profiles populate theconfig.yamlfile with the proper parameter values based on your detected hardware.Example output of profile auto-detection
Generating config file and profiles: /home/user/.config/instructlab/config.yaml /home/user/.local/share/instructlab/internal/system_profiles/ We have detected the NVIDIA H100 X4 profile as an exact match for your system. -------------------------------------------- Initialization completed successfully! You're ready to start using `ilab`. Enjoy! --------------------------------------------If the CLI does not detect an exact match for your system, you can manually select a system profile when prompted. Select your hardware vendor and configuration that matches your system.
Example output of selecting system profiles
Please choose a system profile to use. System profiles apply to all parts of the config file and set hardware specific defaults for each command. First, please select the hardware vendor your system falls into [0] NO SYSTEM PROFILE [1] NVIDIA Enter the number of your choice [0]: 4 You selected: NVIDIA Next, please select the specific hardware configuration that most closely matches your system. [0] No system profile [1] NVIDIA H100 X2 [2] NVIDIA H100 X8 [3] NVIDIA H100 X4 [4] NVIDIA L4 X8 [5] NVIDIA A100 X2 [6] NVIDIA A100 X8 [7] NVIDIA A100 X4 [8] NVIDIA L40S X4 [9] NVIDIA L40S X8 Enter the number of your choice [hit enter for hardware defaults] [0]: 3Example output of a completed
ilab config initrun.You selected: /Users/<user>/.local/share/instructlab/internal/system_profiles/nvidia/H100/h100_x4.yaml -------------------------------------------- Initialization completed successfully! You're ready to start using `ilab`. Enjoy! --------------------------------------------If you want to use the skeleton taxonomy tree, which includes two skills and one knowledge
qna.yamlfile, you can clone the skeleton repository and place it in thetaxonomydirectory by running the following command:rm -rf ~/.local/share/instructlab/taxonomy/ ; git clone https://github.com/RedHatOfficial/rhelai-sample-taxonomy.git ~/.local/share/instructlab/taxonomy/If the incorrect system profile is auto-detected, you can run the following command:
$ ilab config init --profile <path-to-system-profile>where
- <path-to-system-profile>
Specify the path to the correct system profile. You can find the system profiles in the
~/.local/share/instructlab/internal/system_profilespath.Example profile selection command
$ ilab config init --profile ~/.local/share/instructlab/internal/system_profiles/amd/mi300x/mi300x_x8.yaml
Directory structure of the InstructLab environment
├─ ~/.config/instructlab/config.yaml
├─ ~/.cache/instructlab/models/
├─ ~/.local/share/instructlab/datasets
├─ ~/.local/share/instructlab/taxonomy
├─ ~/.local/share/instructlab/phased/<phase1-or-phase2>/checkpoints/
- 1
~/.config/instructlab/config.yaml: Contains theconfig.yamlfile.- 2
~/.cache/instructlab/models/: Contains all downloaded large language models, including the saved output of ones you generate with RHEL AI.- 3
~/.local/share/instructlab/datasets/: Contains data output from the SDG phase, built on modifications to the taxonomy repository.- 4
~/.local/share/instructlab/taxonomy/: Contains the skill and knowledge data.- 5
~/.local/share/instructlab/phased/<phase1-or-phase2>/checkpoints/: Contains the output of the multi-phase training process
Verification
You can view the full
config.yamlfile by running the following command$ ilab config showYou can also manually edit the
config.yamlfile by running the following command:$ ilab config edit
Chapter 3. Downloading Large Language models Copy linkLink copied to clipboard!
Red Hat Enterprise Linux AI allows you to customize or chat with various Large Language Models (LLMs) provided and built by Red Hat and IBM. You can download these models from the Red Hat RHEL AI registry. You can upload any custom model to an S3 bucket.
| Large Language Models (LLMs) | Type | Size | Purpose | Model family | NVIDIA Accelerator Support | AMD Accelerator Support | Intel Accelerator Support |
|---|---|---|---|---|---|---|---|
|
| LAB fine-tuned granite starter model | 16.0 GB | Version 2 of the default Granite 3.1 base model for customizing and fine-tuning | Granite 3.1 | Generally Available | Generally Available | Not Available |
|
| LAB fine-tuned granite model | 16.0 GB | Version 2 of the default Granite 3.1 model for inference serving | Granite 3.1 | Generally Available | Generally Available | Not Available |
|
| LAB fine-tuned granite starter model | 16.0 GB | Version 2 of the default Granite 3.1 base model for customizing and fine-tuning | Granite 3.1 | Not Available | Not Available | Technology Preview |
|
| LAB fine-tuned granite model | 16.0 GB | Version 2 of the default Granite 3.1 model for inference serving | Granite 3.1 | Not Available | Not Available | Technology Preview |
|
| LAB fine-tuned granite code model | 15.0 GB | LAB fine-tuned granite code model for inference serving | Granite Code models | Technology Preview | Technology Preview | Technology Preview |
|
| Granite fine-tuned code model | 15.0 GB | Granite code model for inference serving | Granite Code models | Technology Preview | Technology Preview | Technology Preview |
|
| Default teacher model | 87.0 GB | Default teacher model for running Synthetic data generation (SDG) | Mixtral | Generally Available | Generally Available | Technology Preview |
|
| Optional teacher model | 74.0 GB | Optional teacher model for running Synthetic data generation (SDG) | Llama | Technology Preview | Not Available | Not Available |
|
| Evaluation judge model | 87.0 GB | Judge model for multi-phase training and evaluation | Prometheus 2 | Generally Available | Generally Available | Technology Preview |
Using the granite-8b-code-instruct or granite-8b-code-base Large Language models (LLMs) is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Models required for customizing the Granite LLM
-
The
granite-7b-starterorgranite-8b-starter-v1base LLM depending on your hardware vendor. -
The
mixtral-8x7b-instruct-v0-1teacher model for SDG. -
The
prometheus-8x7b-v2-0judge model for training and evaluation.
Additional tools required for customizing an LLM
The Low-rank adaptation (LoRA) adaptors enhance the efficiency of the Synthetic Data Generation (SDG) process.
-
The
skills-adapter-v3LoRA layered skills adapter for SDG. The
knowledge-adapter-v3LoRA layered knowledge adapter for SDG.Example command for downloading the adaptors
$ ilab model download --repository docker://registry.redhat.io/rhelai1/knowledge-adapter-v3 --release latest
The LoRA layered adapters do not show up in the output of the ilab model list command. You can see the skills-adapter-v3 and knowledge-adapter-v3 files in the ls ~/.cache/instructlab/models folder.
3.1. Downloading the models from a Red Hat repository Copy linkLink copied to clipboard!
You can download the additional optional models created by Red Hat and IBM.
Prerequisites
- You installed RHEL AI with the bootable container image.
- You initialized InstructLab.
- You created a Red Hat registry account and logged in on your machine.
- You have root user access on your machine.
Procedure
To download the additional LLM models, run the following command:
$ ilab model download --repository docker://<repository_and_model> --release <release>where:
- <repository_and_model>
-
Specifies the repository location of the model as well as the model. You can access the models from the
registry.redhat.io/rhelai1/repository. - <release>
-
Specifies the version of the model. Set to
1.5for the models that are supported on RHEL AI version 1.5. Set tolatestfor the latest version of the model.
Example command
$ ilab model download --repository docker://registry.redhat.io/rhelai1/granite-3.1-8b-starter-v1 --release latest
Verification
You can view all the downloaded models, including the new models after training, on your system with the following command:
$ ilab model listExample output
+-----------------------------------+---------------------+---------+ | Model Name | Last Modified | Size | +-----------------------------------+---------------------+---------+ | models/prometheus-8x7b-v2-0 | 2024-08-09 13:28:50 | 87.0 GB| | models/mixtral-8x7b-instruct-v0-1 | 2024-08-09 13:28:24 | 87.0 GB| | models/granite-3.1-8b-starter-v1 | 2024-08-09 14:28:40 | 16.6 GB| | models/granite-3.1-8b-lab-v1 | 2024-08-09 14:40:35 | 16.6 GB| +-----------------------------------+---------------------+---------+You can also list the downloaded models in the
ls ~/.cache/instructlab/modelsfolder by running the following command:$ ls ~/.cache/instructlab/modelsExample output
granite-3.1-8b-starter-v1 granite-3.1-8b-lab-v1
Chapter 4. Model management Copy linkLink copied to clipboard!
There are a various ways you can organize and manage your custom or downloaded models on RHEL AI
4.1. Uploading your models to a registry Copy linkLink copied to clipboard!
After you fine-tune a model, you can upload the model to an external registry. RHEL AI currently supports uploading models to AWS S3 buckets.
Prerequisites
- You installed RHEL AI on your preferred platform.
- You initialized InstructLab.
- Log in to your preferred registry.
Procedure
You can upload your models to a specific registry with the following command
$ ilab model upload --model <name-of-model> --destination <registry-location> --dest-type <registry-type>where:
- <name-of-model>
-
Specify the checkpoint name you want to upload. For example,
--model samples_0801. You can also specify the path to the checkpoint. - <registry-location>
-
Specify where you want to upload the model. For example,
--destination example-s3-bucket - <registry-type>
-
Specify the model type. Valid values include:
s3.
Example
ilab model uploadcommand to an s3 bucket$ ilab model upload --model samples_0801 --destination example-s3-bucket --dest-type s3
Chapter 5. Serving and chatting with the models Copy linkLink copied to clipboard!
To interact with various models on Red Hat Enterprise Linux AI you must serve the model, which hosts it on a server, then you can chat with the models.
5.1. Serving the model Copy linkLink copied to clipboard!
To interact with the models, you must first activate the model in a machine through serving. The ilab model serve commands starts a vLLM server that allows you to chat with the model.
Prerequisites
- You installed RHEL AI with the bootable container image.
- You initialized InstructLab.
- You installed your preferred Granite LLMs.
- You have root user access on your machine.
Procedure
If you do not specify a model, you can serve the default model,
granite-7b-redhat-lab, by running the following command:$ ilab model serveTo serve a specific model, run the following command
$ ilab model serve --model-path <model-path>Example command
$ ilab model serve --model-path ~/.cache/instructlab/models/granite-8b-code-instructExample output of when the model is served and ready
INFO 2024-03-02 02:21:11,352 lab.py:201 Using model 'models/granite-8b-code-instruct' with -1 gpu-layers and 4096 max context size. Starting server process After application startup complete see http://127.0.0.1:8000/docs for API. Press CTRL+C to shut down the server.
5.1.1. Optional: Running ilab model serve as a service Copy linkLink copied to clipboard!
You can set up a systemd service so that the ilab model serve command runs as a running service. The systemd service runs the ilab model serve command in the background and restarts if it crashes or fails. You can configure the service to start upon system boot.
Prerequisites
- You installed the Red Hat Enterprise Linux AI image on bare metal.
- You initialized InstructLab
- You downloaded your preferred Granite LLMs.
- You have root user access on your machine.
Procedure.
Create a directory for your
systemduser service by running the following command:$ mkdir -p $HOME/.config/systemd/userCreate your
systemdservice file with the following example configurations:$ cat << EOF > $HOME/.config/systemd/user/ilab-serve.service [Unit] Description=ilab model serve service [Install] WantedBy=multi-user.target default.target1 [Service] ExecStart=ilab model serve --model-family granite Restart=always EOF- 1
- Specifies to start by default on boot.
Reload the
systemdmanager configuration by running the following command:$ systemctl --user daemon-reloadStart the
ilab model servesystemdservice by running the following command:$ systemctl --user start ilab-serve.serviceYou can check that the service is running with the following command:
$ systemctl --user status ilab-serve.serviceYou can check the service logs by running the following command:
$ journalctl --user-unit ilab-serve.serviceTo allow the service to start on boot, run the following command:
$ sudo loginctl enable-lingerOptional: There are a few optional commands you can run for maintaining your
systemdservice.You can stop the ilab-serve system service by running the following command:
$ systemctl --user stop ilab-serve.service-
You can prevent the service from starting on boot by removing the
"WantedBy=multi-user.target default.target"from the$HOME/.config/systemd/user/ilab-serve.servicefile.
5.1.2. Optional: Allowing access to a model from a secure endpoint Copy linkLink copied to clipboard!
You can serve an inference endpoint and allow others to interact with models provided with Red Hat Enterprise Linux AI on secure connections by creating a systemd service and setting up a nginx reverse proxy that exposes a secure endpoint. This allows you to share the secure endpoint with others so they can chat with the model over a network.
The following procedure uses self-signed certifications, but it is recommended to use certificates issued by a trusted Certificate Authority (CA).
The following procedure is supported only on bare metal platforms.
Prerequisites
- You installed the Red Hat Enterprise Linux AI image on bare-metal.
- You initialized InstructLab
- You downloaded your preferred Granite LLMs.
- You have root user access on your machine.
Procedure
Create a directory for your certificate file and key by running the following command:
$ mkdir -p `pwd`/nginx/ssl/Create an OpenSSL configuration file with the proper configurations by running the following command:
$ cat > openssl.cnf <<EOL [ req ] default_bits = 2048 distinguished_name = <req-distinguished-name>1 x509_extensions = v3_req prompt = no [ req_distinguished_name ] C = US ST = California L = San Francisco O = My Company OU = My Division CN = rhelai.redhat.com [ v3_req ] subjectAltName = <alt-names>2 basicConstraints = critical, CA:true subjectKeyIdentifier = hash authorityKeyIdentifier = keyid:always,issuer [ alt_names ] DNS.1 = rhelai.redhat.com3 DNS.2 = www.rhelai.redhat.com4 Generate a self signed certificate with a Subject Alternative Name (SAN) enabled with the following commands:
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout `pwd`/nginx/ssl/rhelai.redhat.com.key -out `pwd`/nginx/ssl/rhelai.redhat.com.crt -config openssl.cnf$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyoutCreate the Nginx Configuration file and add it to the
`pwd/nginx/conf.d` by running the following command:mkdir -p `pwd`/nginx/conf.d echo 'server { listen 8443 ssl; server_name <rhelai.redhat.com>1 ssl_certificate /etc/nginx/ssl/rhelai.redhat.com.crt; ssl_certificate_key /etc/nginx/ssl/rhelai.redhat.com.key; location / { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } ' > `pwd`/nginx/conf.d/rhelai.redhat.com.conf- 1
- Specify the name of your server. In the example, the server name is
rhelai.redhat.com
Run the Nginx container with the new configurations by running the following command:
$ podman run --net host -v `pwd`/nginx/conf.d:/etc/nginx/conf.d:ro,Z -v `pwd`/nginx/ssl:/etc/nginx/ssl:ro,Z nginxIf you want to use port 443, you must run the
podman runcommand as a root user..You can now connect to a serving ilab machine using a secure endpoint URL. Example command:
$ ilab model chat -m /instructlab/instructlab/granite-7b-redhat-lab --endpoint-urlYou can also connect to the serving RHEL AI machine with the following command:
$ curl --location 'https://rhelai.redhat.com:8443/v1' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer <api-key>' \ --data '{ "model": "/var/home/cloud-user/.cache/instructlab/models/granite-7b-redhat-lab", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }' | jq .where
- <api-key>
- Specify your API key. You can create your own API key by following the procedure in "Creating an API key for chatting with a model".
Optional: You can also get the server certificate and append it to the Certifi CA Bundle
Get the server certificate by running the following command:
$ openssl s_client -connect rhelai.redhat.com:8443 </dev/null 2>/dev/null | openssl x509 -outform PEM > server.crtCopy the certificate to you system’s trusted CA storage directory and update the CA trust store with the following commands:
$ sudo cp server.crt /etc/pki/ca-trust/source/anchors/$ sudo update-ca-trustYou can append your certificate to the Certifi CA bundle by running the following command:
$ cat server.crt >> $(python -m certifi)You can now run
ilab model chatwith a self-signed certificate. Example command:$ ilab model chat -m /instructlab/instructlab/granite-7b-redhat-lab --endpoint-url https://rhelai.redhat.com:8443/v1
5.2. Chatting with the model Copy linkLink copied to clipboard!
Once you serve your model, you can now chat with the model.
The model you are chatting with must match the model you are serving. With the default config.yaml file, the granite-7b-redhat-lab model is the default for serving and chatting.
Prerequisites
- You installed RHEL AI with the bootable container image.
- You initialized InstructLab.
- You downloaded your preferred Granite LLMs.
- You are serving a model.
- You have root user access on your machine.
Procedure
- Since you are serving the model in one terminal window, you must open another terminal to chat with the model.
To chat with the default model, run the following command:
$ ilab model chatTo chat with a specific model run the following command:
$ ilab model chat --model <model-path>Example command
$ ilab model chat --model ~/.cache/instructlab/models/granite-8b-code-instruct
Example output of the chatbot
$ ilab model chat
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Welcome to InstructLab Chat w/ GRANITE-8B-CODE-INSTRUCT (type /h for help) │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
>>> [S][default]
+ Type exit to leave the chatbot.
5.2.1. Optional: Creating an API key for chatting with a model Copy linkLink copied to clipboard!
By default, the ilab CLI does not use authentication. If you want to expose your server to the internet, you can create a API key that connects to your server with the following procedures.
Prerequisites
- You installed the Red Hat Enterprise Linux AI image on bare metal.
- You initialized InstructLab
- You downloaded your preferred Granite LLMs.
- You have root user access on your machine.
Procedure
Create a API key that is held in
$VLLM_API_KEYparameter by running the following command:$ export VLLM_API_KEY=$(python -c 'import secrets; print(secrets.token_urlsafe())')You can view the API key by running the following command:
$ echo $VLLM_API_KEYUpdate the
config.yamlby running the following command:$ ilab config editAdd the following parameters to the
vllm_argssection of yourconfig.yamlfile.serve: vllm: vllm_args: - --api-key - <api-key-string>where
- <api-key-string>
- Specify your API key string.
You can verify that the server is using API key authentication by running the following command:
$ ilab model chatThen, seeing the following error that shows an unauthorized user.
openai.AuthenticationError: Error code: 401 - {'error': 'Unauthorized'}Verify that your API key is working by running the following command:
$ ilab model chat -m granite-7b-redhat-lab --endpoint-url https://inference.rhelai.com/v1 --api-key $VLLM_API_KEYExample output
$ ilab model chat ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ Welcome to InstructLab Chat w/ GRANITE-7B-LAB (type /h for help) │ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ >>> [S][default]