このコンテンツは選択した言語では利用できません。

Chapter 4. Serving and chatting with the models


To interact with various models on Red Hat Enterprise Linux AI you must serve the model, which hosts it on a server, then you can chat with the models.

4.1. Serving the model

To interact with the models, you must first activate the model in a machine through serving. The ilab model serve commands starts a vLLM server that allows you to chat with the model.

Prerequisites

  • You installed RHEL AI with the bootable container image.
  • You initialized InstructLab.
  • You installed your preferred Granite LLMs.
  • You have root user access on your machine.

Procedure

  1. If you do not specify a model, you can serve the default model, granite-7b-redhat-lab, by running the following command:

    $ ilab model serve
  2. To serve a specific model, run the following command

    $ ilab model serve --model-path <model-path>

    Example command

    $ ilab model serve --model-path ~/.cache/instructlab/models/granite-7b-code-instruct

    Example output of when the model is served and ready

    INFO 2024-03-02 02:21:11,352 lab.py:201 Using model 'models/granite-7b-code-instruct' with -1 gpu-layers and 4096 max context size.
    Starting server process
    After application startup complete see http://127.0.0.1:8000/docs for API.
    Press CTRL+C to shut down the server.

4.1.1. Optional: Running ilab model serve as a service

You can set up a systemd service so that the ilab model serve command runs as a running service. The systemd service runs the ilab model serve command in the background and restarts if it crashes or fails. You can configure the service to start upon system boot.

Prerequisites

  • You installed the Red Hat Enterprise Linux AI image on bare metal.
  • You initialized InstructLab
  • You downloaded your preferred Granite LLMs.
  • You have root user access on your machine.

Procedure.

  1. Create a directory for your systemd user service by running the following command:

    $ mkdir -p $HOME/.config/systemd/user
  2. Create your systemd service file with the following example configurations:

    $ cat << EOF > $HOME/.config/systemd/user/ilab-serve.service
    [Unit]
    Description=ilab model serve service
    
    [Install]
    WantedBy=multi-user.target default.target 1
    
    [Service]
    ExecStart=ilab model serve --model-family granite
    Restart=always
    EOF
    1
    Specifies to start by default on boot.
  3. Reload the systemd manager configuration by running the following command:

    $ systemctl --user daemon-reload
  4. Start the ilab model serve systemd service by running the following command:

    $ systemctl --user start ilab-serve.service
  5. You can check that the service is running with the following command:

    $ systemctl --user status ilab-serve.service
  6. You can check the service logs by running the following command:

    $ journalctl --user-unit ilab-serve.service
  7. To allow the service to start on boot, run the following command:

    $ sudo loginctl enable-linger
  8. Optional: There are a few optional commands you can run for maintaining your systemd service.

    • You can stop the ilab-serve system service by running the following command:

      $ systemctl --user stop ilab-serve.service
    • You can prevent the service from starting on boot by removing the "WantedBy=multi-user.target default.target" from the $HOME/.config/systemd/user/ilab-serve.service file.

4.2. Chatting with the model

Once you serve your model, you can now chat with the model.

Important

The model you are chatting with must match the model you are serving. With the default config.yaml file, the granite-7b-redhat-lab model is the default for serving and chatting.

Prerequisites

  • You installed RHEL AI with the bootable container image.
  • You initialized InstructLab.
  • You downloaded your preferred Granite LLMs.
  • You are serving a model.
  • You have root user access on your machine.

Procedure

  1. Since you are serving the model in one terminal window, you must open another terminal to chat with the model.
  2. To chat with the default model, run the following command:

    $ ilab model chat
  3. To chat with a specific model run the following command:

    $ ilab model chat --model <model-path>

    Example command

    $ ilab model chat --model ~/.cache/instructlab/models/granite-7b-code-instruct

Example output of the chatbot

$ ilab model chat
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Welcome to InstructLab Chat w/ GRANITE-7B-CODE-INSTRUCT (type /h for help)                                                                                                                                                                    │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
>>>                                                                                                                                                                                                                        [S][default]

+ Type exit to leave the chatbot.

4.2.1. Optional: Creating an API key for model chatting

By default, the ilab CLI does not use authentication. If you want to expose your server to the internet, you can create a API key that connects to your server with the following procedures.

Prerequisites

  • You installed the Red Hat Enterprise Linux AI image on bare metal.
  • You initialized InstructLab
  • You downloaded your preferred Granite LLMs.
  • You have root user access on your machine.

Procedure

  1. Create a API key that is held in $VLLM_API_KEY parameter by running the following command:

    $ export VLLM_API_KEY=$(python -c 'import secrets; print(secrets.token_urlsafe())')
  2. You can view the API key by running the following command:

    $ echo $VLLM_API_KEY
  3. Update the config.yaml by running the following command:

    $ ilab config edit
  4. Add the following parameters to the vllm_args section of your config.yaml file.

    serve:
        vllm:
            vllm_args:
            - --api-key
            - <api-key-string>

    where

    <api-key-string>
    Specify your API key string.
  5. You can verify that the server is using API key authentication by running the following command:

    $ ilab model chat

    Then, seeing the following error that shows an unauthorized user.

    openai.AuthenticationError: Error code: 401 - {'error': 'Unauthorized'}
  6. Verify that your API key is working by running the following command:

    $ ilab chat -m granite-7b-redhat-lab --endpoint-url https://inference.rhelai.com/v1 --api-key $VLLM_API_KEY

    Example output

    $ ilab model chat
    ╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────── system ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
    │ Welcome to InstructLab Chat w/ GRANITE-7B-LAB (type /h for help)                                                                                                                                                                    │
    ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
    >>>                                                                                                                                                                                                                        [S][default]

4.2.2. Optional: Allowing chat access to a model from a secure endpoint

You can serve an inference endpoint and allow others to interact with models provided with Red Hat Enterprise Linux AI on secure connections by creating a systemd service and setting up a nginx reverse proxy that exposes a secure endpoint. This allows you to share the secure endpoint with others so they can chat with the model over a network.

The following procedure uses self-signed certifications, but it is recommended to use certificates issued by a trusted Certificate Authority (CA).

Note

The following procedure is supported only on bare metal platforms.

Prerequisites

  • You installed the Red Hat Enterprise Linux AI image on bare-metal.
  • You initialized InstructLab
  • You downloaded your preferred Granite LLMs.
  • You have root user access on your machine.

Procedure

  1. Create a directory for your certificate file and key by running the following command:

    $ mkdir -p `pwd`/nginx/ssl/
  2. Create an OpenSSL configuration file with the proper configurations by running the following command:

    $ cat > openssl.cnf <<EOL
    [ req ]
    default_bits = 2048
    distinguished_name = <req-distinguished-name> 1
    x509_extensions = v3_req
    prompt = no
    
    [ req_distinguished_name ]
    C  = US
    ST = California
    L  = San Francisco
    O  = My Company
    OU = My Division
    CN = rhelai.redhat.com
    
    [ v3_req ]
    subjectAltName = <alt-names> 2
    basicConstraints = critical, CA:true
    subjectKeyIdentifier = hash
    authorityKeyIdentifier = keyid:always,issuer
    
    [ alt_names ]
    DNS.1 = rhelai.redhat.com 3
    DNS.2 = www.rhelai.redhat.com 4
    1
    Specify the distinguished name for your requirements.
    2
    Specify the alternate name for your requirements.
    3 4
    Specify the server common name for RHEL AI. In the example, the server name is rhelai.redhat.com.
  3. Generate a self signed certificate with a Subject Alternative Name (SAN) enabled with the following commands:

    $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout `pwd`/nginx/ssl/rhelai.redhat.com.key -out `pwd`/nginx/ssl/rhelai.redhat.com.crt -config openssl.cnf
    $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout
  4. Create the Nginx Configuration file and add it to the `pwd/nginx/conf.d` by running the following command:

    mkdir -p `pwd`/nginx/conf.d
    
    echo 'server {
        listen 8443 ssl;
        server_name <rhelai.redhat.com> 1
    
        ssl_certificate /etc/nginx/ssl/rhelai.redhat.com.crt;
        ssl_certificate_key /etc/nginx/ssl/rhelai.redhat.com.key;
    
        location / {
            proxy_pass http://127.0.0.1:8000;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
    ' > `pwd`/nginx/conf.d/rhelai.redhat.com.conf
    1
    Specify the name of your server. In the example, the server name is rhelai.redhat.com
  5. Run the Nginx container with the new configurations by running the following command:

    $ podman run --net host -v `pwd`/nginx/conf.d:/etc/nginx/conf.d:ro,Z -v `pwd`/nginx/ssl:/etc/nginx/ssl:ro,Z nginx

    If you want to use port 443, you must run the podman run command as a root user..

  6. You can now connect to a serving ilab machine using a secure endpoint URL. Example command:

    $ ilab chat -m /instructlab/instructlab/granite-7b-redhat-lab --endpoint-url https://rhelai.redhat.com:8443/v1
  7. Optional: You can also get the server certificate and append it to the Certifi CA Bundle

    1. Get the server certificate by running the following command:

      $ openssl s_client -connect rhelai.redhat.com:8443 </dev/null 2>/dev/null | openssl x509 -outform PEM > server.crt
    2. Copy the certificate to you system’s trusted CA storage directory and update the CA trust store with the following commands:

      $ sudo cp server.crt /etc/pki/ca-trust/source/anchors/
      $ sudo update-ca-trust
    3. You can append your certificate to the Certifi CA bundle by running the following command:

      $ cat server.crt >> $(python -m certifi)
    4. You can now run ilab model chat with a self-signed certificate. Example command:

      $ ilab chat -m /instructlab/instructlab/granite-7b-redhat-lab --endpoint-url https://rhelai.redhat.com:8443/v1
Red Hat logoGithubRedditYoutubeTwitter

詳細情報

試用、購入および販売

コミュニティー

Red Hat ドキュメントについて

Red Hat をお使いのお客様が、信頼できるコンテンツが含まれている製品やサービスを活用することで、イノベーションを行い、目標を達成できるようにします。 最新の更新を見る.

多様性を受け入れるオープンソースの強化

Red Hat では、コード、ドキュメント、Web プロパティーにおける配慮に欠ける用語の置き換えに取り組んでいます。このような変更は、段階的に実施される予定です。詳細情報: Red Hat ブログ.

会社概要

Red Hat は、企業がコアとなるデータセンターからネットワークエッジに至るまで、各種プラットフォームや環境全体で作業を簡素化できるように、強化されたソリューションを提供しています。

© 2024 Red Hat, Inc.