이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Converting models with Speculators


Convert an existing Eagle 3 speculator model to the Speculators format for use with Red Hat AI Inference Server. Use this procedure when you have an externally-trained Eagle 3 checkpoint that is not already in the Speculators format.

Prerequisites

  • You have installed Podman or Docker.
  • You are logged in as a user with sudo access.
  • You have access to the registry.redhat.io image registry and have logged in.
  • You have a Hugging Face account and have generated a Hugging Face access token.
  • You have access to a Linux server with at least one NVIDIA AI accelerator installed.
  • You have installed the relevant NVIDIA drivers.
  • You have installed the NVIDIA Container Toolkit.
Note

This example uses the meta-llama/Meta-Llama-3.1-8B-Instruct model, which requires accepting a license agreement. Before running this procedure, request access at meta-llama/Llama-3.1-8B-Instruct on Hugging Face.

Procedure

  1. Pull the Red Hat AI Model Optimization Toolkit container image:

    $ podman pull registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2
  2. Verify the Speculators version installed in the container:

    $ podman run --rm -it \
      registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2 \
      pip show speculators | grep Version

    Example output

    Version: 0.4.0a1

  3. Create a working directory and clone the upstream Speculators repository:

    $ mkdir model-opt && \
    cd model-opt && \
    git clone https://github.com/vllm-project/speculators.git
  4. Check out the Speculators branch that matches the version installed in the container:

    $ cd speculators && \
    git checkout v0.4.0+rhaiis
  5. Create or append your HF_TOKEN Hugging Face token to the private.env file and source it:

    $ echo "export HF_TOKEN=<YOUR_HF_TOKEN>" > private.env
    $ source private.env
  6. If your system has SELinux enabled, configure SELinux to allow device access:

    $ sudo setsebool -P container_use_devices 1
  7. Run the apply_eagle3_eagle.sh convert example using the Red Hat AI Model Optimization Toolkit container:

    $ podman run --rm \
      -v "$(pwd):/opt/app-root/model-opt" \
      --device nvidia.com/gpu=0 \
      --ipc=host \
      -e HF_TOKEN=$HF_TOKEN \
      registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2 \
      bash /opt/app-root/model-opt/speculators/examples/convert/eagle3/apply_eagle3_eagle.sh

    The script downloads the Eagle 3 checkpoint, converts it to the Speculators format, and validates the result.

Verification

  • Verify that the output includes Validation succeeded.
  • Confirm that the converted model directory exists in your working directory, for example eagle3-llama-3.1-8b-instruct-converted.

Example output

2026-04-17 13:58:49.830 | INFO     | speculators.convert.eagle.eagle3_converter:convert:41 - Converting Eagle-3 checkpoint: yuhuili/EAGLE3-LLaMA3.1-Instruct-8B
Fetching 2 files: 100%|██████████| 2/2 [00:06<00:00,  3.04s/it]
2026-04-17 13:59:01.127 | SUCCESS  | speculators.convert.eagle.eagle3_converter:convert:88 - Saved to: eagle3-llama-3.1-8b-instruct-converted
2026-04-17 13:59:03.888 | SUCCESS  | speculators.convert.eagle.eagle3_converter:_validate_converted_checkpoint:220 - Validation succeeded

Red Hat logoGithubredditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 문서 정보

Legal Notice

Theme

© 2026 Red Hat
맨 위로 이동