Questo contenuto non è disponibile nella lingua selezionata.

Chapter 3. Converting models with Speculators


Convert an existing Eagle 3 speculator model to the Speculators format for use with Red Hat AI Inference Server. Use this procedure when you have an externally-trained Eagle 3 checkpoint that is not already in the Speculators format.

Prerequisites

  • You have installed Podman or Docker.
  • You are logged in as a user with sudo access.
  • You have access to the registry.redhat.io image registry and have logged in.
  • You have a Hugging Face account and have generated a Hugging Face access token.
  • You have access to a Linux server with at least one NVIDIA AI accelerator installed.
  • You have installed the relevant NVIDIA drivers.
  • You have installed the NVIDIA Container Toolkit.
Note

This example uses the meta-llama/Meta-Llama-3.1-8B-Instruct model, which requires accepting a license agreement. Before running this procedure, request access at meta-llama/Llama-3.1-8B-Instruct on Hugging Face.

Procedure

  1. Pull the Red Hat AI Model Optimization Toolkit container image:

    $ podman pull registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2
  2. Verify the Speculators version installed in the container:

    $ podman run --rm -it \
      registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2 \
      pip show speculators | grep Version

    Example output

    Version: 0.4.0a1

  3. Create a working directory and clone the upstream Speculators repository:

    $ mkdir model-opt && \
    cd model-opt && \
    git clone https://github.com/vllm-project/speculators.git
  4. Check out the Speculators branch that matches the version installed in the container:

    $ cd speculators && \
    git checkout v0.4.0+rhaiis
  5. Create or append your HF_TOKEN Hugging Face token to the private.env file and source it:

    $ echo "export HF_TOKEN=<YOUR_HF_TOKEN>" > private.env
    $ source private.env
  6. If your system has SELinux enabled, configure SELinux to allow device access:

    $ sudo setsebool -P container_use_devices 1
  7. Run the apply_eagle3_eagle.sh convert example using the Red Hat AI Model Optimization Toolkit container:

    $ podman run --rm \
      -v "$(pwd):/opt/app-root/model-opt" \
      --device nvidia.com/gpu=0 \
      --ipc=host \
      -e HF_TOKEN=$HF_TOKEN \
      registry.redhat.io/rhaii-early-access/model-opt-cuda-rhel9:3.4.0-ea.2 \
      bash /opt/app-root/model-opt/speculators/examples/convert/eagle3/apply_eagle3_eagle.sh

    The script downloads the Eagle 3 checkpoint, converts it to the Speculators format, and validates the result.

Verification

  • Verify that the output includes Validation succeeded.
  • Confirm that the converted model directory exists in your working directory, for example eagle3-llama-3.1-8b-instruct-converted.

Example output

2026-04-17 13:58:49.830 | INFO     | speculators.convert.eagle.eagle3_converter:convert:41 - Converting Eagle-3 checkpoint: yuhuili/EAGLE3-LLaMA3.1-Instruct-8B
Fetching 2 files: 100%|██████████| 2/2 [00:06<00:00,  3.04s/it]
2026-04-17 13:59:01.127 | SUCCESS  | speculators.convert.eagle.eagle3_converter:convert:88 - Saved to: eagle3-llama-3.1-8b-instruct-converted
2026-04-17 13:59:03.888 | SUCCESS  | speculators.convert.eagle.eagle3_converter:_validate_converted_checkpoint:220 - Validation succeeded

Red Hat logoGithubredditYoutubeTwitter

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Aiutiamo gli utenti Red Hat a innovarsi e raggiungere i propri obiettivi con i nostri prodotti e servizi grazie a contenuti di cui possono fidarsi. Esplora i nostri ultimi aggiornamenti.

Rendiamo l’open source più inclusivo

Red Hat si impegna a sostituire il linguaggio problematico nel codice, nella documentazione e nelle proprietà web. Per maggiori dettagli, visita il Blog di Red Hat.

Informazioni su Red Hat

Forniamo soluzioni consolidate che rendono più semplice per le aziende lavorare su piattaforme e ambienti diversi, dal datacenter centrale all'edge della rete.

Theme

© 2026 Red Hat
Torna in cima