이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 1. About Speculators


Speculators is a unified library for building, training, and storing speculative decoding algorithms for large language model (LLM) inference, including frameworks such as vLLM.

Important

Speculators is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Speculative decoding is an optimization technique that improves inference performance for the LLM you are trying to serve. Red Hat AI Inference Server supports Eagle 3, a speculative decoding algorithm that uses a small, single-layer draft model and a full-sized 'verifier' model, which is the LLM you are serving. The Eagle 3 speculator model auto-regressively predicts several tokens, and then the verifier model processes these tokens in parallel. As the verifier model can accept multiple tokens per forward pass, effective throughput increases. When the verifier model rejects a token, it samples a corrected token from its own distribution, ensuring the output matches what it would produce alone.

Speculative decoding provides the following advantages:

  • Latency decreases through parallel token validation.
  • Eagle 3 speculator models require minimal processing due to their small size.
  • Output quality matches what the verifier model would produce alone.
Red Hat logoGithubredditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 문서 정보

Legal Notice

Theme

© 2026 Red Hat
맨 위로 이동