Distributed Inference with llm-d
Red Hat AI Inference 3.4
Architecture, components, and deployment of Distributed Inference with llm-d for scalable LLM serving on Kubernetes
Abstract
Learn about Distributed Inference with llm-d, a Kubernetes-native framework for serving large language models at scale.