LLM Compressor
Red Hat AI Inference Server 3.0
Compressing large language models with the LLM Compressor library
Abstract
				Describes the LLM Compressor library and how you can use it to optimize and compress large language models before inferencing.