Chapter 6. Deprecated metrics

The following metrics are deprecated and will be removed in a future version of AI Inference Server:

vllm:num_requests_swapped
vllm:cpu_cache_usage_perc
vllm:cpu_prefix_cache_hit_rate (KV cache offloading is not used in V1).
vllm:gpu_prefix_cache_hit_rate. This metric is replaced by queries+hits counters in V1.
vllm:time_in_queue_requests. This metric is duplicated by vllm:request_queue_time_seconds.
vllm:model_forward_time_milliseconds
vllm:model_execute_time_milliseconds. Prefill, decode or inference time metrics should be used instead.

Important

When metrics are deprecated in version X.Y, they are hidden in version X.Y+1 but can be re-enabled by using the --show-hidden-metrics-for-version=X.Y escape hatch. Deprecated metrics are completely removed in the following version X.Y+2.

Este conteúdo não está disponível no idioma selecionado.

Chapter 6. Deprecated metrics

Aprender

Experimente, compre e venda

Comunidades

Sobre a documentação da Red Hat

Tornando o open source mais inclusivo

Sobre a Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links