8.2. Configuring scale bounds Knative Serving autoscaling
The minScale
and maxScale
annotations can be used to configure the minimum and maximum number of Pods that can serve applications. These annotations can be used to prevent cold starts or to help control computing costs.
- minScale
-
If the
minScale
annotation is not set, Pods will scale to zero (or to 1 if enable-scale-to-zero is false per theConfigMap
). - maxScale
-
If the
maxScale
annotation is not set, there will be no upper limit for the number of Pods created.
minScale
and maxScale
can be configured as follows in the revision template:
spec: template: metadata: autoscaling.knative.dev/minScale: "2" autoscaling.knative.dev/maxScale: "10"
Using these annotations in the revision template will propagate this confguration to PodAutoscaler
objects.
These annotations apply for the full lifetime of a revision. Even when a revision is not referenced by any route, the minimal Pod count specified by minScale
will still be provided. Keep in mind that non-routeable revisions may be garbage collected, which enables Knative to reclaim the resources.