Chapter 16. Auto Scaling
Kubernetes includes the HorizontalPodAutoscaler
which allows StatefulSets or Deployments to be automatically scaled up or down based upon specified metrics. The Infinispan CR exposes the .status.scale
sub-resource, which enables HorizontalPodAutoscaler
resources to target the Infinispan CR.
Before defining a HorizontalPodAutoscaler
configuration, consider the types of Data Grid caches that you define. Distributed and Replicated caches have very different scaling requirements, so defining a HorizontalPodAutoscaler
for server’s running a combination of these cache types may not be advantageous. For example, defining a HorizontalPodAutoscaler
that scales when memory usage reaches a certain percentage will allow overall cache capacity to be increased when defining Distributed caches as cache entries are spread across pods, however it will not work with replicated cache as every pod hosts all cache entries. Conversely, configuring a HorizontalPodAutoscaler
based upon CPU usage will be more beneficial for clusters with replicated cache as every pod contains all cache entries and so distributing read requests across additional nodes will allow a greater number of requests to be processed simultaneously.
16.1. Configuring HorizontalPodAutoscaler
Create a HorizontalPodAutoScaler resource that targets your Infinispan CR.
Procedure
Define a
HorizontalPodAutoscaler
resource in the same namespace as yourInfinispan
CRapiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: infinispan-auto spec: scaleTargetRef: apiVersion: infinispan.org/v1 kind: Infinispan name: example 1 minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50
- 1
- The name of your
Infinispan
CR
If using metric resource of type cpu
or memory
, you must configure request/limits for this resource in your Infinispan
CR.
HorizontalPodAutoscaler should be removed when upgrading a Data Grid cluster, as the automatic scaling will cause the upgrade process to enter unexpected state, as the Operator needs to scale the cluster down to 0 pods.