Chapter 43. Configuring the cluster auto-scaler in AWS
You can configure an auto-scaler on your OpenShift Container Platform cluster in Amazon Web Services (AWS) to provide elasticity for your application workload. The auto-scaler ensures that enough nodes are active to run your pods and that the number of active nodes is proportional to current demand.
You can run the auto-scaler only on AWS.
43.1. About the OpenShift Container Platform auto-scaler
The auto-scaler in OpenShift Container Platform repeatedly checks to see how many pods are pending node allocation. If pods are pending allocation and the auto-scaler has not met its maximum capacity, then new nodes are continuously provisioned to accommodate the current demand. When demand drops and fewer nodes are required, the auto-scaler removes unused nodes. After you install the auto-scaler, its behavior is automatic. You only need to add the desired number of replicas to the deployment.
In OpenShift Container Platform version 3.11, you can deploy the auto-scaler only on Amazon Web Services (AWS). The auto-scaler uses some standard AWS objects to manage your cluster size, including Auto Scaling groups and Launch Configurations.
The auto-scaler uses the following assets:
- Auto Scaling groups
- An Auto Scaling group is a logical representation of a set of machines. You configure an Auto Scaling group with a minimum number of instances to run, the maximum number of instances that can run, and your desired number of instances to run. An Auto Scaling group starts by launching enough instances to meet your desired capacity. You can configure an Auto Scaling group to start with zero instances.
- Launch Configurations
A Launch Configuration is a template that an Auto Scaling group uses to launch instances. When you create a Launch Configuration, you specify information such as:
- The ID of the Amazon Machine Image (AMI) to use as the base image
- The instance type, such as m4.large
- A key pair
- One or more security groups
- The subnets to apply the Launch Configuration to
- OpenShift Container Platform primed images
- When the Auto Scaling group provisions a new instance, the image that it launches must have OpenShift Container Platform already prepared. The Auto Scaling group uses this image to both automatically bootstrap the node and enroll it within the cluster without any manual intervention.
43.2. Creating a primed image
You can use Ansible playbooks to automatically create a primed image for the auto-scaler to use. You must provide attributes from your existing Amazon Web Services (AWS) cluster.
If you already have a primed image, you can use it instead of creating a new one.
Procedure
On the host that you used to create your OpenShift Container Platform cluster, create a primed image:
Create a new Ansible inventory file on your local host. This file requires variables that assign the
cloudprovider
flag to enable autoscaling on the participating nodes. Without these variables, thebuild_ami.yml
playbook cannot use theopenshift_cloud_provider
role:[OSEv3:children] masters nodes etcd [OSEv3:vars] openshift_deployment_type=openshift-enterprise ansible_ssh_user=ec2-user openshift_clusterid=mycluster ansible_become=yes openshift_cloudprovider_kind=aws 1 openshift_cloudprovider_aws_access_key=<aws_access_key> 2 openshift_cloudprovider_aws_secret_key=<aws_secret_key> 3 [masters] [etcd] [nodes]
Create provisioning file, build-ami-provisioning-vars.yaml, on your local host:
openshift_deployment_type: openshift-enterprise openshift_aws_clusterid: mycluster 1 openshift_aws_region: us-east-1 2 openshift_aws_create_vpc: false 3 openshift_aws_vpc_name: production 4 openshift_aws_subnet_az: us-east-1d 5 openshift_aws_create_security_groups: false 6 openshift_aws_ssh_key_name: production-ssh-key 7 openshift_aws_base_ami: ami-12345678 8 openshift_aws_create_s3: False 9 openshift_aws_build_ami_group: default 10 openshift_aws_vpc: 11 name: "{{ openshift_aws_vpc_name }}" cidr: 172.18.0.0/16 subnets: us-east-1: - cidr: 172.18.0.0/20 az: "us-east-1d" container_runtime_docker_storage_type: overlay2 12 container_runtime_docker_storage_setup_device: /dev/xvdb 13 # atomic-openshift-node service requires gquota to be set on the # filesystem that hosts /var/lib/origin/openshift.local.volumes (OCP # emptydir). Often is it not ideal or cost effective to deploy a vol # for emptydir. This pushes emptydir up to the / filesystem. Base ami # often does not ship with gquota enabled for /. Set this bool true to # enable gquota on / filesystem when using Red Hat Cloud Access RHEL7 # AMI or Amazon Market RHEL7 AMI. openshift_aws_ami_build_set_gquota_on_slashfs: true 14 rhsub_user: user@example.com 15 rhsub_pass: password 16 rhsub_pool: pool-id 17
- 1
- Provide the name of the existing cluster.
- 2
- Provide the region the existing cluster is currently running in.
- 3
- Specify
False
to disable the creation of a VPC. - 4
- Provide the existing VPC name that the cluster is running in.
- 5
- Provide the name of a subnet the existing cluster is running in.
- 6
- Specify
False
to disable the creation of security groups. - 7
- Provide the AWS key name to use for SSH access.
- 8
- Provide the AMI image ID to use as the base image for the primed image. See Red Hat® Cloud Access.
- 9
- Specify
False
to disable the creation of an S3 bucket. - 10
- Provide the security group name.
- 11
- Provide the VPC subnets the existing cluster is running in.
- 12
- Specify
overlay2
as the Docker storage type. - 13
- Specify the mount point for LVM and the /var/lib/docker directory.
- 14
- If you use Red Hat Cloud, set this parameter value to
true
to enablegquota
on the file system. - 15
- Specify an email address for a Red Hat account with an active OpenShift Container Platform subscription.
- 16
- Specify the password for the Red Hat account
- 17
- Specify a pool ID for an OpenShift Container Platform subscription. You can use the same pool ID that you used when you created your cluster.
Run the build_ami.yml playbook to generate a primed image:
# ansible-playbook -i </path/to/inventory/file> \ /usr/openshift-ansible/playbooks/aws/openshift-cluster/build_ami.yml \ -e @build-ami-provisioning-vars.yaml
After the playbook runs, you see a new image ID, or AMI, in its output. You specify the AMI that it generated when you create the Launch Configuration.
43.3. Creating the launch configuration and Auto Scaling group
Before you deploy the cluster auto-scaler, you must create an Amazon Web Services (AWS) launch configuration and Auto Scaling group that reference a primed image. You must configure the launch configuration so that the new node automatically joins the existing cluster when it starts.
Prerequisites
- Install an OpenShift Container Platform cluster in AWS.
- Create a primed image.
-
If you deployed the EFK stack in your cluster, set the node label to
logging-infra-fluentd=true
.
Procedure
Create the bootstrap.kubeconfig file by generating it from a master node:
$ ssh master "sudo oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper" > ~/bootstrap.kubeconfig
Create the user-data.txt cloud-init file from the bootstrap.kubeconfig file:
$ cat <<EOF > user-data.txt #cloud-config write_files: - path: /root/openshift_bootstrap/openshift_settings.yaml owner: 'root:root' permissions: '0640' content: | openshift_node_config_name: node-config-compute - path: /etc/origin/node/bootstrap.kubeconfig owner: 'root:root' permissions: '0640' encoding: b64 content: | $(base64 ~/bootstrap.kubeconfig | sed '2,$s/^/ /') runcmd: - [ ansible-playbook, /root/openshift_bootstrap/bootstrap.yml] - [ systemctl, restart, systemd-hostnamed] - [ systemctl, restart, NetworkManager] - [ systemctl, enable, atomic-openshift-node] - [ systemctl, start, atomic-openshift-node] EOF
- Upload a launch configuration template to an AWS S3 bucket.
Create the launch configuration by using the AWS CLI:
$ aws autoscaling create-launch-configuration \ --launch-configuration-name mycluster-LC \ 1 --region us-east-1 \ 2 --image-id ami-987654321 \ 3 --instance-type m4.large \ 4 --security-groups sg-12345678 \ 5 --template-url https://s3-.amazonaws.com/.../yourtemplate.json \ 6 --key-name production-key \ 7
- 1
- Specify a launch configuration name.
- 2
- Specify the region to launch the image in.
- 3
- Specify the primed image AMI that you created.
- 4
- Specify the type of instance to launch.
- 5
- Specify the security groups to attach to the launched image.
- 6
- Specify the launch configuration template that you uploaded.
- 7
- Specify the SSH key-pair name.
NoteIf your template is fewer than 16 KB before you encode it, you can provide it using the AWS CLI by substituting
--template-url
with--user-data
.Create the Auto Scaling group by using the AWS CLI:
$ aws autoscaling create-auto-scaling-group \ --auto-scaling-group-name mycluster-ASG \ 1 --launch-configuration-name mycluster-LC \ 2 --min-size 1 \ 3 --max-size 6 \ 4 --vpc-zone-identifier subnet-12345678 \ 5 --tags ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=Name,Value=mycluster-ASG-node,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=kubernetes.io/cluster/mycluster,Value=true,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/node-template/label/node-role.kubernetes.io/compute,Value=true,PropagateAtLaunch=true 6
- 1
- Specify the name of the Auto Scaling group, which you use when you deploy the auto-scaler deployment
- 2
- Specify the name of the Launch Configuration that you created.
- 3
- Specify the minimum number of nodes that the auto-scaler maintains. At least one node is required.
- 4
- Specify the maximum number of nodes the scale group can expand to.
- 5
- Specify the VPC subnet-id, which is the same subnet that the cluster uses.
- 6
- Specify this string to ensure that Auto Scaling group tags are propagated to the nodes when they launch.
43.4. Deploying the auto-scaler components on your cluster
After you create the Launch Configuration and Auto Scaling group, you can deploy the auto-scaler components onto the cluster.
Prerequisites
- Install a OpenShift Container Platform cluster in AWS.
- Create a primed image.
- Create a Launch Configuration and Auto Scaling group that reference the primed image.
Procedure
To deploy the auto-scaler:
Update your cluster to run the auto-scaler:
Add the following parameter to the inventory file that you used to create the cluster, by default /etc/ansible/hosts:
openshift_master_bootstrap_auto_approve=true
To obtain the auto-scaler components, change to the playbook directory and run the playbook again:
$ cd /usr/share/ansible/openshift-ansible $ ansible-playbook -i </path/to/inventory/file> \ playbooks/openshift-master/enable_bootstrap.yml
Confirm that the
bootstrap-autoapprover
pod is running:$ oc get pods --all-namespaces | grep bootstrap-autoapprover NAMESPACE NAME READY STATUS RESTARTS AGE openshift-infra bootstrap-autoapprover-0 1/1 Running 0
Create a namespace for the auto-scaler:
$ oc apply -f - <<EOF apiVersion: v1 kind: Namespace metadata: name: cluster-autoscaler annotations: openshift.io/node-selector: "" EOF
Create a service account for the auto-scaler:
$ oc apply -f - <<EOF apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler name: cluster-autoscaler namespace: cluster-autoscaler EOF
Create a cluster role to grant the required permissions to the service account:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: v1 kind: ClusterRole metadata: name: cluster-autoscaler rules: - apiGroups: 1 - "" resources: - pods/eviction verbs: - create attributeRestrictions: null - apiGroups: - "" resources: - persistentvolumeclaims - persistentvolumes - pods - replicationcontrollers - services verbs: - get - list - watch attributeRestrictions: null - apiGroups: - "" resources: - events verbs: - get - list - watch - patch - create attributeRestrictions: null - apiGroups: - "" resources: - nodes verbs: - get - list - watch - patch - update attributeRestrictions: null - apiGroups: - extensions - apps resources: - daemonsets - replicasets - statefulsets verbs: - get - list - watch attributeRestrictions: null - apiGroups: - policy resources: - poddisruptionbudgets verbs: - get - list - watch attributeRestrictions: null EOF
- 1
- If the
cluster-autoscaler
object exists, ensure that thepods/eviction
rule exists with the verbcreate
.
Create a role for the deployment auto-scaler:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: v1 kind: Role metadata: name: cluster-autoscaler rules: - apiGroups: - "" resources: - configmaps resourceNames: - cluster-autoscaler - cluster-autoscaler-status verbs: - create - get - patch - update attributeRestrictions: null - apiGroups: - "" resources: - configmaps verbs: - create attributeRestrictions: null - apiGroups: - "" resources: - events verbs: - create attributeRestrictions: null EOF
Create a creds file to store AWS credentials for the auto-scaler:
$ cat <<EOF > creds [default] aws_access_key_id = your-aws-access-key-id aws_secret_access_key = your-aws-secret-access-key EOF
The auto-scaler uses these credentials to launch new instances.
Create the a secret that contains the AWS credentials:
$ oc create secret -n cluster-autoscaler generic autoscaler-credentials --from-file=creds
The auto-scaler uses this secret to launch instances within AWS.
Create and grant cluster-reader role to the
cluster-autoscaler
service account that you created:$ oc adm policy add-cluster-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler $ oc adm policy add-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler --role-namespace cluster-autoscaler -n cluster-autoscaler $ oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler
Deploy the cluster auto-scaler:
$ oc apply -n cluster-autoscaler -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: labels: app: cluster-autoscaler name: cluster-autoscaler namespace: cluster-autoscaler spec: replicas: 1 selector: matchLabels: app: cluster-autoscaler role: infra template: metadata: labels: app: cluster-autoscaler role: infra spec: containers: - args: - /bin/cluster-autoscaler - --alsologtostderr - --v=4 - --skip-nodes-with-local-storage=False - --leader-elect-resource-lock=configmaps - --namespace=cluster-autoscaler - --cloud-provider=aws - --nodes=0:6:mycluster-ASG env: - name: AWS_REGION value: us-east-1 - name: AWS_SHARED_CREDENTIALS_FILE value: /var/run/secrets/aws-creds/creds image: registry.redhat.io/openshift3/ose-cluster-autoscaler:v3.11 name: autoscaler volumeMounts: - mountPath: /var/run/secrets/aws-creds name: aws-creds readOnly: true dnsPolicy: ClusterFirst nodeSelector: node-role.kubernetes.io/infra: "true" serviceAccountName: cluster-autoscaler terminationGracePeriodSeconds: 30 volumes: - name: aws-creds secret: defaultMode: 420 secretName: autoscaler-credentials EOF
43.5. Testing the auto-scaler
After you add the auto-scaler to your Amazon Web Services (AWS) cluster, you can confirm that the auto-scaler works by deploying more pods than the current nodes can run.
Prerequisites
- You added the auto-scaler to your OpenShift Container Platform cluster that runs on AWS.
Procedure
Create the scale-up.yaml file that contains the deployment configuration to test auto-scaling:
apiVersion: apps/v1 kind: Deployment metadata: name: scale-up labels: app: scale-up spec: replicas: 20 1 selector: matchLabels: app: scale-up template: metadata: labels: app: scale-up spec: containers: - name: origin-base image: openshift/origin-base resources: requests: memory: 2Gi command: - /bin/sh - "-c" - "echo 'this should be in the logs' && sleep 86400" terminationGracePeriodSeconds: 0
- 1
- This deployment specifies 20 replicas, but the initial size of the cluster cannot run all of the pods without first increasing the number of compute nodes.
Create a namespace for the deployment:
$ oc apply -f - <<EOF apiVersion: v1 kind: Namespace metadata: name: autoscaler-demo EOF
Deploy the configuration:
$ oc apply -n autoscaler-demo -f scale-up.yaml
View the pods in your namespace:
View the running pods in your namespace:
$ oc get pods -n autoscaler-demo | grep Running cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d scale-up-79684ff956-45sbg 1/1 Running 0 31s scale-up-79684ff956-4kzjv 1/1 Running 0 31s scale-up-79684ff956-859d2 1/1 Running 0 31s scale-up-79684ff956-h47gv 1/1 Running 0 31s scale-up-79684ff956-htjth 1/1 Running 0 31s scale-up-79684ff956-m996k 1/1 Running 0 31s scale-up-79684ff956-pvvrm 1/1 Running 0 31s scale-up-79684ff956-qs9pp 1/1 Running 0 31s scale-up-79684ff956-zwdpr 1/1 Running 0 31s
View the pending pods in your namespace:
$ oc get pods -n autoscaler-demo | grep Pending scale-up-79684ff956-5jdnj 0/1 Pending 0 40s scale-up-79684ff956-794d6 0/1 Pending 0 40s scale-up-79684ff956-7rlm2 0/1 Pending 0 40s scale-up-79684ff956-9m2jc 0/1 Pending 0 40s scale-up-79684ff956-9m5fn 0/1 Pending 0 40s scale-up-79684ff956-fr62m 0/1 Pending 0 40s scale-up-79684ff956-q255w 0/1 Pending 0 40s scale-up-79684ff956-qc2cn 0/1 Pending 0 40s scale-up-79684ff956-qjn7z 0/1 Pending 0 40s scale-up-79684ff956-tdmqt 0/1 Pending 0 40s scale-up-79684ff956-xnjhw 0/1 Pending 0 40s
These pending pods cannot run until the cluster auto-scaler automatically provisions new compute nodes to run the pods on. It can several minutes for the nodes have a
Ready
state in the cluster.
After several minutes, check the list of nodes to see if new nodes are ready:
$ oc get nodes NAME STATUS ROLES AGE VERSION ip-172-31-49-172.ec2.internal Ready infra 1d v1.11.0+d4cacc0 ip-172-31-53-217.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-55-89.ec2.internal Ready compute 9h v1.11.0+d4cacc0 ip-172-31-56-21.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-56-71.ec2.internal Ready compute 7m v1.11.0+d4cacc0 ip-172-31-63-234.ec2.internal Ready master 1d v1.11.0+d4cacc0
When more nodes are ready, view the running pods in your namespace again:
$ oc get pods -n autoscaler-demo NAME READY STATUS RESTARTS AGE cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d scale-up-79684ff956-45sbg 1/1 Running 0 8m scale-up-79684ff956-4kzjv 1/1 Running 0 8m scale-up-79684ff956-5jdnj 1/1 Running 0 8m scale-up-79684ff956-794d6 1/1 Running 0 8m scale-up-79684ff956-7rlm2 1/1 Running 0 8m scale-up-79684ff956-859d2 1/1 Running 0 8m scale-up-79684ff956-9m2jc 1/1 Running 0 8m scale-up-79684ff956-9m5fn 1/1 Running 0 8m scale-up-79684ff956-fr62m 1/1 Running 0 8m scale-up-79684ff956-h47gv 1/1 Running 0 8m scale-up-79684ff956-htjth 1/1 Running 0 8m scale-up-79684ff956-m996k 1/1 Running 0 8m scale-up-79684ff956-pvvrm 1/1 Running 0 8m scale-up-79684ff956-q255w 1/1 Running 0 8m scale-up-79684ff956-qc2cn 1/1 Running 0 8m scale-up-79684ff956-qjn7z 1/1 Running 0 8m scale-up-79684ff956-qs9pp 1/1 Running 0 8m scale-up-79684ff956-tdmqt 1/1 Running 0 8m scale-up-79684ff956-xnjhw 1/1 Running 0 8m scale-up-79684ff956-zwdpr 1/1 Running 0 8m ...