Chapter 3. Specifying dedicated nodes
A Kubernetes cluster runs on top of many Virtual Machines or nodes (generally anywhere between 2 and 20 nodes). Pods can be scheduled on any of these nodes. When you create or schedule a new pod, use the topology_spread_constraints setting to configure how new pods are distributed across the underlying nodes when scheduled or created.
Do not schedule your pods on a single node, because if that node fails, the services that those pods provide also fails.
Schedule the control plane nodes to run on different nodes to the automation job pods. If the control plane pods share nodes with the job pods, the control plane can become resource starved and degrade the performance of the whole application.
3.1. Assigning pods to specific nodes Copy linkLink copied to clipboard!
You can constrain the automation controller pods created by the operator to run on a certain subset of nodes.
-
node_selectorandpostgres_selectorconstrain the automation controller pods to run only on the nodes that match all the specified key, or value, pairs. -
tolerationsandpostgres_tolerationsenable the automation controller pods to be scheduled onto nodes with matching taints. See Taints and Toleration in the Kubernetes documentation for further details.
The following table shows the settings and fields that can be set on the automation controller’s specification section of the YAML (or using the OpenShift UI form).
| Name | Description | Default |
|---|---|---|
|
| Path of the image to pull | postgres |
|
| Image version to pull | 13 |
|
| AutomationController pods’ nodeSelector | “”’’ |
|
| AutomationController pods’ topologySpreadConstraints | “”’’ |
|
| AutomationController pods’ tolerations | “”’’ |
|
| AutomationController pods’ annotations | “”’’ |
|
| Postgres pods’ nodeSelector | “”’’ |
|
| Postgres pods’ tolerations | “”’’ |
topology_spread_constraints can help optimize spreading your control plane pods across the compute nodes that match your node selector. For example, with the maxSkew parameter of this option set to 100, this means maximally spread across available nodes. So if there are three matching compute nodes and three pods, one pod will be assigned to each compute node. This parameter helps prevent the control plane pods from competing for resources with each other.
Example of a custom configuration for constraining controller pods to specific nodes
3.2. Specify nodes for job execution Copy linkLink copied to clipboard!
You can add a node selector to the container group pod specification to ensure they only run against certain nodes. First add a label to the nodes you want to run jobs against.
The following procedure adds a label to a node.
Procedure
List the nodes in your cluster, along with their labels:
kubectl get nodes --show-labels
kubectl get nodes --show-labelsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output is similar to this (shown here in a table):
Expand Name Status Roles Age Version Labels worker0Ready
<none>
1d
v1.13.0
…,kubernetes.io/hostname=worker0worker1Ready
<none>
1d
v1.13.0
…,kubernetes.io/hostname=worker1worker2Ready
<none>
1d
v1.13.0
…,kubernetes.io/hostname=worker2Choose one of your nodes, and add a label to it by using the following command:
kubectl label nodes <your-node-name> <aap_node_type>=<execution>
kubectl label nodes <your-node-name> <aap_node_type>=<execution>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
kubectl label nodes <your-node-name> disktype=ssd
kubectl label nodes <your-node-name> disktype=ssdCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<your-node-name>is the name of your chosen node.Verify that your chosen node has a
disktype=ssdlabel:kubectl get nodes --show-labels
kubectl get nodes --show-labelsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output is similar to this (shown here in a table):
Expand Name Status Roles Age Version Labels worker0Ready
<none>
1d
v1.13.0
…disktype=ssd,kubernetes.io/hostname=worker0worker1Ready
<none>
1d
v1.13.0
…,kubernetes.io/hostname=worker1worker2Ready
<none>
1d
v1.13.0
…,kubernetes.io/hostname=worker2You can see that the
worker0node now has adisktype=ssdlabel.In the automation controller UI, specify that label in the metadata section of your customized pod specification in the container group.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
3.3. Extra settings Copy linkLink copied to clipboard!
With extra_settings, you can pass many custom settings by using the awx-operator. The parameter extra_settings is appended to /etc/tower/settings.py and can be an alternative to the extra_volumes parameter.
| Name | Description | Default |
|---|---|---|
|
| Extra settings | ‘’ |
Example configuration of extra_settings parameter
Custom pod timeouts
A container group job in automation controller transitions to the running state just before you submit the pod to the Kubernetes API. Automation controller then expects the pod to enter the Running state before AWX_CONTAINER_GROUP_POD_PENDING_TIMEOUT seconds has elapsed. You can set AWX_CONTAINER_GROUP_POD_PENDING_TIMEOUT to a higher value if you want automation controller to wait for longer before canceling jobs that fail to enter the Running state. AWX_CONTAINER_GROUP_POD_PENDING_TIMEOUT is how long automation controller waits from creation of a pod until the Ansible work begins in the pod. You can also extend the time if the pod cannot be scheduled because of resource constraints. You can do this using extra_settings on the automation controller specification. The default value is two hours.
This is used if you are consistently launching many more jobs than Kubernetes can schedule, and jobs are spending periods longer than AWX_CONTAINER_GROUP_POD_PENDING_TIMEOUT in pending.
Jobs are not launched until control capacity is available. If many more jobs are being launched than the container group has capacity to run, consider scaling up your Kubernetes worker nodes.
3.4. Jobs scheduled on the worker nodes Copy linkLink copied to clipboard!
Both automation controller and Kubernetes play a role in scheduling a job.
When a job is launched, its dependencies are fulfilled, meaning any project updates or inventory updates are launched by automation controller as required by the job template, project, and inventory settings.
If the job is not blocked by other business logic in automation controller and there is control capacity in the control plane to start the job, the job is submitted to the dispatcher. The default settings of the "cost" to control a job is 1 capacity. So, a control pod with 100 capacity is able to control up to 100 jobs at a time. Given control capacity, the job transitions from pending to waiting.
The dispatcher, which is a background process in the control plan pod, starts a worker process to run the job. This communicates with the Kubernetes API using a service account associated with the container group and uses the pod specification as defined on the Container Group in automation controller to provision the pod. The job status in automation controller is shown as running.
Kubernetes now schedules the pod. A pod can remain in the pending state for AWX_CONTAINER_GROUP_POD_PENDING_TIMEOUT. If the pod is denied through a ResourceQuota, the job starts over at pending. You can configure a resource quota on a namespace to limit how many resources may be consumed by pods in the namespace. For further information about ResourceQuotas, see Resource Quotas.