Automation mesh for VM environments
Automate at scale in a cloud-native way
Abstract
Preface Copy linkLink copied to clipboard!
Thank you for your interest in Red Hat Ansible Automation Platform. Ansible Automation Platform is a commercial offering that helps teams manage complex multi-tier deployments by adding control, knowledge, and delegation to Ansible-powered environments.
This guide helps you to understand the requirements and processes behind setting up an automation mesh on a VM-based installation of Red Hat Ansible Automation Platform. This document has been updated to include information for the latest release of Ansible Automation Platform.
Providing feedback on Red Hat documentation Copy linkLink copied to clipboard!
If you have a suggestion to improve this documentation, or find an error, you can contact technical support at https://access.redhat.com to open a request.
Chapter 1. Planning for automation mesh in your VM-based Red Hat Ansible Automation Platform environment Copy linkLink copied to clipboard!
The following topics contain information to help plan an automation mesh deployment in your VM-based Ansible Automation Platform environment. The subsequent sections explain the concepts that comprise automation mesh in addition to providing examples on how you can design automation mesh topologies. Simple to complex topology examples are included to illustrate the various ways you can deploy automation mesh.
1.1. About automation mesh Copy linkLink copied to clipboard!
Automation mesh is an overlay network intended to ease the distribution of work across a large and dispersed collection of workers through nodes that establish peer-to-peer connections with each other using existing networks.
Red Hat Ansible Automation Platform 2 replaces Ansible Tower and isolated nodes with Ansible Automation Platform and automation hub. Ansible Automation Platform provides the control plane for automation through its UI, RESTful API, RBAC, workflows and CI/CD integration, while automation mesh can be used for setting up, discovering, changing or modifying the nodes that form the control and execution layers.
Automation mesh uses TLS encryption for communication, so traffic that traverses external networks (the internet or other) is encrypted in transit.
Automation mesh introduces:
- Dynamic cluster capacity that scales independently, enabling you to create, register, group, ungroup and deregister nodes with minimal downtime.
- Control and execution plane separation that enables you to scale playbook execution capacity independently from control plane capacity.
- Deployment choices that are resilient to latency, reconfigurable without outage, and that dynamically re-reroute to choose a different path when outages occur.
- Connectivity that includes bi-directional, multi-hopped mesh communication possibilities which are Federal Information Processing Standards (FIPS) compliant.
1.2. Control and execution planes Copy linkLink copied to clipboard!
Automation mesh makes use of unique node types to create both the control and execution plane. Learn more about the control and execution plane and their node types before designing your automation mesh topology.
1.2.1. Control plane Copy linkLink copied to clipboard!
The control plane consists of hybrid and control nodes. Instances in the control plane run persistent automation controller services such as the the web server and task dispatcher, in addition to project updates, and management jobs.
-
Hybrid nodes - this is the default node type for control plane nodes, responsible for automation controller runtime functions like project updates, management jobs and
ansible-runnertask operations. Hybrid nodes are also used for automation execution. - Control nodes - control nodes run project and inventory updates and system jobs, but not regular jobs. Execution capabilities are disabled on these nodes.
1.2.2. Execution plane Copy linkLink copied to clipboard!
The execution plane consists of execution nodes that execute automation on behalf of the control plane and have no control functions. Hop nodes serve to communicate. Nodes in the execution plane only run user-space jobs, and may be geographically separated, with high latency, from the control plane.
-
Execution nodes - Execution nodes run jobs under
ansible-runnerwithpodmanisolation. This node type is similar to isolated nodes. This is the default node type for execution plane nodes. - Hop nodes - similar to a jump host, hop nodes route traffic to other execution nodes. Hop nodes cannot execute automation.
1.2.3. Peers Copy linkLink copied to clipboard!
Peer relationships define node-to-node connections. You can define peers within the [automationcontroller] and [execution_nodes] groups or using the [automationcontroller:vars] or [execution_nodes:vars] groups
1.2.4. Defining automation mesh node types Copy linkLink copied to clipboard!
The examples in this section demonstrate how to set the node type for the hosts in your inventory file.
You can set the node_type for single nodes in the control plane or execution plane inventory groups. To define the node type for an entire group of nodes, set the node_type in the vars stanza for the group.
-
The permitted values for
node_typein the control plane[automationcontroller]group arehybrid(default) andcontrol. -
The permitted values for
node_typein the[execution_nodes]group areexecution(default) andhop.
Hybrid node
The following inventory consists of a single hybrid node in the control plane:
[automationcontroller] control-plane-1.example.com
[automationcontroller]
control-plane-1.example.com
Control node
The following inventory consists of a single control node in the control plane:
[automationcontroller] control-plane-1.example.com node_type=control
[automationcontroller]
control-plane-1.example.com node_type=control
If you set node_type to control in the vars stanza for the control plane nodes, then all of the nodes in control plane are control nodes.
[automationcontroller] control-plane-1.example.com [automationcontroller:vars] node_type=control
[automationcontroller]
control-plane-1.example.com
[automationcontroller:vars]
node_type=control
Execution node
The following stanza defines a single execution node in the execution plane:
[execution_nodes] execution-plane-1.example.com
[execution_nodes]
execution-plane-1.example.com
Hop node
The following stanza defines a single hop node and an execution node in the execution plane. The node_type variable is set for every individual node.
[execution_nodes] execution-plane-1.example.com node_type=hop execution-plane-2.example.com
[execution_nodes]
execution-plane-1.example.com node_type=hop
execution-plane-2.example.com
If you want to set the node_type at the group level, you must create separate groups for the execution nodes and the hop nodes.
For a container-based installation of Ansible Automation Platform, use the receptor_peers= variable instead of peers=.
The value of receptor_peers must be a comma-separated list of hostnames. Do not use inventory group names. For more information, see Adding execution nodes.
Create node-to-node connections using the peers= host variable. The following example connects control-plane-1.example.com to execution-node-1.example.com and execution-node-1.example.com to execution-node-2.example.com:
Chapter 2. Setting up automation mesh Copy linkLink copied to clipboard!
Configure the Ansible Automation Platform installer to set up automation mesh for your Ansible environment. Perform additional tasks to customize your installation, such as importing a Certificate Authority (CA) certificate.
2.1. automation mesh Installation Copy linkLink copied to clipboard!
For a VM-based install of Ansible Automation Platform you use the installation program to set up automation mesh or to upgrade to automation mesh. To provide Ansible Automation Platform with details about the nodes, groups, and peer relationships in your mesh network, you define them in an the inventory file in the installer bundle. For managed cloud, OpenShift, or operator environments, see Automation mesh for managed cloud or operator environments.
2.2. Editing the Red Hat Ansible Automation Platform installer inventory file Copy linkLink copied to clipboard!
You can use the Red Hat Ansible Automation Platform installer inventory file to specify your installation scenario.
Procedure
Navigate to the installer:
[RPM installed package]
cd /opt/ansible-automation-platform/installer/
$ cd /opt/ansible-automation-platform/installer/Copy to Clipboard Copied! Toggle word wrap Toggle overflow [bundled installer]
cd ansible-automation-platform-setup-bundle-<latest-version>
$ cd ansible-automation-platform-setup-bundle-<latest-version>Copy to Clipboard Copied! Toggle word wrap Toggle overflow [online installer]
cd ansible-automation-platform-setup-<latest-version>
$ cd ansible-automation-platform-setup-<latest-version>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Open the
inventoryfile with a text editor. -
Edit
inventoryfile parameters to specify your installation scenario. For further information, see Editing the Red Hat Ansible Automation Platform installer inventory file
2.3. Running the Red Hat Ansible Automation Platform installer setup script Copy linkLink copied to clipboard!
After you update the inventory file with required parameters, run the installer setup script.
Procedure
Run the
setup.shscriptsudo ./setup.sh
$ sudo ./setup.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you are running the setup as a non-root user with sudo privileges, you can use the following command:
ANSIBLE_BECOME_METHOD='sudo' ANSIBLE_BECOME=True ./setup.sh
$ ANSIBLE_BECOME_METHOD='sudo'
ANSIBLE_BECOME=True ./setup.sh
Installation of Red Hat Ansible Automation Platform will begin.
Additional resources
See Understanding privilege escalation for additional setup.sh script examples.
If you want to add additional nodes to your automation mesh after the initial setup, edit the inventory file to add the new node, then rerun the setup.sh script.
2.4. Importing a Certificate Authority (CA) certificate Copy linkLink copied to clipboard!
A Certificate Authority (CA) verifies and signs individual node certificates in an automation mesh environment. You can provide your own CA by specifying the path to the certificate and the private RSA key file in the inventory file of your Red Hat Ansible Automation Platform installer.
The Ansible Automation Platform installation program generates a CA if you do not provide one.
Procedure
-
Open the
inventoryfile for editing. -
Add the
mesh_ca_keyfilevariable and specify the full path to the private RSA key (.key). -
Add the
mesh_ca_certfilevariable and specify the full path to the CA certificate file (.crt). - Save the changes to the inventory file.
Example
[all:vars] mesh_ca_keyfile=/tmp/<mesh_CA>.key mesh_ca_certfile=/tmp/<mesh_CA>.crt
[all:vars]
mesh_ca_keyfile=/tmp/<mesh_CA>.key
mesh_ca_certfile=/tmp/<mesh_CA>.crt
With the CA files added to the inventory file, run the installation program to apply the CA. This process copies the CA to the to /etc/receptor/tls/ca/ directory on each control and execution node on your mesh network.
2.4.1. Using custom signed certificates in automation mesh Copy linkLink copied to clipboard!
Learn how to replace the default automation mesh installer-provided certificates with custom,organization-specific certificates.
In the following procedure, replace <FQDN/IP Address> and <IP Address> with the Fully Qualified Domain Name (FQDN) or IP address of the node.
Procedure
Stop the receptor service on all automation controller and execution nodes.
# systemctl stop receptor- Generate a new Certificate Authority (CA) for your mesh network.
Replace "common ca" in the command below with the required common name.
receptor --cert-init commonname="common ca" bits=4096 outcert=/etc/receptor/tls/ca/mesh-CA.crt outkey=/etc/receptor/tls/ca/mesh-CA.key
# receptor --cert-init commonname="common ca" bits=4096 outcert=/etc/receptor/tls/ca/mesh-CA.crt outkey=/etc/receptor/tls/ca/mesh-CA.keyCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a self-signed certificate request for each Controller and Execution Node.
receptor --cert-makereq commonname=<FQDN/IP Address> bits=4096 nodeid=<FQDN/IP Address> outreq=/etc/receptor/tls/<FQDN/IP Address>.csr outkey=/etc/receptor/tls/<FQDN/IP Address>.key ipaddress=<IP Address> ipaddress=<IP Address>
# receptor --cert-makereq commonname=<FQDN/IP Address> bits=4096 nodeid=<FQDN/IP Address> outreq=/etc/receptor/tls/<FQDN/IP Address>.csr outkey=/etc/receptor/tls/<FQDN/IP Address>.key ipaddress=<IP Address> ipaddress=<IP Address>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Sign the newly created certificates with your CA. Make sure you adjust the
notafter=date to meet your organizational requirements. The example shown uses a date far in the future.receptor --cert-signreq verify=yes cacert=/etc/receptor/tls/ca/mesh-CA.crt cakey=/etc/receptor/tls/ca/mesh-CA.key req=/etc/receptor/tls/<FQDN/IP Address>.csr outcert=/etc/receptor/tls/<FQDN/IP Address>.crt notafter="2034-07-29T20:48:02Z"
# receptor --cert-signreq verify=yes cacert=/etc/receptor/tls/ca/mesh-CA.crt cakey=/etc/receptor/tls/ca/mesh-CA.key req=/etc/receptor/tls/<FQDN/IP Address>.csr outcert=/etc/receptor/tls/<FQDN/IP Address>.crt notafter="2034-07-29T20:48:02Z"Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Transfer the newly created and signed certificates to their nodes in the
/etc/receptor/tls/directory. -
The
mesh-CA.crtfile must be placed in/etc/receptor/tls/ca. Ensure that the permissions and ownership of the certificate files are set correctly.
- All files should be owned by receptor
All certificate files should have 0640 permissions.
# chown -R receptor: /etc/receptor; chmod 0640 /etc/receptor/tls/<FQDN/IP Address>.crt
Start the receptor service on all Controller and Execution nodes.
# systemctl start receptor- Verify the node status in the Ansible Automation Platform UI:
- In the navigation panel, select → →
- Select the default instance group, then go to the Instances tab.
- Ensure that the status of all nodes is marked as Ready.
If any node is marked as Unavailable:
- Select the Unavailable node.
- Click .
- Refresh the page, and the node should now display as Ready.
2.4.2. Correcting multiple signed certificates Copy linkLink copied to clipboard!
If /etc/receptor/tls/ca/mesh-CA.crt (for RPM-based installs) or $HOME/aap/receptor/etc/mesh-CA.crt (for containerized installs) contains more than 10 certificates, an error occurs.
Take the following steps on all automation controller and execution nodes within the Ansible Automation Platform environment.
For an RPM-based install
Procedure
Make a backup of the
mesh-CA.crtfilecp -p /etc/receptor/tls/ca/mesh-CA.crt /etc/receptor/tls/ca/mesh-CA.crt-$(date +%F)-
Delete everything past the first certificate within the
mesh-CA.crtfile, that is, keep only the first certificate that is present at the top of the file. Restart receptor
systemctl restart receptor
For a Containerized install
Make a backup of the
mesh-CA.crtfilecp -p $HOME/aap/receptor/etc/mesh-CA.crt $HOME/aap/receptor/etc/mesh-CA.crt-$(date +%F)-
Delete everything past the first certificate within the
mesh-CA.crtfile, that is, keep only the first certificate that is present at the top of the file. Restart receptor
systemctl --user restart receptor
Chapter 3. Automation mesh design patterns Copy linkLink copied to clipboard!
The automation mesh topologies in this section provide examples you can use to design a mesh deployment in your environment. Examples range from a simple, hydrid node deployment to a complex pattern that deploys numerous automation controller instances, employing several execution and hop nodes.
If you are creating a mesh similar to the following in a containerized environment:
-
Replace the
node_typevariable withreceptor_type -
Replace the
peersvariable withreceptor_peers - Replace inventory group names with explicit comma-separated lists of hostnames
The value of receptor_peers must be a comma-separated list of hostnames. Do not use inventory group names. For more information, see Adding execution nodes.
Prerequisites
- You reviewed conceptual information on node types and relationships.
The following examples include images that illustrate the mesh topology. The arrows in the images indicate the direction of peering. After peering is established, the connection between the nodes allows bidirectional communication.
3.1. Multiple hybrid nodes inventory file example Copy linkLink copied to clipboard!
This example inventory file deploys a control plane consisting of multiple hybrid nodes. The nodes in the control plane are automatically peered to one another.
[automationcontroller] aap_c_1.example.com aap_c_2.example.com aap_c_3.example.com
[automationcontroller]
aap_c_1.example.com
aap_c_2.example.com
aap_c_3.example.com
The following image displays the topology of this mesh network.
The default node_type for nodes in the control plane is hybrid. You can explicitly set the node_type of individual nodes to hybrid in the [automationcontroller group]:
[automationcontroller] aap_c_1.example.com node_type=hybrid aap_c_2.example.com node_type=hybrid aap_c_3.example.com node_type=hybrid
[automationcontroller]
aap_c_1.example.com node_type=hybrid
aap_c_2.example.com node_type=hybrid
aap_c_3.example.com node_type=hybrid
Alternatively, you can set the node_type of all nodes in the [automationcontroller] group. When you add new nodes to the control plane they are automatically set to hybrid nodes.
If you think that you might add control nodes to your control plane in future, it is better to define a separate group for the hybrid nodes, and set the node_type for the group:
3.2. Single node control plane with single execution node Copy linkLink copied to clipboard!
This example inventory file deploys a single-node control plane and establishes a peer relationship to an execution node.
The following image displays the topology of this mesh network.
The [automationcontroller] stanza defines the control nodes. If you add a new node to the automationcontroller group, it will automatically peer with the aap_c_1.example.com node.
The [automationcontroller:vars] stanza sets the node type to control for all nodes in the control plane and defines how the nodes peer to the execution nodes:
-
If you add a new node to the
execution_nodesgroup, the control plane nodes automatically peer to it. -
If you add a new node to the
automationcontrollergroup, the node type is set tocontrol.
The [execution_nodes] stanza lists all the execution and hop nodes in the inventory. The default node type is execution. You can specify the node type for an individual node:
[execution_nodes] aap_e_1.example.com node_type=execution
[execution_nodes]
aap_e_1.example.com node_type=execution
Alternatively, you can set the node_type of all execution nodes in the [execution_nodes] group. When you add new nodes to the group, they are automatically set to execution nodes.
[execution_nodes] aap_e_1.example.com [execution_nodes:vars] node_type=execution
[execution_nodes]
aap_e_1.example.com
[execution_nodes:vars]
node_type=execution
If you plan to add hop nodes to your inventory in future, it is better to define a separate group for the execution nodes, and set the node_type for the group:
3.3. Minimum resilient configuration Copy linkLink copied to clipboard!
This example inventory file deploys a control plane consisting of two control nodes, and two execution nodes. All nodes in the control plane are automatically peered to one another. All nodes in the control plane are peered with all nodes in the execution_nodes group. This configuration is resilient because the execution nodes are reachable from all control nodes.
The capacity algorithm determines which control node is chosen when a job is launched. Refer to Automation controller capacity determination and job impact in Configuring automation execution for more information.
The following inventory file defines this configuration.
The [automationcontroller] stanza defines the control nodes. All nodes in the control plane are peered to one another. If you add a new node to the automationcontroller group, it will automatically peer with the original nodes.
The [automationcontroller:vars] stanza sets the node type to control for all nodes in the control plane and defines how the nodes peer to the execution nodes:
-
If you add a new node to the
execution_nodesgroup, the control plane nodes automatically peer to it. -
If you add a new node to the
automationcontrollergroup, the node type is set tocontrol.
The following image displays the topology of this mesh network.
3.4. Segregated local and remote execution configuration Copy linkLink copied to clipboard!
This configuration adds a hop node and a remote execution node to the resilient configuration. The remote execution node is reachable from the hop node.
You can use this setup if you are setting up execution nodes in a remote location, or if you need to run automation in a DMZ network.
The following image displays the topology of this mesh network.
The [automationcontroller:vars] stanza sets the node types for all nodes in the control plane and defines how the control nodes peer to the local execution nodes:
- All nodes in the control plane are automatically peered to one another.
- All nodes in the control plane are peered with all local execution nodes.
If the name of a group of nodes begins with instance_group_, the installer recognises it as an instance group and adds it to the Ansible Automation Platform user interface.
3.5. Multi-hopped execution node Copy linkLink copied to clipboard!
In this configuration, resilient controller nodes are peered with resilient local execution nodes. Resilient local hop nodes are peered with the controller nodes. A remote execution node and a remote hop node are peered with the local hop nodes.
You can use this setup if you need to run automation in a DMZ network from a remote network.
The following image displays the topology of this mesh network.
The [automationcontroller:vars] stanza sets the node types for all nodes in the control plane and defines how the control nodes peer to the local execution nodes:
- All nodes in the control plane are automatically peered to one another.
- All nodes in the control plane are peered with all local execution nodes.
The [local_hop:vars] stanza peers all nodes in the [local_hop] group with all the control nodes.
If the name of a group of nodes begins with instance_group_, the installer recognises it as an instance group and adds it to the Ansible Automation Platform user interface.
3.6. Outbound only connections to controller nodes Copy linkLink copied to clipboard!
This example inventory file deploys a control plane consisting of two control nodes, and several execution nodes. Only outbound connections are allowed to the controller nodes All nodes in the 'execution_nodes' group are peered with all nodes in the controller plane.
The following image displays the topology of this mesh network.
Chapter 4. Deprovisioning individual nodes or groups Copy linkLink copied to clipboard!
You can deprovision automation mesh nodes and instance groups by using the Ansible Automation Platform installer. Learn how to deprovision specific nodes or entire groups, with example inventory files.
Containerized Ansible Automation Platform deployments do not support the node_state=deprovision parameter. For containerized deployments, use the awx-manage command to deprovision nodes and groups. For more information, see Deprovisioning isolated nodes and Deprovisioning isolated instance groups.
4.1. Deprovisioning individual nodes using the installer Copy linkLink copied to clipboard!
You can deprovision nodes from your automation mesh using the Ansible Automation Platform installer. Edit the inventory file to mark the nodes to deprovision, then run the installer. Running the installer also removes all configuration files and logs attached to the node.
You can deprovision any of your inventory’s hosts except for the first host specified in the [automationcontroller] group.
Procedure
Append
node_state=deprovisionto nodes in the installer file you want to deprovision.The following example inventory file deprovisions two nodes from an automation mesh configuration.
Example 4.1. Deprovision nodes
4.2. Deprovisioning isolated nodes Copy linkLink copied to clipboard!
You have the option to manually remove any isolated nodes using the awx-manage deprovisioning utility.
Use the deprovisioning command to remove only isolated nodes that have not migrated to execution nodes. To deprovision execution nodes from your automation mesh architecture, use the Deprovisioning individual nodes using the installer method instead.
Procedure
Shut down the instance:
automation-controller-service stop
$ automation-controller-service stopCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the deprovision command from another instance, replacing
host_namewith the name of the node as listed in the inventory file:awx-manage deprovision_instance --hostname=<host_name>
$ awx-manage deprovision_instance --hostname=<host_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.3. Deprovisioning groups using the installer Copy linkLink copied to clipboard!
You can deprovision entire groups from your automation mesh using the Ansible Automation Platform installer. Running the installer will remove all configuration files and logs attached to the nodes in the group.
You can deprovision any hosts in your inventory except for the first host specified in the [automationcontroller] group.
Procedure
-
Add
node_state=deprovisionto the [group:vars] associated with the group you want to deprovision.
Example 4.2. Group deprovision
4.4. Deprovisioning isolated instance groups Copy linkLink copied to clipboard!
You have the option to manually remove any isolated instance groups using the awx-manage deprovisioning utility.
Use the deprovisioning command to only remove isolated instance groups. To deprovision instance groups from your automation mesh architecture, use the Deprovisioning groups using the installer method instead.
Procedure
Run the following command, replacing
<name>with the name of the instance group:awx-manage unregister_queue --queuename=<name>
$ awx-manage unregister_queue --queuename=<name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow