Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Appendix A. Troubleshooting containerized Ansible Automation Platform
Use this information to troubleshoot your containerized Ansible Automation Platform installation.
A.1. Troubleshooting containerized Ansible Automation Platform installation Link kopierenLink in die Zwischenablage kopiert!
The installation takes a long time, or has errors, what should I check?
- Ensure your system meets the minimum requirements as outlined in the installation guide. Items such as improper storage choices and high latency when distributing across many hosts will all have a significant impact.
-
Check the installation log file located by default at
./aap_install.logunless otherwise changed within the local installeransible.cfg. -
Enable task profiling callbacks on an ad hoc basis to give an overview of where the installation program spends the most time. To do this, use the local
ansible.cfgfile. Add a callback line such as this under the[defaults]section:
cat ansible.cfg [defaults] callbacks_enabled = ansible.posix.profile_tasks
$ cat ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks
Automation controller returns an error of 413
This error is due to manifest.zip license files that are larger than the nginx_client_max_body_size setting. If this error occurs, you will need to change the installation inventory file to include the following variables:
nginx_disable_hsts: false nginx_http_port: 8081 nginx_https_port: 8444 nginx_client_max_body_size: 20m nginx_user_headers: []
nginx_disable_hsts: false
nginx_http_port: 8081
nginx_https_port: 8444
nginx_client_max_body_size: 20m
nginx_user_headers: []
The current default setting of 20m should be enough to avoid this issue.
The installation failed with a “502 Bad Gateway” when going to the controller UI.
This error can occur and manifest itself in the installation application output as:
TASK [ansible.containerized_installer.automationcontroller : Wait for the Controller API to te ready] ******************************************************
fatal: [daap1.lan]: FAILED! => {"changed": false, "connection": "close", "content_length": "150", "content_type": "text/html", "date": "Fri, 29 Sep 2023 09:42:32 GMT", "elapsed": 0, "msg": "Status code was 502 and not [200]: HTTP Error 502: Bad Gateway", "redirected": false, "server": "nginx", "status": 502, "url": "https://daap1.lan:443/api/v2/ping/"}
TASK [ansible.containerized_installer.automationcontroller : Wait for the Controller API to te ready] ******************************************************
fatal: [daap1.lan]: FAILED! => {"changed": false, "connection": "close", "content_length": "150", "content_type": "text/html", "date": "Fri, 29 Sep 2023 09:42:32 GMT", "elapsed": 0, "msg": "Status code was 502 and not [200]: HTTP Error 502: Bad Gateway", "redirected": false, "server": "nginx", "status": 502, "url": "https://daap1.lan:443/api/v2/ping/"}
-
Check if you have an
automation-controller-webcontainer running and a systemd service.
This is used at the regular unprivileged user not system wide level. If you have used su to switch to the user running the containers, you must set your XDG_RUNTIME_DIR environment variable to the correct value to be able to interact with the user systemctl units.
export XDG_RUNTIME_DIR="/run/user/$UID"
export XDG_RUNTIME_DIR="/run/user/$UID"
podman ps | grep web systemctl --user | grep web
podman ps | grep web
systemctl --user | grep web
No output indicates a problem.
Try restarting the
automation-controller-webservice:systemctl start automation-controller-web.service --user systemctl --user | grep web systemctl status automation-controller-web.service --user
systemctl start automation-controller-web.service --user systemctl --user | grep web systemctl status automation-controller-web.service --userCopy to Clipboard Copied! Toggle word wrap Toggle overflow Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use) Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use) Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output indicates that the port is already, or still, in use by another service. In this case
nginx.Run:
sudo pkill nginx
sudo pkill nginxCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Restart and status check the web service again.
Normal service output should look similar to the following, and should still be running:
You can run the installation program again to ensure everything installs as expected.
When attempting to install containerized Ansible Automation Platform in Amazon Web Services you receive output that there is no space left on device
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] ***************************************************
fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] ***************************************************
fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
If you are installing a /home filesystem into a default Amazon Web Services marketplace RHEL instance, it might be too small since /home is part of the root / filesystem. You will need to make more space available. The documentation specifies a minimum of 40GB for a single-node deployment of containerized Ansible Automation Platform.
A.2. Troubleshooting containerized Ansible Automation Platform configuration Link kopierenLink in die Zwischenablage kopiert!
Sometimes the post install for seeding my Ansible Automation Platform content errors out. This could manifest itself as output similar to this:
TASK [infra.controller_configuration.projects : Configure Controller Projects | Wait for finish the projects creation] ***************************************
Friday 29 September 2023 11:02:32 +0100 (0:00:00.443) 0:00:53.521 ******
FAILED - RETRYING: [daap1.lan]: Configure Controller Projects | Wait for finish the projects creation (1 retries left).
failed: [daap1.lan] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '536962174348.33944', 'results_file': '/home/aap/.ansible_async/536962174348.33944', 'changed': False, '__controller_project_item': {'name': 'AAP Config-As-Code Examples', 'organization': 'Default', 'scm_branch': 'main', 'scm_clean': 'no', 'scm_delete_on_update': 'no', 'scm_type': 'git', 'scm_update_on_launch': 'no', 'scm_url': 'https://github.com/user/repo.git'}, 'ansible_loop_var': '__controller_project_item'}) => {"__projects_job_async_results_item": {"__controller_project_item": {"name": "AAP Config-As-Code Examples", "organization": "Default", "scm_branch": "main", "scm_clean": "no", "scm_delete_on_update": "no", "scm_type": "git", "scm_update_on_launch": "no", "scm_url": "https://github.com/user/repo.git"}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__controller_project_item", "changed": false, "failed": 0, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__projects_job_async_results_item", "attempts": 30, "changed": false, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
TASK [infra.controller_configuration.projects : Configure Controller Projects | Wait for finish the projects creation] ***************************************
Friday 29 September 2023 11:02:32 +0100 (0:00:00.443) 0:00:53.521 ******
FAILED - RETRYING: [daap1.lan]: Configure Controller Projects | Wait for finish the projects creation (1 retries left).
failed: [daap1.lan] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '536962174348.33944', 'results_file': '/home/aap/.ansible_async/536962174348.33944', 'changed': False, '__controller_project_item': {'name': 'AAP Config-As-Code Examples', 'organization': 'Default', 'scm_branch': 'main', 'scm_clean': 'no', 'scm_delete_on_update': 'no', 'scm_type': 'git', 'scm_update_on_launch': 'no', 'scm_url': 'https://github.com/user/repo.git'}, 'ansible_loop_var': '__controller_project_item'}) => {"__projects_job_async_results_item": {"__controller_project_item": {"name": "AAP Config-As-Code Examples", "organization": "Default", "scm_branch": "main", "scm_clean": "no", "scm_delete_on_update": "no", "scm_type": "git", "scm_update_on_launch": "no", "scm_url": "https://github.com/user/repo.git"}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__controller_project_item", "changed": false, "failed": 0, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__projects_job_async_results_item", "attempts": 30, "changed": false, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
The infra.controller_configuration.dispatch role uses an asynchronous loop with 30 retries to apply each configuration type, and the default delay between retries is 1 second. If the configuration is large, this might not be enough time to apply everything before the last retry occurs.
Increase the retry delay by setting the controller_configuration_async_delay variable to something other than 1 second. For example, setting it to 2 seconds doubles the retry time. The place to do this would be in the repository where the controller configuration is defined. It could also be added to the [all:vars] section of the installation program inventory file.
A few instances have shown that no additional modification is required, and re-running the installation program again worked.
A.3. Containerized Ansible Automation Platform reference Link kopierenLink in die Zwischenablage kopiert!
Can you provide details of the architecture for the Ansible Automation Platform containerized design?
We use as much of the underlying native RHEL technology as possible. For the container runtime and management of services we use Podman. Many Podman services and commands are used to show and investigate the solution.
For instance, use podman ps, and podman images to see some of the foundational and running pieces:
Containerized Ansible Automation Platform runs as rootless containers for maximum out-of-the-box security. This means you can install containerized Ansible Automation Platform by using any local unprivileged user account. Privilege escalation is only needed for certain root level tasks, and by default is not needed to use root directly.
Once installed, you will notice certain items have populate on the filesystem where the installation program is run (the underlying RHEL host).
Other containerized services that make use of things such as Podman volumes, reside under the installation root directory used. Here are some examples for further reference:
The containers directory contains some of the Podman specifics used and installed for the execution plane:
The controller directory has some of the installed configuration and runtime data points:
The receptor directory has the automation mesh configuration:
After installation, you will also find other pieces in the local users home directory such as the .cache directory:
.cache/
├── containers
│ └── short-name-aliases.conf.lock
└── rhsm
└── rhsm.log
.cache/
├── containers
│ └── short-name-aliases.conf.lock
└── rhsm
└── rhsm.log
As we run by default in the most secure manner, such as rootless Podman, we can also use other services such as running systemd as non-privileged users. Under systemd you can see some of the component service controls available:
The .config directory:
This is specific to Podman and conforms to the Open Container Initiative (OCI) specifications. Whereas Podman run as the root user would use /var/lib/containers by default, for standard users the hierarchy under $HOME/.local is used.
The .local directory:
We isolate the execution plane from the control plane main services (PostgreSQL, Redis, automation controller, receptor, automation hub and Event-Driven Ansible).
Control plane services run with the standard Podman configuration (~/.local/share/containers/storage).
Execution plane services use a dedicated configuration or storage (~/aap/containers/storage) to avoid execution plane containers to be able to interact with the control plane.
How can I see host resource utilization statistics?
- Run:
podman container stats -a
$ podman container stats -a
The previous is an example of a Dell sold and offered containerized Ansible Automation Platform solution (DAAP) install and utilizes ~1.8Gb RAM.
How much storage is used and where?
As we run rootless Podman the container volume storage is under the local user at $HOME/.local/share/containers/storage/volumes.
To view the details of each volume run:
podman volume ls
$ podman volume lsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Then run:
podman volume inspect <volume_name>
$ podman volume inspect <volume_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Here is an example:
Several files created by the installation program are located in $HOME/aap/ and bind-mounted into various running containers.
To view the mounts associated with a container run:
podman ps --format "{{.ID}}\t{{.Command}}\t{{.Names}}"$ podman ps --format "{{.ID}}\t{{.Command}}\t{{.Names}}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow Then run:
podman inspect <container_name> | jq -r .[].Mounts[].Source
$ podman inspect <container_name> | jq -r .[].Mounts[].SourceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the
jqRPM is not installed, install with:sudo dnf -y install jq
$ sudo dnf -y install jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow