Este contenido no está disponible en el idioma seleccionado.
Appendix A. Troubleshooting containerized Ansible Automation Platform
Use this information to troubleshoot your containerized Ansible Automation Platform installation.
A.1. Gathering Ansible Automation Platform logs Copiar enlaceEnlace copiado en el portapapeles!
With the sos utility, you can collect configuration, diagnostic, and troubleshooting data, and give those files to Red Hat Technical Support. An sos report is a common starting point for Red Hat technical support engineers when performing analysis of a service request for Ansible Automation Platform.
You can collect an sos report for each host in your containerized Ansible Automation Platform deployment by running the log_gathering playbook with the appropriate parameters.
Procedure
- Go to the Ansible Automation Platform installation directory.
Run the
log_gatheringplaybook. This playbook connects to each host in the inventory file, installs thesostool, and generates thesosreport.ansible-playbook -i <path_to_inventory_file> ansible.containerized_installer.log_gathering
$ ansible-playbook -i <path_to_inventory_file> ansible.containerized_installer.log_gatheringCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To define additional parameters, specify them with the
-eoption. For example:ansible-playbook -i <path_to_inventory_file> ansible.containerized_installer.log_gathering -e 'target_sos_directory=<path_to_files>' -e 'case_number=0000000' -e 'clean=true' -e 'upload=true' -s
$ ansible-playbook -i <path_to_inventory_file> ansible.containerized_installer.log_gathering -e 'target_sos_directory=<path_to_files>' -e 'case_number=0000000' -e 'clean=true' -e 'upload=true' -sCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
You can use the
-soption to step through each task in the playbook and confirm its execution. This is optional but can be helpful for debugging. The following is a list of the parameters you can use with the
log_gatheringplaybook:Expand Table A.1. Parameter reference Parameter name Description Default target_sos_directoryUsed to change the default location for the
sosreport files./tmpdirectory of the current server.case_numberSpecifies the support case number if relevant to the log gathering.
cleanObfuscates sensitive data that might be present on the
sosreport.falseuploadAutomatically uploads the
sosreport data to Red Hat.false
-
You can use the
-
Gather the
sosreport files described in the playbook output and share them with the support engineer or directly upload thesosreport to Red Hat using theupload=trueadditional parameter.
A.2. Diagnosing the problem Copiar enlaceEnlace copiado en el portapapeles!
For general container-based troubleshooting, you can inspect the container logs for any running service to help troubleshoot underlying issues.
Identifying the running containers
To get a list of the running container names run the following command:
podman ps --all --format "{{.Names}}"
$ podman ps --all --format "{{.Names}}"
| Component group | Container name | Purpose |
|---|---|---|
| Automation controller |
| Handles centralized logging for automation controller. |
| Automation controller |
| Manages and runs tasks related to automation controller, such as running playbooks and interacting with inventories. |
| Automation controller |
| A web server that provides a REST API for automation controller. This is accessed and routed through platform gateway for user interaction. |
| Event-Driven Ansible |
| Exposes the API for Event-Driven Ansible, allowing external systems to trigger and manage event-driven automations. |
| Event-Driven Ansible |
| A web server for Event-Driven Ansible, handling WebSocket connections and serving static files. |
| Event-Driven Ansible |
| A web server that provides a REST API for Event-Driven Ansible. This is accessed and routed through platform gateway for user interaction. |
| Event-Driven Ansible |
| These containers run the automation rules and playbooks based on incoming events. |
| Event-Driven Ansible |
| These containers manage the activation of automation rules, ensuring they run when specific conditions are met. |
| Event-Driven Ansible |
| Responsible for scheduling and managing recurring tasks and rule activations. |
| Platform gateway |
| Acts as a reverse proxy, routing incoming requests to the appropriate Ansible Automation Platform services. |
| Platform gateway |
| Responsible for authentication, authorization, and overall request handling for the platform, all of which is exposed through a REST API and served by a web server. |
| Automation hub |
| Provides the API for automation hub, enabling interaction with collection content, user management, and other automation hub functionality. |
| Automation hub |
| Manages and serves Ansible Content Collections, roles, and modules stored in automation hub. |
| Automation hub |
| A web server that provides a REST API for automation hub. This is accessed and routed through platform gateway for user interaction. |
| Automation hub |
| These containers handle background tasks for automation hub, such as content synchronization, indexing, and validation. |
| Performance Co-Pilot |
| If Performance Co-Pilot Monitoring is enabled, this container is used for system performance monitoring and data collection. |
| PostgreSQL |
| Hosts the PostgreSQL database for Ansible Automation Platform. |
| Receptor |
| Facilitates secure and reliable communication within Ansible Automation Platform. |
| Redis |
| Responsible for caching, real-time analytics and fast data retrieval. |
Inspecting the logs
Containerized Ansible Automation Platform uses journald for Podman logging. To inspect any running container logs, run the journalctl command:
journalctl CONTAINER_NAME=<container_name>
$ journalctl CONTAINER_NAME=<container_name>
Example command with output:
journalctl CONTAINER_NAME=automation-gateway-proxy
$ journalctl CONTAINER_NAME=automation-gateway-proxy
Oct 08 01:40:12 aap.example.org automation-gateway-proxy[1919]: [2024-10-08 00:40:12.885][2][info][upstream] [external/envoy/source/common/upstream/cds_ap>
Oct 08 01:40:12 aap.example.org automation-gateway-proxy[1919]: [2024-10-08 00:40:12.885][2][info][upstream] [external/envoy/source/common/upstream/cds_ap>
Oct 08 01:40:19 aap.example.org automation-gateway-proxy[1919]: [2024-10-08T00:40:16.753Z] "GET /up HTTP/1.1" 200 - 0 1138 10 0 "192.0.2.1" "python->
To view the logs of a running container in real-time, run the podman logs -f command:
podman logs -f <container_name>
$ podman logs -f <container_name>
Controlling container operations
You can control operations for a container by running the systemctl command:
systemctl --user status <container_name>
$ systemctl --user status <container_name>
Example command with output:
Getting container information about the execution plane
To get container information about automation controller, Event-Driven Ansible, and execution_nodes nodes, prefix any Podman commands with either:
CONTAINER_HOST=unix://run/user/<user_id>/podman/podman.sock
CONTAINER_HOST=unix://run/user/<user_id>/podman/podman.sock
or
CONTAINERS_STORAGE_CONF=<user_home_directory>/aap/containers/storage.conf
CONTAINERS_STORAGE_CONF=<user_home_directory>/aap/containers/storage.conf
Example with output:
CONTAINER_HOST=unix://run/user/1000/podman/podman.sock podman images
$ CONTAINER_HOST=unix://run/user/1000/podman/podman.sock podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.redhat.io/ansible-automation-platform-25/ee-supported-rhel8 latest 59d1bc680a7c 6 days ago 2.24 GB
registry.redhat.io/ansible-automation-platform-25/ee-minimal-rhel8 latest a64b9fc48094 6 days ago 338 MB
A.3. Troubleshooting containerized Ansible Automation Platform installation Copiar enlaceEnlace copiado en el portapapeles!
Use this information to troubleshoot your containerized installation of Ansible Automation Platform.
The installation takes a long time, or has errors, what should I check?
- Ensure your system meets the minimum requirements as outlined in System requirements. Factors such as improper storage choices and high latency when distributing across many hosts will all have an impact on installation time.
-
Review the installation log file which is located by default at
./aap_install.log. You can change the log file location within theansible.cfgfile in the installation directory. -
Enable task profiling callbacks on an ad hoc basis to give an overview of where the installation program spends the most time. To do this, use the local
ansible.cfgfile. Add a callback line under the[defaults]section, for example:
cat ansible.cfg [defaults] callbacks_enabled = ansible.posix.profile_tasks
$ cat ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks
Automation controller returns an error of 413
This error occurs when manifest.zip license files that are larger than the controller_nginx_client_max_body_size setting. If this error occurs, update the inventory file to include the following variable:
controller_nginx_client_max_body_size=5m
controller_nginx_client_max_body_size=5m
The default setting of 5m should prevent this issue, but you can increase the value as needed.
When attempting to install containerized Ansible Automation Platform in Amazon Web Services you receive output that there is no space left on device
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] ***************************************************
fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] ***************************************************
fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
If you are installing a /home filesystem into a default Amazon Web Services marketplace RHEL instance, it might be too small since /home is part of the root / filesystem. To resolve this issue you must make more space available. For more information about the system requirements, see System requirements.
"Install container tools" task fails due to unavailable packages
This error can be seen in the installation process output as the following:
To fix this error, run the following command on the target hosts:
sudo subscription-manager register
sudo subscription-manager register
A.4. Troubleshooting containerized Ansible Automation Platform configuration Copiar enlaceEnlace copiado en el portapapeles!
Use this information to troubleshoot your containerized Ansible Automation Platform configuration.
Sometimes the post install for seeding my Ansible Automation Platform content errors out
This could manifest itself as output similar to this:
TASK [infra.controller_configuration.projects : Configure Controller Projects | Wait for finish the projects creation] ***************************************
Friday 29 September 2023 11:02:32 +0100 (0:00:00.443) 0:00:53.521 ******
FAILED - RETRYING: [daap1.lan]: Configure Controller Projects | Wait for finish the projects creation (1 retries left).
failed: [daap1.lan] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '536962174348.33944', 'results_file': '/home/aap/.ansible_async/536962174348.33944', 'changed': False, '__controller_project_item': {'name': 'AAP Config-As-Code Examples', 'organization': 'Default', 'scm_branch': 'main', 'scm_clean': 'no', 'scm_delete_on_update': 'no', 'scm_type': 'git', 'scm_update_on_launch': 'no', 'scm_url': 'https://github.com/user/repo.git'}, 'ansible_loop_var': '__controller_project_item'}) => {"__projects_job_async_results_item": {"__controller_project_item": {"name": "AAP Config-As-Code Examples", "organization": "Default", "scm_branch": "main", "scm_clean": "no", "scm_delete_on_update": "no", "scm_type": "git", "scm_update_on_launch": "no", "scm_url": "https://github.com/user/repo.git"}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__controller_project_item", "changed": false, "failed": 0, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__projects_job_async_results_item", "attempts": 30, "changed": false, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
TASK [infra.controller_configuration.projects : Configure Controller Projects | Wait for finish the projects creation] ***************************************
Friday 29 September 2023 11:02:32 +0100 (0:00:00.443) 0:00:53.521 ******
FAILED - RETRYING: [daap1.lan]: Configure Controller Projects | Wait for finish the projects creation (1 retries left).
failed: [daap1.lan] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '536962174348.33944', 'results_file': '/home/aap/.ansible_async/536962174348.33944', 'changed': False, '__controller_project_item': {'name': 'AAP Config-As-Code Examples', 'organization': 'Default', 'scm_branch': 'main', 'scm_clean': 'no', 'scm_delete_on_update': 'no', 'scm_type': 'git', 'scm_update_on_launch': 'no', 'scm_url': 'https://github.com/user/repo.git'}, 'ansible_loop_var': '__controller_project_item'}) => {"__projects_job_async_results_item": {"__controller_project_item": {"name": "AAP Config-As-Code Examples", "organization": "Default", "scm_branch": "main", "scm_clean": "no", "scm_delete_on_update": "no", "scm_type": "git", "scm_update_on_launch": "no", "scm_url": "https://github.com/user/repo.git"}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__controller_project_item", "changed": false, "failed": 0, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1}, "ansible_job_id": "536962174348.33944", "ansible_loop_var": "__projects_job_async_results_item", "attempts": 30, "changed": false, "finished": 0, "results_file": "/home/aap/.ansible_async/536962174348.33944", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
The infra.controller_configuration.dispatch role uses an asynchronous loop with 30 retries to apply each configuration type, and the default delay between retries is 1 second. If the configuration is large, this might not be enough time to apply everything before the last retry occurs.
Increase the retry delay by setting the controller_configuration_async_delay variable to 2 seconds for example. You can set this variable in the [all:vars] section of the installation program inventory file.
Re-run the installation program to ensure everything works as expected.
A.5. Containerized Ansible Automation Platform reference Copiar enlaceEnlace copiado en el portapapeles!
Use this information to understand the architecture for your containerized Ansible Automation Platform deployment.
Can you give details of the architecture for the Ansible Automation Platform containerized design?
We use as much of the underlying native Red Hat Enterprise Linux technology as possible. Podman is used for the container runtime and management of services.
Use podman ps to list the running containers on the system:
Use podman images to display information about locally stored images:
Containerized Ansible Automation Platform runs as rootless containers for enhanced security by default. This means you can install containerized Ansible Automation Platform by using any local unprivileged user account. Privilege escalation is only needed for certain root level tasks, and by default is not needed to use root directly.
The installation program adds the following files to the filesystem where you run the installation program on the underlying Red Hat Enterprise Linux host:
The installation root directory includes other containerized services that make use of Podman volumes.
Here are some examples for further reference:
The containers directory includes some of the Podman specifics used and installed for the execution plane:
The controller directory has some of the installed configuration and runtime data points:
The receptor directory has the automation mesh configuration:
After installation, you will also find other files in the local user’s /home directory such as the .cache directory:
.cache/
├── containers
│ └── short-name-aliases.conf.lock
└── rhsm
└── rhsm.log
.cache/
├── containers
│ └── short-name-aliases.conf.lock
└── rhsm
└── rhsm.log
As services are run using rootless Podman by default, you can use other services such as running systemd as non-privileged users. Under systemd you can see some of the component service controls available:
The .config directory:
This is specific to Podman and conforms to the Open Container Initiative (OCI) specifications. When you run Podman as the root user /var/lib/containers is used by default. For standard users the hierarchy under $HOME/.local is used.
The .local directory:
As an example .local/storage/volumes contains what the output from podman volume ls provides:
The execution plane is isolated from the control plane main services to ensure it does not affect the main services.
Control plane services run with the standard Podman configuration and can be found in: ~/.local/share/containers/storage.
Execution plane services (automation controller, Event-Driven Ansible and execution nodes) use a dedicated configuration found in ~/aap/containers/storage.conf. This separation prevents execution plane containers from affecting the control plane services.
You can view the execution plane configuration with one of the following commands:
CONTAINERS_STORAGE_CONF=~/aap/containers/storage.conf podman <subcommand>
CONTAINERS_STORAGE_CONF=~/aap/containers/storage.conf podman <subcommand>
CONTAINER_HOST=unix://run/user/<user uid>/podman/podman.sock podman <subcommand>
CONTAINER_HOST=unix://run/user/<user uid>/podman/podman.sock podman <subcommand>
How can I see host resource utilization statistics?
Run the following command to display host resource utilization statistics:
podman container stats -a
$ podman container stats -a
Example output based on a Dell sold and offered containerized Ansible Automation Platform solution (DAAP) install that utilizes ~1.8 GB RAM:
How much storage is used and where?
The container volume storage is under the local user at $HOME/.local/share/containers/storage/volumes.
To view the details of each volume, run the following command:
podman volume ls
$ podman volume lsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to display detailed information about a specific volume:
podman volume inspect <volume_name>
$ podman volume inspect <volume_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For example:
podman volume inspect postgresql
$ podman volume inspect postgresql
Example output:
Several files created by the installation program are located in $HOME/aap/ and bind-mounted into various running containers.
To view the mounts associated with a container run the following command:
podman ps --format "{{.ID}}\t{{.Command}}\t{{.Names}}"$ podman ps --format "{{.ID}}\t{{.Command}}\t{{.Names}}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command:
podman inspect <container_name> | jq -r .[].Mounts[].Source
$ podman inspect <container_name> | jq -r .[].Mounts[].SourceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the
jqRPM is not installed, install it by running the following command:sudo dnf -y install jq
$ sudo dnf -y install jqCopy to Clipboard Copied! Toggle word wrap Toggle overflow