Troubleshooting Ansible Automation Platform


Red Hat Ansible Automation Platform 2.5

Troubleshoot issues with Ansible Automation Platform

Red Hat Customer Content Services

Abstract

This guide provides troubleshooting topics for Red Hat Ansible Automation Platform.

Preface

Use the Troubleshooting Ansible Automation Platform guide to troubleshoot your Ansible Automation Platform installation.

Providing feedback on Red Hat documentation

If you have a suggestion to improve this documentation, or find an error, you can contact technical support at https://access.redhat.com to open a request.

Disclaimer: Links contained in this information to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Chapter 1. Diagnosing the problem

To start troubleshooting Ansible Automation Platform, use the must-gather command on OpenShift Container Platform or the sos utility on a VM-based installation to collect configuration and diagnostic information. You can attach the output of these utilities to your support case.

The oc adm must-gather command line interface (CLI) command collects information from your Ansible Automation Platform installation deployed on OpenShift Container Platform. It gathers information that is often needed for debugging issues, including resource definitions and service logs.

Running the oc adm must-gather CLI command creates a new directory containing the collected data that you can use to troubleshoot or attach to your support case.

If your OpenShift environment does not have access to registry.redhat.io and you cannot run the must-gather command, then run the oc adm inspect command instead.

Prerequisites

  • The OpenShift CLI (oc) is installed.

Procedure

  1. Log in to your cluster:

    oc login <openshift_url>
    Copy to Clipboard Toggle word wrap
  2. Run one of the following commands based on your level of access in the cluster:

    • Run must-gather across the entire cluster:

      oc adm must-gather --image=registry.redhat.io/ansible-automation-platform-25/aap-must-gather-rhel8 --dest-dir <dest_dir>
      Copy to Clipboard Toggle word wrap
      • --image specifies the image that gathers data
      • --dest-dir specifies the directory for the output
    • Run must-gather for a specific namespace in the cluster:

      oc adm must-gather --image=registry.redhat.io/ansible-automation-platform-25/aap-must-gather-rhel8 --dest-dir <dest_dir> – /usr/bin/ns-gather <namespace>
      Copy to Clipboard Toggle word wrap
      • – /usr/bin/ns-gather limits the must-gather data collection to a specified namespace
  3. To attach the must-gather archive to your support case, create a compressed file from the must-gather directory created before and attach it to your support case.

    • For example, on a computer that uses a Linux operating system, run the following command, replacing <must-gather-local.5421342344627712289/> with the must-gather directory name:

      $ tar cvaf must-gather.tar.gz <must-gather.local.5421342344627712289/>
      Copy to Clipboard Toggle word wrap

The sos utility collects configuration, diagnostic, and troubleshooting data from your Ansible Automation Platform on a VM-based installation.

For more information about installing and using the sos utility, see Generating an sos report for technical support.

Find information about troubleshooting automation controller performance and logging issues.

Chapter 3. Backup and recovery

Find information about troubleshooting backup and recovery operations for Ansible Automation Platform.

  • For information about troubleshooting backup and recovery for installations of Ansible Automation Platform Operator on OpenShift Container Platform, see the Troubleshooting section in Backup and recovery for operator environments.

Chapter 4. Execution environments

Resolve issues with execution environment images, including problems with the "Use in Controller" option.

You cannot use the Use in Controller option for an execution environment image on private automation hub. You also receive the error message: “No Controllers available”.

To resolve this issue, connect automation controller to your private automation hub instance.

Procedure

  1. Change the /etc/pulp/settings.py file on private automation hub and add one of the following parameters depending on your configuration:

    • Single controller

      CONNECTED_ANSIBLE_CONTROLLERS = ['<https://my.controller.node>']
      Copy to Clipboard Toggle word wrap
    • Many controllers behind a load balancer

      CONNECTED_ANSIBLE_CONTROLLERS = ['<https://my.controller.loadbalancer>']
      Copy to Clipboard Toggle word wrap
    • Many controllers without a load balancer

      CONNECTED_ANSIBLE_CONTROLLERS = ['<https://my.controller.node1>', '<https://my.controller2.node2>']
      Copy to Clipboard Toggle word wrap
  2. Stop all of the private automation hub services:

    # systemctl stop pulpcore.service pulpcore-api.service pulpcore-content.service pulpcore-worker@1.service pulpcore-worker@2.service nginx.service redis.service
    Copy to Clipboard Toggle word wrap
  3. Restart all of the private automation hub services:

    # systemctl start pulpcore.service pulpcore-api.service pulpcore-content.service pulpcore-worker@1.service pulpcore-worker@2.service nginx.service redis.service
    Copy to Clipboard Toggle word wrap

    Verification

    • Verify that you can now use the Use in Controller option in private automation hub.

Chapter 5. Installation

Find information about troubleshooting containerized, operator, and RPM-based installations of Ansible Automation Platform.

Chapter 6. Jobs

Resolve common job issues including module resolution errors, timeout errors, pending jobs, and permission errors.

Jobs are failing with the error message “ERROR! couldn’t resolve module/action 'module name'. This often indicates a misspelling, missing collection, or incorrect module path”.

This error can happen when the collection associated with the module is missing from the execution environment.

The recommended resolution is to create a custom execution environment and add the required collections inside of that execution environment. For more information about creating an execution environment, see Using Ansible Builder in Creating and using execution environments.

Alternatively, you can complete these steps:

Procedure

  1. Create a collections folder inside of the project repository.
  2. Add a requirements.yml file inside of the collections folder and add the collection:

    collections:
    - <collection_name>
    Copy to Clipboard Toggle word wrap

This error can happen when the timeout value is too small, causing the job to stop before completion. The default timeout value for connection plugins is 10.

To resolve the issue, increase the timeout value by completing one of the following methods.

Note

The following changes will affect all of the jobs in automation controller. To use a timeout value for a specific project, add an ansible.cfg file in the root of the project directory and add the timeout parameter value to that ansible.cfg file.

Procedure

  • Increase the timeout value by using one of the following methods:

    • Add ANSIBLE_TIMEOUT as an environment variable in the automation controller UI:

      1. Go to automation controller.
      2. From the navigation panel, select SettingsJobs settings.
      3. Under Extra Environment Variables add the following:

        {
        "ANSIBLE_TIMEOUT": 60
        }
        Copy to Clipboard Toggle word wrap
    • Add a timeout value in the [defaults] section of the ansible.cfg file:

      1. Edit the /etc/ansible/ansible.cfg file and add the following:

        [defaults]
        timeout = 60
        Copy to Clipboard Toggle word wrap
    • Run ad hoc commands with a timeout:

      1. To run an ad hoc playbook in the command line, add the --timeout flag to the ansible-playbook command, for example:

        # ansible-playbook --timeout=60 <your_playbook.yml>
        Copy to Clipboard Toggle word wrap

After launching jobs in automation controller, the jobs stay in a pending state and do not start.

There are a few reasons jobs can become stuck in a pending state. For more information about troubleshooting this issue, see Playbook stays in pending in Configuring automation execution

Procedure

  1. Run the following commands to list all of the pending jobs:

    # awx-manage shell_plus
    Copy to Clipboard Toggle word wrap
    >>> UnifiedJob.objects.filter(status='pending')
    Copy to Clipboard Toggle word wrap
  2. Cancel the pending jobs by using one of the following methods:

    • To cancel all pending jobs, run the following command:

      >>> UnifiedJob.objects.filter(status='pending').update(status='canceled')
      Copy to Clipboard Toggle word wrap
    • To cancel a single job, run the following command, replacing <job_id> with the job ID to cancel:

      >>> UnifiedJob.objects.filter(id=<job_id>).update(status='canceled')
      Copy to Clipboard Toggle word wrap

Jobs are failing with the error message "denied: requested access to the resource is denied, unauthorized: Insufficient permissions". This happens when using an execution environment in private automation hub.

This issue occurs when you protect private automation hub with a password or token but do not assign the registry credential to the execution environment.

Procedure

  1. Go to automation controller.
  2. From the navigation panel, select AdministrationExecution Environments.
  3. Click the execution environment assigned to the job template that is failing.
  4. Click Edit.
  5. Assign the appropriate Registry credential from your private automation hub to the execution environment.

Chapter 7. Networking

Resolve networking issues including subnet conflicts and SSL/TLS certificate problems.

The default subnet used in Ansible Automation Platform containers conflicts with the internal network resulting in "No route to host" errors.

To resolve this issue, update the default classless inter-domain routing (CIDR) value so it does not conflict with the CIDR used by the default Podman networking plugin.

Procedure

  1. In all controller and hybrid nodes, run the following commands to create a file called custom.py:

    # touch /etc/tower/conf.d/custom.py
    Copy to Clipboard Toggle word wrap
    # chmod 640 /etc/tower/conf.d/custom.py
    Copy to Clipboard Toggle word wrap
    # chown root:awx /etc/tower/conf.d/custom.py
    Copy to Clipboard Toggle word wrap
  2. Add the following to the /etc/tower/conf.d/custom.py file:

    DEFAULT_CONTAINER_RUN_OPTIONS = ['--network', 'slirp4netns:enable_ipv6=true,cidr=192.168.1.0/24']
    Copy to Clipboard Toggle word wrap
    • 192.168.1.0/24 is the value for the new CIDR in this example.
  3. Stop and start the automation controller service in all controller and hybrid nodes:

    # automation-controller-service stop
    Copy to Clipboard Toggle word wrap
    # automation-controller-service start
    Copy to Clipboard Toggle word wrap

    All containers will start on the new CIDR.

7.2. Troubleshooting SSL/TLS issues

To troubleshoot SSL/TLS issues, verify the certificate chain, use the correct certificates, and confirm that a trusted Certificate Authority (CA) signed the certificate.

Procedure

  1. Check if the server is reachable over SSL/TLS.

    1. Run the following command to confirm whether the server is reachable over SSL/TLS and to see the full certificate chain:

      # true | openssl s_client -showcerts -connect <fqdn_or_ip>:<port>
      Copy to Clipboard Toggle word wrap
    2. Replace <fqdn_or_ip> and <port> with suitable values.
  2. Verify the certificate details.

    1. Run the following command to view the details of a certificate:

      # openssl x509 -in <path_to_certificate> -noout -text
      Copy to Clipboard Toggle word wrap
  3. Replace <path_to_certificate> with the path to the certificate file you want to inspect.

    The result of the command shows information such as:

    • Subject - The entity the certificate has been issued to.
    • Issuer - The CA that issued the certificate.
    • Validity "Not Before" - The date the certificate was issued.
    • Validity "Not After" - The date the certificate expires.
  4. Verify a trusted CA signed the certificate.

    1. Run the following command to verify that a specific certificate is valid and was signed by a trusted CA:

      openssl verify -CAfile <path_to_ca_public_certificate> <path_to_server_certificate_file_to_verify>
      Copy to Clipboard Toggle word wrap
    2. If the command returns OK, it means the certificate file is valid and signed by a trusted CA.

Chapter 8. Playbooks

You can use automation content navigator to interactively troubleshoot your playbook. For more information, see Troubleshooting Ansible content with automation content navigator.

Chapter 9. Upgrading

Troubleshoot issues when upgrading to Ansible Automation Platform 2.5.

When upgrading from Ansible Automation Platform 2.4 to 2.5, the upgrade completes successfully. However, connections to the platform gateway URL fail if you are using automation controller behind a load balancer.

You see this error message in the logs:

Error connecting to Controller API

Procedure

  1. To resolve this issue, perform the following tasks for all controller hosts:

    1. For each controller host, add the platform gateway URL as a trusted source in the CSRF_TRUSTED_ORIGIN setting in the settings.py file.

      For example, if you configured the platform gateway URL as https://www.example.com, you must add that URL in the settings.py file too as shown:

      CSRF_TRUSTED_ORIGINS = ['https://appX.example.com:8443','https://www.example.com']
      Copy to Clipboard Toggle word wrap
    2. Restart each controller host by using the automation-controller-service restart command so that the URL changes are implemented. For the procedure, see Start, stop, and restart automation controller in Configuring automation execution.

Legal Notice

Copyright © 2025 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top