附录 A. 容器化 Ansible Automation Platform 故障排除
使用此信息对容器化 Ansible Automation Platform 安装进行故障排除。
A.1. 容器化 Ansible Automation Platform 安装故障排除 复制链接链接已复制到粘贴板!
安装需要很长时间,或者出现错误,我应检查什么内容?
- 确保您的系统满足安装指南中所述的最低要求。在很多主机间分布不当的存储选择和高延迟等项目都将严重影响。
-
在默认情况下,检查位于
./aap_install.log
的安装日志文件,除非在本地安装程序ansible.cfg
中有其他变化。 -
临时启用任务分析回调,以概述安装程序花费最多时间的概览。要做到这一点,请使用本地
ansible.cfg
文件。在[defaults]
部分下添加回调行,例如:
cat ansible.cfg [defaults] callbacks_enabled = ansible.posix.profile_tasks
$ cat ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks
自动化控制器会返回 413 错误
此错误的原因是 manifest.zip
许可证文件大于 nginx_client_max_body_size
设置。如果发生这个错误,您需要更改安装清单文件使其包含以下变量:
nginx_disable_hsts: false nginx_http_port: 8081 nginx_https_port: 8444 nginx_client_max_body_size: 20m nginx_user_headers: []
nginx_disable_hsts: false
nginx_http_port: 8081
nginx_https_port: 8444
nginx_client_max_body_size: 20m
nginx_user_headers: []
20m
的当前默认设置应该足以避免此问题。
当进入控制器 UI 时,安装会失败并显示 "502 Bad Gateway"。
这个错误可能会在安装应用程序输出中发生,并在安装应用程序输出中清单本身:
TASK [ansible.containerized_installer.automationcontroller : Wait for the Controller API to te ready] ****************************************************** fatal: [daap1.lan]: FAILED! => {"changed": false, "connection": "close", "content_length": "150", "content_type": "text/html", "date": "Fri, 29 Sep 2023 09:42:32 GMT", "elapsed": 0, "msg": "Status code was 502 and not [200]: HTTP Error 502: Bad Gateway", "redirected": false, "server": "nginx", "status": 502, "url": "https://daap1.lan:443/api/v2/ping/"}
TASK [ansible.containerized_installer.automationcontroller : Wait for the Controller API to te ready] ******************************************************
fatal: [daap1.lan]: FAILED! => {"changed": false, "connection": "close", "content_length": "150", "content_type": "text/html", "date": "Fri, 29 Sep 2023 09:42:32 GMT", "elapsed": 0, "msg": "Status code was 502 and not [200]: HTTP Error 502: Bad Gateway", "redirected": false, "server": "nginx", "status": 502, "url": "https://daap1.lan:443/api/v2/ping/"}
-
检查您是否已运行
automation-controller-web
容器和一个 systemd 服务。
这在常规非特权用户而非系统范围内的级别使用。如果您使用 su
切换到运行容器的用户,您必须将 XDG_RUNTIME_DIR
环境变量设置为正确的值,以便能够与用户 systemctl
单元交互。
export XDG_RUNTIME_DIR="/run/user/$UID"
export XDG_RUNTIME_DIR="/run/user/$UID"
podman ps | grep web systemctl --user | grep web
podman ps | grep web
systemctl --user | grep web
没有输出表示问题。
尝试重启
automation-controller-web
服务:systemctl start automation-controller-web.service --user systemctl --user | grep web systemctl status automation-controller-web.service --user
systemctl start automation-controller-web.service --user systemctl --user | grep web systemctl status automation-controller-web.service --user
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use) Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use) Sep 29 10:55:16 daap1.lan automation-controller-web[29875]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow 输出表明端口已经或仍然被其他服务使用。在本例中,
nginx
。运行:
sudo pkill nginx
sudo pkill nginx
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 重新启动并状态检查 Web 服务。
普通服务输出应类似于如下,并且应该仍然在运行:
您可以再次运行安装程序,以确保所有安装都如预期安装。
在 Amazon Web Services 中安装容器化 Ansible Automation Platform 时,您会收到没有剩余空间的输出
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] *************************************************** fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
TASK [ansible.containerized_installer.automationcontroller : Create the receptor container] ***************************************************
fatal: [ec2-13-48-25-168.eu-north-1.compute.amazonaws.com]: FAILED! => {"changed": false, "msg": "Can't create container receptor", "stderr": "Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1\n", "stderr_lines": ["Error: creating container storage: creating an ID-mapped copy of layer \"98955f43cc908bd50ff43585fec2c7dd9445eaf05eecd1e3144f93ffc00ed4ba\": error during chown: storage-chown-by-maps: lchown usr/local/lib/python3.9/site-packages/azure/mgmt/network/v2019_11_01/operations/__pycache__/_available_service_aliases_operations.cpython-39.pyc: no space left on device: exit status 1"], "stdout": "", "stdout_lines": []}
如果要将 /home
文件系统安装到默认的 Amazon Web Services marketplace RHEL 实例中,则可能太小,因为 /home
是 root /
文件系统的一部分。您需要留出更多可用空间。本文档为容器化 Ansible Automation Platform 的单节点部署指定最小 40GB。