Chapter 7. Monitoring and logging
You can send metrics to the Google Cloud Platform monitoring system to be visualized in the Google Cloud Platform UI. Ansible Automation Platform from GCP Marketplace metrics and logging are disabled by default as there is a cost to send these metrics to GCP. Refer to Cloud Monitoring and Cloud Logging respectively for more information.
You can set up GCP monitoring and logging either:
- At deployment time, or
- After the deployment
7.1. Setting up monitoring and logging at deployment time
Procedure
- In the GCP UI, navigate to .
Check the Connect Logging and Connect Metrics checkboxes.
NoteThese checkboxes are only available in the foundation deployment.
7.2. Setting up monitoring and logging after deployment
You can start or stop the logging and monitoring after the deployment by using the gcp_setup_logging_monitoring
playbook available from registry.redhat.com.
7.2.1. Pulling the ansible-on-clouds-ops container image
Pull the Docker image for the Ansible on Clouds operational container which aligns with the version of your foundation deployment.
For example, if your foundation deployment version is 2.3.20230221-00, you must pull the operational image with tag 2.3.20230221.
Use the following commands:
$ export IMAGE=registry.redhat.io/ansible-on-clouds/ansible-on-clouds-ops-rhel8:2.3.20230221 $ docker pull $IMAGE --platform=linux/amd64
For EMEA regions (Europe, Middle East, Africa) run the following command instead:
$ export IMAGE=registry.redhat.io/ansible-on-clouds/ansible-on-clouds-ops-emea-rhel8:2.3.20230221 $ docker pull $IMAGE --platform=linux/amd64
7.2.2. Generating data files by running the ansible-on-clouds-ops container
The following commands generate the required data file. These commands create a directory, and an empty data template that, when populated, is used to generate the playbook.
Procedure
Create a folder to hold the configuration files.
$ mkdir command_generator_data
Populate the
command_generator_data
folder with the configuration file template.$ docker run --rm -v $(pwd)/command_generator_data:/data $IMAGE \ command_generator_vars gcp_setup_logging_monitoring \ --output-data-file /data/logging-monitoring.yml
Creates the following command file:
=============================================== Playbook: gcp_setup_logging_monitoring Description: This playbook setup the logging and monitoring. ----------------------------------------------- Install monitoring tools and configure them to send data to the Google Cloud Monitoring service. ----------------------------------------------- Command generator template: docker run --rm -v <local_data_file_directory>:/data $IMAGE command_generator \ gcp_setup_logging_monitoring --data-file /data/logging-monitoring.yml
When you have run these commands, a
command_generator_data/logging-monitoring.yml
template file is created.NoteIn the following example file,
ansible_config_path
is optional.This template file resembles the following:-
gcp_setup_logging_monitoring: ansible_config_path: cloud_credentials_path: deployment_name: extra_vars: components: default_collector_interval: logging_enabled: monitoring_enabled:
7.2.3. Updating the data file
If you do not require a parameter, remove that parameter from the configuration file.
Procedure
-
Edit the
command_generator_data/logging-monitoring.yml
file and set the following parameters: -
ansible_config_path
is used by default as the standard configuration for theansible-on-cloud offering
but if you have extra requirements in your environment you can specify your own. -
cloud_credentials_path
is the absolute path toward your credentials. This must be an absolute path. -
deployment_name
is the name of the deployment. -
components
is an array containing the type of component where you want to do the setup. The default is [ “controller”, “hub” ] which means that the logging monitoring will be enabled on both automation controller and automation hub. -
monitoring_enabled
is set totrue
to enable the monitoring,false
otherwise. Default =false
. -
logging_enabled
is set totrue
to enable the logging,false
otherwise. Default =false
. default_collector_interval
is the periodicity at which the monitoring data must be send to the Google Cloud. Default = 59s.NoteThe Google cost of this service depends on that periodicity and so the higher the value of the collector interval, the less it will cost.
Do not set values less than 59 seconds.
NoteIf monitoring and logging is disabled, the value of 'default_collector_interval' is automatically set to
0
.
7.2.4. Generating the playbook
To generate the playbook, run the command generator to generate the CLI command.
docker run --rm -v $(pwd)/command_generator_data:/data $IMAGE command_generator gcp_setup_logging_monitoring \ --data-file /data/logging-monitoring.yml
Provides the following command:
docker run --rm --env PLATFORM=GCP -v </path/to/gcp/service-account.json>:/home/runner/.gcp/credentials:ro \ --env ANSIBLE_CONFIG=../gcp-ansible.cfg --env DEPLOYMENT_NAME=<deployment_name> --env GENERATE_INVENTORY=true \ $IMAGE redhat.ansible_on_clouds.gcp_setup_logging_monitoring -e 'gcp_deployment_name=<deployment_name> \ gcp_service_account_credentials_json_path=/home/runner/.gcp/credentials monitoring_enabled=<monitoring_enabled> \ logging_enabled=<logging_enabled> default_collector_interval=<interval>'
Run the supplied command to run the playbook.
$ docker run --rm --env PLATFORM=GCP -v /path/to/credentials:/home/runner/.gcp/credentials:ro \ --env ANSIBLE_CONFIG=../gcp-ansible.cfg --env DEPLOYMENT_NAME=mu-deployment \ --env GENERATE_INVENTORY=true $IMAGE redhat.ansible_on_clouds.gcp_setup_logging_monitoring \ -e 'gcp_deployment_name=mu-deployment \ gcp_service_account_credentials_json_path=/home/runner/.gcp/credentials components=["hubs","controllers"]\ monitoring_enabled=True logging_enabled=True default_collector_interval=60s'
The process may take some time, and provides output similar to the following:
TASK [redhat.ansible_on_clouds.setup_logging_monitoring : Update runtime variable logging_enabled] *** changed: [<user_name> -> localhost] TASK [redhat.ansible_on_clouds.setup_logging_monitoring : Update runtime variable monitoring_enabled] *** changed: [<user_name> -> localhost] PLAY RECAP ********************************************************************* <user_name> : ok=20 changed=6 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
7.3. Customizing monitoring and logging
Metrics are provided by Ansible, Podman and Google Ops Agent. Ansible metrics is only installed on automation hub, the Google Ops Agent and Podman are also installed on automation controllers.
A configurable process (collector) runs on each automation controller and automation hub to export the collected Ansible and Podman metrics to Google Cloud Platform Monitoring. As the Google Ops Agent is part of the Google Cloud solution, it has its own configuration file.
The Google Ops Agent is also responsible for the logging configuration.
The service APIs monitoring.googleapis.com
and logging.googleapis.com
must be respectively enabled for the monitoring and logging capabilities.
Configuration
Configuration files are located on a disk shared by each automation controller and automation hub. Modify the file /aap/bootstrap/config_file_templates/<controller|hub>/monitoring.yml
to configure all exporters and agents.
7.3.1. Ansible and podman configuration
The file /aap/bootstrap/config_file_templates/<controller|hub>/monitoring.yaml
on automation controller or automation hub contains the configuration for collecting and sending Ansible and podman metrics to GCP.
The default configuration for automation controller looks like this:
# This value will be set at deployment time. # Set to zero if monitoringEnabled is false otherwise 59s # The collection interval for each collector will be the minimum # between the defaultCollectorInterval and all send Interval # of a given collector # NB: The awx exported should not run on controllers as # it duplicates the number of records sent to GCP Monitoring defaultCollectorInterval: $DEFAULT_COLLECTOR_INTERVAL collectors: - name: podman endpoint: http://localhost:9882/podman/metrics enabled: true # list of metrics to exclude # excludedMetrics: # - podman_container_created_seconds metrics: - name: podman_container_exit_code # interval on which the metric must be pushed to gcp sendInterval: 59s
The default configuration for automation hub looks like:
# This value will be set at deployment time. # Set to zero if monitoringEnabled is false otherwise 59s # The collection interval for each collector will be the minimum # between the defaultCollectorInterval and all sendInterval # of a given collector # NB: The awx exporter should not run on controllers as # it duplicates the number of records sent to GCP Monitoring defaultCollectorInterval: 59s collectors: - name: awx userName: admin endpoint: http://<Controller_LB_IP>/api/v2/metrics/ enabled: true metrics: - name: awx_inventories_total # interval on which the metric must be pushed to gcp sendInterval: 59s - name: podman endpoint: http://localhost:9882/podman/metrics enabled: true # list of metrics to exclude # excludedMetrics: # - podman_container_created_seconds metrics: - name: podman_container_exit_code # interval on which the metric must be pushed to gcp sendInterval: 59s
where
collectors
is a configuration array with one item per collector, that is, awx and podman.
The awx collector requires authentication and so userName
must be set to admin
. The password is retrieved from the secret-manager.
The endpoint should not be changed.
defaultCollectorInterval
specifies the default interval at which the exporter collects the information from the metric end-point and sends it to Google Cloud Platform Monitoring.
Setting this value to 0
or omitting this attribute disables all collectors.
Each collector can be enabled or disabled separately by setting enabled
to true
or false
.
A collector returns all available metrics grouped by families, but you can exclude the families that should not be sent to Google Cloud Platform Monitoring by adding their name in the array excludedMetrics
.
For all other family metrics, you can specify the interval at which you want to collect and send them to the Google Cloud Platform Monitoring. The collector interval is the minimum between all family metrics interval and the defaultCollectorInterval
. This to ensure that a collection is made for each set of metrics sent to Google Cloud Platform Monitoring.
7.3.2. Google cloud ops agent configuration
The configuration file details can be found here.
The configuration file is located in /etc/google-cloud-ops-agent/config.yml
.
This is a symbolic link to the shared disk /aap/bootstrap/config_file_templates/controller/gcp-ops-agent-config.yml
or /aap/bootstrap/config_file_templates/hub/gcp-ops-agent-config.yml
depending on the component type.
The configuration file contains a number of receivers specifying what should be collected by the ops agent.
Your selection of Connect Logging and Connect Metrics during deployment determines which pipelines are included in the file and therefore which logs and metrics are collected and sent to GCP.
If you need to add more pipelines post-deployment, you can insert them in /aap/bootstrap/config_file_templates/hub|controller/gcp-ops-agent-config.yml
.
A crontab job restarts the agent if gcp-ops-agent-config.yml
changed in the last 10 minutes. The agent rereads its configuration after a restart.