Chapter 4. Gathering Information About the Environment
4.1. Monitoring and observability
This chapter provides a number of ways to monitor and obtain metrics and logs from your Red Hat Virtualization system. These methods include:
- Using Data Warehouse and Grafana to monitor RHV
- Sending metrics to a remote instance of Elasticsearch
- Deploying Insights in Red Hat Virtualization Manager
4.1.1. Using Data Warehouse and Grafana to monitor RHV
4.1.1.1. Grafana overview
Grafana is a web-based UI tool used to display reports based on data collected from the oVirt Data Warehouse PostgreSQL database under the database name ovirt_engine_history
. For details of the available report dashboards, see Grafana dashboards and Grafana website - dashboards.
Data from the Manager is collected every minute and aggregated in hourly and daily aggregations. The data is retained according to the scale setting defined in the Data Warehouse configuration during engine-setup (Basic or Full scale):
- Basic (default) - samples data saved for 24 hours, hourly data saved for 1 month, daily data - no daily aggregations saved.
- Full (recommended)- samples data saved for 24 hours, hourly data saved for 2 months, daily aggregations saved for 5 years.
Full sample scaling may require migrating the Data Warehouse to a separate virtual machine.
- For Data Warehouse scaling instructions, see Changing the Data Warehouse Sampling Scale.
- For instructions on migrating the Data Warehouse to or installing on a separate machine, see Migrating Data Warehouse to a Separate Machine and Installing and Configuring Data Warehouse on a Separate Machine.
Red Hat only supports installing the Data Warehouse database, the Data Warehouse service and Grafana all on the same machine as each other, even though you can install each of these components on separate machines from each other.
4.1.1.2. Installation
Grafana integration is enabled and installed by default when you run Red Hat Virtualization Manager engine-setup
in a Stand Alone Manager installation, and in the Self-Hosted engine installation.
Grafana is not installed by default and you may need to install it manually under some scenarios such as performing an upgrade from an earlier version of RHV, restoring a backup, or when the Data Warehouse is migrated to a separate machine.
To enable Grafana integration manually:
Put the environment in global maintenance mode:
# hosted-engine --set-maintenance --mode=global
- Log in to the machine where you want to install Grafana. This should be the same machine where the Data Warehouse is configured; usually the Manager machine.
Run the
engine-setup
command as follows:# engine-setup --reconfigure-optional-components
Answer
Yes
to install Grafana on this machine:Configure Grafana on this host (Yes, No) [Yes]:
Disable global maintenance mode:
# hosted-engine --set-maintenance --mode=none
To access the Grafana dashboards:
-
Go to
https://<engine FQDN or IP address>/ovirt-engine-grafana
or
- Click Monitoring Portal in the web administration welcome page for the Administration Portal.
4.1.1.2.1. Configuring Grafana for Single Sign-on
The Manager engine-setup automatically configures Grafana to allow existing users on the Manager to log in with SSO from the Administration Portal, but does not automatically create users. You need to create new users (Invite
in the Grafana UI), confirm the new user, and then they can log in.
- Set an email address for the user in the Manager, if it is not already defined.
- Log in to Grafana with an existing admin user (the initially configured admin).
-
Go to
and select . - Input the email address and name, and select a Role.
Send the invitation using one of these options:
Select
and click . For this option, you need an operational local mail server configured on the Grafana machine.or
Select
- Locate the entry you want
- Select
- Copy and use this link to create the account by pasting it directly into a browser address bar, or by sending it to another user.
If you use the Pending Invites option, no email is sent, and the email address does not really need to exist - any valid looking address will work, as long as it’s configured as the email address of a Manager user.
To log in with this account:
- Log in to the Red Hat Virtualization web administration welcome page using the account that has this email address.
-
Select
Monitoring Portal
to open the Grafana dashboard. - Select .
4.1.1.3. Built-in Grafana dashboards
The following dashboards are available in the initial Grafana setup to report Data Center, Cluster, Host, and Virtual Machine data:
Dashboard type | Content |
---|---|
Executive dashboards |
|
Inventory dashboards |
|
Service Level dashboards |
|
Trend dashboards |
|
The Grafana dashboards includes direct links to the Red Hat Virtualization Administration Portal, allowing you to quickly view additional details for your clusters, hosts, and virtual machines.
4.1.1.4. Customized Grafana dashboards
You can create customized dashboards or copy and modify existing dashboards according to your reporting needs.
Built-in dashboards cannot be customized.
4.1.2. Sending metrics and logs to a remote instance of Elasticsearch
Red Hat does not own or maintain Elasticsearch. You need to have a working familiarity with Elasticsearch setup and maintenance to deploy this option.
You can configure the Red Hat Virtualization Manager and hosts to send metrics data and logs to your existing Elasticsearch instance.
To do this, run the Ansible role that configures collectd
and rsyslog
on the Manager and all hosts to collect engine.log
, vdsm.log
, and collectd
metrics, and send them to your Elasticsearch instance.
For more information, including a full list with explanations of available Metrics Schema, see Sending RHV monitoring data to a remote Elasticsearch instance.
4.1.2.1. Installing collectd and rsyslog
Deploy collectd
and rsyslog
on the hosts to collect logs and metrics.
You do not need to repeat this procedure for new hosts. Every new host that is added is automatically configured by the Manager to send the data to Elasticsearch during host-deploy.
Procedure
- Log in to the Manager machine using SSH.
Copy
/etc/ovirt-engine-metrics/config.yml.example
to create/etc/ovirt-engine-metrics/config.yml.d/config.yml
:# cp /etc/ovirt-engine-metrics/config.yml.example /etc/ovirt-engine-metrics/config.yml.d/config.yml
Edit the
ovirt_env_name
andelasticsearch_host
parameters inconfig.yml
and save the file. The following additional parameters can be added to the file:use_omelasticsearch_cert: false rsyslog_elasticsearch_usehttps_metrics: !!str off rsyslog_elasticsearch_usehttps_logs: !!str off
-
When using certificates, set
use_omelasticsearch_cert
to true. -
To disable logs or metrics, use the
rsyslog_elasticsearch_usehttps_metrics
and/orrsyslog_elasticsearch_usehttps_logs
parameters.
-
When using certificates, set
Deploy
collectd
andrsyslog
on the hosts:# /usr/share/ovirt-engine-metrics/setup/ansible/configure_ovirt_machines_for_metrics.sh
The
configure_ovirt_machines_for_metrics.sh
script runs an Ansible role that includeslinux-system-roles
(see Administration and configuration tasks using System Roles in RHEL) and uses it to deploy and configurersyslog
on the host.rsyslog
collects metrics fromcollectd
and sends them to Elasticsearch.
4.1.2.2. Logging schema and analyzing logs
Use the Discover page to interactively explore data collected from RHV. Each set of results that is collected is referred to as a document. Documents are collected from the following log files:
-
engine.log
- contains all oVirt Engine UI crashes, Active Directory lookups, database issues, and other events. -
vdsm.log
- the log file for the VDSM, the Manager’s agent on the virtualization hosts, and contains host-related events.
The following fields are available:
parameter | description |
---|---|
_id | The unique ID of the document |
_index |
The ID of the index to which the document belongs. The index with the |
hostname | For the engine.log this is the hostname of the Manager. For the vdsm.log this is the hostname of the host. |
level | The log record severity: TRACE, DEBUG, INFO, WARN, ERROR, FATAL. |
message | The body of the document message. |
ovirt.class | The name of a Java class that produced this log. |
ovirt.correlationid | For the engine.log only. This ID is used to correlate the multiple parts of a single task performed by the Manager. |
ovirt.thread | The name of a Java thread inside which the log record was produced. |
tag | Predefined sets of metadata that can be used to filter the data. |
@timestamp | The [time](Troubleshooting#information-is-missing-from-kibana) that the record was issued. |
_score | N/A |
_type | N/A |
ipaddr4 | The machine’s IP address. |
ovirt.cluster_name | For the vdsm.log only. The name of the cluster to which the host belongs. |
ovirt.engine_fqdn | The Manager’s FQDN. |
ovirt.module_lineno |
The file and line number within the file that ran the command defined in |
4.1.3. Deploying Insights
To deploy Red Hat Insights on an existing Red Hat Enterprise Linux (RHEL) system with Red Hat Virtualization Manager installed, complete these tasks:
- Register the system to the Red Hat Insights application.
- Enable data collection from the Red Hat Virtualization environment.
4.1.3.1. Register the system to Red Hat Insights
Register the system to communicate with the Red Hat Insights service and to view results displayed in the Red Hat Insights console.
[root@server ~]# insights-client --register
4.1.3.2. Enable data collection from the Red Hat Virtualization environment
Modify the /etc/ovirt-engine/rhv-log-collector-analyzer/rhv-log-collector-analyzer.conf
file to include the following line:
upload-json=True
4.1.3.3. View your Insights results in the Insights Console
System and infrastructure results can be viewed in the Insights console. The Overview
tab provides a dashboard view of current risks to your infrastructure. From this starting point, you can investigate how a specific rule is affecting your system, or take a system-based approach to view all the rule matches that pose a risk to the system.
Procedure
Select
Rule
hits by severity to view rules by theTotal Risk
they pose to your infrastructure (Critical
,Important
,Moderate
, orLow
).Or
-
Select
Rule
hits by category to see the type of risk they pose to your infrastructure (Availability
,Stability
,Performance
, orSecurity
). - Search for a specific rule by name, or scroll through the list of rules to see high-level information about risk, systems exposed, and availability of Ansible Playbook to automate remediation.
- Click a rule to see a description of the rule, learn more from relevant knowledge base articles, and view a list of systems that are affected.
- Click a system to see specific information about detected issues and steps to resolve the issue.