Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 3. Managing alarms

You can use the Telemetry Alarming service (aodh) to trigger actions based on defined rules against metric or event data collected by Ceilometer or Gnocchi.

Alarms can be in one of the following states:

ok: The metric or event is in an acceptable state.
firing: The metric or event is outside of the defined ok state.
insufficient data: The alarm state is unknown. This can be for several reasons, for example, there is no data for the requested granularity, the check has not been executed yet, and so on.

3.1. Viewing existing alarms
Link kopieren

You can view existing Telemetry alarm information and list the meters assigned to a resource to check the current state of the metrics.

Procedure

List the existing Telemetry alarms:

openstack alarm list

# openstack alarm list
+--------------------------------------+--------------------------------------------+----------------------------+-------------------+----------+---------+
| alarm_id                             | type                                       | name                       | state             | severity | enabled |
+--------------------------------------+--------------------------------------------+----------------------------+-------------------+----------+---------+
| 922f899c-27c8-4c7d-a2cf-107be51ca90a | gnocchi_aggregation_by_resources_threshold | iops-monitor-read-requests | insufficient data | low      | True    |
+--------------------------------------+--------------------------------------------+----------------------------+-------------------+----------+---------+

Copy to Clipboard

Toggle word wrap

To list the meters assigned to a resource, specify the UUID of the resource. For example:

openstack metric resource show 22592ae1-922a-4f51-b935-20c938f48753

# openstack metric resource show 22592ae1-922a-4f51-b935-20c938f48753

| Field                 | Value                                                             |
+-----------------------+-------------------------------------------------------------------+
| created_by_project_id | 1adaed3aaa7f454c83307688c0825978                                  |
| created_by_user_id    | d8429405a2764c3bb5184d29bd32c46a                                  |
| creator               | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 |
| ended_at              | None                                                              |
| id                    | 22592ae1-922a-4f51-b935-20c938f48753                              |
| metrics               | cpu: a0375b0e-f799-47ea-b4ba-f494cf562ad8                         |
|                       | disk.ephemeral.size: cd082824-dfd6-49c3-afdf-6bfc8c12bd2a         |
|                       | disk.root.size: cd88dc61-ba85-45eb-a7b9-4686a6a0787b              |
|                       | memory.usage: 7a1e787c-5fa7-4ac3-a2c6-4c3821eaf80a                |
|                       | memory: ebd38ef7-cdc1-49f1-87c1-0b627d7c189e                      |
|                       | vcpus: cc9353f1-bb24-4d37-ab8f-d3e887ca8856                       |
| original_resource_id  | 22592ae1-922a-4f51-b935-20c938f48753                              |
| project_id            | cdda46e0b5be4782bc0480dac280832a                                  |
| revision_end          | None                                                              |
| revision_start        | 2021-09-16T17:00:41.227453+00:00                                  |
| started_at            | 2021-09-16T16:17:08.444032+00:00                                  |
| type                  | instance                                                          |
| user_id               | f00de1d74408428cadf483ea7dbb2a83                                  |
+-----------------------+-------------------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

3.2. Creating an alarm
Link kopieren

Use the Telemetry Alarming service (aodh) to create an alarm that triggers when a particular condition is met, for example, when a threshold value is reached. In this example, the alarm activates and adds a log entry when the average CPU utilization for an individual instance exceeds 80%.

Procedure

Archive policies are pre-populated during the deployment process and you rarely need to create a new archive policy. However, if there is no configured archive policy, you must create one. To create an archive policy that creates metrics for 5s * 86400 points (5 days), use the following command:
```
openstack archive-policy create <name> \
       -d granularity:5s,points:86400 \
       -b 3 -m mean -m rate:mean
```
```
# openstack archive-policy create <name> \
       -d granularity:5s,points:86400 \
       -b 3 -m mean -m rate:mean
```
Copy to Clipboard Toggle word wrap
+ Replace <name> with the name of the archive policy.
Note
Ensure that you set the value of the evaluation period for the Telemetry Alarming service to an integer greater than 60. The Ceilometer polling interval is linked to the evaluation period. Ensure that you set the Ceilometer polling interval value to a number between 60 and 600 and ensure that the value is greater than the value of the evaluation period for the Telemetry Alarming service. If the Ceilometer polling interval is too low, it can severely impact system load.

Create an alarm and use a query to isolate the specific ID of the instance for monitoring purposes. The ID of the instance in the following example is 94619081-abf5-4f1f-81c7-9cedaa872403.

Note

To calculate the threshold value, use the following formula: 1,000,000,000 x {granularity} x {percentage_in_decimal}

openstack alarm create \
--type gnocchi_aggregation_by_resources_threshold \
--name cpu_usage_high \
--granularity 5

# openstack alarm create \
--type gnocchi_aggregation_by_resources_threshold \
--name cpu_usage_high \
--granularity 5
--metric cpu \
--threshold 48000000000 \
--aggregation-method rate:mean \
--resource-type instance \
--query '{"=": {"id": "94619081-abf5-4f1f-81c7-9cedaa872403"}}' --alarm-action 'log://'
+---------------------------+-------------------------------------------------------+
| Field                     | Value                                                 |
+---------------------------+-------------------------------------------------------+
| aggregation_method        | rate:mean                                                   |
| alarm_actions             | [u'log://']                                           |
| alarm_id                  | b794adc7-ed4f-4edb-ace4-88cbe4674a94                  |
| comparison_operator       | eq                                                    |
| description               | gnocchi_aggregation_by_resources_threshold alarm rule |
| enabled                   | True                                                  |
| evaluation_periods        | 1                                                     |
| granularity               | 5                                                    |
| insufficient_data_actions | []                                                    |
| metric                    | cpu                                              |
| name                      | cpu_usage_high                                        |
| ok_actions                | []                                                    |
| project_id                | 13c52c41e0e543d9841a3e761f981c20                      |
| query                     | {"=": {"id": "94619081-abf5-4f1f-81c7-9cedaa872403"}} |
| repeat_actions            | False                                                 |
| resource_type             | instance                                              |
| severity                  | low                                                   |
| state                     | insufficient data                                     |
| state_timestamp           | 2016-12-09T05:18:53.326000                            |
| threshold                 | 48000000000.0                                                  |
| time_constraints          | []                                                    |
| timestamp                 | 2016-12-09T05:18:53.326000                            |
| type                      | gnocchi_aggregation_by_resources_threshold            |
| user_id                   | 32d3f2c9a234423cb52fb69d3741dbbc                      |
+---------------------------+-------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

3.3. Editing an alarm
Link kopieren

When you edit an alarm, you increase or decrease the value threshold of the alarm.

Procedure

To update the threshold value, use the openstack alarm update command. For example, to increase the alarm threshold to 75%, use the following command:
```
openstack alarm update --name cpu_usage_high --threshold 75
```
```
# openstack alarm update --name cpu_usage_high --threshold 75
```
Copy to Clipboard Toggle word wrap

3.4. Disabling an alarm
Link kopieren

You can disable and enable alarms.

Procedure

Disable the alarm:

openstack alarm update --name cpu_usage_high --enabled=false

# openstack alarm update --name cpu_usage_high --enabled=false

Copy to Clipboard

Toggle word wrap

3.5. Deleting an alarm
Link kopieren

Use the openstack alarm delete command to delete an alarm.

Procedure

To delete an alarm, enter the following command:
```
openstack alarm delete --name cpu_usage_high
```
```
# openstack alarm delete --name cpu_usage_high
```
Copy to Clipboard Toggle word wrap

3.6. Example: Monitoring the disk activity of instances
Link kopieren

This example demonstrates how to use an alarm that is part of the Telemetry Alarming service to monitor the cumulative disk activity for all the instances contained within a particular project.

Procedure

Review the existing projects and select the appropriate UUID of the project that you want to monitor. This example uses the admin tenant:

openstack project list

$ openstack project list
+----------------------------------+----------+
| ID                               | Name     |
+----------------------------------+----------+
| 745d33000ac74d30a77539f8920555e7 | admin    |
| 983739bb834a42ddb48124a38def8538 | services |
| be9e767afd4c4b7ead1417c6dfedde2b | demo     |
+----------------------------------+----------+

Copy to Clipboard

Toggle word wrap

Use the project UUID to create an alarm that analyses the sum() of all read requests generated by the instances in the admin tenant. You can further restrain the query by using the --query parameter:

openstack alarm create \
--type gnocchi_aggregation_by_resources_threshold \
--name iops-monitor-read-requests \
--metric disk.read.requests.rate \
--threshold 42000 \
--aggregation-method sum \
--resource-type instance \
--query '{"=": {"project_id": "745d33000ac74d30a77539f8920555e7"}}'

# openstack alarm create \
--type gnocchi_aggregation_by_resources_threshold \
--name iops-monitor-read-requests \
--metric disk.read.requests.rate \
--threshold 42000 \
--aggregation-method sum \
--resource-type instance \
--query '{"=": {"project_id": "745d33000ac74d30a77539f8920555e7"}}'
+---------------------------+-----------------------------------------------------------+

| Field                     | Value                                                     |
+---------------------------+-----------------------------------------------------------+
| aggregation_method        | sum                                                       |
| alarm_actions             | []                                                        |
| alarm_id                  | 192aba27-d823-4ede-a404-7f6b3cc12469                      |
| comparison_operator       | eq                                                        |
| description               | gnocchi_aggregation_by_resources_threshold alarm rule     |
| enabled                   | True                                                      |
| evaluation_periods        | 1                                                         |
| granularity               | 60                                                        |
| insufficient_data_actions | []                                                        |
| metric                    | disk.read.requests.rate                                   |
| name                      | iops-monitor-read-requests                                |
| ok_actions                | []                                                        |
| project_id                | 745d33000ac74d30a77539f8920555e7                          |
| query                     | {"=": {"project_id": "745d33000ac74d30a77539f8920555e7"}} |
| repeat_actions            | False                                                     |
| resource_type             | instance                                                  |
| severity                  | low                                                       |
| state                     | insufficient data                                         |
| state_timestamp           | 2016-11-08T23:41:22.919000                                |
| threshold                 | 42000.0                                                   |
| time_constraints          | []                                                        |
| timestamp                 | 2016-11-08T23:41:22.919000                                |
| type                      | gnocchi_aggregation_by_resources_threshold                |
| user_id                   | 8c4aea738d774967b4ef388eb41fef5e                          |
+---------------------------+-----------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

3.7. Example: Monitoring CPU use
Link kopieren

To monitor the performance of an instance, examine the Gnocchi database to identify which metrics you can monitor, such as memory or CPU usage.

Procedure

To identify the metrics you can monitor, enter the openstack metric resource show command with an instance UUID:

openstack metric resource show --type instance 22592ae1-922a-4f51-b935-20c938f48753

$ openstack metric resource show --type instance 22592ae1-922a-4f51-b935-20c938f48753

+-----------------------+-------------------------------------------------------------------+
| Field                 | Value                                                             |
+-----------------------+-------------------------------------------------------------------+
| availability_zone     | nova                                                              |
| created_at            | 2021-09-16T16:16:24+00:00                                         |
| created_by_project_id | 1adaed3aaa7f454c83307688c0825978                                  |
| created_by_user_id    | d8429405a2764c3bb5184d29bd32c46a                                  |
| creator               | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 |
| deleted_at            | None                                                              |
| display_name          | foo-2                                                             |
| ended_at              | None                                                              |
| flavor_id             | 0e5bae38-a949-4509-9868-82b353ef7ffb                              |
| flavor_name           | workload_flavor_0                                                 |
| host                  | compute-0.redhat.local                                            |
| id                    | 22592ae1-922a-4f51-b935-20c938f48753                              |
| image_ref             | 3cde20b4-7620-49f3-8622-eeacbdc43d49                              |
| launched_at           | 2021-09-16T16:17:03+00:00                                         |
| metrics               | cpu: a0375b0e-f799-47ea-b4ba-f494cf562ad8                         |
|                       | disk.ephemeral.size: cd082824-dfd6-49c3-afdf-6bfc8c12bd2a         |
|                       | disk.root.size: cd88dc61-ba85-45eb-a7b9-4686a6a0787b              |
|                       | memory.usage: 7a1e787c-5fa7-4ac3-a2c6-4c3821eaf80a                |
|                       | memory: ebd38ef7-cdc1-49f1-87c1-0b627d7c189e                      |
|                       | vcpus: cc9353f1-bb24-4d37-ab8f-d3e887ca8856                       |
| original_resource_id  | 22592ae1-922a-4f51-b935-20c938f48753                              |
| project_id            | cdda46e0b5be4782bc0480dac280832a                                  |
| revision_end          | None                                                              |
| revision_start        | 2021-09-16T17:00:41.227453+00:00                                  |
| server_group          | None                                                              |
| started_at            | 2021-09-16T16:17:08.444032+00:00                                  |
| type                  | instance                                                          |
| user_id               | f00de1d74408428cadf483ea7dbb2a83                                  |
+-----------------------+-------------------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

In this result, the metrics value lists the components you can monitor with the Alarming service, for example cpu.

To monitor CPU usage, use the cpu metric:

openstack metric show --resource-id 22592ae1-922a-4f51-b935-20c938f48753 cpu

$ openstack metric show --resource-id 22592ae1-922a-4f51-b935-20c938f48753 cpu
+--------------------------------+-------------------------------------------------------------------+
| Field                          | Value                                                             |
+--------------------------------+-------------------------------------------------------------------+
| archive_policy/name            | ceilometer-high-rate                                              |
| creator                        | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 |
| id                             | a0375b0e-f799-47ea-b4ba-f494cf562ad8                              |
| name                           | cpu                                                               |
| resource/created_by_project_id | 1adaed3aaa7f454c83307688c0825978                                  |
| resource/created_by_user_id    | d8429405a2764c3bb5184d29bd32c46a                                  |
| resource/creator               | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 |
| resource/ended_at              | None                                                              |
| resource/id                    | 22592ae1-922a-4f51-b935-20c938f48753                              |
| resource/original_resource_id  | 22592ae1-922a-4f51-b935-20c938f48753                              |
| resource/project_id            | cdda46e0b5be4782bc0480dac280832a                                  |
| resource/revision_end          | None                                                              |
| resource/revision_start        | 2021-09-16T17:00:41.227453+00:00                                  |
| resource/started_at            | 2021-09-16T16:17:08.444032+00:00                                  |
| resource/type                  | instance                                                          |
| resource/user_id               | f00de1d74408428cadf483ea7dbb2a83                                  |
| unit                           | ns                                                                |
+--------------------------------+-------------------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

The archive_policy defines the aggregation interval for calculating the std, count, min, max, sum, mean values.

Inspect the currently selected archive policy for the cpu metric:

openstack metric archive-policy show ceilometer-high-rate

$ openstack metric archive-policy show ceilometer-high-rate

+---------------------+-------------------------------------------------------------------+
| Field               | Value                                                             |
+---------------------+-------------------------------------------------------------------+
| aggregation_methods | rate:mean, mean                                                   |
| back_window         | 0                                                                 |
| definition          | - timespan: 1:00:00, granularity: 0:00:01, points: 3600           |
|                     | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440    |
|                     | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 |
| name                | ceilometer-high-rate                                              |
+---------------------+-------------------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

Use the Alarming service to create a monitoring task that queries cpu. This task triggers events based on the settings that you specify. For example, to raise a log entry when the CPU of an instance spikes over 80% for an extended duration, use the following command:

openstack alarm create \
  --project-id 3cee262b907b4040b26b678d7180566b \
  --name high-cpu \
  --type gnocchi_resources_threshold \
  --description 'High CPU usage' \
  --metric cpu \
  --threshold 800,000,000.0 \
  --comparison-operator ge \
  --aggregation-method mean \
  --granularity 300 \
  --evaluation-periods 1 \
  --alarm-action 'log://' \
  --ok-action 'log://' \
  --resource-type instance \
  --resource-id 22592ae1-922a-4f51-b935-20c938f48753

$ openstack alarm create \
  --project-id 3cee262b907b4040b26b678d7180566b \
  --name high-cpu \
  --type gnocchi_resources_threshold \
  --description 'High CPU usage' \
  --metric cpu \
  --threshold 800,000,000.0 \
  --comparison-operator ge \
  --aggregation-method mean \
  --granularity 300 \
  --evaluation-periods 1 \
  --alarm-action 'log://' \
  --ok-action 'log://' \
  --resource-type instance \
  --resource-id 22592ae1-922a-4f51-b935-20c938f48753

  +---------------------------+--------------------------------------+
  | Field                     | Value                                |
  +---------------------------+--------------------------------------+
  | aggregation_method        | rate:mean                            |
  | alarm_actions             | ['log:']                             |
  | alarm_id                  | c7b326bd-a68c-4247-9d2b-56d9fb18bf38 |
  | comparison_operator       | ge                                   |
  | description               | High CPU usage                       |
  | enabled                   | True                                 |
  | evaluation_periods        | 1                                    |
  | granularity               | 300                                  |
  | insufficient_data_actions | []                                   |
  | metric                    | cpu                                  |
  | name                      | high-cpu                             |
  | ok_actions                | ['log:']                             |
  | project_id                | cdda46e0b5be4782bc0480dac280832a     |
  | repeat_actions            | False                                |
  | resource_id               | 22592ae1-922a-4f51-b935-20c938f48753 |
  | resource_type             | instance                             |
  | severity                  | low                                  |
  | state                     | insufficient data                    |
  | state_reason              | Not evaluated yet                    |
  | state_timestamp           | 2021-09-21T08:02:57.090592           |
  | threshold                 | 800000000.0                          |
  | time_constraints          | []                                   |
  | timestamp                 | 2021-09-21T08:02:57.090592           |
  | type                      | gnocchi_resources_threshold          |
  | user_id                   | f00de1d74408428cadf483ea7dbb2a83     |
  +---------------------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

comparison-operator: The ge operator defines that the alarm triggers if the CPU usage is greater than or equal to 80%.
granularity: Metrics have an archive policy associated with them; the policy can have various granularities. For example, 5 minutes aggregation for 1 hour + 1 hour aggregation over a month. The granularity value must match the duration described in the archive policy.
evaluation-periods: Number of granularity periods that need to pass before the alarm triggers. For example, if you set this value to 2, the CPU usage must be over 80% for two polling periods before the alarm triggers.
[u’log://']: When you set alarm_actions or ok_actions to [u’log://'], events, for example, the alarm is triggered or returns to a normal state, are recorded to the aodh log file.
Note
You can define different actions to run when an alarm is triggered (alarm_actions), and when it returns to a normal state (ok_actions), such as a webhook URL.

3.8. Viewing alarm history
Link kopieren

To check if a particular alarm has been triggered, you can query the alarm history and view the event information.

Procedure

Use the openstack alarm-history show command:

openstack alarm-history show 1625015c-49b8-4e3f-9427-3c312a8615dd --fit-width

$ openstack alarm-history show 1625015c-49b8-4e3f-9427-3c312a8615dd --fit-width
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp                  | type             | detail                                                                                                                                            | event_id                             |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-11-16T05:21:47.850094 | state transition | {"transition_reason": "Transition to ok due to 1 samples inside threshold, most recent: 0.0366665763", "state": "ok"}                             | 3b51f09d-ded1-4807-b6bb-65fdc87669e4 |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 3. Managing alarms

3.1. Viewing existing alarms
Link kopieren

3.2. Creating an alarm
Link kopieren

3.3. Editing an alarm
Link kopieren

3.4. Disabling an alarm
Link kopieren

3.5. Deleting an alarm
Link kopieren

3.6. Example: Monitoring the disk activity of instances
Link kopieren

3.7. Example: Monitoring CPU use
Link kopieren

3.8. Viewing alarm history
Link kopieren

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Mehr Inklusion in Open Source

Über Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 3. Managing alarms

3.1. Viewing existing alarmsLink kopierenLink in die Zwischenablage kopiert!

3.2. Creating an alarmLink kopierenLink in die Zwischenablage kopiert!

3.3. Editing an alarmLink kopierenLink in die Zwischenablage kopiert!

3.4. Disabling an alarmLink kopierenLink in die Zwischenablage kopiert!

3.5. Deleting an alarmLink kopierenLink in die Zwischenablage kopiert!

3.6. Example: Monitoring the disk activity of instancesLink kopierenLink in die Zwischenablage kopiert!

3.7. Example: Monitoring CPU useLink kopierenLink in die Zwischenablage kopiert!

3.8. Viewing alarm historyLink kopierenLink in die Zwischenablage kopiert!

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Mehr Inklusion in Open Source

Über Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

3.1. Viewing existing alarms
Link kopieren

3.2. Creating an alarm
Link kopieren

3.3. Editing an alarm
Link kopieren

3.4. Disabling an alarm
Link kopieren

3.5. Deleting an alarm
Link kopieren

3.6. Example: Monitoring the disk activity of instances
Link kopieren

3.7. Example: Monitoring CPU use
Link kopieren

3.8. Viewing alarm history
Link kopieren