Chapter 4. Capacity metering using the Telemetry service
Este contenido no está disponible en el idioma seleccionado.
Chapter 4. Capacity metering using the Telemetry service
The OpenStack Telemetry service provides usage metrics that you can use for billing, charge-back, and show-back purposes. Such metrics data can also be used by third-party applications to plan for capacity on the cluster and can also be leveraged for auto-scaling virtual instances using OpenStack Heat. For more information, see Auto Scaling for Instances.
You can use the combination of Ceilometer and Gnocchi for monitoring and alarms. This is supported on small-size clusters and with known limitations. For real-time monitoring, Red Hat OpenStack Platform ships with agents that provide metrics data, and can be consumed by separate monitoring infrastructure and applications. For more information, see Monitoring Tools Configuration.
If your Intel hardware and libvirt version supports Cache Monitoring Technology (CMT), you can use the cpu_l3_cache meter to monitor the amount of L3 cache used by an instance.
Monitoring the L3 cache requires the following:
cmt in the LibvirtEnabledPerfEvents parameter.
cpu_l3_cache in the gnocchi_resources.yaml file.
cpu_l3_cache in the Ceilometer polling.yaml file.
Enableing L3 cache monitoring
To enable L3 cache monitoring:
Create a YAML file for telemetry (for example, ceilometer-environment.yaml) and add cmt to the LibvirtEnabledPerfEvents parameter.
parameter_defaults:
LibvirtEnabledPerfEvents: cmt
parameter_defaults:LibvirtEnabledPerfEvents: cmt
Copy to ClipboardCopied!Toggle word wrapToggle overflow
You can use aodh to create an alarm that activates when a threshold value is reached. In this example, the alarm activates and adds a log entry when the average CPU utilization for an individual instance exceeds 80%. A query is used to isolate the specific instance’s id (94619081-abf5-4f1f-81c7-9cedaa872403) for monitoring purposes:
The following example demonstrates how to use an Aodh alarm to monitor the cumulative disk activity for all the instances contained within a particular project.
1. Review the existing projects, and select the appropriate UUID of the project you need to monitor. This example uses the admin project:
openstack project list
$ openstack project list
+----------------------------------+----------+
| ID | Name |
+----------------------------------+----------+
| 745d33000ac74d30a77539f8920555e7 | admin |
| 983739bb834a42ddb48124a38def8538 | services |
| be9e767afd4c4b7ead1417c6dfedde2b | demo |
+----------------------------------+----------+
Copy to ClipboardCopied!Toggle word wrapToggle overflow
2. Use the project’s UUID to create an alarm that analyses the sum() of all read requests generated by the instances in the admin project (the query can be further restrained with the --query parameter).
If you want to monitor an instance’s performance, you would start by examining the gnocchi database to identify which metrics you can monitor, such as memory or CPU usage. For example, run gnocchi resource show against an instance to identify which metrics can be monitored:
Query the available metrics for a particular instance UUID:
gnocchi resource show --type instance d71cdf9a-51dc-4bba-8170-9cd95edd3f66
Copy to ClipboardCopied!Toggle word wrapToggle overflow
archive_policy - Defines the aggregation interval for calculating the std, count, min, max, sum, mean values.
Use Aodh to create a monitoring task that queries cpu_util. This task will trigger events based on the settings you specify. For example, to raise a log entry when an instance’s CPU spikes over 80% for an extended duration:
Copy to ClipboardCopied!Toggle word wrapToggle overflow
comparison-operator - The ge operator defines that the alarm will trigger if the CPU usage is greater than (or equal to) 80%.
granularity - Metrics have an archive policy associated with them; the policy can have various granularities (for example, 5 minutes aggregation for 1 hour + 1 hour aggregation over a month). The granularity value must match the duration described in the archive policy.
evaluation-periods - Number of granularity periods that need to pass before the alarm will trigger. For example, setting this value to 2 will mean that the CPU usage will need to be over 80% for two polling periods before the alarm will trigger.
[u’log://'] - This value will log events to your Aodh log file.
Note
You can define different actions to run when an alarm is triggered (alarm_actions), and when it returns to a normal state (ok_actions), such as a webhook URL.
To check if your alarm has been triggered, query the alarm’s history:
aodh alarm-history show 1625015c-49b8-4e3f-9427-3c312a8615dd --fit-width
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp | type | detail | event_id |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-11-16T05:21:47.850094 | state transition | {"transition_reason": "Transition to ok due to 1 samples inside threshold, most recent: 0.0366665763", "state": "ok"} | 3b51f09d-ded1-4807-b6bb-65fdc87669e4 |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
aodh alarm-history show 1625015c-49b8-4e3f-9427-3c312a8615dd --fit-width
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp | type | detail | event_id |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-11-16T05:21:47.850094 | state transition | {"transition_reason": "Transition to ok due to 1 samples inside threshold, most recent: 0.0366665763", "state": "ok"} | 3b51f09d-ded1-4807-b6bb-65fdc87669e4 |
+----------------------------+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
Copy to ClipboardCopied!Toggle word wrapToggle overflow
Telemetry resource types that were previously hardcoded can now be managed by the gnocchi client. You can use the gnocchi client to create, view, and delete resource types, and you can use the gnocchi API to update or delete attributes.
1. Create a new resource-type:
gnocchi resource-type create testResource01 -a bla:string:True:min_length=123
$ gnocchi resource-type create testResource01 -a bla:string:True:min_length=123
+----------------+------------------------------------------------------------+
| Field | Value |
+----------------+------------------------------------------------------------+
| attributes/bla | max_length=255, min_length=123, required=True, type=string |
| name | testResource01 |
| state | active |
+----------------+------------------------------------------------------------+
Copy to ClipboardCopied!Toggle word wrapToggle overflow
2. Review the configuration of the resource-type:
gnocchi resource-type show testResource01
$ gnocchi resource-type show testResource01
+----------------+------------------------------------------------------------+
| Field | Value |
+----------------+------------------------------------------------------------+
| attributes/bla | max_length=255, min_length=123, required=True, type=string |
| name | testResource01 |
| state | active |
+----------------+------------------------------------------------------------+
Copy to ClipboardCopied!Toggle word wrapToggle overflow
3. Delete the resource-type:
gnocchi resource-type delete testResource01
$ gnocchi resource-type delete testResource01
Copy to ClipboardCopied!Toggle word wrapToggle overflow
Note
You cannot delete a resource type if a resource is using it.
Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.
Hacer que el código abierto sea más inclusivo
Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.
Acerca de Red Hat
Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.