3.7. 示例:监控 CPU 使用情况
若要监控实例的性能,请检查 Gnocchi 数据库,以确定您可以监控哪些指标,如内存或 CPU 使用量。
流程
要识别您可以监控的指标,使用实例 UUID 输入
openstack metric resource show
命令:$ openstack metric resource show --type instance 22592ae1-922a-4f51-b935-20c938f48753 +-----------------------+-------------------------------------------------------------------+ | Field | Value | +-----------------------+-------------------------------------------------------------------+ | availability_zone | nova | | created_at | 2021-09-16T16:16:24+00:00 | | created_by_project_id | 1adaed3aaa7f454c83307688c0825978 | | created_by_user_id | d8429405a2764c3bb5184d29bd32c46a | | creator | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 | | deleted_at | None | | display_name | foo-2 | | ended_at | None | | flavor_id | 0e5bae38-a949-4509-9868-82b353ef7ffb | | flavor_name | workload_flavor_0 | | host | compute-0.redhat.local | | id | 22592ae1-922a-4f51-b935-20c938f48753 | | image_ref | 3cde20b4-7620-49f3-8622-eeacbdc43d49 | | launched_at | 2021-09-16T16:17:03+00:00 | | metrics | cpu: a0375b0e-f799-47ea-b4ba-f494cf562ad8 | | | disk.ephemeral.size: cd082824-dfd6-49c3-afdf-6bfc8c12bd2a | | | disk.root.size: cd88dc61-ba85-45eb-a7b9-4686a6a0787b | | | memory.usage: 7a1e787c-5fa7-4ac3-a2c6-4c3821eaf80a | | | memory: ebd38ef7-cdc1-49f1-87c1-0b627d7c189e | | | vcpus: cc9353f1-bb24-4d37-ab8f-d3e887ca8856 | | original_resource_id | 22592ae1-922a-4f51-b935-20c938f48753 | | project_id | cdda46e0b5be4782bc0480dac280832a | | revision_end | None | | revision_start | 2021-09-16T17:00:41.227453+00:00 | | server_group | None | | started_at | 2021-09-16T16:17:08.444032+00:00 | | type | instance | | user_id | f00de1d74408428cadf483ea7dbb2a83 | +-----------------------+-------------------------------------------------------------------+
因此,metrics 值会列出您可以使用 Alarming 服务监控的组件,如
cpu
。要监控 CPU 用量,请使用
cpu
指标:$ openstack metric show --resource-id 22592ae1-922a-4f51-b935-20c938f48753 cpu +--------------------------------+-------------------------------------------------------------------+ | Field | Value | +--------------------------------+-------------------------------------------------------------------+ | archive_policy/name | ceilometer-high-rate | | creator | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 | | id | a0375b0e-f799-47ea-b4ba-f494cf562ad8 | | name | cpu | | resource/created_by_project_id | 1adaed3aaa7f454c83307688c0825978 | | resource/created_by_user_id | d8429405a2764c3bb5184d29bd32c46a | | resource/creator | d8429405a2764c3bb5184d29bd32c46a:1adaed3aaa7f454c83307688c0825978 | | resource/ended_at | None | | resource/id | 22592ae1-922a-4f51-b935-20c938f48753 | | resource/original_resource_id | 22592ae1-922a-4f51-b935-20c938f48753 | | resource/project_id | cdda46e0b5be4782bc0480dac280832a | | resource/revision_end | None | | resource/revision_start | 2021-09-16T17:00:41.227453+00:00 | | resource/started_at | 2021-09-16T16:17:08.444032+00:00 | | resource/type | instance | | resource/user_id | f00de1d74408428cadf483ea7dbb2a83 | | unit | ns | +--------------------------------+-------------------------------------------------------------------+
archive_policy 定义计算 std、count、min、max、sum 和 mean 值的聚合间隔。
检查
cpu
指标的当前选择归档策略:$ openstack metric archive-policy show ceilometer-high-rate +---------------------+-------------------------------------------------------------------+ | Field | Value | +---------------------+-------------------------------------------------------------------+ | aggregation_methods | rate:mean, mean | | back_window | 0 | | definition | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | | | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440 | | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | | name | ceilometer-high-rate | +---------------------+-------------------------------------------------------------------+
使用 Alarming 服务创建查询
cpu
的监控任务。此任务会根据您指定的设置触发事件。例如,若要在实例负载超过 80% 时引发日志条目,请使用以下命令:$ openstack alarm create \ --project-id 3cee262b907b4040b26b678d7180566b \ --name high-cpu \ --type gnocchi_resources_threshold \ --description 'High CPU usage' \ --metric cpu \ --threshold 800,000,000.0 \ --comparison-operator ge \ --aggregation-method mean \ --granularity 300 \ --evaluation-periods 1 \ --alarm-action 'log://' \ --ok-action 'log://' \ --resource-type instance \ --resource-id 22592ae1-922a-4f51-b935-20c938f48753 +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | aggregation_method | rate:mean | | alarm_actions | ['log:'] | | alarm_id | c7b326bd-a68c-4247-9d2b-56d9fb18bf38 | | comparison_operator | ge | | description | High CPU usage | | enabled | True | | evaluation_periods | 1 | | granularity | 300 | | insufficient_data_actions | [] | | metric | cpu | | name | high-cpu | | ok_actions | ['log:'] | | project_id | cdda46e0b5be4782bc0480dac280832a | | repeat_actions | False | | resource_id | 22592ae1-922a-4f51-b935-20c938f48753 | | resource_type | instance | | severity | low | | state | insufficient data | | state_reason | Not evaluated yet | | state_timestamp | 2021-09-21T08:02:57.090592 | | threshold | 800000000.0 | | time_constraints | [] | | timestamp | 2021-09-21T08:02:57.090592 | | type | gnocchi_resources_threshold | | user_id | f00de1d74408428cadf483ea7dbb2a83 | +---------------------------+--------------------------------------+
- comparison-operator:ge operator 定义当 CPU 使用量大于或等于 80% 时警报触发。
- granularity:指标关联有归档策略,策略可以具有各种粒度。例如,一个月内 1 小时进行聚合的 5 分钟聚合 + 1 小时聚合。granularity 值必须与归档策略中描述的持续时间匹配。
- 评估:在警报触发前需要经过的粒度周期数。例如,如果您将此值设置为 2,则 CPU 使用量必须在警报触发前有两个轮询期超过 80%。
[U'log://']:当您将
alarm_actions
或ok_actions
设置为[u'log://']
时,事件(例如,警报被触发)或返回到正常状态时,将记录到 aodh 日志文件。注意您可以定义在警报触发时运行的不同操作(alarm_actions),以及返回正常状态(ok_actions)时运行的不同操作,如 webhook URL。