4.4. 平衡单个实例工作负载
使用 Optimize 服务(watcher)和 工作 负载平衡策略,在物理主机 CPU 或 RAM 利用率百分比超过指定阈值时移动虚拟机实例工作负载。所迁移的实例应该会导致主机的工作负载大约是 OpenShift (RHOSO)集群中红帽 OpenStack 服务中所有 Compute 节点的平均工作负载。
| 目标 | 策略 |
|---|---|
|
|
|
先决条件
- 您有一个可正常工作的 RHOSO 18.0,在其上运行 Optimize 服务(watcher)。
- 您的 RHOSO 环境至少包含两个 Compute 节点,每个节点至少运行一个实例。
-
在工作站上安装了
oc命令行工具。 -
以具有
cluster-admin权限的用户身份登录到可访问 RHOSO 控制平面的工作站。
步骤
从您的工作站访问 OpenStackClient pod 的远程 shell:
$ oc rsh -n openstack openstackclient验证 RHOSO 环境是否包含至少两个 Compute 节点,每个节点至少运行一个实例:
$ openstack server list --long输出示例
在本例中,在
compute1上运行的test01和test03实例具有大量 CPU 负载。我们希望将 Optimize 服务与工作负载平衡策略一起使用,以在没有用户中断的情况下将至少一个实例实时迁移到具有更多容量的 Compute 节点:+--------+-------------------------------+ | Name | Host | +--------+-------------------------------+ | test04 | compute2.ctlplane.localdomain | | test03 | compute1.ctlplane.localdomain | | test02 | compute2.ctlplane.localdomain | | test01 | compute1.ctlplane.localdomain | +--------+-------------------------------+创建使用 strategy、
workload_balance和目标workload_balancing的审计模板。Example
在本例中,audit 模板名为
WorkLoadBalance:$ openstack optimize audittemplate create -s workload_balance \ WorkLoadBalance workload_balancing输出示例
+-------------+--------------------------------------+ | Field | Value | +-------------+--------------------------------------+ | UUID | 5990155a-e3f2-46f8-a81e-c87d0b2f09a2 | | Created At | 2025-07-03T14:05:23.231411+00:00 | | Updated At | None | | Deleted At | None | | Description | None | | Name | WorkLoadBalance | | Goal | workload_balancing | | Strategy | workload_balance | | Audit Scope | [] | +-------------+--------------------------------------+确认已创建了 audit 模板:
$ openstack optimize audittemplate list输出示例
+----------------------+------------------+ | Goal | Strategy | +----------------------+------------------+ | workload_balancing | workload_balance | +----------------------+------------------+运行基于策略、
workload_balance和目标workload_balancing的审计模板。使用适合您环境的值更新策略参数。Example
在这个审核中,使用了
WorkLoadBalance审计模板,并使用带有不同值的workload_balance策略参数来检查 CPU 用量:$ openstack optimize audit create -a WorkLoadBalance -p granularity=30 \ -p threshold=20 -p period=300 -p metrics=instance_cpu_usage有关此策略使用的参数的更多信息,请参阅 Workload balance 迁移策略。
输出示例
+---------------+------------------------------------------------------+ | Field | Value | +---------------+------------------------------------------------------+ | UUID | ad815d54-5b7d-4562-aa12-17e1b64d0868 | | Name | workload_balance-2025-07-03T15:55:08.016161 | | Created At | 2025-07-03T15:55:08.026706+00:00 | | Updated At | None | | Deleted At | None | | State | PENDING | | Audit Type | ONESHOT | | Parameters | {'granularity': 30, 'threshold': 20, 'period': 300, | | | 'metrics': 'instance_cpu_usage'} | | Interval | None | | Goal | workload_balancing | | Strategy | workload_balance | | Audit Scope | [] | | Auto Trigger | False | | Next Run Time | None | | Hostname | None | | Start Time | None | | End Time | None | | Force | False | +---------------+------------------------------------------------------+确认 Optimize 服务运行审计:
$ openstack optimize audit list输出示例
如果审计
状态的值为SUCCEEDED,则审计运行并创建了操作计划:+--------------------------------------+-----------+------------------+ | UUID | State | Strategy | +--------------------------------------+-----------+------------------+ | ad815d54-5b7d-4562-aa12-17e1b64d0868 | SUCCEEDED | workload_balance | +--------------------------------------+-----------+------------------+检查操作计划。
Example
$ openstack optimize actionplan list \ --audit ad815d54-5b7d-4562-aa12-17e1b64d0868输出示例
在本例中,
全局 efficacy是Live_migrations_count: 25.00 %。这个值表示,如果您执行操作计划,计算服务将迁移当前运行的实例的 25%:+---------------------------+-------------+----------------------------+ | UUID | State | Global efficacy | +---------------------------+-------------+----------------------------+ | f40dfa4e-1b96-4883-b85f- | RECOMMENDED | Live_migrations_count: | | 3bfa73554359 | | 25.00 % | | | | | +---------------------------+-------------+----------------------------+列出操作计划中包含的操作。
Example
$ openstack optimize action list \ --action-plan f40dfa4e-1b96-4883-b85f-3bfa73554359输出示例
在本例中,操作计划包含一个操作,
迁移:+--------------------------------------+---------+---------+ | UUID | State | Action | +--------------------------------------+---------+---------+ | 9a510bf9-ebac-450d-a4ea-a10b66d6d869 | PENDING | migrate | +--------------------------------------+---------+---------+您可以查看更多有关操作的详情:
Example
$ openstack optimize action show 9a510bf9-ebac-450d-a4ea-a10b66d6d869输出示例
在本例中,操作计划将把 CPU 使用率高负载的其中一个实例实时迁移到实例 CPU 使用量较低的 Compute 节点。
+-------------+--------------------------------------------------------+ | Field | Value | +-------------+--------------------------------------------------------+ | UUID | 9a510bf9-ebac-450d-a4ea-a10b66d6d869 | | Created At | 2025-07-03T15:55:08+00:00 | | Updated At | None | | Deleted At | None | | Parents | [] | | State | PENDING | | Action Plan | f40dfa4e-1b96-4883-b85f-3bfa73554359 | | Action | migrate | | Parameters | {'migration_type': 'live', 'source_node': | | | 'compute1.ctlplane.localdomain', 'destination_node': | | | 'compute2.ctlplane.localdomain', 'resource_name': | | | 'test03', 'resource_id': | | | 'd6ae1c7c-8e69-45ae-92b3-6218b8c1570b'} | | Description | Moving a VM instance from source_node to | | | destination_node | +-------------+--------------------------------------------------------+执行操作计划。
Example
$ openstack optimize actionplan start \ f40dfa4e-1b96-4883-b85f-3bfa73554359输出示例
+---------------------+------------------------------------------------+ | Field | Value | +---------------------+------------------------------------------------+ | UUID | f40dfa4e-1b96-4883-b85f-3bfa73554359 | | Created At | 2025-07-03T15:55:08+00:00 | | Updated At | 2025-07-03T16:00:23+00:00 | | Deleted At | None | | Audit | f878bd64-96bc-4063-97a6-dc7500edfb55 | | Strategy | workload_balance | | State | PENDING | | Efficacy indicators | [{'name': 'instance_migrations_count', | | | 'description': 'The number of VM migrations to | | | be performed.', 'unit': None, 'value': 1.0}, | | | {'name': 'instances_count', 'description': | | | 'The total number of audited instances in | | | strategy.', 'unit': None, 'value': 4.0}] | | Global efficacy | [{'name': 'live_migrations_count', | | | 'description': 'Ratio of migrated virtual | | | machines to audited virtual machines', 'unit': | | | '%', 'value': 25.0}] | | Hostname | None | +---------------------+------------------------------------------------+确认操作是否成功。
Example
$ openstack optimize action list \ --action-plan dfdcb491-89c5-4c07-a5ed-65d2085c488c输出示例
+------------------+---------+-----------+-------------------+---------+ | UUID | Parents | State | Action Plan | Action | +------------------+---------+-----------+-------------------+---------+ | 9a510bf9-ebac- | [] | SUCCEEDED | f40dfa4e-1b96- | migrate | | 450d-a4ea- | | | 4883-b85f- | | | a10b66d6d869 | | | 3bfa73554359 | | +------------------+---------+-----------+-------------------+---------+确认 CPU 使用量较重的实例已迁移到不同的 Compute 节点:
$ openstack server list --long输出示例
在本例中,实例
test03现在在不同节点上运行,compute2:+--------+-------------------------------+ | Name | Host | +--------+-------------------------------+ | test04 | compute2.ctlplane.localdomain | | test03 | compute2.ctlplane.localdomain | | test02 | compute2.ctlplane.localdomain | | test01 | compute1.ctlplane.localdomain | +--------+-------------------------------+退出
openstackclientpod:$ exit