Questo contenuto non è disponibile nella lingua selezionata.

Chapter 4. Testing and troubleshooting autoscaling

Use the Orchestration service (heat) to automatically scale instances up and down based on threshold definitions. To troubleshoot your environment, you can look for errors in the log files and history records.

4.1. Testing automatic scaling up of instances
Copia collegamento

You can use the Orchestration service (heat) to scale instances automatically based on the cpu_alarm_high threshold definition. When the CPU use reaches a value defined in the threshold parameter, another instance starts up to balance the load. The threshold value in the template.yaml file is set to 80%.

Procedure

Log in to the host environment as the stack user.
For standalone environments set the OS_CLOUD environment variable:
```
export OS_CLOUD=standalone
```
```
[stack@standalone ~]$ export OS_CLOUD=standalone
```
Copy to Clipboard Toggle word wrap
For director environments source the overcloudrc file:
```
source ~/overcloudrc
```
```
[stack@undercloud ~]$ source ~/overcloudrc
```
Copy to Clipboard Toggle word wrap

ssh -i ~/mykey.pem cirros@192.168.122.8

$ ssh -i ~/mykey.pem cirros@192.168.122.8

Copy to Clipboard

Toggle word wrap

Run multiple dd commands to generate the load:

sudo dd if=/dev/zero of=/dev/null &
sudo dd if=/dev/zero of=/dev/null &
sudo dd if=/dev/zero of=/dev/null &

[instance ~]$ sudo dd if=/dev/zero of=/dev/null &
[instance ~]$ sudo dd if=/dev/zero of=/dev/null &
[instance ~]$ sudo dd if=/dev/zero of=/dev/null &

Copy to Clipboard

Toggle word wrap

Exit from the running instance and return to the host.

After you run the dd commands, you can expect to have 100% CPU use in the instance. Verify that the alarm has been triggered:

openstack alarm list

$ openstack alarm list
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+
| alarm_id                             | type                                       | name                                | state | severity | enabled |
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+
| 022f707d-46cc-4d39-a0b2-afd2fc7ab86a | gnocchi_aggregation_by_resources_threshold | example-cpu_alarm_high-odj77qpbld7j | alarm | low      | True    |
| 46ed2c50-e05a-44d8-b6f6-f1ebd83af913 | gnocchi_aggregation_by_resources_threshold | example-cpu_alarm_low-m37jvnm56x2t  | ok    | low      | True    |
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+

Copy to Clipboard

Toggle word wrap

After approximately 60 seconds, Orchestration starts another instance and adds it to the group. To verify that an instance has been created, enter the following command:

openstack server list

$ openstack server list
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+
| ID                                   | Name                                                  | Status | Task State | Power State | Networks                              |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+
| 477ee1af-096c-477c-9a3f-b95b0e2d4ab5 | ex-3gax-4urpikl5koff-yrxk3zxzfmpf-server-2hde4tp4trnk | ACTIVE | -          | Running     | internal1=10.10.10.13, 192.168.122.17 |
| e1524f65-5be6-49e4-8501-e5e5d812c612 | ex-3gax-5f3a4og5cwn2-png47w3u2vjd-server-vaajhuv4mj3j | ACTIVE | -          | Running     | internal1=10.10.10.9, 192.168.122.8   |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+

Copy to Clipboard

Toggle word wrap

After another short period of time, observe that the Orchestration service has autoscaled to three instances. The configuration is set to a maximum of three instances. Verify there are three instances:

openstack server list

$ openstack server list
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+
| ID                                   | Name                                                  | Status | Task State | Power State | Networks                              |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+
| 477ee1af-096c-477c-9a3f-b95b0e2d4ab5 | ex-3gax-4urpikl5koff-yrxk3zxzfmpf-server-2hde4tp4trnk | ACTIVE | -          | Running     | internal1=10.10.10.13, 192.168.122.17 |
| e1524f65-5be6-49e4-8501-e5e5d812c612 | ex-3gax-5f3a4og5cwn2-png47w3u2vjd-server-vaajhuv4mj3j | ACTIVE | -          | Running     | internal1=10.10.10.9, 192.168.122.8   |
| 6c88179e-c368-453d-a01a-555eae8cd77a | ex-3gax-fvxz3tr63j4o-36fhftuja3bw-server-rhl4sqkjuy5p | ACTIVE | -          | Running     | internal1=10.10.10.5, 192.168.122.5   |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+---------------------------------------+

Copy to Clipboard

Toggle word wrap

4.2. Testing automatic scaling down of instances
Copia collegamento

You can use the Orchestration service (heat) to automatically scale down instances based on the cpu_alarm_low threshold. In this example, the instances are scaled down when CPU use is below 5%.

Procedure

From within the workload instance, terminate the running dd processes and observe Orchestration begin to scale the instances back down.
```
killall dd
```
```
$ killall dd
```
Copy to Clipboard Toggle word wrap
Log in to the host environment as the stack user.
For standalone environments set the OS_CLOUD environment variable:
```
export OS_CLOUD=standalone
```
```
[stack@standalone ~]$ export OS_CLOUD=standalone
```
Copy to Clipboard Toggle word wrap
For director environments source the overcloudrc file:
```
source ~/overcloudrc
```
```
[stack@undercloud ~]$ source ~/overcloudrc
```
Copy to Clipboard Toggle word wrap

When you stop the dd processes, this triggers the cpu_alarm_low event alarm. As a result, Orchestration begins to automatically scale down and remove the instances. Verify that the corresponding alarm has triggered:

openstack alarm list

$ openstack alarm list
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+
| alarm_id                             | type                                       | name                                | state | severity | enabled |
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+
| 022f707d-46cc-4d39-a0b2-afd2fc7ab86a | gnocchi_aggregation_by_resources_threshold | example-cpu_alarm_high-odj77qpbld7j | ok    | low      | True    |
| 46ed2c50-e05a-44d8-b6f6-f1ebd83af913 | gnocchi_aggregation_by_resources_threshold | example-cpu_alarm_low-m37jvnm56x2t  | alarm | low      | True    |
+--------------------------------------+--------------------------------------------+-------------------------------------+-------+----------+---------+

Copy to Clipboard

Toggle word wrap

After a few minutes, Orchestration continually reduce the number of instances to the minimum value defined in the min_size parameter of the scaleup_group definition. In this scenario, the min_size parameter is set to 1.

4.3. Troubleshooting for autoscaling
Copia collegamento

If your environment is not working properly, you can look for errors in the log files and history records.

Procedure

Log in to the host environment as the stack user.
For standalone environments set the OS_CLOUD environment variable:
```
export OS_CLOUD=standalone
```
```
[stack@standalone ~]$ export OS_CLOUD=standalone
```
Copy to Clipboard Toggle word wrap
For director environments source the overcloudrc file:
```
source ~/overcloudrc
```
```
[stack@undercloud ~]$ source ~/overcloudrc
```
Copy to Clipboard Toggle word wrap

You must first launch the ephemeral Heat process to use the openstack stack commands:

openstack tripleo launch heat --heat-dir /home/stack/overcloud-deploy/overcloud/heat-launcher --restore-db
export OS_CLOUD=heat

(undercloud)$ openstack tripleo launch heat --heat-dir /home/stack/overcloud-deploy/overcloud/heat-launcher --restore-db
(undercloud)$ export OS_CLOUD=heat

Copy to Clipboard

Toggle word wrap

To retrieve information on state transitions, list the stack event records:

openstack stack event list example

$ openstack stack event list example
2017-03-06 11:12:43Z [example]: CREATE_IN_PROGRESS  Stack CREATE started
2017-03-06 11:12:43Z [example.scaleup_group]: CREATE_IN_PROGRESS  state changed
2017-03-06 11:13:04Z [example.scaleup_group]: CREATE_COMPLETE  state changed
2017-03-06 11:13:04Z [example.scaledown_policy]: CREATE_IN_PROGRESS  state changed
2017-03-06 11:13:05Z [example.scaleup_policy]: CREATE_IN_PROGRESS  state changed
2017-03-06 11:13:05Z [example.scaledown_policy]: CREATE_COMPLETE  state changed
2017-03-06 11:13:05Z [example.scaleup_policy]: CREATE_COMPLETE  state changed
2017-03-06 11:13:05Z [example.cpu_alarm_low]: CREATE_IN_PROGRESS  state changed
2017-03-06 11:13:05Z [example.cpu_alarm_high]: CREATE_IN_PROGRESS  state changed
2017-03-06 11:13:06Z [example.cpu_alarm_low]: CREATE_COMPLETE  state changed
2017-03-06 11:13:07Z [example.cpu_alarm_high]: CREATE_COMPLETE  state changed
2017-03-06 11:13:07Z [example]: CREATE_COMPLETE  Stack CREATE completed successfully
2017-03-06 11:19:34Z [example.scaleup_policy]: SIGNAL_COMPLETE  alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 95.4080102993)
2017-03-06 11:25:43Z [example.scaleup_policy]: SIGNAL_COMPLETE  alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 95.8869217299)
2017-03-06 11:33:25Z [example.scaledown_policy]: SIGNAL_COMPLETE  alarm state changed from ok to alarm (Transition to alarm due to 1 samples outside threshold, most recent: 2.73931707966)
2017-03-06 11:39:15Z [example.scaledown_policy]: SIGNAL_COMPLETE  alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 2.78110858552)

Copy to Clipboard

Toggle word wrap

Read the alarm history log:

openstack alarm-history show 022f707d-46cc-4d39-a0b2-afd2fc7ab86a

$ openstack alarm-history show 022f707d-46cc-4d39-a0b2-afd2fc7ab86a
+----------------------------+------------------+-----------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp                  | type             | detail                                                                                              | event_id                             |
+----------------------------+------------------+-----------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-03-06T11:32:35.510000 | state transition | {"transition_reason": "Transition to ok due to 1 samples inside threshold, most recent:             | 25e0e70b-3eda-466e-abac-42d9cf67e704 |
|                            |                  | 2.73931707966", "state": "ok"}                                                                      |                                      |
| 2017-03-06T11:17:35.403000 | state transition | {"transition_reason": "Transition to alarm due to 1 samples outside threshold, most recent:         | 8322f62c-0d0a-4dc0-9279-435510f81039 |
|                            |                  | 95.0964497325", "state": "alarm"}                                                                   |                                      |
| 2017-03-06T11:15:35.723000 | state transition | {"transition_reason": "Transition to ok due to 1 samples inside threshold, most recent:             | 1503bd81-7eba-474e-b74e-ded8a7b630a1 |
|                            |                  | 3.59330523447", "state": "ok"}                                                                      |                                      |
| 2017-03-06T11:13:06.413000 | creation         | {"alarm_actions": ["trust+http://fca6e27e3d524ed68abdc0fd576aa848:delete@192.168.122.126:8004/v1/fd | 224f15c0-b6f1-4690-9a22-0c1d236e65f6 |
|                            |                  | 1c345135be4ee587fef424c241719d/stacks/example/d9ef59ed-b8f8-4e90-bd9b-                              |                                      |
|                            |                  | ae87e73ef6e2/resources/scaleup_policy/signal"], "user_id": "a85f83b7f7784025b6acdc06ef0a8fd8",      |                                      |
|                            |                  | "name": "example-cpu_alarm_high-odj77qpbld7j", "state": "insufficient data", "timestamp":           |                                      |
|                            |                  | "2017-03-06T11:13:06.413455", "description": "Scale up if CPU > 80%", "enabled": true,              |                                      |
|                            |                  | "state_timestamp": "2017-03-06T11:13:06.413455", "rule": {"evaluation_periods": 1, "metric":        |                                      |
|                            |                  | "cpu_util", "aggregation_method": "mean", "granularity": 300, "threshold": 80.0, "query": "{\"=\":   |                                      |
|                            |                  | {\"server_group\": \"d9ef59ed-b8f8-4e90-bd9b-ae87e73ef6e2\"}}", "comparison_operator": "gt",        |                                      |
|                            |                  | "resource_type": "instance"}, "alarm_id": "022f707d-46cc-4d39-a0b2-afd2fc7ab86a",                   |                                      |
|                            |                  | "time_constraints": [], "insufficient_data_actions": null, "repeat_actions": true, "ok_actions":    |                                      |
|                            |                  | null, "project_id": "fd1c345135be4ee587fef424c241719d", "type":                                     |                                      |
|                            |                  | "gnocchi_aggregation_by_resources_threshold", "severity": "low"}                                    |                                      |
+----------------------------+------------------+-----------------------------------------------------------------------------------------------------+-------------------------------------

Copy to Clipboard

Toggle word wrap

To view the records of scale-out or scale-down operations that heat collects for the existing stack, you can use the awk command to parse the heat-engine.log:

awk '/Stack UPDATE started/,/Stack CREATE completed successfully/ {print $0}' /var/log/containers/heat/heat-engine.log

$ awk '/Stack UPDATE started/,/Stack CREATE completed successfully/ {print $0}' /var/log/containers/heat/heat-engine.log

Copy to Clipboard

Toggle word wrap

To view aodh-related information, examine the evaluator.log:

grep -i alarm /var/log/containers/aodh/evaluator.log | grep -i transition

$ grep -i alarm /var/log/containers/aodh/evaluator.log | grep -i transition

Copy to Clipboard

Toggle word wrap

Remove the ephemeral Heat process from the undercloud:
```
openstack tripleo launch heat --kill
```
```
(undercloud)$ openstack tripleo launch heat --kill
```
Copy to Clipboard Toggle word wrap

4.4. Using CPU telemetry values for autoscaling threshold when using rate:mean aggregration
Copia collegamento

When using the OS::Heat::Autoscaling heat orchestration template (HOT) and setting a threshold value for CPU, the value is expressed in nanoseconds of CPU time which is a dynamic value based on the number of virtual CPUs allocated to the instance workload. In this reference guide we’ll explore how to calculate and express the CPU nanosecond value as a percentage when using the Gnocchi rate:mean aggregration method.

4.4.1. Calculating CPU telemetry values as a percentage
Copia collegamento

CPU telemetry is stored in Gnocchi (OpenStack time-series data store) as CPU utilization in nanoseconds. When using CPU telemetry to define autoscaling thresholds it is useful to express the values as a percentage of CPU utilization since that is more natural when defining the threshold values. When defining the scaling policies used as part of an autoscaling group, we can take our desired threshold defined as a percentage and calculate the required threshold value in nanoseconds which is used in the policy definitions.

Expand

Value (ns)	Granularity (s)	Percentage
60000000000	60	100
54000000000	60	90
48000000000	60	80
42000000000	60	70
36000000000	60	60
30000000000	60	50
24000000000	60	40
18000000000	60	30
12000000000	60	20
6000000000	60	10

4.4.2. Displaying instance workload vCPU as a percentage
Copia collegamento

You can display the gnocchi-stored CPU telemetry data as a percentage rather than the nanosecond values for instances by using the openstack metric aggregates command.

Prerequisites

Create a heat stack using the autoscaling group resource that results in an instance workload.

Procedure

Retrieve the ID of the autoscaling group heat stack:

openstack stack show vnf -c id -c stack_status

$ openstack stack show vnf -c id -c stack_status
+--------------+--------------------------------------+
| Field        | Value                                |
+--------------+--------------------------------------+
| id           | e0a15cee-34d1-418a-ac79-74ad07585730 |
| stack_status | CREATE_COMPLETE                      |
+--------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

Set the value of the stack ID to an environment variable:

export STACK_ID=$(openstack stack show vnf -c id -f value)

$ export STACK_ID=$(openstack stack show vnf -c id -f value)

Copy to Clipboard

Toggle word wrap

Return the metrics as an aggregate by resource type instance (server ID) with the value calculated as a percentage. The aggregate is returned as a value of nanoseconds of CPU time. We divide that number by 1000000000 to get the value in seconds. We then divide the value by our granularity, which in this example is 60 seconds. That value is then converted to a percentage by multiplying by 100. Finally, we divide the total value by the number of vCPU provided by the flavor assigned to the instance, in this example a value of 2 vCPU, providing us a value expressed as a percentage of CPU time:

openstack metric aggregates --resource-type instance --sort-column timestamp --sort-descending '(/ (* (/ (/ (metric cpu rate:mean) 1000000000) 60) 100) 2)' server_group="$STACK_ID"

$ openstack metric aggregates --resource-type instance --sort-column timestamp --sort-descending '(/ (* (/ (/ (metric cpu rate:mean) 1000000000) 60) 100) 2)' server_group="$STACK_ID"
+----------------------------------------------------+---------------------------+-------------+--------------------+
| name                                               | timestamp                 | granularity |              value |
+----------------------------------------------------+---------------------------+-------------+--------------------+
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T21:03:00+00:00 |        60.0 |  3.158333333333333 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T21:02:00+00:00 |        60.0 | 2.6333333333333333 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T21:02:00+00:00 |        60.0 |  2.533333333333333 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T21:01:00+00:00 |        60.0 |  2.833333333333333 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T21:01:00+00:00 |        60.0 | 3.0833333333333335 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T21:00:00+00:00 |        60.0 | 13.450000000000001 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T21:00:00+00:00 |        60.0 |               2.45 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T21:00:00+00:00 |        60.0 | 2.6166666666666667 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T20:59:00+00:00 |        60.0 | 60.583333333333336 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:59:00+00:00 |        60.0 |               2.35 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:59:00+00:00 |        60.0 |              2.525 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T20:58:00+00:00 |        60.0 |  71.35833333333333 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:58:00+00:00 |        60.0 |              3.025 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:58:00+00:00 |        60.0 |                9.3 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T20:57:00+00:00 |        60.0 |  66.19166666666668 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:57:00+00:00 |        60.0 |              2.275 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:57:00+00:00 |        60.0 |  56.31666666666667 |
| 61bfb555-9efb-46f1-8559-08dec90f94ed/cpu/rate:mean | 2022-11-07T20:56:00+00:00 |        60.0 |  59.50833333333333 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:56:00+00:00 |        60.0 |              2.375 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:56:00+00:00 |        60.0 | 63.949999999999996 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:55:00+00:00 |        60.0 | 15.558333333333335 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:55:00+00:00 |        60.0 |              93.85 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:54:00+00:00 |        60.0 |  59.54999999999999 |
| 199b0cb9-6ed6-4410-9073-0fb2e7842b65/cpu/rate:mean | 2022-11-07T20:54:00+00:00 |        60.0 |  61.23333333333334 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:53:00+00:00 |        60.0 |  74.73333333333333 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:52:00+00:00 |        60.0 |  57.86666666666667 |
| a95ab818-fbe8-4acd-9f7b-58e24ade6393/cpu/rate:mean | 2022-11-07T20:51:00+00:00 |        60.0 | 60.416666666666664 |
+----------------------------------------------------+---------------------------+-------------+--------------------+

Copy to Clipboard

Toggle word wrap

4.4.3. Retrieving available telemetry for an instance workload
Copia collegamento

Retrieve the available telemetry for an instance workload and express the vCPU utilization as a percentage.

Prerequisites

Create a heat stack using the autoscaling group resource that results in an instance workload.

Procedure

Retrieve the ID of the autoscaling group heat stack:

openstack stack show vnf -c id -c stack_status

$ openstack stack show vnf -c id -c stack_status
+--------------+--------------------------------------+
| Field        | Value                                |
+--------------+--------------------------------------+
| id           | e0a15cee-34d1-418a-ac79-74ad07585730 |
| stack_status | CREATE_COMPLETE                      |
+--------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

Set the value of the stack ID to an environment variable:

export STACK_ID=$(openstack stack show vnf -c id -f value)

$ export STACK_ID=$(openstack stack show vnf -c id -f value)

Copy to Clipboard

Toggle word wrap

Retrieve the ID of the workload instance you want to return data for. We are using the server list long form and filtering for instances that are part of our autoscaling group:

openstack server list --long --fit-width | grep "metering.server_group='$STACK_ID'"

$ openstack server list --long --fit-width | grep "metering.server_group='$STACK_ID'"
| bc1811de-48ed-44c1-ae22-c01f36d6cb02 | vn-xlfb4jb-yhbq6fkk2kec-qsu2lr47zigs-vnf-y27wuo25ce4e | ACTIVE | None       | Running     | private=192.168.100.139, 192.168.25.179 | fedora36   | d21f1aaa-0077-4313-8a46-266c39b705c1 | m1.small    | 692533fe-0912-417e-b706-5d085449db53 | nova              | standalone.localdomain | metering.server_group='e0a15cee-34d1-418a-ac79-74ad07585730' |

Copy to Clipboard

Toggle word wrap

Set the instance ID for one of the returned instance workload names:

INSTANCE_NAME='vn-xlfb4jb-yhbq6fkk2kec-qsu2lr47zigs-vnf-y27wuo25ce4e' ; export INSTANCE_ID=$(openstack server list --name $INSTANCE_NAME -c ID -f value)

$ INSTANCE_NAME='vn-xlfb4jb-yhbq6fkk2kec-qsu2lr47zigs-vnf-y27wuo25ce4e' ; export INSTANCE_ID=$(openstack server list --name $INSTANCE_NAME -c ID -f value)

Copy to Clipboard

Toggle word wrap

Verify metrics have been stored for the instance resource ID. If no metrics are available it’s possible not enough time has elapsed since the instance was created. If enough time has elapsed, you can check the logs for the data collection service in /var/log/containers/ceilometer/ and logs for the time-series database service gnocchi in /var/log/containers/gnocchi/:

openstack metric resource show --column metrics $INSTANCE_ID

$ openstack metric resource show --column metrics $INSTANCE_ID
+---------+---------------------------------------------------------------------+
| Field   | Value                                                               |
+---------+---------------------------------------------------------------------+
| metrics | compute.instance.booting.time: 57ca241d-764b-4c58-aa32-35760d720b08 |
|         | cpu: d7767d7f-b10c-4124-8893-679b2e5d2ccd                           |
|         | disk.ephemeral.size: 038b11db-0598-4cfd-9f8d-4ba6b725375b           |
|         | disk.root.size: 843f8998-e644-41f6-8635-e7c99e28859e                |
|         | memory.usage: 1e554370-05ac-4107-98d8-9330265db750                  |
|         | memory: fbd50c0e-90fa-4ad9-b0df-f7361ceb4e38                        |
|         | vcpus: 0629743e-6baa-4e22-ae93-512dc16bac85                         |
+---------+---------------------------------------------------------------------+

Copy to Clipboard

Toggle word wrap

Verify there are available measures for the resource metric and note the granularity value as we’ll use it when running the openstack metric aggregates command:

openstack metric measures show --resource-id $INSTANCE_ID --aggregation rate:mean cpu

$ openstack metric measures show --resource-id $INSTANCE_ID --aggregation rate:mean cpu
+---------------------------+-------------+---------------+
| timestamp                 | granularity |         value |
+---------------------------+-------------+---------------+
| 2022-11-08T14:12:00+00:00 |        60.0 | 71920000000.0 |
| 2022-11-08T14:13:00+00:00 |        60.0 | 88920000000.0 |
| 2022-11-08T14:14:00+00:00 |        60.0 | 76130000000.0 |
| 2022-11-08T14:15:00+00:00 |        60.0 | 17640000000.0 |
| 2022-11-08T14:16:00+00:00 |        60.0 |  3330000000.0 |
| 2022-11-08T14:17:00+00:00 |        60.0 |  2450000000.0 |
...

Copy to Clipboard

Toggle word wrap

Retrieve the number of vCPU cores applied to the workload instance by reviewing the configured flavor for the instance workload:

openstack server show $INSTANCE_ID -cflavor -f value
openstack flavor show 692533fe-0912-417e-b706-5d085449db53 -c vcpus -f value

$ openstack server show $INSTANCE_ID -cflavor -f value
m1.small (692533fe-0912-417e-b706-5d085449db53)

$ openstack flavor show 692533fe-0912-417e-b706-5d085449db53 -c vcpus -f value
2

Copy to Clipboard

Toggle word wrap

Return the metrics as an aggregate by resource type instance (server ID) with the value calculated as a percentage. The aggregate is returned as a value of nanoseconds of CPU time. We divide that number by 1000000000 to get the value in seconds. We then divide the value by our granularity, which in this example is 60 seconds (as previously retrieved with openstack metric measures show command). That value is then converted to a percentage by multiplying by 100. Finally, we divide the total value by the number of vCPU provided by the flavor assigned to the instance, in this example a value of 2 vCPU, providing us a value expressed as a percentage of CPU time:

openstack metric aggregates --resource-type instance --sort-column timestamp --sort-descending '(/ (* (/ (/ (metric cpu rate:mean) 1000000000) 60) 100) 2)' id=$INSTANCE_ID

$ openstack metric aggregates --resource-type instance --sort-column timestamp --sort-descending '(/ (* (/ (/ (metric cpu rate:mean) 1000000000) 60) 100) 2)' id=$INSTANCE_ID
+----------------------------------------------------+---------------------------+-------------+--------------------+
| name                                               | timestamp                 | granularity |              value |
+----------------------------------------------------+---------------------------+-------------+--------------------+
| bc1811de-48ed-44c1-ae22-c01f36d6cb02/cpu/rate:mean | 2022-11-08T14:26:00+00:00 |        60.0 |               2.45 |
| bc1811de-48ed-44c1-ae22-c01f36d6cb02/cpu/rate:mean | 2022-11-08T14:25:00+00:00 |        60.0 |             11.075 |
| bc1811de-48ed-44c1-ae22-c01f36d6cb02/cpu/rate:mean | 2022-11-08T14:24:00+00:00 |        60.0 |               61.3 |
| bc1811de-48ed-44c1-ae22-c01f36d6cb02/cpu/rate:mean | 2022-11-08T14:23:00+00:00 |        60.0 |  74.78333333333332 |
| bc1811de-48ed-44c1-ae22-c01f36d6cb02/cpu/rate:mean | 2022-11-08T14:22:00+00:00 |        60.0 | 55.383333333333326 |
...

Copy to Clipboard

Toggle word wrap

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 4. Testing and troubleshooting autoscaling

4.1. Testing automatic scaling up of instances
Copia collegamento

4.2. Testing automatic scaling down of instances
Copia collegamento

4.3. Troubleshooting for autoscaling
Copia collegamento

4.4. Using CPU telemetry values for autoscaling threshold when using rate:mean aggregration
Copia collegamento

4.4.1. Calculating CPU telemetry values as a percentage
Copia collegamento

4.4.2. Displaying instance workload vCPU as a percentage
Copia collegamento

4.4.3. Retrieving available telemetry for an instance workload
Copia collegamento

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Rendiamo l’open source più inclusivo

Informazioni su Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 4. Testing and troubleshooting autoscaling

4.1. Testing automatic scaling up of instancesCopia collegamentoCollegamento copiato negli appunti!

4.2. Testing automatic scaling down of instancesCopia collegamentoCollegamento copiato negli appunti!

4.3. Troubleshooting for autoscalingCopia collegamentoCollegamento copiato negli appunti!

4.4. Using CPU telemetry values for autoscaling threshold when using rate:mean aggregrationCopia collegamentoCollegamento copiato negli appunti!

4.4.1. Calculating CPU telemetry values as a percentageCopia collegamentoCollegamento copiato negli appunti!

4.4.2. Displaying instance workload vCPU as a percentageCopia collegamentoCollegamento copiato negli appunti!

4.4.3. Retrieving available telemetry for an instance workloadCopia collegamentoCollegamento copiato negli appunti!

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Rendiamo l’open source più inclusivo

Informazioni su Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

4.1. Testing automatic scaling up of instances
Copia collegamento

4.2. Testing automatic scaling down of instances
Copia collegamento

4.3. Troubleshooting for autoscaling
Copia collegamento

4.4. Using CPU telemetry values for autoscaling threshold when using rate:mean aggregration
Copia collegamento

4.4.1. Calculating CPU telemetry values as a percentage
Copia collegamento

4.4.2. Displaying instance workload vCPU as a percentage
Copia collegamento

4.4.3. Retrieving available telemetry for an instance workload
Copia collegamento