Configuring the Compute service for instance creation
Configuring and managing the Red Hat OpenStack Platform Compute service (nova) for creating instances
Abstract
Providing feedback on Red Hat documentation Copy linkLink copied to clipboard!
We appreciate your input on our documentation. Tell us how we can make it better.
Use the Create Issue form to provide feedback on the documentation for Red Hat OpenStack Services on OpenShift (RHOSO) or earlier releases of Red Hat OpenStack Platform (RHOSP). When you create an issue for RHOSO or RHOSP documents, the issue is recorded in the RHOSO Jira project, where you can track the progress of your feedback.
To complete the Create Issue form, ensure that you are logged in to Jira. If you do not have a Red Hat Jira account, you can create an account at https://issues.redhat.com.
- Click the following link to open a Create Issue page: Create Issue
- Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue. Do not modify any other fields in the form.
- Click Create.
Chapter 1. Compute service (nova) functionality Copy linkLink copied to clipboard!
You use the Compute (nova) service to create, provision, and manage virtual machine instances and bare metal servers in a Red Hat OpenStack Platform (RHOSP) environment. The Compute service abstracts the underlying hardware that it runs on, rather than exposing specifics about the underlying host platforms. For example, rather than exposing the types and topologies of CPUs running on hosts, the Compute service exposes a number of virtual CPUs (vCPUs) and allows for overcommitting of these vCPUs.
The Compute service uses the KVM hypervisor to execute Compute service workloads. The libvirt driver interacts with QEMU to handle all interactions with KVM, and enables the creation of virtual machine instances. To create and provision instances, the Compute service interacts with the following RHOSP services:
- Identity (keystone) service for authentication.
- Placement service for resource inventory tracking and selection.
- Image Service (glance) for disk and instance images.
- Networking (neutron) service for provisioning the virtual or physical networks that instances connect to on boot.
The Compute service consists of daemon processes and services, named nova-*. The following are the core Compute services:
- Compute service (
nova-compute) - This service creates, manages and terminates instances by using the libvirt for KVM or QEMU hypervisor APIs, and updates the database with instance states.
- Compute conductor (
nova-conductor) -
This service mediates interactions between the Compute service and the database, which insulates Compute nodes from direct database access. Do not deploy this service on nodes where the
nova-computeservice runs. - Compute scheduler (
nova-scheduler) - This service takes an instance request from the queue and determines on which Compute node to host the instance.
- Compute API (
nova-api) - This service provides the external REST API to users.
- API database
- This database tracks instance location information, and provides a temporary location for instances that are built but not scheduled. In multi-cell deployments, this database also contains cell mappings that specify the database connection for each cell.
- Cell database
- This database contains most of the information about instances. It is used by the API database, the conductor, and the Compute services.
- Message queue
- This messaging service is used by all services to communicate with each other within the cell and with the global services.
- Compute metadata
-
This service stores data specific to instances. Instances access the metadata service at http://169.254.169.254 or over IPv6 at the link-local address fe80::a9fe:a9fe. The Networking (neutron) service is responsible for forwarding requests to the metadata API server. You must use the
NeutronMetadataProxySharedSecretparameter to set a secret keyword in the configuration of both the Networking service and the Compute service to allow the services to communicate. The Compute metadata service can be run globally, as part of the Compute API, or in each cell.
You can deploy more than one Compute node. The hypervisor that operates instances runs on each Compute node. Each Compute node requires a minimum of two network interfaces. The Compute node also runs a Networking service agent that connects instances to virtual networks and provides firewalling services to instances through security groups.
By default, director installs the overcloud with a single cell for all Compute nodes. This cell contains all the Compute services and databases that control and manage the virtual machine instances, and all the instances and instance metadata. For larger deployments, you can deploy the overcloud with multiple cells to accommodate a larger number of Compute nodes. You can add cells to your environment when you install a new overcloud or at any time afterwards.
If you are using the openstack client, the microversion is set to 2.88 by default. This is the latest supported version in OSP 17. If you need an older microversion, you can request is by passing --os-compute-api-version <version> on the command line.
Chapter 2. Configuring the Compute service (nova) Copy linkLink copied to clipboard!
As a cloud administrator, you use environment files to customize the Compute (nova) service. Puppet generates and stores this configuration in the /var/lib/config-data/puppet-generated/<nova_container>/etc/nova/nova.conf file. Use the following configuration methods to customize the Compute service configuration, in the following order of precedence:
Heat parameters - as detailed in the Compute (nova) Parameters section in the Overcloud parameters guide. The following example uses heat parameters to set the default scheduler filters, and configure an NFS backend for the Compute service:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Puppet parameters - as defined in
/etc/puppet/modules/nova/manifests/*:parameter_defaults: ComputeExtraConfig: nova::compute::force_raw_images: Trueparameter_defaults: ComputeExtraConfig: nova::compute::force_raw_images: TrueCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteOnly use this method if an equivalent heat parameter does not exist.
Manual hieradata overrides - for customizing parameters when no heat or Puppet parameter exists. For example, the following sets the
timeout_nbdin the[DEFAULT]section on the Compute role:parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/timeout_nbd: value: '20'parameter_defaults: ComputeExtraConfig: nova::config::nova_config: DEFAULT/timeout_nbd: value: '20'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If a heat parameter exists, use it instead of the Puppet parameter. If a Puppet parameter exists, but not a heat parameter, use the Puppet parameter instead of the manual override method. Use the manual override method only if there is no equivalent heat or Puppet parameter.
Follow the guidance in Identifying parameters that you want to modify to determine if a heat or Puppet parameter is available for customizing a particular configuration.
For more information about how to configure overcloud services, see Heat parameters in the Customizing your Red Hat OpenStack Platform deployment guide.
Chapter 3. Creating flavors for launching instances Copy linkLink copied to clipboard!
An instance flavor is a resource template that specifies the virtual hardware profile for the instance. Cloud users must specify a flavor when they launch an instance.
A flavor can specify the quantity of the following resources the Compute service must allocate to an instance:
- The number of vCPUs.
- The RAM, in MB.
- The root disk, in GB.
- The virtual storage, including secondary ephemeral storage and swap disk.
You can specify who can use flavors by making the flavor public to all projects, or private to specific projects or domains.
Flavors can use metadata, also referred to as "extra specs", to specify instance hardware support and quotas. The flavor metadata influences the instance placement, resource usage limits, and performance. For a complete list of available metadata properties, see Flavor metadata.
You can also use the flavor metadata keys to find a suitable host aggregate to host the instance, by matching the extra_specs metadata set on the host aggregate. To schedule an instance on a host aggregate, you must scope the flavor metadata by prefixing the extra_specs key with the aggregate_instance_extra_specs: namespace. For more information, see Creating and managing host aggregates.
A Red Hat OpenStack Platform (RHOSP) deployment includes the following set of default public flavors that your cloud users can use.
| Name | vCPUs | RAM | Root Disk Size |
|---|---|---|---|
| m1.nano | 1 | 128 MB | 1 GB |
| m1.micro | 1 | 192 MB | 1 GB |
Behavior set using flavor properties override behavior set using images. When a cloud user launches an instance, the properties of the flavor they specify override the properties of the image they specify.
3.1. Creating a flavor Copy linkLink copied to clipboard!
You can create and manage specialized flavors for specific functionality or behaviors, for example:
- Change default memory and capacity to suit the underlying hardware needs.
- Add metadata to force a specific I/O rate for the instance or to match a host aggregate.
Procedure
Create a flavor that specifies the basic resources to make available to an instance:
openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <no_vcpus> \ [--private --project <project_id>] <flavor_name>
(overcloud)$ openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <no_vcpus> \ [--private --project <project_id>] <flavor_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<size_mb>with the size of RAM to allocate to an instance created with this flavor. -
Replace
<size_gb>with the size of root disk to allocate to an instance created with this flavor. -
Replace
<no_vcpus>with the number of vCPUs to reserve for an instance created with this flavor. Optional: Specify the
--privateand--projectoptions to make the flavor accessible only by a particular project or group of users. Replace<project_id>with the ID of the project that can use this flavor to create instances. If you do not specify the accessibility, the flavor defaults to public, which means that it is available to all projects.NoteYou cannot make a public flavor private after it has been created.
Replace
<flavor_name>with a unique name for your flavor.For more information about flavor arguments, see Flavor arguments.
-
Replace
Optional: To specify flavor metadata, set the required properties by using key-value pairs:
openstack flavor set \ --property <key=value> --property <key=value> ... <flavor_name>
(overcloud)$ openstack flavor set \ --property <key=value> --property <key=value> ... <flavor_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<key>with the metadata key of the property you want to allocate to an instance that is created with this flavor. For a list of available metadata keys, see Flavor metadata. -
Replace
<value>with the value of the metadata key you want to allocate to an instance that is created with this flavor. Replace
<flavor_name>with the name of your flavor.For example, an instance that is launched by using the following flavor has two CPU sockets, each with two CPUs:
openstack flavor set \ --property hw:cpu_sockets=2 \ --property hw:cpu_cores=2 processor_topology_flavor
(overcloud)$ openstack flavor set \ --property hw:cpu_sockets=2 \ --property hw:cpu_cores=2 processor_topology_flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Replace
3.2. Flavor arguments Copy linkLink copied to clipboard!
The openstack flavor create command has one positional argument, <flavor_name>, to specify the name of your new flavor.
The following table details the optional arguments that you can specify as required when you create a new flavor.
| Optional argument | Description |
|---|---|
|
|
Unique ID for the flavor. The default value, |
|
| (Mandatory) Size of memory to make available to the instance, in MB. Default: 256 MB |
|
| (Mandatory) Amount of disk space to use for the root (/) partition, in GB. The root disk is an ephemeral disk that the base image is copied into. When an instance boots from a persistent volume, the root disk is not used. Note
Creation of an instance with a flavor that has Default: 0 GB |
|
| Amount of disk space to use for the ephemeral disks, in GB. Defaults to 0 GB, which means that no secondary ephemeral disk is created. Ephemeral disks offer machine local disk storage linked to the lifecycle of the instance. Ephemeral disks are not included in any snapshots. This disk is destroyed and all data is lost when the instance is deleted. Default: 0 GB |
|
|
Swap disk size in MB. Do not specify Default: 0 GB |
|
| (Mandatory) Number of virtual CPUs for the instance. Default: 1 |
|
| The flavor is available to all projects. By default, a flavor is public and available to all projects. |
|
|
The flavor is only available to the projects specified by using the |
|
| Metadata, or "extra specs", specified by using key-value pairs in the following format:
Repeat this option to set multiple properties. |
|
|
Specifies the project that can use the private flavor. You must use this argument with the Repeat this option to allow access to multiple projects. |
|
|
Specifies the project domain that can use the private flavor. You must use this argument with the Repeat this option to allow access to multiple project domains. |
|
| Description of the flavor. Limited to 65535 characters in length. You can use only printable characters. |
3.3. Flavor metadata Copy linkLink copied to clipboard!
Use the --property option to specify flavor metadata when you create a flavor. Flavor metadata is also referred to as extra specs. Flavor metadata determines instance hardware support and quotas, which influence instance placement, instance limits, and performance.
Instance resource usage
Use the property keys in the following table to configure limits on CPU, memory and disk I/O usage by instances.
The extra specs for limiting instance CPU resource usage are host-specific tunable properties that are passed directly to libvirt, which then passes the limits onto the host OS. Therefore, the supported instance CPU resource limits configurations are dependent on the underlying host OS.
For more information on how to configure instance CPU resource usage for the Compute nodes in your RHOSP deployment, see Understanding cgroups in the RHEL 9 documentation, and CPU Tuning in the Libvirt documentation.
| Key | Description |
|---|---|
|
|
Specifies the proportional weighted share of CPU time for the domain. Defaults to the OS provided defaults. The Compute scheduler weighs this value relative to the setting of this property on other instances in the same domain. For example, an instance that is configured with |
|
|
Specifies the period of time within which to enforce the |
|
|
Specifies the maximum allowed bandwidth for the vCPU in each
You can use openstack flavor set cpu_limits_flavor \ --property quota:cpu_quota=10000 \ --property quota:cpu_period=20000
|
Instance disk tuning
Use the property keys in the following table to tune the instance disk performance.
The Compute service applies the following quality of service settings to storage that the Compute service has provisioned, such as ephemeral storage. To tune the performance of Block Storage (cinder) volumes, you must also configure and associate a Quality of Service (QoS) specification for the volume type. For more information, see Block Storage service (cinder) Quality of Service specifications in the Configuring persistent storage guide.
| Key | Description |
|---|---|
|
| Specifies the maximum disk reads available to an instance, in bytes per second. |
|
| Specifies the maximum disk reads available to an instance, in IOPS. |
|
| Specifies the maximum disk writes available to an instance, in bytes per second. |
|
| Specifies the maximum disk writes available to an instance, in IOPS. |
|
| Specifies the maximum I/O operations available to an instance, in bytes per second. |
|
| Specifies the maximum I/O operations available to an instance, in IOPS. |
Instance network traffic bandwidth
Use the property keys in the following table to configure bandwidth limits on the instance network traffic by configuring the VIF I/O options.
The quota :vif_* properties are deprecated. Instead, you should use the Networking (neutron) service Quality of Service (QoS) policies. For more information about QoS policies, see Configuring Quality of Service (QoS) policies in the Configuring Red Hat OpenStack Platform networking guide. The quota:vif_* properties are only supported when you use the ML2/OVS mechanism driver with NeutronOVSFirewallDriver set to iptables_hybrid.
| Key | Description |
|---|---|
|
| (Deprecated) Specifies the required average bit rate on the traffic incoming to the instance, in kbps. |
|
| (Deprecated) Specifies the maximum amount of incoming traffic that can be burst at peak speed, in KB. |
|
| (Deprecated) Specifies the maximum rate at which the instance can receive incoming traffic, in kbps. |
|
| (Deprecated) Specifies the required average bit rate on the traffic outgoing from the instance, in kbps. |
|
| (Deprecated) Specifies the maximum amount of outgoing traffic that can be burst at peak speed, in KB. |
|
| (Deprecated) Specifies the maximum rate at which the instance can send outgoing traffic, in kbps. |
Hardware video RAM
Use the property key in the following table to configure limits on the instance RAM to use for video devices.
| Key | Description |
|---|---|
|
|
Specifies the maximum RAM to use for video devices, in MB. Use with the |
Watchdog behavior
Use the property key in the following table to enable the virtual hardware watchdog device on the instance.
| Key | Description |
|---|---|
|
|
Specify to enable the virtual hardware watchdog device and set its behavior. Watchdog devices perform the configured action if the instance hangs or fails. The watchdog uses the i6300esb device, which emulates a PCI Intel 6300ESB. If Set to one of the following valid values:
|
Random number generator (RNG)
Use the property keys in the following table to enable the RNG device on the instance.
| Key | Description |
|---|---|
|
|
Set to
Default: |
|
| Specifies the maximum number of bytes that the instance can read from the entropy of the host, per period. |
|
| Specifies the duration of the read period in milliseconds. |
Virtual Performance Monitoring Unit (vPMU)
Use the property key in the following table to enable the vPMU for the instance.
| Key | Description |
|---|---|
|
|
Set to
Tools such as |
Virtual Trusted Platform Module (vTPM) devices
Use the property keys in the following table to enable a vTPM device for the instance.
| Key | Description |
|---|---|
|
|
Set to the version of TPM to use. TPM version |
|
|
Set to the model of TPM device to use. Ignored if
|
Instance CPU topology
Use the property keys in the following table to define the topology of the processors in the instance.
| Key | Description |
|---|---|
|
| Specifies the preferred number of sockets for the instance. Default: the number of vCPUs requested |
|
| Specifies the preferred number of cores per socket for the instance.
Default: |
|
| Specifies the preferred number of threads per core for the instance.
Default: |
|
| Specifies the maximum number of sockets that users can select for their instances by using image properties.
Example: |
|
| Specifies the maximum number of cores per socket that users can select for their instances by using image properties. |
|
| Specifies the maximum number of threads per core that users can select for their instances by using image properties. |
Serial ports
Use the property key in the following table to configure the number of serial ports per instance.
| Key | Description |
|---|---|
|
| Maximum serial ports per instance. |
CPU pinning policy
By default, instance virtual CPUs (vCPUs) are sockets with one core and one thread. You can use properties to create flavors that pin the vCPUs of instances to the physical CPU cores (pCPUs) of the host. You can also configure the behavior of hardware CPU threads in a simultaneous multithreading (SMT) architecture where one or more cores have thread siblings.
Use the property keys in the following table to define the CPU pinning policy of the instance.
| Key | Description |
|---|---|
|
| Specifies the CPU policy to use. Set to one of the following valid values:
|
|
|
Specifies the CPU thread policy to use when
|
|
| Specifies which CPUs are dedicated (pinned) or shared (unpinned/floating).
|
Instance PCI NUMA affinity policy
Use the property key in the following table to create flavors that specify the NUMA affinity policy for PCI passthrough devices and SR-IOV interfaces.
| Key | Description |
|---|---|
|
| Specifies the NUMA affinity policy for PCI passthrough devices and SR-IOV interfaces. Set to one of the following valid values:
|
Instance NUMA topology
You can use properties to create flavors that define the host NUMA placement for the instance vCPU threads, and the allocation of instance vCPUs and memory from the host NUMA nodes.
Defining a NUMA topology for the instance improves the performance of the instance OS for flavors whose memory and vCPU allocations are larger than the size of NUMA nodes in the Compute hosts.
The Compute scheduler uses these properties to determine a suitable host for the instance. For example, a cloud user launches an instance by using the following flavor:
The Compute scheduler searches for a host that has two NUMA nodes, one with 3GB of RAM and the ability to run six CPUs, and the other with 1GB of RAM and two CPUS. If a host has a single NUMA node with capability to run eight CPUs and 4GB of RAM, the Compute scheduler does not consider it a valid match.
NUMA topologies defined by a flavor cannot be overridden by NUMA topologies defined by the image. The Compute service raises an ImageNUMATopologyForbidden error if the image NUMA topology conflicts with the flavor NUMA topology.
You cannot use this feature to constrain instances to specific host CPUs or NUMA nodes. Use this feature only after you complete extensive testing and performance measurements. You can use the hw:pci_numa_affinity_policy property instead.
Use the property keys in the following table to define the instance NUMA topology.
| Key | Description |
|---|---|
|
| Specifies the number of host NUMA nodes to restrict execution of instance vCPU threads to. If not specified, the vCPU threads can run on any number of the available host NUMA nodes. |
|
| A comma-separated list of instance vCPUs to map to instance NUMA node N. If this key is not specified, vCPUs are evenly divided among available NUMA nodes. N starts from 0. Use *.N values with caution, and only if you have at least two NUMA nodes.
This property is valid only if you have set |
|
| The number of MB of instance memory to map to instance NUMA node N. If this key is not specified, memory is evenly divided among available NUMA nodes. N starts from 0. Use *.N values with caution, and only if you have at least two NUMA nodes.
This property is valid only if you have set |
If the combined values of hw:numa_cpus.N or hw:numa_mem.N are greater than the available number of CPUs or memory respectively, the Compute service raises an exception.
CPU real-time policy
Use the property keys in the following table to define the real-time policy of the processors in the instance.
- Although most of your instance vCPUs can run with a real-time policy, you must mark at least one vCPU as non-real-time to use for both non-real-time guest processes and emulator overhead processes.
- To use this extra spec, you must enable pinned CPUs.
| Key | Description |
|---|---|
|
|
Set to
Default: |
|
| Specifies the vCPUs to not assign a real-time policy to. You must prepend the mask value with a caret symbol (^). The following example indicates that all vCPUs except vCPUs 0 and 1 have a real-time policy: openstack flavor set <flavor> \ --property hw:cpu_realtime="yes" \ --property hw:cpu_realtime_mask=^0-1
Note
If the |
Emulator threads policy
You can assign a pCPU to an instance to use for emulator threads. Emulator threads are emulator processes that are not directly related to the instance. A dedicated emulator thread pCPU is required for real-time workloads. To use the emulator threads policy, you must enable pinned CPUs by setting the following property:
--property hw:cpu_policy=dedicated
--property hw:cpu_policy=dedicated
Use the property key in the following table to define the emulator threads policy of the instance.
| Key | Description |
|---|---|
|
| Specifies the emulator threads policy to use for instances. Set to one of the following valid values:
|
Instance memory page size
Use the property keys in the following table to create an instance with an explicit memory page size.
| Key | Description |
|---|---|
|
|
Specifies the size of large pages to use to back the instances. Use of this option creates an implicit NUMA topology of 1 NUMA node unless otherwise specified by
|
PCI passthrough
Use the property key in the following table to attach a physical PCI device, such as a graphics card or a network device, to an instance. For more information about using PCI passthrough, see Configuring PCI passthrough.
| Key | Description |
|---|---|
|
| Specifies the PCI device to assign to an instance by using the following format: <alias>:<count>
|
Hypervisor signature
Use the property key in the following table to hide the hypervisor signature from the instance.
| Key | Description |
|---|---|
|
|
Set to |
UEFI Secure Boot
Use the property key in the following table to create an instance that is protected with UEFI Secure Boot.
Instances with UEFI Secure Boot must support UEFI and the GUID Partition Table (GPT) standard, and include an EFI system partition.
| Key | Description |
|---|---|
|
|
Set to |
Instance resource traits
Each resource provider has a set of traits. Traits are the qualitative aspects of a resource provider, for example, the type of storage disk, or the Intel CPU instruction set extension. An instance can specify which of these traits it requires.
The traits that you can specify are defined in the os-traits library. Example traits include the following:
-
COMPUTE_TRUSTED_CERTS -
COMPUTE_NET_ATTACH_INTERFACE_WITH_TAG -
COMPUTE_IMAGE_TYPE_RAW -
HW_CPU_X86_AVX -
HW_CPU_X86_AVX512VL -
HW_CPU_X86_AVX512CD
For details about how to use the os-traits library, see https://docs.openstack.org/os-traits/latest/user/index.html.
Use the property key in the following table to define the resource traits of the instance.
| Key | Description |
|---|---|
|
| Specifies Compute node traits. Set the trait to one of the following valid values:
Example: openstack flavor set --property trait:HW_CPU_X86_AVX512BW=required avx512-flavor
|
Instance bare-metal resource class
Use the property key in the following table to request a bare-metal resource class for an instance.
| Key | Description |
|---|---|
|
| Use this property to specify standard bare-metal resource classes to override the values of, or to specify custom bare-metal resource classes that the instance requires.
The standard resource classes that you can override are
The name of custom resource classes must start with
For example, to schedule instances on a node that has openstack flavor set \ --property resources:CUSTOM_BAREMETAL_SMALL=1 \ --property resources:VCPU=0 --property resources:MEMORY_MB=0 \ --property resources:DISK_GB=0 compute-small
|
Chapter 4. Configuring CPUs on Compute nodes Copy linkLink copied to clipboard!
As a cloud administrator, you can configure the scheduling and placement of instances for optimal performance by creating customized flavors to target specialized workloads, including NFV and High Performance Computing (HPC).
Use the following features to tune your instances for optimal CPU performance:
- CPU pinning: Pin virtual CPUs to physical CPUs.
- Emulator threads: Pin emulator threads associated with the instance to physical CPUs.
- CPU feature flags: Configure the standard set of CPU feature flags that are applied to instances to improve live migration compatibility across Compute nodes.
4.1. Configuring CPU pinning on Compute nodes Copy linkLink copied to clipboard!
You can configure each instance CPU process to run on a dedicated host CPU by enabling CPU pinning on the Compute nodes. When an instance uses CPU pinning, each instance vCPU process is allocated its own host pCPU that no other instance vCPU process can use. Instances that run on Compute nodes with CPU pinning enabled have a NUMA topology. Each NUMA node of the instance NUMA topology maps to a NUMA node on the host Compute node.
You can configure the Compute scheduler to schedule instances with dedicated (pinned) CPUs and instances with shared (floating) CPUs on the same Compute node. To configure CPU pinning on Compute nodes that have a NUMA topology, you must complete the following:
- Designate Compute nodes for CPU pinning.
- Configure the Compute nodes to reserve host cores for pinned instance vCPU processes, floating instance vCPU processes, and host processes.
- Deploy the overcloud.
- Create a flavor for launching instances that require CPU pinning.
- Create a flavor for launching instances that use shared, or floating, CPUs.
Configuring CPU pinning creates an implicit NUMA topology on the instance even if a NUMA topology is not requested. Do not run NUMA and non-NUMA virtual machines (VMs) on the same hosts. For more information, see Constraints when using NUMA.
4.1.1. Prerequisites Copy linkLink copied to clipboard!
- You know the NUMA topology of your Compute node.
-
You have configured
NovaReservedHugePageson the Compute nodes. For more information, see Configuring huge pages on Compute nodes.
4.1.2. Designating Compute nodes for CPU pinning Copy linkLink copied to clipboard!
To designate Compute nodes for instances with pinned CPUs, you must create a new role file to configure the CPU pinning role, and configure the bare metal nodes with a CPU pinning resource class to use to tag the Compute nodes for CPU pinning.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, you must use the scale down procedure to unprovision the node, then use the scale up procedure to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file named
roles_data_cpu_pinning.yamlthat includes theController,Compute, andComputeCPUPinningroles, along with any other roles that you need for the overcloud:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_cpu_pinning.yaml \ Compute:ComputeCPUPinning Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_cpu_pinning.yaml \ Compute:ComputeCPUPinning Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Open
roles_data_cpu_pinning.yamland edit or add the following parameters and sections:Expand Section/Parameter Current value New value Role comment
Role: ComputeRole: ComputeCPUPinningRole name
name: Computename: ComputeCPUPinningdescriptionBasic Compute Node roleCPU Pinning Compute Node roleHostnameFormatDefault%stackname%-novacompute-%index%%stackname%-novacomputepinning-%index%deprecated_nic_config_namecompute.yamlcompute-cpu-pinning.yaml-
Register the CPU pinning Compute nodes for the overcloud by adding them to your node definition template,
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect \ --all-manageable --provide
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare metal node that you want to designate for CPU pinning with a custom CPU pinning resource class:
openstack baremetal node set \ --resource-class baremetal.CPU-PINNING <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.CPU-PINNING <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the ID of the bare metal node.Add the
ComputeCPUPinningrole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. If you do not define the network definitions by using the
network_configproperty, then the default network definitions are used.
For more information about the properties you can use to configure node attributes in your node definition file, see Bare metal node provisioning attributes. For an example node definition file, see Example node definition file.
Run the provisioning command to provision the new nodes for your role:
openstack overcloud node provision \ --stack <stack> \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ --stack <stack> \ [--network-config \] --output /home/stack/templates/overcloud-baremetal-deployed.yaml \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you do not define the network definitions by using thenetwork_configproperty, then the default network definitions are used.
-
Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you did not run the provisioning command with the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeCPUPinningNetworkConfigTemplate: /home/stack/templates/nic-configs/<cpu_pinning_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeCPUPinningNetworkConfigTemplate: /home/stack/templates/nic-configs/<cpu_pinning_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<cpu_pinning_net_top>with the name of the file that contains the network topology of theComputeCPUPinningrole, for example,compute.yamlto use the default network topology.
4.1.3. Configuring Compute nodes for CPU pinning Copy linkLink copied to clipboard!
Configure CPU pinning on your Compute nodes based on the NUMA topology of the nodes. Reserve some CPU cores across all the NUMA nodes for the host processes for efficiency. Assign the remaining CPU cores to managing your instances.
This procedure uses the following NUMA topology, with eight CPU cores spread across two NUMA nodes, to illustrate how to configure CPU pinning:
| NUMA Node 0 | NUMA Node 1 | ||
| Core 0 | Core 1 | Core 2 | Core 3 |
| Core 4 | Core 5 | Core 6 | Core 7 |
The procedure reserves cores 0 and 4 for host processes, cores 1, 3, 5 and 7 for instances that require CPU pinning, and cores 2 and 6 for floating instances that do not require CPU pinning.
Procedure
-
Create an environment file to configure Compute nodes to reserve cores for pinned instances, floating instances, and host processes, for example,
cpu_pinning.yaml. To schedule instances with a NUMA topology on NUMA-capable Compute nodes, add
NUMATopologyFilterto theNovaSchedulerEnabledFiltersparameter in your Compute environment file, if not already present:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on
NUMATopologyFilter, see Compute scheduler filters .To reserve physical CPU cores for the dedicated instances, add the following configuration to
cpu_pinning.yaml:parameter_defaults: ComputeCPUPinningParameters: NovaComputeCpuDedicatedSet: 1,3,5,7parameter_defaults: ComputeCPUPinningParameters: NovaComputeCpuDedicatedSet: 1,3,5,7Copy to Clipboard Copied! Toggle word wrap Toggle overflow To reserve physical CPU cores for the shared instances, add the following configuration to
cpu_pinning.yaml:parameter_defaults: ComputeCPUPinningParameters: ... NovaComputeCpuSharedSet: 2,6parameter_defaults: ComputeCPUPinningParameters: ... NovaComputeCpuSharedSet: 2,6Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you are not using file-backed memory, specify the amount of RAM to reserve for host processes:
parameter_defaults: ComputeCPUPinningParameters: ... NovaReservedHugePages: <ram>parameter_defaults: ComputeCPUPinningParameters: ... NovaReservedHugePages: <ram>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<ram>with the amount of RAM to reserve in MB.To ensure that host processes do not run on the CPU cores reserved for instances, set the parameter
IsolCpusListto the CPU cores you have reserved for instances:parameter_defaults: ComputeCPUPinningParameters: ... IsolCpusList: 1-3,5-7parameter_defaults: ComputeCPUPinningParameters: ... IsolCpusList: 1-3,5-7Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the value of the
IsolCpusListparameter using a list, or ranges, of CPU indices separated by a comma.Ensure that
TunedProfileNameis set tocpu-partitioning:parameter_defaults: ComputeCPUPinningParameters: ... TunedProfileName: "cpu-partitioning"parameter_defaults: ComputeCPUPinningParameters: ... TunedProfileName: "cpu-partitioning"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add your new files to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
4.1.4. Creating a dedicated CPU flavor for instances Copy linkLink copied to clipboard!
To enable your cloud users to create instances that have dedicated CPUs, you can create a flavor with a dedicated CPU policy for launching instances.
Prerequisites
- Simultaneous multithreading (SMT) is enabled on the host.
- The Compute node is configured to allow CPU pinning. For more information, see Configuring CPU pinning on the Compute nodes.
Procedure
Source the
overcloudrcfile:source ~/overcloudrc
(undercloud)$ source ~/overcloudrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a flavor for instances that require CPU pinning:
openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <no_reserved_vcpus> pinned_cpus
(overcloud)$ openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <no_reserved_vcpus> pinned_cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow To request pinned CPUs, set the
hw:cpu_policyproperty of the flavor todedicated:openstack flavor set \ --property hw:cpu_policy=dedicated pinned_cpus
(overcloud)$ openstack flavor set \ --property hw:cpu_policy=dedicated pinned_cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you are not using file-backed memory, set the
hw:mem_page_sizeproperty of the flavor to enable NUMA-aware memory allocation:openstack flavor set \ --property hw:mem_page_size=<page_size> pinned_cpus
(overcloud)$ openstack flavor set \ --property hw:mem_page_size=<page_size> pinned_cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<page_size>with one of the following valid values:-
large: Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems. -
small: (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages). -
any: Selects the page size by using thehw_mem_page_sizeset on the image. If the page size is not specified by the image, selects the largest available page size, as determined by the libvirt driver. -
<pagesize>: Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
-
NoteTo set
hw:mem_page_sizetosmallorany, you must have configured the amount of memory pages to reserve on each NUMA node for processes that are not instances. For more information, see Configuring huge pages on Compute nodes.To place each vCPU on thread siblings, set the
hw:cpu_thread_policyproperty of the flavor torequire:openstack flavor set \ --property hw:cpu_thread_policy=require pinned_cpus
(overcloud)$ openstack flavor set \ --property hw:cpu_thread_policy=require pinned_cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Note-
If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling fails. To prevent this, set
hw:cpu_thread_policytopreferinstead ofrequire. Thepreferpolicy is the default policy that ensures that thread siblings are used when available. -
If you use
hw:cpu_thread_policy=isolate, you must have SMT disabled or use a platform that does not support SMT.
-
If the host does not have an SMT architecture or enough CPU cores with available thread siblings, scheduling fails. To prevent this, set
Verification
To verify the flavor creates an instance with dedicated CPUs, use your new flavor to launch an instance:
openstack server create --flavor pinned_cpus \ --image <image> pinned_cpu_instance
(overcloud)$ openstack server create --flavor pinned_cpus \ --image <image> pinned_cpu_instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
4.1.6. Creating a mixed CPU flavor for instances Copy linkLink copied to clipboard!
To enable your cloud users to create instances that have a mix of dedicated and shared CPUs, you can create a flavor with a mixed CPU policy for launching instances.
Procedure
Source the
overcloudrcfile:source ~/overcloudrc
(undercloud)$ source ~/overcloudrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a flavor for instances that require a mixed of dedicated and shared CPUs:
openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <number_of_reserved_vcpus> \ --property hw:cpu_policy=mixed mixed_CPUs_flavor
(overcloud)$ openstack flavor create --ram <size_mb> \ --disk <size_gb> --vcpus <number_of_reserved_vcpus> \ --property hw:cpu_policy=mixed mixed_CPUs_flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Specify which CPUs must be dedicated or shared:
openstack flavor set \ --property hw:cpu_dedicated_mask=<CPU_number> \ mixed_CPUs_flavor
(overcloud)$ openstack flavor set \ --property hw:cpu_dedicated_mask=<CPU_number> \ mixed_CPUs_flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<CPU_number>with the CPUs that must be either dedicated or shared:-
To specify dedicated CPUs, specify the CPU number or CPU range. For example, set the property to
2-3to specify that CPUs 2 and 3 are dedicated and all the remaining CPUs are shared. -
To specify shared CPUs, prepend the CPU number or CPU range with a caret (^). For example, set the property to
^0-1to specify that CPUs 0 and 1 are shared and all the remaining CPUs are dedicated.
-
To specify dedicated CPUs, specify the CPU number or CPU range. For example, set the property to
If you are not using file-backed memory, set the
hw:mem_page_sizeproperty of the flavor to enable NUMA-aware memory allocation:openstack flavor set \ --property hw:mem_page_size=<page_size> pinned_cpus
(overcloud)$ openstack flavor set \ --property hw:mem_page_size=<page_size> pinned_cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<page_size>with one of the following valid values:-
large: Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems. -
small: (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages). -
any: Selects the page size by using thehw_mem_page_sizeset on the image. If the page size is not specified by the image, selects the largest available page size, as determined by the libvirt driver. -
<pagesize>: Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
-
NoteTo set
hw:mem_page_sizetosmallorany, you must have configured the amount of memory pages to reserve on each NUMA node for processes that are not instances. For more information, see Configuring huge pages on Compute nodes.
4.1.7. Configuring CPU pinning on Compute nodes with simultaneous multithreading (SMT) Copy linkLink copied to clipboard!
If a Compute node supports simultaneous multithreading (SMT), group thread siblings together in either the dedicated or the shared set. Thread siblings share some common hardware which means it is possible for a process running on one thread sibling to impact the performance of the other thread sibling.
For example, the host identifies four logical CPU cores in a dual core CPU with SMT: 0, 1, 2, and 3. Of these four, there are two pairs of thread siblings:
- Thread sibling 1: logical CPU cores 0 and 2
- Thread sibling 2: logical CPU cores 1 and 3
In this scenario, do not assign logical CPU cores 0 and 1 as dedicated and 2 and 3 as shared. Instead, assign 0 and 2 as dedicated and 1 and 3 as shared.
The files /sys/devices/system/cpu/cpuN/topology/thread_siblings_list, where N is the logical CPU number, contain the thread pairs. You can use the following command to identify which logical CPU cores are thread siblings:
grep -H . /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | sort -n -t ':' -k 2 -u
# grep -H . /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | sort -n -t ':' -k 2 -u
The following output indicates that logical CPU core 0 and logical CPU core 2 are threads on the same core:
/sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0,2 /sys/devices/system/cpu/cpu2/topology/thread_siblings_list:1,3
/sys/devices/system/cpu/cpu0/topology/thread_siblings_list:0,2
/sys/devices/system/cpu/cpu2/topology/thread_siblings_list:1,3
4.1.8. Additional resources Copy linkLink copied to clipboard!
4.2. Configuring emulator threads Copy linkLink copied to clipboard!
Compute nodes have overhead tasks associated with the hypervisor for each instance, known as emulator threads. By default, emulator threads run on the same CPUs as the instance, which impacts the performance of the instance.
You can configure the emulator thread policy to run emulator threads on separate CPUs to those the instance uses.
- To avoid packet loss, you must never preempt the vCPUs in an NFV deployment.
Prerequisites
- CPU pinning must be enabled.
Procedure
-
Log in to the undercloud as the
stackuser. - Open your Compute environment file.
To reserve physical CPU cores for instances that require CPU pinning, configure the
NovaComputeCpuDedicatedSetparameter in the Compute environment file. For example, the following configuration sets the dedicated CPUs on a Compute node with a 32-core CPU:parameter_defaults: ... NovaComputeCpuDedicatedSet: 2-15,18-31 ...
parameter_defaults: ... NovaComputeCpuDedicatedSet: 2-15,18-31 ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Configuring CPU pinning on the Compute nodes.
To reserve physical CPU cores for the emulator threads, configure the
NovaComputeCpuSharedSetparameter in the Compute environment file. For example, the following configuration sets the shared CPUs on a Compute node with a 32-core CPU:parameter_defaults: ... NovaComputeCpuSharedSet: 0,1,16,17 ...
parameter_defaults: ... NovaComputeCpuSharedSet: 0,1,16,17 ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe Compute scheduler also uses the CPUs in the shared set for instances that run on shared, or floating, CPUs. For more information, see Configuring CPU pinning on Compute nodes
-
Add the Compute scheduler filter
NUMATopologyFilterto theNovaSchedulerEnabledFiltersparameter, if not already present. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure a flavor that runs emulator threads for the instance on a dedicated CPU, which is selected from the shared CPUs configured using
NovaComputeCpuSharedSet:openstack flavor set --property hw:cpu_policy=dedicated \ --property hw:emulator_threads_policy=share \ dedicated_emulator_threads
(overcloud)$ openstack flavor set --property hw:cpu_policy=dedicated \ --property hw:emulator_threads_policy=share \ dedicated_emulator_threadsCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about configuration options for
hw:emulator_threads_policy, see Emulator threads policy in Flavor metadata.
4.3. Configuring CPU feature flags for instances Copy linkLink copied to clipboard!
You can enable or disable CPU feature flags for an instance without changing the settings on the host Compute node and rebooting the Compute node. By configuring the standard set of CPU feature flags that are applied to instances, you are helping to achieve live migration compatibility across Compute nodes. You are also helping to manage the performance and security of the instances, by disabling flags that have a negative impact on the security or performance of the instances with a particular CPU model, or enabling flags that provide mitigation from a security problem or alleviates performance problems.
4.3.1. Configuring CPU feature flags for instances Copy linkLink copied to clipboard!
Configure the Compute service to apply CPU feature flags to instances with specific vCPU models.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Open your Compute environment file.
Configure the instance CPU mode:
parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: <cpu_mode>parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: <cpu_mode>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<cpu_mode>with the CPU mode of each instance on the Compute node. Set to one of the following valid values:-
host-model: (Default) Use the CPU model of the host Compute node. Use this CPU mode to automatically add critical CPU flags to the instance to provide mitigation from security flaws. custom: Use to configure the specific CPU models each instance should use.NoteYou can also set the CPU mode to
host-passthroughto use the same CPU model and feature flags as the Compute node for the instances hosted on that Compute node.
-
Optional: If you set
NovaLibvirtCPUModetocustom, configure the instance CPU models that you want to customise:parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: <cpu_model>parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: <cpu_model>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<cpu_model>with a list of the CPU models that the host supports. List the CPU models in order, placing the more common and less advanced CPU models first in the list, and the more feature-rich CPU models last, for example:NovaLibvirtCPUModels: - SandyBridge - IvyBridge - Haswell-noTSX-IBRSNovaLibvirtCPUModels: - SandyBridge - IvyBridge - Haswell-noTSX-IBRSCopy to Clipboard Copied! Toggle word wrap Toggle overflow For a list of model names, see the
/usr/share/libvirt/cpu_map.xmlfile, or use one of the following commands on the host Compute node:For a RHEL version 8.4 Compute node:
sudo podman exec -it nova_libvirt virsh cpu-models <arch>
$ sudo podman exec -it nova_libvirt virsh cpu-models <arch>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For a RHEL version 9.2 Compute node:
sudo podman exec -it nova_virtqemud virsh cpu-models <arch>
$ sudo podman exec -it nova_virtqemud virsh cpu-models <arch>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<arch>with the name of the architecture of the Compute node, for example,x86_64.
Configure the CPU feature flags for instances with the specified CPU models:
parameter_defaults: ComputeParameters: ... NovaLibvirtCPUModelExtraFlags: <cpu_feature_flags>parameter_defaults: ComputeParameters: ... NovaLibvirtCPUModelExtraFlags: <cpu_feature_flags>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<cpu_feature_flags>with a comma-separated list of feature flags to enable or disable. Prefix each flag with "+" to enable the flag, or "-" to disable it. If a prefix is not specified, the flag is enabled. For a list of the available feature flags for a given CPU model, see/usr/share/libvirt/cpu_map/*.xml.The following example enables the CPU feature flags
pcidandssbdfor theIvyBridgeandCascadelake-Servermodels, and disables the feature flagmtrr.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 5. Configuring memory on Compute nodes Copy linkLink copied to clipboard!
As a cloud administrator, you can configure the scheduling and placement of instances for optimal performance by creating customized flavors to target specialized workloads, including NFV and High Performance Computing (HPC).
Use the following features to tune your instances for optimal memory performance:
- Overallocation: Tune the virtual RAM to physical RAM allocation ratio.
- Swap: Tune the allocated swap size to handle memory overcommit.
- Huge pages: Tune instance memory allocation policies both for normal memory (4k pages) and huge pages (2 MB or 1 GB pages).
- File-backed memory: Use to expand your Compute node memory capacity.
- SEV: Use to enable your cloud users to create instances that use memory encryption.
5.1. Configuring memory for overallocation Copy linkLink copied to clipboard!
When you use memory overcommit (NovaRAMAllocationRatio >= 1.0), you need to deploy your overcloud with enough swap space to support the allocation ratio.
If your NovaRAMAllocationRatio parameter is set to < 1, follow the RHEL recommendations for swap size. For more information, see Recommended system swap space in the RHEL Managing Storage Devices guide.
Prerequisites
- You have calculated the swap size your node requires. For more information, see Calculating swap size.
Procedure
Copy the
/usr/share/openstack-tripleo-heat-templates/environments/enable-swap.yamlfile to your environment file directory:cp /usr/share/openstack-tripleo-heat-templates/environments/enable-swap.yaml /home/stack/templates/enable-swap.yaml
$ cp /usr/share/openstack-tripleo-heat-templates/environments/enable-swap.yaml /home/stack/templates/enable-swap.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the swap size by adding the following parameters to your
enable-swap.yamlfile:parameter_defaults: swap_size_megabytes: <swap size in MB> swap_path: <full path to location of swap, default: /swap>
parameter_defaults: swap_size_megabytes: <swap size in MB> swap_path: <full path to location of swap, default: /swap>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
enable_swap.yamlenvironment file to the stack with your other environment files and deploy the overcloud:openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/enable-swap.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/enable-swap.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2. Calculating reserved host memory on Compute nodes Copy linkLink copied to clipboard!
To determine the total amount of RAM to reserve for host processes, you need to allocate enough memory for each of the following:
- The resources that run on the host, for example, OSD consumes 3 GB of memory.
- The emulator overhead required to host instances.
- The hypervisor for each instance.
After you calculate the additional demands on memory, use the following formula to help you determine the amount of memory to reserve for host processes on each node:
NovaReservedHostMemory = total_RAM - ( (vm_no * (avg_instance_size + overhead)) + (resource1 * resource_ram) + (resourcen * resource_ram))
NovaReservedHostMemory = total_RAM - ( (vm_no * (avg_instance_size + overhead)) + (resource1 * resource_ram) + (resourcen * resource_ram))
-
Replace
vm_nowith the number of instances. -
Replace
avg_instance_sizewith the average amount of memory each instance can use. -
Replace
overheadwith the hypervisor overhead required for each instance. -
Replace
resource1and all resources up to<resourcen>with the number of a resource type on the node. -
Replace
resource_ramwith the amount of RAM each resource of this type requires.
5.3. Calculating swap size Copy linkLink copied to clipboard!
The allocated swap size must be large enough to handle any memory overcommit. You can use the following formulas to calculate the swap size your node requires:
-
overcommit_ratio =
NovaRAMAllocationRatio- 1 -
Minimum swap size (MB) =
(total_RAM * overcommit_ratio) + RHEL_min_swap -
Recommended (maximum) swap size (MB) =
total_RAM * (overcommit_ratio + percentage_of_RAM_to_use_for_swap)
The percentage_of_RAM_to_use_for_swap variable creates a buffer to account for QEMU overhead and any other resources consumed by the operating system or host services.
For instance, to use 25% of the available RAM for swap, with 64GB total RAM, and NovaRAMAllocationRatio set to 1:
- Recommended (maximum) swap size = 64000 MB * (0 + 0.25) = 16000 MB
For information about how to calculate the NovaReservedHostMemory value, see Calculating reserved host memory on Compute nodes.
For information about how to determine the RHEL_min_swap value, see Recommended system swap space in the RHEL Managing Storage Devices guide.
5.4. Configuring huge pages on Compute nodes Copy linkLink copied to clipboard!
As a cloud administrator, you can configure Compute nodes to enable instances to request huge pages.
Configuring huge pages creates an implicit NUMA topology on the instance even if a NUMA topology is not requested.
Procedure
- Open your Compute environment file.
Configure the amount of huge page memory to reserve on each NUMA node for processes that are not instances:
parameter_defaults: ComputeParameters: NovaReservedHugePages: ["node:0,size:1GB,count:1","node:1,size:1GB,count:1"]parameter_defaults: ComputeParameters: NovaReservedHugePages: ["node:0,size:1GB,count:1","node:1,size:1GB,count:1"]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace the
sizevalue for each node with the size of the allocated huge page. Set to one of the following valid values:- 2048 (for 2MB)
- 1GB
-
Replace the
countvalue for each node with the number of huge pages used by OVS per NUMA node. For example, for 4096 of socket memory used by Open vSwitch, set this to 2.
Configure huge pages on the Compute nodes:
parameter_defaults: ComputeParameters: ... KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32"parameter_defaults: ComputeParameters: ... KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you configure multiple huge page sizes, you must also mount the huge page folders during first boot. For more information, see Mounting multiple huge page folders during first boot.
Optional: To allow instances to allocate 1GB huge pages, configure the CPU feature flags,
NovaLibvirtCPUModelExtraFlags, to includepdpe1gb:parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb'parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note- CPU feature flags do not need to be configured to allow instances to only request 2 MB huge pages.
- You can only allocate 1G huge pages to an instance if the host supports 1G huge page allocation.
-
You only need to set
NovaLibvirtCPUModelExtraFlagstopdpe1gbwhenNovaLibvirtCPUModeis set tohost-modelorcustom. -
If the host supports
pdpe1gb, andhost-passthroughis used as theNovaLibvirtCPUMode, then you do not need to setpdpe1gbas aNovaLibvirtCPUModelExtraFlags. Thepdpe1gbflag is only included in Opteron_G4 and Opteron_G5 CPU models, it is not included in any of the Intel CPU models supported by QEMU. - To mitigate for CPU hardware issues, such as Microarchitectural Data Sampling (MDS), you might need to configure other CPU flags. For more information, see RHOS Mitigation for MDS ("Microarchitectural Data Sampling") Security Flaws.
To avoid loss of performance after applying Meltdown protection, configure the CPU feature flags,
NovaLibvirtCPUModelExtraFlags, to include+pcid:parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb, +pcid'parameter_defaults: ComputeParameters: NovaLibvirtCPUMode: 'custom' NovaLibvirtCPUModels: 'Haswell-noTSX' NovaLibvirtCPUModelExtraFlags: 'vmx, pdpe1gb, +pcid'Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipFor more information, see Reducing the performance impact of Meltdown CVE fixes for OpenStack guests with "PCID" CPU feature flag.
-
Add
NUMATopologyFilterto theNovaSchedulerEnabledFiltersparameter, if not already present. Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.4.1. Creating a huge pages flavor for instances Copy linkLink copied to clipboard!
To enable your cloud users to create instances that use huge pages, you can create a flavor with the hw:mem_page_size extra spec key for launching instances.
Prerequisites
- The Compute node is configured for huge pages. For more information, see Configuring huge pages on Compute nodes.
Procedure
Create a flavor for instances that require huge pages:
openstack flavor create --ram <size_mb> --disk <size_gb> \ --vcpus <no_reserved_vcpus> huge_pages
$ openstack flavor create --ram <size_mb> --disk <size_gb> \ --vcpus <no_reserved_vcpus> huge_pagesCopy to Clipboard Copied! Toggle word wrap Toggle overflow To request huge pages, set the
hw:mem_page_sizeproperty of the flavor to the required size:openstack flavor set huge_pages --property hw:mem_page_size=<page_size>
$ openstack flavor set huge_pages --property hw:mem_page_size=<page_size>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<page_size>with one of the following valid values:-
large: Selects the largest page size supported on the host, which may be 2 MB or 1 GB on x86_64 systems. -
small: (Default) Selects the smallest page size supported on the host. On x86_64 systems this is 4 kB (normal pages). -
any: Selects the page size by using thehw_mem_page_sizeset on the image. If the page size is not specified by the image, selects the largest available page size, as determined by the libvirt driver. -
<pagesize>: Set an explicit page size if the workload has specific requirements. Use an integer value for the page size in KB, or any standard suffix. For example: 4KB, 2MB, 2048, 1GB.
-
To verify the flavor creates an instance with huge pages, use your new flavor to launch an instance:
openstack server create --flavor huge_pages \ --image <image> huge_pages_instance
$ openstack server create --flavor huge_pages \ --image <image> huge_pages_instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow The Compute scheduler identifies a host with enough free huge pages of the required size to back the memory of the instance. If the scheduler is unable to find a host and NUMA node with enough pages, then the request will fail with a
NoValidHosterror.
5.4.2. Mounting multiple huge page folders during first boot Copy linkLink copied to clipboard!
You can configure the Compute service (nova) to handle multiple page sizes as part of the first boot process. The first boot process adds the heat template configuration to all nodes the first time you boot the nodes. Subsequent inclusion of these templates, such as updating the overcloud stack, does not run these scripts.
Procedure
Create a first boot template file,
hugepages.yaml, that runs a script to create the mounts for the huge page folders. You can use theOS::TripleO::MultipartMimeresource type to send the configuration script:Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
configscript in this template performs the following tasks:-
Filters the hosts to create the mounts for the huge page folders on, by specifying hostnames that match
'co?mp'. You can update the filter grep pattern for specific computes as required. -
Masks the default
dev-hugepages.mount systemdunit file to enable new mounts to be created using the page size. - Ensures that the folders are created first.
-
Creates
systemdmount units for eachpagesize. -
Runs
systemd daemon-reloadafter the first loop, to include the newly created unit files. - Enables each mount for 2M and 1G pagesizes. You can update this loop to include additional pagesizes, as required.
-
Filters the hosts to create the mounts for the huge page folders on, by specifying hostnames that match
Optional: The
/devfolder is automatically bind mounted to thenova_computeandnova_libvirtcontainers. If you have used a different destination for the huge page mounts, then you need to pass the mounts to thenova_computeandnova_libvirtcontainers:parameter_defaults NovaComputeOptVolumes: - /opt/dev:/opt/dev NovaLibvirtOptVolumes: - /opt/dev:/opt/devparameter_defaults NovaComputeOptVolumes: - /opt/dev:/opt/dev NovaLibvirtOptVolumes: - /opt/dev:/opt/devCopy to Clipboard Copied! Toggle word wrap Toggle overflow Register your heat template as the
OS::TripleO::NodeUserDataresource type in your~/templates/firstboot.yamlenvironment file:resource_registry: OS::TripleO::NodeUserData: ./hugepages.yaml
resource_registry: OS::TripleO::NodeUserData: ./hugepages.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantYou can only register the
NodeUserDataresources to one heat template for each resource. Subsequent usage overrides the heat template to use.Add your first boot environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/firstboot.yaml \ ...
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/firstboot.yaml \ ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5. Configuring Compute nodes to use file-backed memory for instances Copy linkLink copied to clipboard!
You can use file-backed memory to expand your Compute node memory capacity, by allocating files within the libvirt memory backing directory as instance memory. You can configure the amount of host disk that is available for instance memory, and the location on the disk of the instance memory files.
The Compute service reports the capacity configured for file-backed memory to the Placement service as the total system memory capacity. This allows the Compute node to host more instances than would normally fit within the system memory.
To use file-backed memory for instances, you must enable file-backed memory on the Compute node.
Limitations
- You cannot live migrate instances between Compute nodes that have file-backed memory enabled and Compute nodes that do not have file-backed memory enabled.
- File-backed memory is not compatible with huge pages. Instances that use huge pages cannot start on a Compute node with file-backed memory enabled. Use host aggregates to ensure that instances that use huge pages are not placed on Compute nodes with file-backed memory enabled.
- File-backed memory is not compatible with memory overcommit.
-
You cannot reserve memory for host processes using
NovaReservedHostMemory. When file-backed memory is in use, reserved memory corresponds to disk space not set aside for file-backed memory. File-backed memory is reported to the Placement service as the total system memory, with RAM used as cache memory.
Prerequisites
-
NovaRAMAllocationRatiomust be set to "1.0" on the node and any host aggregate the node is added to. -
NovaReservedHostMemorymust be set to "0".
Procedure
- Open your Compute environment file.
Configure the amount of host disk space, in MiB, to make available for instance RAM, by adding the following parameter to your Compute environment file:
parameter_defaults: NovaLibvirtFileBackedMemory: 102400
parameter_defaults: NovaLibvirtFileBackedMemory: 102400Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To configure the directory to store the memory backing files, set the
QemuMemoryBackingDirparameter in your Compute environment file. If not set, the memory backing directory defaults to/var/lib/libvirt/qemu/ram/.NoteYou must locate your backing store in a directory at or above the default directory location,
/var/lib/libvirt/qemu/ram/.You can also change the host disk for the backing store. For more information, see Changing the memory backing directory host disk.
When the Compute service is not configured to use Red Hat Ceph Storage as nova’s storage back end and if the memory backing path as defined by
QemuMemoryBackingDiris on the same block device as/var/lib/nova, you must reserve the same amount of host disk space as you allocated inNovaLibvirtFileBackedMemory, for example:parameter_defaults: ComputeExtraConfig: nova::compute::reserved_host_disk: 102400parameter_defaults: ComputeExtraConfig: nova::compute::reserved_host_disk: 102400Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5.1. Changing the memory backing directory host disk Copy linkLink copied to clipboard!
You can move the memory backing directory from the default primary disk location to an alternative disk.
Procedure
Create a file system on the alternative backing device. For example, enter the following command to create an
ext4filesystem on/dev/sdb:mkfs.ext4 /dev/sdb
# mkfs.ext4 /dev/sdbCopy to Clipboard Copied! Toggle word wrap Toggle overflow Mount the backing device. For example, enter the following command to mount
/dev/sdbon the default libvirt memory backing directory:mount /dev/sdb /var/lib/libvirt/qemu/ram
# mount /dev/sdb /var/lib/libvirt/qemu/ramCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe mount point must match the value of the
QemuMemoryBackingDirparameter.
5.6. Configuring AMD SEV Compute nodes to provide memory encryption for instances Copy linkLink copied to clipboard!
As a cloud administrator, you can provide cloud users the ability to create instances that run on SEV-capable Compute nodes with memory encryption enabled.
This feature is available to use from the 2nd Gen AMD EPYC™ 7002 Series ("Rome").
To enable your cloud users to create instances that use memory encryption, you must perform the following tasks:
- Designate the AMD SEV Compute nodes for memory encryption.
- Configure the Compute nodes for memory encryption.
- Deploy the overcloud.
- Create a flavor or image for launching instances with memory encryption.
If the AMD SEV hardware is limited, you can also configure a host aggregate to optimize scheduling on the AMD SEV Compute nodes. To schedule only instances that request memory encryption on the AMD SEV Compute nodes, create a host aggregate of the Compute nodes that have the AMD SEV hardware, and configure the Compute scheduler to place only instances that request memory encryption on the host aggregate. For more information, see Creating and managing host aggregates and Filtering by isolating host aggregates.
5.6.1. Secure Encrypted Virtualization (SEV) Copy linkLink copied to clipboard!
Secure Encrypted Virtualization (SEV), provided by AMD, protects the data in DRAM that a running virtual machine instance is using. SEV encrypts the memory of each instance with a unique key.
SEV increases security when you use non-volatile memory technology (NVDIMM), because an NVDIMM chip can be physically removed from a system with the data intact, similar to a hard drive. Without encryption, any stored information such as sensitive data, passwords, or secret keys can be compromised.
For more information, see the AMD Secure Encrypted Virtualization (SEV) documentation.
Limitations of instances with memory encryption
- You cannot live migrate, or suspend and resume instances with memory encryption.
- You cannot use PCI passthrough to directly access devices on instances with memory encryption.
You cannot use
virtio-blkas the boot disk of instances with memory encryption with Red Hat Enterprise Linux (RHEL) kernels earlier than kernel-4.18.0-115.el8 (RHEL-8.1.0).NoteYou can use
virtio-scsiorSATAas the boot disk, orvirtio-blkfor non-boot disks.- The operating system that runs in an encrypted instance must provide SEV support. For more information, see the Red Hat Knowledgebase solution Enabling AMD Secure Encrypted Virtualization in RHEL 8.
- Machines that support SEV have a limited number of slots in their memory controller for storing encryption keys. Each running instance with encrypted memory consumes one of these slots. Therefore, the number of instances with memory encryption that can run concurrently is limited to the number of slots in the memory controller. For example, on 1st Gen AMD EPYC™ 7001 Series ("Naples") the limit is 16, and on 2nd Gen AMD EPYC™ 7002 Series ("Rome") the limit is 255.
- Instances with memory encryption pin pages in RAM. The Compute service cannot swap these pages, therefore you cannot overcommit memory on a Compute node that hosts instances with memory encryption.
- You cannot use memory encryption with instances that have multiple NUMA nodes.
5.6.2. Designating AMD SEV Compute nodes for memory encryption Copy linkLink copied to clipboard!
To designate AMD SEV Compute nodes for instances that use memory encryption, you must create a new role file to configure the AMD SEV role, and configure the bare metal nodes with an AMD SEV resource class to use to tag the Compute nodes for memory encryption.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, you must use the scale down procedure to unprovision the node, then use the scale up procedure to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file that includes the
ComputeAMDSEVrole, along with any other roles that you need for the overcloud. The following example generates the roles data fileroles_data_amd_sev.yaml, which includes the rolesControllerandComputeAMDSEV:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_amd_sev.yaml \ Compute:ComputeAMDSEV Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_amd_sev.yaml \ Compute:ComputeAMDSEV ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Open
roles_data_amd_sev.yamland edit or add the following parameters and sections:Expand Section/Parameter Current value New value Role comment
Role: ComputeRole: ComputeAMDSEVRole name
name: Computename: ComputeAMDSEVdescriptionBasic Compute Node roleAMD SEV Compute Node roleHostnameFormatDefault%stackname%-novacompute-%index%%stackname%-novacomputeamdsev-%index%deprecated_nic_config_namecompute.yamlcompute-amd-sev.yaml-
Register the AMD SEV Compute nodes for the overcloud by adding them to your node definition template,
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect \ --all-manageable --provide
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare metal node that you want to designate for memory encryption with a custom AMD SEV resource class:
openstack baremetal node set \ --resource-class baremetal.AMD-SEV <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.AMD-SEV <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the name or ID of the bare metal node.Add the
ComputeAMDSEVrole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. If you do not define the network definitions by using the
network_configproperty, then the default network definitions are used.
For more information about the properties you can use to configure node attributes in your node definition file, see Bare metal node provisioning attributes. For an example node definition file, see Example node definition file.
Run the provisioning command to provision the new nodes for your role:
openstack overcloud node provision \ --stack <stack> \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ --stack <stack> \ [--network-config \] --output /home/stack/templates/overcloud-baremetal-deployed.yaml \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you do not define the network definitions by using thenetwork_configproperty, then the default network definitions are used.
-
Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you did not run the provisioning command with the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeAMDSEVNetworkConfigTemplate: /home/stack/templates/nic-configs/<amd_sev_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeAMDSEVNetworkConfigTemplate: /home/stack/templates/nic-configs/<amd_sev_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<amd_sev_net_top>with the name of the file that contains the network topology of theComputeAMDSEVrole, for example,compute.yamlto use the default network topology.
5.6.3. Configuring AMD SEV Compute nodes for memory encryption Copy linkLink copied to clipboard!
To enable your cloud users to create instances that use memory encryption, you must configure the Compute nodes that have the AMD SEV hardware.
From RHOSP OSP17.0 onwards, Q35 is the default machine type. The Q35 machine type uses PCIe ports. You can manage the number of PCIe port devices by configuring the heat parameter NovaLibvirtNumPciePorts. The number of devices that can attach to a PCIe port is fewer than instances running on previous versions. If you want to use more devices, you must use the hw_disk_bus=scsi or hw_scsi_model=virtio-scsi image property. For more information, see Metadata properties for virtual hardware.
Prerequisites
Your deployment must include a Compute node that runs on AMD hardware capable of supporting SEV, such as an AMD EPYC CPU. You can use the following command to determine if your deployment is SEV-capable:
lscpu | grep sev
$ lscpu | grep sevCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
- Open your Compute environment file.
Optional: Add the following configuration to your Compute environment file to specify the maximum number of memory-encrypted instances that the AMD SEV Compute nodes can host concurrently:
parameter_defaults: ComputeAMDSEVExtraConfig: nova::config::nova_config: libvirt/num_memory_encrypted_guests: value: 15parameter_defaults: ComputeAMDSEVExtraConfig: nova::config::nova_config: libvirt/num_memory_encrypted_guests: value: 15Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe default value of the
libvirt/num_memory_encrypted_guestsparameter isnone. If you do not set a custom value, the AMD SEV Compute nodes do not impose a limit on the number of memory-encrypted instances that the nodes can host concurrently. Instead, the hardware determines the maximum number of memory-encrypted instances that the AMD SEV Compute nodes can host concurrently, which might cause some memory-encrypted instances to fail to launch.Optional: To specify that all x86_64 images use the q35 machine type by default, add the following configuration to your Compute environment file:
parameter_defaults: ComputeAMDSEVParameters: NovaHWMachineType: x86_64=q35parameter_defaults: ComputeAMDSEVParameters: NovaHWMachineType: x86_64=q35Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you specify this parameter value, you do not need to set the
hw_machine_typeproperty toq35on every AMD SEV instance image.To ensure that the AMD SEV Compute nodes reserve enough memory for host-level services to function, add 16MB for each potential AMD SEV instance:
parameter_defaults: ComputeAMDSEVParameters: ... NovaReservedHostMemory: <libvirt/num_memory_encrypted_guests * 16>parameter_defaults: ComputeAMDSEVParameters: ... NovaReservedHostMemory: <libvirt/num_memory_encrypted_guests * 16>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the kernel parameters for the AMD SEV Compute nodes:
parameter_defaults: ComputeAMDSEVParameters: ... KernelArgs: "hugepagesz=1GB hugepages=32 default_hugepagesz=1GB kvm_amd.sev=1"parameter_defaults: ComputeAMDSEVParameters: ... KernelArgs: "hugepagesz=1GB hugepages=32 default_hugepagesz=1GB kvm_amd.sev=1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you first add the
KernelArgsparameter to the configuration of a role, the overcloud nodes are automatically rebooted. If required, you can disable the automatic rebooting of nodes and instead perform node reboots manually after each overcloud deployment. For more information, see Configuring manual node reboot to defineKernelArgs.- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.6.4. Creating an image for memory encryption Copy linkLink copied to clipboard!
When the overcloud contains AMD SEV Compute nodes, you can create an AMD SEV instance image that your cloud users can use to launch instances that have memory encryption.
From RHOSP OSP17.0 onwards, Q35 is the default machine type. The Q35 machine type uses PCIe ports. You can manage the number of PCIe port devices by configuring the heat parameter NovaLibvirtNumPciePorts. The number of devices that can attach to a PCIe port is fewer than instances running on previous versions. If you want to use more devices, you must use the hw_disk_bus=scsi or hw_scsi_model=virtio-scsi image property. For more information, see Metadata properties for virtual hardware.
Procedure
Create a new image for memory encryption:
openstack image create ... \ --property hw_firmware_type=uefi amd-sev-image
(overcloud)$ openstack image create ... \ --property hw_firmware_type=uefi amd-sev-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you use an existing image, the image must have the
hw_firmware_typeproperty set touefi.Optional: Add the property
hw_mem_encryption=Trueto the image to enable AMD SEV memory encryption on the image:openstack image set \ --property hw_mem_encryption=True amd-sev-image
(overcloud)$ openstack image set \ --property hw_mem_encryption=True amd-sev-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can enable memory encryption on the flavor. For more information, see Creating a flavor for memory encryption.
Optional: Set the machine type to
q35, if not already set in the Compute node configuration:openstack image set \ --property hw_machine_type=q35 amd-sev-image
(overcloud)$ openstack image set \ --property hw_machine_type=q35 amd-sev-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To schedule memory-encrypted instances on a SEV-capable host aggregate, add the following trait to the image extra specs:
openstack image set \ --property trait:HW_CPU_X86_AMD_SEV=required amd-sev-image
(overcloud)$ openstack image set \ --property trait:HW_CPU_X86_AMD_SEV=required amd-sev-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can also specify this trait on the flavor. For more information, see Creating a flavor for memory encryption.
5.6.5. Creating a flavor for memory encryption Copy linkLink copied to clipboard!
When the overcloud contains AMD SEV Compute nodes, you can create one or more AMD SEV flavors that your cloud users can use to launch instances that have memory encryption.
An AMD SEV flavor is necessary only when the hw_mem_encryption property is not set on an image.
Procedure
Create a flavor for memory encryption:
openstack flavor create --vcpus 1 --ram 512 --disk 2 \ --property hw:mem_encryption=True m1.small-amd-sev
(overcloud)$ openstack flavor create --vcpus 1 --ram 512 --disk 2 \ --property hw:mem_encryption=True m1.small-amd-sevCopy to Clipboard Copied! Toggle word wrap Toggle overflow To schedule memory-encrypted instances on a SEV-capable host aggregate, add the following trait to the flavor extra specs:
openstack flavor set \ --property trait:HW_CPU_X86_AMD_SEV=required m1.small-amd-sev
(overcloud)$ openstack flavor set \ --property trait:HW_CPU_X86_AMD_SEV=required m1.small-amd-sevCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.6.6. Launching an instance with memory encryption Copy linkLink copied to clipboard!
To verify that you can launch instances on an AMD SEV Compute node with memory encryption enabled, use a memory encryption flavor or image to create an instance.
Procedure
Create an instance by using an AMD SEV flavor or image. The following example creates an instance by using the flavor created in Creating a flavor for memory encryption and the image created in Creating an image for memory encryption:
openstack server create --flavor m1.small-amd-sev \ --image amd-sev-image amd-sev-instance
(overcloud)$ openstack server create --flavor m1.small-amd-sev \ --image amd-sev-image amd-sev-instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud user.
To verify that the instance uses memory encryption, enter the following command from the instance:
dmesg | grep -i sev
$ dmesg | grep -i sev AMD Secure Encrypted Virtualization (SEV) activeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 6. Configuring Compute service storage Copy linkLink copied to clipboard!
You create an instance from a base image, which the Compute service copies from the Image (glance) service, and caches locally on the Compute nodes. The instance disk, which is the back end for the instance, is also based on the base image.
You can configure the Compute service to store ephemeral instance disk data locally on the host Compute node or remotely on either an NFS share or Ceph cluster. Alternatively, you can also configure the Compute service to store instance disk data in persistent storage provided by the Block Storage (Cinder) service.
You can configure image caching for your environment, and configure the performance and security of the instance disks. You can also configure the Compute service to download images directly from the RBD image repository without using the Image service API, when the Image service (glance) uses Red Hat Ceph RADOS Block Device (RBD) as the back end.
6.1. Configuration options for image caching Copy linkLink copied to clipboard!
Use the parameters detailed in the following table to configure how the Compute service implements and manages an image cache on Compute nodes.
| Configuration method | Parameter | Description |
|---|---|---|
| Puppet |
|
Specifies the number of seconds to wait between runs of the image cache manager, which manages base image caching on Compute nodes. The Compute service uses this period to perform automatic removal of unused cached images when
Set to
Default: |
| Puppet |
| Specifies the maximum number of Compute nodes that can pre-cache images in parallel. Note
Default: |
| Puppet |
|
Set to
Default: |
| Puppet |
|
Specifies the minimum age that an unused resized base image must be to be removed from the cache, in seconds. Unused resized base images younger than this will not be removed. Set to
Default: |
| Puppet |
|
Specifies the name of the folder where cached images are stored, relative to
Default: |
| Heat |
| Specifies the length of time in seconds that the Compute service should continue caching an image when it is no longer used by any instances on the Compute node. The Compute service deletes images cached on the Compute node that are older than this configured lifetime from the cache directory until they are needed again. Default: 86400 (24 hours) |
6.2. Configuration options for instance ephemeral storage properties Copy linkLink copied to clipboard!
Use the parameters detailed in the following table to configure the performance and security of ephemeral storage used by instances.
Red Hat OpenStack Platform (RHOSP) does not support the LVM image type for instance disks. Therefore, the [libvirt]/volume_clear configuration option, which wipes ephemeral disks when instances are deleted, is not supported because it only applies when the instance disk image type is LVM.
| Configuration method | Parameter | Description |
|---|---|---|
| Puppet |
| Specifies the default format that is used for a new ephemeral volume. Set to one of the following valid values:
The
Default: |
| Puppet |
|
Set to
Default: |
| Puppet |
|
Set to
Set to
Default: |
| Puppet |
| Specifies the preallocation mode for instance disks. Set to one of the following valid values:
Default: |
| Hieradata override |
|
Set to This parameter is not enabled by default because it enables the direct mounting of images which might otherwise be disabled for security reasons.
Default: |
| Hieradata override |
| Specifies the image type to use for instance disks. Set to one of the following valid values:
Note RHOSP does not support the LVM image type for instance disks.
When set to a valid value other than
The default value is determined by the configuration of
|
6.3. Configuring the maximum number of storage devices to attach to one instance Copy linkLink copied to clipboard!
By default, you can attach an unlimited number of storage devices to a single instance. Attaching a large number of disk devices to an instance can degrade performance on the instance. You can tune the maximum number of devices that can be attached to an instance based on the boundaries of what your environment can support. The number of storage disks supported by an instance depends on the bus that the disk uses. For example, the IDE disk bus is limited to 4 attached devices. You can attach a maximum of 500 disk devices to instances with machine type Q35.
From RHOSP OSP17.0 onwards, Q35 is the default machine type. The Q35 machine type uses PCIe ports. You can manage the number of PCIe port devices by configuring the heat parameter NovaLibvirtNumPciePorts. The number of devices that can attach to a PCIe port is fewer than instances running on previous versions. If you want to use more devices, you must use the hw_disk_bus=scsi or hw_scsi_model=virtio-scsi image property. For more information, see Metadata properties for virtual hardware.
-
Changing the value of the
NovaMaxDiskDevicesToAttachparameter on a Compute node with active instances can cause rebuilds to fail if the maximum number is lower than the number of devices already attached to instances. For example, if instance A has 26 devices attached and you changeNovaMaxDiskDevicesToAttachto 20, a request to rebuild instance A will fail. - During cold migration, the configured maximum number of storage devices is enforced only on the source for the instance that you want to migrate. The destination is not checked before the move. This means that if Compute node A has 26 attached disk devices, and Compute node B has a configured maximum of 20 attached disk devices, a cold migration of an instance with 26 attached devices from Compute node A to Compute node B succeeds. However, a subsequent request to rebuild the instance in Compute node B fails because 26 devices are already attached which exceeds the configured maximum of 20.
The configured maximum number of storage devices is not enforced on shelved offloaded instances, as they have no Compute node.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcundercloud credentials file:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Create a new environment file, or open an existing environment file.
Configure the limit on the maximum number of storage devices that can be attached to a single instance by adding the following configuration to your environment file:
parameter_defaults: ... NovaMaxDiskDevicesToAttach: <max_device_limit> ...
parameter_defaults: ... NovaMaxDiskDevicesToAttach: <max_device_limit> ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<max_device_limit>with the maximum number of storage devices that can be attached to an instance.
-
Replace
- Save the updates to your environment file.
Add your environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.5. Configuring image downloads directly from Red Hat Ceph RADOS Block Device (RBD) Copy linkLink copied to clipboard!
When the Image service (glance) uses Red Hat Ceph RADOS Block Device (RBD) as the back end, and the Compute service uses local file-based ephemeral storage, you can configure the Compute service to download images directly from the RBD image repository without using the Image service API. This reduces the time it takes to download an image to the Compute node image cache at instance boot time, which improves instance launch time.
Prerequisites
- The Image service back end is a Red Hat Ceph RADOS Block Device (RBD).
- The Compute service is using a local file-based ephemeral store for the image cache and instance disks.
Procedure
-
Log in to the undercloud as the
stackuser. - Open your Compute environment file.
To download images directly from the RBD back end, add the following configuration to your Compute environment file:
parameter_defaults: ComputeParameters: NovaGlanceEnableRbdDownload: True NovaEnableRbdBackend: False ...parameter_defaults: ComputeParameters: NovaGlanceEnableRbdDownload: True NovaEnableRbdBackend: False ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: If the Image service is configured to use multiple Red Hat Ceph Storage back ends, add the following configuration to your Compute environment file to identify the RBD back end to download images from:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<rbd_backend_id>with the ID used to specify the back end in theGlanceMultistoreConfigconfiguration, for examplerbd2_store.Add the following configuration to your Compute environment file to specify the Image service RBD back end, and the maximum length of time that the Compute service waits to connect to the Image service RBD back end, in seconds:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow - To verify that the Compute service downloads images directly from RBD, create an instance then check the instance debug log for the entry "Attempting to export RBD image:".
Chapter 7. Configuring instance scheduling and placement Copy linkLink copied to clipboard!
The Compute scheduler service determines on which Compute node or host aggregate to place an instance. When the Compute service (nova) receives a request to launch or move an instance, it uses the specifications provided in the request, the flavor, and the image to find a suitable host. For example, a flavor can specify the traits an instance requires a host to have, such as the type of storage disk, or the Intel CPU instruction set extension.
The Compute scheduler service uses the configuration of the following components, in the following order, to determine on which Compute node to launch or move an instance:
- Placement service prefilters: The Compute scheduler service uses the Placement service to filter the set of candidate Compute nodes based on specific attributes. For example, the Placement service automatically excludes disabled Compute nodes.
- Filters: Used by the Compute scheduler service to determine the initial set of Compute nodes on which to launch an instance.
- Weights: The Compute scheduler service prioritizes the filtered Compute nodes using a weighting system. The highest weight has the highest priority.
In the following diagram, host 1 and 3 are eligible after filtering. Host 1 has the highest weight and therefore has the highest priority for scheduling.
7.1. Prefiltering using the Placement service Copy linkLink copied to clipboard!
The Compute service (nova) interacts with the Placement service when it creates and manages instances. The Placement service tracks the inventory and usage of resource providers, such as a Compute node, a shared storage pool, or an IP allocation pool, and their available quantitative resources, such as the available vCPUs. Any service that needs to manage the selection and consumption of resources can use the Placement service.
The Placement service also tracks the mapping of available qualitative resources to resource providers, such as the type of storage disk trait a resource provider has.
The Placement service applies prefilters to the set of candidate Compute nodes based on Placement service resource provider inventories and traits. You can create prefilters based on the following criteria:
- Supported image types
- Traits
- Projects or tenants
- Availability zone
7.1.1. Filtering by requested image type support Copy linkLink copied to clipboard!
You can exclude Compute nodes that do not support the disk format of the image used to launch an instance. This is useful when your environment uses Red Hat Ceph Storage as an ephemeral backend, which does not support QCOW2 images. Enabling this feature ensures that the scheduler does not send requests to launch instances using a QCOW2 image to Compute nodes backed by Red Hat Ceph Storage.
Procedure
- Open your Compute environment file.
-
To exclude Compute nodes that do not support the disk format of the image used to launch an instance, set the
NovaSchedulerQueryImageTypeparameter toTruein the Compute environment file. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.1.2. Filtering by resource provider traits Copy linkLink copied to clipboard!
Each resource provider has a set of traits. Traits are the qualitative aspects of a resource provider, for example, the type of storage disk, or the Intel CPU instruction set extension.
The Compute node reports its capabilities to the Placement service as traits. An instance can specify which of these traits it requires, or which traits the resource provider must not have. The Compute scheduler can use these traits to identify a suitable Compute node or host aggregate to host an instance.
To enable your cloud users to create instances on hosts that have particular traits, you can define a flavor that requires or forbids a particular trait, and you can create an image that requires or forbids a particular trait.
For a list of the available traits, see the os-traits library. You can also create custom traits, as required.
Additional resources
7.1.2.1. Creating an image that requires or forbids a resource provider trait Copy linkLink copied to clipboard!
You can create an instance image that your cloud users can use to launch instances on hosts that have particular traits.
Procedure
Create a new image:
openstack image create ... trait-image
(overcloud)$ openstack image create ... trait-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the trait you require a host or host aggregate to have. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
openstack --os-placement-api-version 1.6 trait list
(overcloud)$ openstack --os-placement-api-version 1.6 trait listCopy to Clipboard Copied! Toggle word wrap Toggle overflow To create a new trait, enter the following command:
openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow Custom traits must begin with the prefix
CUSTOM_and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
Collect the existing resource provider traits of each host:
existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')
(overcloud)$ existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the existing resource provider traits for the traits you require a host or host aggregate to have:
echo $existing_traits
(overcloud)$ echo $existing_traitsCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the traits you require are not already added to the resource provider, then add the existing traits and your required traits to the resource providers for each host:
openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>
(overcloud)$ openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<TRAIT_NAME>with the name of the trait that you want to add to the resource provider. You can use the--traitoption more than once to add additional traits, as required.NoteThis command performs a full replacement of the traits for the resource provider. Therefore, you must retrieve the list of existing resource provider traits on the host and set them again to prevent them from being removed.
To schedule instances on a host or host aggregate that has a required trait, add the trait to the image extra specs. For example, to schedule instances on a host or host aggregate that supports AVX-512, add the following trait to the image extra specs:
openstack image set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-image
(overcloud)$ openstack image set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow To filter out hosts or host aggregates that have a forbidden trait, add the trait to the image extra specs. For example, to prevent instances from being scheduled on a host or host aggregate that supports multi-attach volumes, add the following trait to the image extra specs:
openstack image set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-image
(overcloud)$ openstack image set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.1.2.2. Creating a flavor that requires or forbids a resource provider trait Copy linkLink copied to clipboard!
You can create flavors that your cloud users can use to launch instances on hosts that have particular traits.
Procedure
Create a flavor:
openstack flavor create --vcpus 1 --ram 512 \ --disk 2 trait-flavor
(overcloud)$ openstack flavor create --vcpus 1 --ram 512 \ --disk 2 trait-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the trait you require a host or host aggregate to have. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
openstack --os-placement-api-version 1.6 trait list
(overcloud)$ openstack --os-placement-api-version 1.6 trait listCopy to Clipboard Copied! Toggle word wrap Toggle overflow To create a new trait, enter the following command:
openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow Custom traits must begin with the prefix
CUSTOM_and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
Collect the existing resource provider traits of each host:
existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')
(overcloud)$ existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the existing resource provider traits for the traits you require a host or host aggregate to have:
echo $existing_traits
(overcloud)$ echo $existing_traitsCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the traits you require are not already added to the resource provider, then add the existing traits and your required traits to the resource providers for each host:
openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>
(overcloud)$ openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<TRAIT_NAME>with the name of the trait that you want to add to the resource provider. You can use the--traitoption more than once to add additional traits, as required.NoteThis command performs a full replacement of the traits for the resource provider. Therefore, you must retrieve the list of existing resource provider traits on the host and set them again to prevent them from being removed.
To schedule instances on a host or host aggregate that has a required trait, add the trait to the flavor extra specs. For example, to schedule instances on a host or host aggregate that supports AVX-512, add the following trait to the flavor extra specs:
openstack flavor set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-flavor
(overcloud)$ openstack flavor set \ --property trait:HW_CPU_X86_AVX512BW=required \ trait-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow To filter out hosts or host aggregates that have a forbidden trait, add the trait to the flavor extra specs. For example, to prevent instances from being scheduled on a host or host aggregate that supports multi-attach volumes, add the following trait to the flavor extra specs:
openstack flavor set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-flavor
(overcloud)$ openstack flavor set \ --property trait:COMPUTE_VOLUME_MULTI_ATTACH=forbidden \ trait-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.1.3. Filtering by isolating host aggregates Copy linkLink copied to clipboard!
You can restrict scheduling on a host aggregate to only those instances whose flavor and image traits match the metadata of the host aggregate. The combination of flavor and image metadata must require all the host aggregate traits to be eligible for scheduling on Compute nodes in that host aggregate.
Procedure
- Open your Compute environment file.
-
To isolate host aggregates to host only instances whose flavor and image traits match the aggregate metadata, set the
NovaSchedulerEnableIsolatedAggregateFilteringparameter toTruein the Compute environment file. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the traits you want to isolate the host aggregate for. You can select an existing trait, or create a new trait:
To use an existing trait, list the existing traits to retrieve the trait name:
openstack --os-placement-api-version 1.6 trait list
(overcloud)$ openstack --os-placement-api-version 1.6 trait listCopy to Clipboard Copied! Toggle word wrap Toggle overflow To create a new trait, enter the following command:
openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAME
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_TRAIT_NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow Custom traits must begin with the prefix
CUSTOM_and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.
Collect the existing resource provider traits of each Compute node:
existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')
(overcloud)$ existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the existing resource provider traits for the traits you want to isolate the host aggregate for:
echo $existing_traits
(overcloud)$ echo $existing_traitsCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the traits you require are not already added to the resource provider, then add the existing traits and your required traits to the resource providers for each Compute node in the host aggregate:
openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>
(overcloud)$ openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait <TRAIT_NAME> \ <host_uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<TRAIT_NAME>with the name of the trait that you want to add to the resource provider. You can use the--traitoption more than once to add additional traits, as required.NoteThis command performs a full replacement of the traits for the resource provider. Therefore, you must retrieve the list of existing resource provider traits on the host and set them again to prevent them from being removed.
- Repeat steps 6 - 8 for each Compute node in the host aggregate.
Add the metadata property for the trait to the host aggregate:
openstack --os-compute-api-version 2.53 aggregate set \ --property trait:<TRAIT_NAME>=required <aggregate_name>
(overcloud)$ openstack --os-compute-api-version 2.53 aggregate set \ --property trait:<TRAIT_NAME>=required <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the trait to a flavor or an image:
openstack flavor set \ --property trait:<TRAIT_NAME>=required <flavor> openstack image set \ --property trait:<TRAIT_NAME>=required <image>
(overcloud)$ openstack flavor set \ --property trait:<TRAIT_NAME>=required <flavor> (overcloud)$ openstack image set \ --property trait:<TRAIT_NAME>=required <image>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.1.4. Filtering by availability zone using the Placement service Copy linkLink copied to clipboard!
You can use the Placement service to honor availability zone requests. To use the Placement service to filter by availability zone, placement aggregates must exist that match the membership and UUID of the availability zone host aggregates.
Procedure
- Open your Compute environment file.
-
To use the Placement service to filter by availability zone, set the
NovaSchedulerQueryPlacementForAvailabilityZoneparameter toTruein the Compute environment file. -
Remove the
AvailabilityZoneFilterfilter from theNovaSchedulerEnabledFiltersparameter. - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
- For more information on creating a host aggregate to use as an availability zone, see Creating an availability zone.
7.2. Configuring filters and weights for the Compute scheduler service Copy linkLink copied to clipboard!
You configure the filters and weights for the Compute scheduler service to determine the initial set of Compute nodes on which to launch an instance.
Procedure
- Open your Compute environment file.
Add the filters you want the scheduler to use to the
NovaSchedulerEnabledFiltersparameter, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: By default, the Compute scheduler weighs host Compute nodes by all resource types and spreads instances evenly across all hosts. You can configure the multiplier to apply to each weigher. For example, to specify that the available RAM of a Compute node has a higher weight than the other default weighers, and that the Compute scheduler prefers Compute nodes with more available RAM over those nodes with less available RAM, use the following configuration:
parameter_defaults: ... ControllerExtraConfig: nova::scheduler::filter::ram_weight_multiplier: '2.0'parameter_defaults: ... ControllerExtraConfig: nova::scheduler::filter::ram_weight_multiplier: '2.0'Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can also set multipliers to a negative value. In the above example, to prefer Compute nodes with less available RAM over those nodes with more available RAM, set
ram_weight_multiplierto-2.0.For more information on the available attributes and their multipliers, see Compute scheduler weights.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional resources
7.3. Compute scheduler filters Copy linkLink copied to clipboard!
You configure the NovaSchedulerEnabledFilters parameter in your Compute environment file to specify the filters the Compute scheduler must apply when selecting an appropriate Compute node to host an instance. The default configuration applies the following filters:
-
AvailabilityZoneFilter: The Compute node must be in the requested availability zone. -
ComputeFilter: The Compute node can service the request. -
ComputeCapabilitiesFilter: The Compute node satisfies the flavor extra specs. -
ImagePropertiesFilter: The Compute node satisfies the requested image properties. -
ServerGroupAntiAffinityFilter: The Compute node is not already hosting an instance in a specified group. -
ServerGroupAffinityFilter: The Compute node is already hosting instances in a specified group.
You can add and remove filters. The following table describes all the available filters.
| Filter | Description |
|---|---|
|
| Use this filter to match the image metadata of an instance with host aggregate metadata. If any of the host aggregate metadata matches the metadata of the image, then the Compute nodes that belong to that host aggregate are candidates for launching instances from that image. The scheduler only recognises valid image metadata properties. For details on valid image metadata properties, see Image configuration parameters. |
|
| Use this filter to match namespaced properties defined in the flavor extra specs of an instance with host aggregate metadata.
You must scope your flavor If any of the host aggregate metadata matches the metadata of the flavor extra spec, then the Compute nodes that belong to that host aggregate are candidates for launching instances from that image. |
|
|
Use this filter to filter hosts by I/O operations with a per-aggregate |
|
|
Use this filter to limit the availability of Compute nodes in project-isolated host aggregates to a specified set of projects. Only projects specified using the Note
The project can still place instances on other hosts. To restrict this, use the |
|
|
Use this filter to limit the number of instances each Compute node in an aggregate can host. You can configure the maximum number of instances per-aggregate by using the |
|
|
Use this filter to pass hosts if no flavor metadata key is set, or the flavor aggregate metadata value contains the name of the requested flavor. The value of the flavor metadata entry is a string that may contain either a single flavor name or a comma-separated list of flavor names, such as |
|
| Use this filter to consider all available Compute nodes for instance scheduling. Note Using this filter does not disable other filters. |
|
| Use this filter to launch instances on a Compute node in the availability zone specified by the instance. |
|
|
Use this filter to match namespaced properties defined in the flavor extra specs of an instance against the Compute node capabilities. You must prefix the flavor extra specs with the
A more efficient alternative to using the |
|
| Use this filter to pass all Compute nodes that are operational and enabled. This filter should always be present. |
|
|
Use this filter to enable scheduling of an instance on a different Compute node from a set of specific instances. To specify these instances when launching an instance, use the openstack server create --image cedef40a-ed67-4d10-800e-17455edce175 \ --flavor 1 --hint different_host=a0cf03a5-d921-4877-bb5c-86d26cf818e1 \ --hint different_host=8c19174f-4220-44f0-824a-cd1eeef10287 server-1
|
|
| Use this filter to filter Compute nodes based on the following properties defined on the instance image:
Compute nodes that can support the specified image properties contained in the instance are passed to the scheduler. For more information on image properties, see Image configuration parameters. |
|
|
Use this filter to only schedule instances with isolated images on isolated Compute nodes. You can also prevent non-isolated images from being used to build instances on isolated Compute nodes by configuring
To specify the isolated set of images and hosts use the |
|
|
Use this filter to filter out hosts that have concurrent I/O operations that exceed the configured |
|
|
Use this filter to limit scheduling to Compute nodes that report the metrics configured by using To use this filter, add the following configuration to your Compute environment file: parameter_defaults:
ComputeExtraConfig:
nova::config::nova_config:
DEFAULT/compute_monitors:
value: 'cpu.virt_driver'
By default, the Compute scheduler service updates the metrics every 60 seconds. |
|
|
Use this filter to schedule instances with a NUMA topology on NUMA-capable Compute nodes. Use flavor |
|
|
Use this filter to filter out Compute nodes that have more instances running than specified by the |
|
|
Use this filter to schedule instances on Compute nodes that have the devices that the instance requests by using the flavor Use this filter if you want to reserve nodes with PCI devices, which are typically expensive and limited, for instances that request them. |
|
|
Use this filter to enable scheduling of an instance on the same Compute node as a set of specific instances. To specify these instances when launching an instance, use the openstack server create --image cedef40a-ed67-4d10-800e-17455edce175 \ --flavor 1 --hint same_host=a0cf03a5-d921-4877-bb5c-86d26cf818e1 \ --hint same_host=8c19174f-4220-44f0-824a-cd1eeef10287 server-1
|
|
| Use this filter to schedule instances in an affinity server group on the same Compute node. To create the server group, enter the following command: openstack server group create --policy affinity <group_name>
To launch an instance in this group, use the openstack server create --image <image> \ --flavor <flavor> \ --hint group=<group_uuid> <instance_name>
|
|
| Use this filter to schedule instances that belong to an anti-affinity server group on different Compute nodes. To create the server group, enter the following command: openstack server group create --policy anti-affinity <group_name>
To launch an instance in this group, use the openstack server create --image <image> \ --flavor <flavor> \ --hint group=<group_uuid> <instance_name>
|
|
|
Use this filter to schedule instances on Compute nodes that have a specific IP subnet range. To specify the required range, use the openstack server create --image <image> \ --flavor <flavor> \ --hint build_near_host_ip=<ip_address> \ --hint cidr=<subnet_mask> <instance_name>
|
7.4. Compute scheduler weights Copy linkLink copied to clipboard!
Each Compute node has a weight that the scheduler can use to prioritize instance scheduling. After the Compute scheduler applies the filters, it selects the Compute node with the largest weight from the remaining candidate Compute nodes.
The Compute scheduler determines the weight of each Compute node by performing the following tasks:
- The scheduler normalizes each weight to a value between 0.0 and 1.0.
- The scheduler multiplies the normalized weight by the weigher multiplier.
The Compute scheduler calculates the weight normalization for each resource type by using the lower and upper values for the resource availability across the candidate Compute nodes:
- Nodes with the lowest availability of a resource (minval) are assigned '0'.
- Nodes with the highest availability of a resource (maxval) are assigned '1'.
Nodes with resource availability within the minval - maxval range are assigned a normalized weight calculated by using the following formula:
(node_resource_availability - minval) / (maxval - minval)
(node_resource_availability - minval) / (maxval - minval)Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If all the Compute nodes have the same availability for a resource then they are all normalized to 0.
For example, the scheduler calculates the normalized weights for available vCPUs across 10 Compute nodes, each with a different number of available vCPUs, as follows:
| Compute node | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| No of vCPUs | 5 | 5 | 10 | 10 | 15 | 20 | 20 | 15 | 10 | 5 |
| Normalized weight | 0 | 0 | 0.33 | 0.33 | 0.67 | 1 | 1 | 0.67 | 0.33 | 0 |
The Compute scheduler uses the following formula to calculate the weight of a Compute node:
(w1_multiplier * norm(w1)) + (w2_multiplier * norm(w2)) + ...
(w1_multiplier * norm(w1)) + (w2_multiplier * norm(w2)) + ...
The following table describes the available configuration options for weights. To customize a weight class and multiplier, use the following syntax to configure the option on the Controller:
ControllerExtraConfig: nova::scheduler::filter::scheduler_weight_classes: 'nova.scheduler.weights.ram.RAMWeigher' nova::scheduler::filter::ram_weight_multiplier: '2.0'
ControllerExtraConfig:
nova::scheduler::filter::scheduler_weight_classes: 'nova.scheduler.weights.ram.RAMWeigher'
nova::scheduler::filter::ram_weight_multiplier: '2.0'
Weights can be set on host aggregates using the aggregate metadata key with the same name as the options detailed in the following table. If set on the host aggregate, the host aggregate value takes precedence.
| Configuration option | Type | Description |
|---|---|---|
|
| String | Use this parameter to configure which of the following attributes to use for calculating the weight of each Compute node:
|
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available RAM. Set to a positive value to prefer hosts with more available RAM, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available RAM, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the RAM weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available disk space. Set to a positive value to prefer hosts with more available disk space, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available disk space, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the disk weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the available vCPUs. Set to a positive value to prefer hosts with more available vCPUs, which spreads instances across many hosts. Set to a negative value to prefer hosts with less available vCPUs, which fills up (stacks) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the vCPU weigher is relative to other weighers. Default: 1.0 - The scheduler spreads instances across all hosts evenly. |
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the host workload. Set to a negative value to prefer hosts with lighter workloads, which distributes the workload across more hosts. Set to a positive value to prefer hosts with heavier workloads, which schedules instances onto hosts that are already busy. The absolute value, whether positive or negative, controls how strong the I/O operations weigher is relative to other weighers. Default: -1.0 - The scheduler distributes the workload across more hosts. |
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts based on recent build failures. Set to a positive value to increase the significance of build failures recently reported by the host. Hosts with recent build failures are then less likely to be chosen.
Set to Default: 1000000.0 |
|
| Floating point | Use this parameter to specify the multiplier to use to weigh hosts during a cross-cell move. This option determines how much weight is placed on a host which is within the same source cell when moving an instance. By default, the scheduler prefers hosts within the same source cell when migrating an instance. Set to a positive value to prefer hosts within the same cell the instance is currently running. Set to a negative value to prefer hosts located in a different cell from that where the instance is currently running. Default: 1000000.0 |
|
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts based on the number of PCI devices on the host and the number of PCI devices requested by an instance. If an instance requests PCI devices, then the more PCI devices a Compute node has the higher the weight allocated to the Compute node. For example, if there are three hosts available, one with a single PCI device, one with multiple PCI devices and one without any PCI devices, then the Compute scheduler prioritizes these hosts based on the demands of the instance. The scheduler should prefer the first host if the instance requests one PCI device, the second host if the instance requires multiple PCI devices and the third host if the instance does not request a PCI device. Configure this option to prevent non-PCI instances from occupying resources on hosts with PCI devices. Default: 1.0 |
|
| Integer | Use this parameter to specify the size of the subset of filtered hosts from which to select the host. You must set this option to at least 1. A value of 1 selects the first host returned by the weighing functions. The scheduler ignores any value less than 1 and uses 1 instead. Set to a value greater than 1 to prevent multiple scheduler processes handling similar requests selecting the same host, creating a potential race condition. By selecting a host randomly from the N hosts that best fit the request, the chance of a conflict is reduced. However, the higher you set this value, the less optimal the chosen host may be for a given request. Default: 1 |
|
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts for group soft-affinity. Note You need to specify the microversion when creating a group with this policy: openstack --os-compute-api-version 2.15 server group create --policy soft-affinity <group_name>
Default: 1.0 |
|
| Positive floating point | Use this parameter to specify the multiplier to use to weigh hosts for group soft-anti-affinity. Note You need to specify the microversion when creating a group with this policy: openstack --os-compute-api-version 2.15 server group create --policy soft-affinity <group_name>
Default: 1.0 |
7.5. Declaring custom traits and resource classes Copy linkLink copied to clipboard!
As an administrator, you can declare which custom physical features and consumable resources are available on the Red Hat OpenStack Platform (RHOSP) overcloud nodes by using one of the following methods:
-
By defining a custom inventory of resources in a YAML file,
provider.yaml - By defining a custom inventory of resources that apply to a particular node role.
You can declare the availability of physical host features by defining custom traits, such as CUSTOM_DIESEL_BACKUP_POWER, CUSTOM_FIPS_COMPLIANT, and CUSTOM_HPC_OPTIMIZED. You can also declare the availability of consumable resources by defining resource classes, such as CUSTOM_DISK_IOPS, and CUSTOM_POWER_WATTS.
7.5.1. Declaring custom traits and resource classes in a YAML file Copy linkLink copied to clipboard!
As an administrator, you can declare which custom physical features and consumable resources are available on the Red Hat OpenStack Platform (RHOSP) overcloud nodes by defining a custom inventory of resources in a YAML file, provider.yaml.
You can declare the availability of physical host features by defining custom traits, such as CUSTOM_DIESEL_BACKUP_POWER, CUSTOM_FIPS_COMPLIANT, and CUSTOM_HPC_OPTIMIZED. You can also declare the availability of consumable resources by defining resource classes, such as CUSTOM_DISK_IOPS, and CUSTOM_POWER_WATTS.
You can use flavor metadata to request custom resources and custom traits. For more information, see Instance bare-metal resource class and Instance resource traits.
Procedure
-
Create a file in
/home/stack/templates/namedprovider.yaml. To configure the resource provider, add the following configuration to your
provider.yamlfile:meta: schema_version: '1.0' providers: - identification: uuid: <node_uuid>meta: schema_version: '1.0' providers: - identification: uuid: <node_uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<node_uuid>with the UUID for the node, for example,'5213b75d-9260-42a6-b236-f39b0fd10561'. Alternatively, you can use thenameproperty to identify the resource provider:name: 'EXAMPLE_RESOURCE_PROVIDER'.
-
Replace
To configure the available custom resource classes for the resource provider, add the following configuration to your
provider.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
CUSTOM_EXAMPLE_RESOURCE_CLASSwith the name of the resource class. Custom resource classes must begin with the prefix CUSTOM_ and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character. -
Replace
<total_available>with the number of availableCUSTOM_EXAMPLE_RESOURCE_CLASSfor this resource provider. -
Replace
<reserved>with the number of reservedCUSTOM_EXAMPLE_RESOURCE_CLASSfor this resource provider. -
Replace
<min_unit>with the minimum units of resources a single instance can consume. -
Replace
<max_unit>with the maximum units of resources a single instance can consume. -
Replace
<step_size>with the number of increments of consumption. -
Replace
<allocation_ratio>with the value to set the allocation ratio for the resource. Set to1.0to prevent overallocation. Set to a value greater than1.0to increase the availability of resource to more than the physical hardware.
-
Replace
To configure the available traits for the resource provider, add the following configuration to your
provider.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
CUSTOM_EXAMPLE_TRAITwith the name of the trait. Custom traits must begin with the prefix CUSTOM_ and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character.Example
provider.yamlfileThe following example declares one custom resource class and one custom trait for a resource provider.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- This hypervisor has 22 units of last level cache (LLC).
- 2
- Two of the units of LLC are reserved for the host.
- 3 4
- The min_unit and max_unit values define how many units of resources a single VM can consume.
- 5
- The step size defines the increments of consumption.
- 6
- The allocation ratio configures the overallocation of resources.
-
Save and close the
provider.yamlfile. Add the
provider.yamlfile to the stack with your other environment files and deploy the overcloud:openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/provider.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/provider.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.2. Declaring custom traits and resource classes for a role Copy linkLink copied to clipboard!
To declare custom traits and resource classes for a role, you must configure the CustomProviderInventories parameter in the role file.
Procedure
Generate a new roles data file named
roles_data_custom_traits.yamlthat includes the Controller and Compute roles, along with any other roles that you need for the overcloud:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_roles_data_custom_traits.yaml \ Compute:Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_roles_data_custom_traits.yaml \ Compute:Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the following example configuration to configure the available custom resource classes and traits for the resource provider:
The following example declares custom resource classes and custom traits for the ComputeGpu role:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
CUSTOM_EXAMPLE_RESOURCE_CLASSwith the name of the resource class. Custom resource classes must begin with the prefix CUSTOM_ and contain only the letters A through Z, the numbers 0 through 9 and the underscore “_” character. -
<1>
totalis the number of availableCUSTOM_EXAMPLE_RESOURCE_CLASSfor this resource provider. -
<2>
reservedis the number of reservedCUSTOM_EXAMPLE_RESOURCE_CLASSfor this resource provider. -
<3>
min_unitis the minimum units of resources a single instance can consume. -
<4>
max_unitis the maximum units of resources a single instance can consume. -
<5>
step_sizeis the number of increments of consumption. -
<6>
allocation_ratioconfigures the overallocation of resources. If you setallocation_ratioto 1.0, then no overallocation is allowed. But ifallocation_rationis greater than 1.0, then the total available resource is more than the physically existing one.
-
Replace
- Save and close the role file.
Add the role file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/roles_data_roles_data_custom_traits.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/roles_data_roles_data_custom_traits.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.6. Creating and managing host aggregates Copy linkLink copied to clipboard!
As a cloud administrator, you can partition a Compute deployment into logical groups for performance or administrative purposes. Red Hat OpenStack Platform (RHOSP) provides the following mechanisms for partitioning logical groups:
- Host aggregate
A host aggregate is a grouping of Compute nodes into a logical unit based on attributes such as the hardware or performance characteristics. You can assign a Compute node to one or more host aggregates.
You can map flavors and images to host aggregates by setting metadata on the host aggregate, and then matching flavor extra specs or image metadata properties to the host aggregate metadata. The Compute scheduler can use this metadata to schedule instances when the required filters are enabled. Metadata that you specify in a host aggregate limits the use of that host to any instance that has the same metadata specified in its flavor or image.
You can configure weight multipliers for each host aggregate by setting the
xxx_weight_multiplierconfiguration option in the host aggregate metadata.You can use host aggregates to handle load balancing, enforce physical isolation or redundancy, group servers with common attributes, or separate classes of hardware.
When you create a host aggregate, you can specify a zone name. This name is presented to cloud users as an availability zone that they can select.
- Availability zones
An availability zone is the cloud user view of a host aggregate. A cloud user cannot view the Compute nodes in the availability zone, or view the metadata of the availability zone. The cloud user can only see the name of the availability zone.
You can assign each Compute node to only one availability zone. You can configure a default availability zone where instances will be scheduled when the cloud user does not specify a zone. You can direct cloud users to use availability zones that have specific capabilities.
7.6.1. Enabling scheduling on host aggregates Copy linkLink copied to clipboard!
To schedule instances on host aggregates that have specific attributes, update the configuration of the Compute scheduler to enable filtering based on the host aggregate metadata.
Procedure
- Open your Compute environment file.
Add the following values to the
NovaSchedulerEnabledFiltersparameter, if they are not already present:AggregateInstanceExtraSpecsFilter: Add this value to filter Compute nodes by host aggregate metadata that match flavor extra specs.NoteFor this filter to perform as expected, you must scope the flavor extra specs by prefixing the
extra_specskey with theaggregate_instance_extra_specs:namespace.AggregateImagePropertiesIsolation: Add this value to filter Compute nodes by host aggregate metadata that match image metadata properties.NoteTo filter host aggregate metadata by using image metadata properties, the host aggregate metadata key must match a valid image metadata property. For information about valid image metadata properties, see Image configuration parameters.
AvailabilityZoneFilter: Add this value to filter by availability zone when launching an instance.NoteInstead of using the
AvailabilityZoneFilterCompute scheduler service filter, you can use the Placement service to process availability zone requests. For more information, see Filtering by availability zone using the Placement service.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.6.2. Creating a host aggregate Copy linkLink copied to clipboard!
As a cloud administrator, you can create as many host aggregates as you require.
Procedure
To create a host aggregate, enter the following command:
openstack aggregate create <aggregate_name>
(overcloud)# openstack aggregate create <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<aggregate_name>with the name you want to assign to the host aggregate.Add metadata to the host aggregate:
openstack aggregate set \ --property <key=value> \ --property <key=value> \ <aggregate_name>
(overcloud)# openstack aggregate set \ --property <key=value> \ --property <key=value> \ <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<key=value>with the metadata key-value pair. If you are using theAggregateInstanceExtraSpecsFilterfilter, the key can be any arbitrary string, for example,ssd=true. If you are using theAggregateImagePropertiesIsolationfilter, the key must match a valid image metadata property. For more information about valid image metadata properties, see Image configuration parameters. -
Replace
<aggregate_name>with the name of the host aggregate.
-
Replace
Add the Compute nodes to the host aggregate:
openstack aggregate add host \ <aggregate_name> \ <host_name>
(overcloud)# openstack aggregate add host \ <aggregate_name> \ <host_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<aggregate_name>with the name of the host aggregate to add the Compute node to. -
Replace
<host_name>with the name of the Compute node to add to the host aggregate.
-
Replace
Create a flavor or image for the host aggregate:
Create a flavor:
openstack flavor create \ --ram <size_mb> \ --disk <size_gb> \ --vcpus <no_reserved_vcpus> \ host-agg-flavor
(overcloud)$ openstack flavor create \ --ram <size_mb> \ --disk <size_gb> \ --vcpus <no_reserved_vcpus> \ host-agg-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an image:
openstack image create host-agg-image
(overcloud)$ openstack image create host-agg-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Set one or more key-value pairs on the flavor or image that match the key-value pairs on the host aggregate.
To set the key-value pairs on a flavor, use the scope
aggregate_instance_extra_specs:openstack flavor set \ --property aggregate_instance_extra_specs:ssd=true \ host-agg-flavor
(overcloud)# openstack flavor set \ --property aggregate_instance_extra_specs:ssd=true \ host-agg-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow To set the key-value pairs on an image, use valid image metadata properties as the key:
openstack image set \ --property os_type=linux \ host-agg-image
(overcloud)# openstack image set \ --property os_type=linux \ host-agg-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.6.3. Creating an availability zone Copy linkLink copied to clipboard!
As a cloud administrator, you can create an availability zone that cloud users can select when they create an instance.
Procedure
To create an availability zone, you can create a new availability zone host aggregate, or make an existing host aggregate an availability zone:
To create a new availability zone host aggregate, enter the following command:
openstack aggregate create \ --zone <availability_zone> \ <aggregate_name>
(overcloud)# openstack aggregate create \ --zone <availability_zone> \ <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<availability_zone>with the name you want to assign to the availability zone. -
Replace
<aggregate_name>with the name you want to assign to the host aggregate.
-
Replace
To make an existing host aggregate an availability zone, enter the following command:
openstack aggregate set --zone <availability_zone> \ <aggregate_name>
(overcloud)# openstack aggregate set --zone <availability_zone> \ <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<availability_zone>with the name you want to assign to the availability zone. -
Replace
<aggregate_name>with the name of the host aggregate.
-
Replace
Optional: Add metadata to the availability zone:
openstack aggregate set --property <key=value> \ <aggregate_name>
(overcloud)# openstack aggregate set --property <key=value> \ <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<key=value>with your metadata key-value pair. You can add as many key-value properties as required. -
Replace
<aggregate_name>with the name of the availability zone host aggregate.
-
Replace
Add Compute nodes to the availability zone host aggregate:
openstack aggregate add host <aggregate_name> \ <host_name>
(overcloud)# openstack aggregate add host <aggregate_name> \ <host_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<aggregate_name>with the name of the availability zone host aggregate to add the Compute node to. -
Replace
<host_name>with the name of the Compute node to add to the availability zone.
-
Replace
7.6.4. Deleting a host aggregate Copy linkLink copied to clipboard!
To delete a host aggregate, you first remove all the Compute nodes from the host aggregate.
Procedure
To view a list of all the Compute nodes assigned to the host aggregate, enter the following command:
openstack aggregate show <aggregate_name>
(overcloud)# openstack aggregate show <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To remove all assigned Compute nodes from the host aggregate, enter the following command for each Compute node:
openstack aggregate remove host <aggregate_name> \ <host_name>
(overcloud)# openstack aggregate remove host <aggregate_name> \ <host_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<aggregate_name>with the name of the host aggregate to remove the Compute node from. -
Replace
<host_name>with the name of the Compute node to remove from the host aggregate.
-
Replace
After you remove all the Compute nodes from the host aggregate, enter the following command to delete the host aggregate:
openstack aggregate delete <aggregate_name>
(overcloud)# openstack aggregate delete <aggregate_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.6.5. Creating a project-isolated host aggregate Copy linkLink copied to clipboard!
You can create a host aggregate that is available only to specific projects. Only the projects that you assign to the host aggregate can launch instances on the host aggregate.
Project isolation uses the Placement service to filter host aggregates for each project. This process supersedes the functionality of the AggregateMultiTenancyIsolation filter. You therefore do not need to use the AggregateMultiTenancyIsolation filter.
Procedure
- Open your Compute environment file.
-
To schedule project instances on the project-isolated host aggregate, set the
NovaSchedulerLimitTenantsToPlacementAggregateparameter toTruein the Compute environment file. Optional: To ensure that only the projects that you assign to a host aggregate can create instances on your cloud, set the
NovaSchedulerPlacementAggregateRequiredForTenantsparameter toTrue.NoteNovaSchedulerPlacementAggregateRequiredForTenantsisFalseby default. When this parameter isFalse, projects that are not assigned to a host aggregate can create instances on any host aggregate.- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Create the host aggregate.
Retrieve the list of project IDs:
openstack project list
(overcloud)# openstack project listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
filter_tenant_id<suffix>metadata key to assign projects to the host aggregate:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<ID0>,<ID1>, and all IDs up to<IDn>with unique values for each project filter that you want to create. -
Replace
<project_id0>,<project_id1>, and all project IDs up to<project_idn>with the ID of each project that you want to assign to the host aggregate. Replace
<aggregate_name>with the name of the project-isolated host aggregate.For example, use the following syntax to assign projects
78f1,9d3t, andaa29to the host aggregateproject-isolated-aggregate:openstack aggregate set \ --property filter_tenant_id0=78f1 \ --property filter_tenant_id1=9d3t \ --property filter_tenant_id2=aa29 \ project-isolated-aggregate
(overcloud)# openstack aggregate set \ --property filter_tenant_id0=78f1 \ --property filter_tenant_id1=9d3t \ --property filter_tenant_id2=aa29 \ project-isolated-aggregateCopy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can create a host aggregate that is available only to a single specific project by omitting the suffix from the
filter_tenant_idmetadata key:openstack aggregate set \ --property filter_tenant_id=78f1 \ single-project-isolated-aggregate
(overcloud)# openstack aggregate set \ --property filter_tenant_id=78f1 \ single-project-isolated-aggregateCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Replace
Additional resources
- For more information on creating a host aggregate, see Creating and managing host aggregates.
Chapter 8. Configuring PCI passthrough Copy linkLink copied to clipboard!
You can use PCI passthrough to attach a physical PCI device, such as a graphics card or a network device, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Using PCI passthrough with routed provider networks
The Compute service does not support single networks that span multiple provider networks. When a network contains multiple physical networks, the Compute service only uses the first physical network. Therefore, if you are using routed provider networks you must use the same physical_network name across all the Compute nodes.
If you use routed provider networks with VLAN or flat networks, you must use the same physical_network name for all segments. You then create multiple segments for the network and map the segments to the appropriate subnets.
To enable your cloud users to create instances with PCI devices attached, you must complete the following:
- Designate Compute nodes for PCI passthrough.
- Configure the Compute nodes for PCI passthrough that have the required PCI devices.
- Deploy the overcloud.
- Create a flavor for launching instances with PCI devices attached.
Prerequisites
- The Compute nodes have the required PCI devices.
8.1. Designating Compute nodes for PCI passthrough Copy linkLink copied to clipboard!
To designate Compute nodes for instances with physical PCI devices attached, you must create a new role file to configure the PCI passthrough role, and configure the bare metal nodes with a PCI passthrough resource class to use to tag the Compute nodes for PCI passthrough.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, you must use the scale down procedure to unprovision the node, then use the scale up procedure to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file named
roles_data_pci_passthrough.yamlthat includes theController,Compute, andComputePCIroles, along with any other roles that you need for the overcloud:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_pci_passthrough.yaml \ Compute:ComputePCI Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_pci_passthrough.yaml \ Compute:ComputePCI Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Open
roles_data_pci_passthrough.yamland edit or add the following parameters and sections:Expand Section/Parameter Current value New value Role comment
Role: ComputeRole: ComputePCIRole name
name: Computename: ComputePCIdescriptionBasic Compute Node rolePCI Passthrough Compute Node roleHostnameFormatDefault%stackname%-novacompute-%index%%stackname%-novacomputepci-%index%deprecated_nic_config_namecompute.yamlcompute-pci-passthrough.yaml-
Register the PCI passthrough Compute nodes for the overcloud by adding them to your node definition template,
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect \ --all-manageable --provide
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare metal node that you want to designate for PCI passthrough with a custom PCI passthrough resource class:
openstack baremetal node set \ --resource-class baremetal.PCI-PASSTHROUGH <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.PCI-PASSTHROUGH <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the ID of the bare metal node.Add the
ComputePCIrole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1 1
- You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. If you do not define the network definitions by using the
network_configproperty, then the default network definitions are used.
For more information about the properties you can use to configure node attributes in your node definition file, see Bare metal node provisioning attributes. For an example node definition file, see Example node definition file.
Run the provisioning command to provision the new nodes for your role:
openstack overcloud node provision \ --stack <stack> \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ --stack <stack> \ [--network-config \] --output /home/stack/templates/overcloud-baremetal-deployed.yaml \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you do not define the network definitions by using thenetwork_configproperty, then the default network definitions are used.
-
Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you did not run the provisioning command with the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputePCINetworkConfigTemplate: /home/stack/templates/nic-configs/<pci_passthrough_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputePCINetworkConfigTemplate: /home/stack/templates/nic-configs/<pci_passthrough_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<pci_passthrough_net_top>with the name of the file that contains the network topology of theComputePCIrole, for example,compute.yamlto use the default network topology.
8.2. Configuring a PCI passthrough Compute node Copy linkLink copied to clipboard!
To enable your cloud users to create instances with PCI devices attached, you must configure both the Compute nodes that have the PCI devices and the Controller nodes.
Procedure
-
Create an environment file to configure the Controller node on the overcloud for PCI passthrough, for example,
pci_passthrough_controller.yaml. Add
PciPassthroughFilterto theNovaSchedulerEnabledFiltersparameter inpci_passthrough_controller.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To specify the PCI alias for the devices on the Controller node, add the following configuration to
pci_passthrough_controller.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about configuring the
device_typefield, see PCI passthrough device type field.NoteIf the
nova-apiservice is running in a role different from theControllerrole, replaceControllerExtraConfigwith the user role in the format<Role>ExtraConfig.Optional: To set a default NUMA affinity policy for PCI passthrough devices, add
numa_policyto thenova::pci::aliases:configuration from step 3:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthrough_compute.yaml. To specify the available PCIs for the devices on the Compute node, use the
vendor_idandproduct_idoptions to add all matching PCI devices to the pool of PCI devices available for passthrough to instances. For example, to add all Intel® Ethernet Controller X710 devices to the pool of PCI devices available for passthrough to instances, add the following configuration topci_passthrough_compute.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about how to configure
NovaPCIPassthrough, see Guidelines for configuringNovaPCIPassthrough.You must create a copy of the PCI alias on the Compute node for instance migration and resize operations. To specify the PCI alias for the devices on the PCI passthrough Compute node, add the following to
pci_passthrough_compute.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe Compute node aliases must be identical to the aliases on the Controller node. Therefore, if you added
numa_affinitytonova::pci::aliasesinpci_passthrough_controller.yaml, then you must also add it tonova::pci::aliasesinpci_passthrough_compute.yaml.To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgsparameter topci_passthrough_compute.yaml:Use the following
KernalArgssettings to enable an Intel IOMMU:parameter_defaults: ... ComputePCIParameters: KernelArgs: "intel_iommu=on iommu=pt vfio-pci.ids=<pci_device_id> rd.driver.pre=vfio-pci"parameter_defaults: ... ComputePCIParameters: KernelArgs: "intel_iommu=on iommu=pt vfio-pci.ids=<pci_device_id> rd.driver.pre=vfio-pci"Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Replace <pci_device_id> with the PCI device ID, for example, 10de:1eb8.
Use the following KernalArgs settings to enable an AMD IOMMU:
parameter_defaults: ... ComputePCIParameters: KernelArgs: "amd_iommu=on iommu=pt"parameter_defaults: ... ComputePCIParameters: KernelArgs: "amd_iommu=on iommu=pt"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you first add the
KernelArgsparameter to the configuration of a role, the overcloud nodes are automatically rebooted. If required, you can disable the automatic rebooting of nodes and instead perform node reboots manually after each deployment. For more information, see Configuring manual node reboot to defineKernelArgs.
Add your custom environment files to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create and configure the flavors that your cloud users can use to request the PCI devices. The following example requests two devices, each with a vendor ID of
8086and a product ID of1572, using the alias defined in step 7:openstack flavor set \ --property "pci_passthrough:alias"="a1:2" device_passthrough
(overcloud)$ openstack flavor set \ --property "pci_passthrough:alias"="a1:2" device_passthroughCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To override the default NUMA affinity policy for PCI passthrough devices, you can add the NUMA affinity policy property key to the flavor or the image:
To override the default NUMA affinity policy by using the flavor, add the
hw:pci_numa_affinity_policyproperty key:openstack flavor set \ --property "hw:pci_numa_affinity_policy"="required" \ device_passthrough
(overcloud)$ openstack flavor set \ --property "hw:pci_numa_affinity_policy"="required" \ device_passthroughCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about the valid values for
hw:pci_numa_affinity_policy, see Flavor metadata.To override the default NUMA affinity policy by using the image, add the
hw_pci_numa_affinity_policyproperty key:openstack image set \ --property hw_pci_numa_affinity_policy=required \ device_passthrough_image
(overcloud)$ openstack image set \ --property hw_pci_numa_affinity_policy=required \ device_passthrough_imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you set the NUMA affinity policy on both the image and the flavor then the property values must match. The flavor setting takes precedence over the image and default settings. Therefore, the configuration of the NUMA affinity policy on the image only takes effect if the property is not set on the flavor.
Verification
Create an instance with a PCI passthrough device:
openstack server create --flavor device_passthrough \ --image <image> --wait test-pci
$ openstack server create --flavor device_passthrough \ --image <image> --wait test-pciCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud user. For more information, see Connecting to an instance.
To verify that the PCI device is accessible from the instance, enter the following command from the instance:
lspci -nn | grep <device_name>
$ lspci -nn | grep <device_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3. PCI passthrough device type field Copy linkLink copied to clipboard!
The Compute service categorizes PCI devices into one of three types, depending on the capabilities the devices report. The following lists the valid values that you can set the device_type field to:
type-PF- The device supports SR-IOV and is the parent or root device. Specify this device type to passthrough a device that supports SR-IOV in its entirety.
type-VF- The device is a child device of a device that supports SR-IOV.
type-PCI-
The device does not support SR-IOV. This is the default device type if the
device_typefield is not set.
You must configure the Compute and Controller nodes with the same device_type.
8.4. Guidelines for configuring NovaPCIPassthrough Copy linkLink copied to clipboard!
-
Do not use the
devnameparameter when configuring PCI passthrough, as the device name of a NIC can change. Instead, usevendor_idandproduct_idbecause they are more stable, or use theaddressof the NIC. -
To pass through a specific Physical Function (PF), you can use the
addressparameter because the PCI address is unique to each device. Alternatively, you can use theproduct_idparameter to pass through a PF, but you must also specify theaddressof the PF if you have multiple PFs of the same type. -
To pass through all the Virtual Functions (VFs) specify only the
product_idandvendor_idof the VFs that you want to use for PCI passthrough. You must also specify theaddressof the VF if you are using SRIOV for NIC partitioning and you are running OVS on a VF. -
To pass through only the VFs for a PF but not the PF itself, you can use the
addressparameter to specify the PCI address of the PF andproduct_idto specify the product ID of the VF.
Configuring the address parameter
The address parameter specifies the PCI address of the device. You can set the value of the address parameter using either a String or a dict mapping.
- String format
If you specify the address using a string you can include wildcards (*), as shown in the following example:
NovaPCIPassthrough: - address: "*:0a:00.*" physical_network: physnet1NovaPCIPassthrough: - address: "*:0a:00.*" physical_network: physnet1Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Dictionary format
If you specify the address using the dictionary format you can include regular expression syntax, as shown in the following example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The Compute service restricts the configuration of address fields to the following maximum values:
- domain - 0xFFFF
- bus - 0xFF
- slot - 0x1F
- function - 0x7
The Compute service supports PCI devices with a 16-bit address domain. The Compute service ignores PCI devices with a 32-bit address domain.
Chapter 9. Configuring VDPA Compute nodes to enable instances that use VDPA ports Copy linkLink copied to clipboard!
VIRTIO data path acceleration (VDPA) provides wirespeed data transfer over VIRTIO. A VDPA device provides a VIRTIO abstraction over an SR-IOV virtual function (VF), which enables VFs to be consumed without a vendor-specific driver on the instance.
When you use a NIC as a VDPA interface it must be dedicated to the VDPA interface. You cannot use the NIC for other connections because you must configure the NIC’s physical function (PF) in switchdev mode and manage the PF by using hardware offloaded OVS.
To enable your cloud users to create instances that use VDPA ports, complete the following tasks:
- Optional: Designate Compute nodes for VDPA.
- Configure the Compute nodes for VDPA that have the required VDPA drivers.
- Deploy the overcloud.
If the VDPA hardware is limited, you can also configure a host aggregate to optimize scheduling on the VDPA Compute nodes. To schedule only instances that request VDPA on the VDPA Compute nodes, create a host aggregate of the Compute nodes that have the VDPA hardware, and configure the Compute scheduler to place only VDPA instances on the host aggregate. For more information, see Filtering by isolating host aggregates and Creating and managing host aggregates.
Prerequisites
- Your Compute nodes have the required VDPA devices and drivers.
- Your Compute nodes have Mellanox NICs.
- Your overcloud is configured for OVS hardware offload. For more information, see Configuring OVS hardware offload.
- Your overcloud is configured to use ML2/OVN.
9.1. Designating Compute nodes for VDPA Copy linkLink copied to clipboard!
To designate Compute nodes for instances that request a VIRTIO data path acceleration (VDPA) interface, create a new role file to configure the VDPA role, and configure the bare-metal nodes with a VDPA resource class to tag the Compute nodes for VDPA.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, scale down the overcloud to unprovision the node, then scale up the overcloud to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud host as the
stackuser. Source the
stackrcundercloud credentials file:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file named
roles_data_vdpa.yamlthat includes theController,Compute, andComputeVdparoles:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_vdpa.yaml \ ComputeVdpa Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_vdpa.yaml \ ComputeVdpa Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the
roles_data_vdpa.yamlfile for the VDPA role:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Register the VDPA Compute nodes for the overcloud by adding them to your node definition template:
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect \ --all-manageable --provide
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare-metal node that you want to designate for VDPA with a custom VDPA resource class:
openstack baremetal node set \ --resource-class baremetal.VDPA <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.VDPA <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the name or UUID of the bare-metal node.Add the
ComputeVdparole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<role_topology_file>with the name of the topology file to use for theComputeVdparole, for example,vdpa_net_top.j2. You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. To use the default network definition settings, do not includenetwork_configin the role definition.
For more information about the properties that you can use to configure node attributes in your node definition file, see Bare-metal node provisioning attributes. For an example node definition file, see Example node definition file.
-
Replace
Open your network interface template,
vdpa_net_top.j2, and add the following configuration to specify your VDPA-supported network interfaces as a member of the OVS bridge:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Provision the new nodes for your role:
openstack overcloud node provision \ [--stack <stack>] \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ [--stack <stack>] \ [--network-config \] --output <deployment_file> \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Optional: Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. Defaults toovercloud. -
Optional: Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you have not defined the network definitions in the node definition file by using thenetwork_configproperty, then the default network definitions are used. -
Replace
<deployment_file>with the name of the heat environment file to generate for inclusion in the deployment command, for example/home/stack/templates/overcloud-baremetal-deployed.yaml.
-
Optional: Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you ran the provisioning command without the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeVdpaNetworkConfigTemplate: /home/stack/templates/nic-configs/<role_topology_file> ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeVdpaNetworkConfigTemplate: /home/stack/templates/nic-configs/<role_topology_file> ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<role_topology_file>with the name of the file that contains the network topology of theComputeVdparole, for example,vdpa_net_top.j2. Set tocompute.j2to use the default network topology.
9.2. Configuring a VDPA Compute node Copy linkLink copied to clipboard!
To enable your cloud users to create instances that use VIRTIO data path acceleration (VDPA) ports, configure the Compute nodes that have the VDPA devices.
Procedure
-
Create a new Compute environment file for configuring VDPA Compute nodes, for example,
vdpa_compute.yaml. Add
PciPassthroughFilterandNUMATopologyFilterto theNovaSchedulerEnabledFiltersparameter invdpa_compute.yaml:parameter_defaults: NovaSchedulerEnabledFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
parameter_defaults: NovaSchedulerEnabledFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
NovaPCIPassthroughparameter tovdpa_compute.yamlto specify the available PCIs for the VDPA devices on the Compute node. For example, to add NVIDIA® ConnectX®-6 Dx devices to the pool of PCI devices that are available for passthrough to instances, add the following configuration tovdpa_compute.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about how to configure
NovaPCIPassthrough, see Guidelines for configuringNovaPCIPassthrough.Enable the input–output memory management unit (IOMMU) in each Compute node BIOS by adding the
KernelArgsparameter tovdpa_compute.yaml. For example, use the followingKernalArgssettings to enable an Intel Corporation IOMMU:parameter_defaults: ... ComputeVdpaParameters: ... KernelArgs: "intel_iommu=on iommu=pt"parameter_defaults: ... ComputeVdpaParameters: ... KernelArgs: "intel_iommu=on iommu=pt"Copy to Clipboard Copied! Toggle word wrap Toggle overflow To enable an AMD IOMMU, set
KernelArgsto"amd_iommu=on iommu=pt".NoteWhen you first add the
KernelArgsparameter to the configuration of a role, the overcloud nodes automatically reboot during overcloud deployment. If required, you can disable the automatic rebooting of nodes and instead perform node reboots manually after each overcloud deployment. For more information, see Configuring manual node reboot to defineKernelArgs.Open your network environment file, and add the following configuration to define the physical network:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<bridge_map_1>, and all bridge mappings up to<bridge_map_n>, with the logical to physical bridge mappings that you want to use for the VDPA bridge. For example,tenant:br-tenant. -
Replace
<tunnel_types>with a comma-separated list of the tunnel types for the project network. For example,geneve. -
Replace
<network_types>with a comma-separated list of the project network types for the Networking service (neutron). The first type that you specify is used until all available networks are exhausted, then the next type is used. For example,geneve,vlan. -
Replace
<network_vlan_range_1>, and all physical network and VLAN ranges up to<network_vlan_range_n>, with the ML2 and OVN VLAN mapping ranges that you want to support. For example,datacentre:1:1000,tenant:100:299.
-
Replace
Add your custom environment files to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
- Create an instance with a VDPA device. For more information, see Creating an instance with a VDPA interface in the Creating and managing instances guide.
- Log in to the instance as a cloud user. For more information, see Connecting to an instance in the Creating and managing instances guide.
Verify that the VDPA device is accessible from the instance:
openstack port show vdpa-port
$ openstack port show vdpa-portCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 10. Configuring virtual GPUs for instances Copy linkLink copied to clipboard!
To support GPU-based rendering on your instances, you can define and manage virtual GPU (vGPU) resources according to your available physical GPU devices and your hypervisor type. You can use this configuration to divide the rendering workloads between all your physical GPU devices more effectively, and to have more control over scheduling your vGPU-enabled instances.
To enable vGPU in the Compute (nova) service, create flavors that your cloud users can use to create Red Hat Enterprise Linux (RHEL) instances with vGPU devices. Each instance can then support GPU workloads with virtual GPU devices that correspond to the physical GPU devices.
The Compute service tracks the number of vGPU devices that are available for each GPU profile you define on each host. The Compute service schedules instances to these hosts based on the flavor, attaches the devices, and monitors usage on an ongoing basis. When an instance is deleted, the Compute service adds the vGPU devices back to the available pool.
Red Hat enables the use of NVIDIA vGPU in RHOSP without the requirement for support exceptions. However, Red Hat does not provide technical support for the NVIDIA vGPU drivers. The NVIDIA vGPU drivers are shipped and supported by NVIDIA. You require an NVIDIA Certified Support Services subscription to obtain NVIDIA Enterprise Support for NVIDIA vGPU software. For issues that result from the use of NVIDIA vGPUs where you are unable to reproduce the issue on a supported component, the following support policies apply:
- When Red Hat does not suspect that the third-party component is involved in the issue, the normal Scope of Support and Red Hat SLA apply.
- When Red Hat suspects that the third-party component is involved in the issue, the customer will be directed to NVIDIA in line with the Red Hat third party support and certification policies. For more information, see the Knowledge Base article Obtaining Support from NVIDIA.
10.1. Supported configurations and limitations Copy linkLink copied to clipboard!
Supported GPU cards
For a list of supported NVIDIA GPU cards, see Virtual GPU Software Supported Products on the NVIDIA website.
Limitations when using vGPU devices
- Each instance can use only one vGPU resource.
- Live migration of vGPU instances between hosts is not supported.
- Evacuation of vGPU instances is not supported.
If you need to reboot the Compute node that hosts the vGPU instances, the vGPUs are not automatically reassigned to the recreated instances. You must either cold migrate the instances before you reboot the Compute node, or manually allocate each vGPU to the correct instance after reboot. To manually allocate each vGPU, you must retrieve the
mdevUUID from the instance XML for each vGPU instance that runs on the Compute node before you reboot. You can use one of the following commands to discover themdevUUID for each instance:For a RHEL version 8.4 Compute node:
sudo podman exec -it nova_libvirt virsh dumpxml <instance_name> | grep mdev
$ sudo podman exec -it nova_libvirt virsh dumpxml <instance_name> | grep mdevCopy to Clipboard Copied! Toggle word wrap Toggle overflow For a RHEL version 9.2 Compute node:
sudo podman exec -it nova_virtqemud virsh dumpxml <instance_name> | grep mdev
$ sudo podman exec -it nova_virtqemud virsh dumpxml <instance_name> | grep mdevCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance_name>with the libvirt instance name,OS-EXT-SRV-ATTR:instance_name, returned in a/serversrequest to the Compute API.
- Suspend operations on a vGPU-enabled instance is not supported due to a libvirt limitation. Instead, you can snapshot or shelve the instance.
- By default, vGPU types on Compute hosts are not exposed to API users. To expose the vGPU types on Compute hosts to API users, you must configure resource provider traits and create flavors that require the traits. For more information, see Creating a custom vGPU resource provider trait. Alternatively, if you only have one vGPU type, you can grant access by adding the hosts to a host aggregate. For more information, see Creating and managing host aggregates.
- If you use NVIDIA accelerator hardware, you must comply with the NVIDIA licensing requirements. For example, NVIDIA vGPU GRID requires a licensing server. For more information about the NVIDIA licensing requirements, see NVIDIA License Server Release Notes on the NVIDIA website.
10.2. Configuring vGPU on the Compute nodes Copy linkLink copied to clipboard!
To enable your cloud users to create instances that use a virtual GPU (vGPU), you must configure the Compute nodes that have the physical GPUs:
- Designate Compute nodes for vGPU.
- Configure the Compute node for vGPU.
- Deploy the overcloud.
- Optional: Create custom traits for vGPU types.
- Optional: Create a custom GPU instance image.
- Create a vGPU flavor for launching instances that have vGPU.
If the GPU hardware is limited, you can also configure a host aggregate to optimize scheduling on the vGPU Compute nodes. To schedule only instances that request vGPUs on the vGPU Compute nodes, create a host aggregate of the vGPU Compute nodes, and configure the Compute scheduler to place only vGPU instances on the host aggregate. For more information, see Creating and managing host aggregates and Filtering by isolating host aggregates.
To use an NVIDIA GRID vGPU, you must comply with the NVIDIA GRID licensing requirements and you must have the URL of your self-hosted license server. For more information, see the NVIDIA License Server Release Notes web page.
10.2.1. Prerequisites Copy linkLink copied to clipboard!
- You have downloaded the NVIDIA GRID host driver RPM package that corresponds to your GPU device from the NVIDIA website. To determine which driver you need, see the NVIDIA Driver Downloads Portal. You must be a registered NVIDIA customer to download the drivers from the portal.
- You have built a custom overcloud image that has the NVIDIA GRID host driver installed.
10.2.2. Designating Compute nodes for vGPU Copy linkLink copied to clipboard!
To designate Compute nodes for vGPU workloads, you must create a new role file to configure the vGPU role, and configure the bare metal nodes with a GPU resource class to use to tag the GPU-enabled Compute nodes.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, you must use the scale down procedure to unprovision the node, then use the scale up procedure to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file named
roles_data_gpu.yamlthat includes theController,Compute, andComputeGpuroles, along with any other roles that you need for the overcloud:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_gpu.yaml \ Compute:ComputeGpu Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_gpu.yaml \ Compute:ComputeGpu Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Open
roles_data_gpu.yamland edit or add the following parameters and sections:Expand Section/Parameter Current value New value Role comment
Role: ComputeRole: ComputeGpuRole name
name: Computename: ComputeGpudescriptionBasic Compute Node roleGPU Compute Node roleHostnameFormatDefault-compute--computegpu-deprecated_nic_config_namecompute.yamlcompute-gpu.yaml-
Register the GPU-enabled Compute nodes for the overcloud by adding them to your node definition template,
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect --all-manageable \ --provide
(undercloud)$ openstack overcloud node introspect --all-manageable \ --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare metal node that you want to designate for GPU workloads with a custom GPU resource class:
openstack baremetal node set \ --resource-class baremetal.GPU <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.GPU <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the ID of the baremetal node.Add the
ComputeGpurole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. If you do not define the network definitions by using the
network_configproperty, then the default network definitions are used.
For more information about the properties you can use to configure node attributes in your node definition file, see Bare metal node provisioning attributes. For an example node definition file, see Example node definition file.
Run the provisioning command to provision the new nodes for your role:
openstack overcloud node provision \ --stack <stack> \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ --stack <stack> \ [--network-config \] --output /home/stack/templates/overcloud-baremetal-deployed.yaml \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you do not define the network definitions by using thenetwork_configproperty, then the default network definitions are used.
-
Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you did not run the provisioning command with the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeGpuNetworkConfigTemplate: /home/stack/templates/nic-configs/<gpu_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeGpuNetworkConfigTemplate: /home/stack/templates/nic-configs/<gpu_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<gpu_net_top>with the name of the file that contains the network topology of theComputeGpurole, for example,compute.yamlto use the default network topology.
10.2.3. Configuring the Compute node for vGPU and deploying the overcloud Copy linkLink copied to clipboard!
You need to retrieve and assign the vGPU type that corresponds to the physical GPU device in your environment, and prepare the environment files to configure the Compute node for vGPU.
Procedure
- Install Red Hat Enterprise Linux and the NVIDIA GRID driver on a temporary Compute node and launch the node.
Virtual GPUs are mediated devices, or
mdevtype devices. Retrieve the PCI address for eachmdevdevice on each Compute node:ls /sys/class/mdev_bus/
$ ls /sys/class/mdev_bus/Copy to Clipboard Copied! Toggle word wrap Toggle overflow The PCI address is used as the device driver directory name, for example,
0000:84:00.0.Review the supported
mdevtypes for each available pGPU device on each Compute node to discover the available vGPU types:ls /sys/class/mdev_bus/<mdev_device>/mdev_supported_types
$ ls /sys/class/mdev_bus/<mdev_device>/mdev_supported_typesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<mdev_device>with the PCI address for themdevdevice, for example,0000:84:00.0.For example, the following Compute node has 4 pGPUs, and each pGPU supports the same 11 vGPU types:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
gpu.yamlfile to specify the vGPU types that each GPU device supports:parameter_defaults: ComputeGpuExtraConfig: nova::compute::vgpu::enabled_vgpu_types: - nvidia-35 - nvidia-36parameter_defaults: ComputeGpuExtraConfig: nova::compute::vgpu::enabled_vgpu_types: - nvidia-35 - nvidia-36Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To configure more than one vGPU type, map the supported vGPU types to the pGPUs:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<vgpu_type>with the name of the vGPU type to create a label for the vGPU group, for example,vgpu_nvidia-35. Use a comma-separated list ofvgpu_<vgpu_type>definitions to map additional vGPU types. Replace
<pci_address>with the PCI address of a pGPU device that supports the vGPU type, for example,0000:84:00.0. Use a comma-separated list of<pci_address>definitions to map the vGPU group to additional pGPUs.Example:
NovaVGPUTypesDeviceAddressesMapping: {'vgpu_nvidia-35': ['0000:84:00.0', '0000:85:00.0'],'vgpu_nvidia-36': ['0000:86:00.0']}-
The
nvidia-35vGPU type is supported by the pGPUs that are in the PCI addresses0000:84:00.0and0000:85:00.0. -
The
nvidia-36vGPU type is supported only by the pGPUs that are in the PCI address0000:86:00.0.
-
The
-
Replace
- Save the updates to your Compute environment file.
Add your new role and environment files to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
10.3. Creating a custom vGPU resource provider trait Copy linkLink copied to clipboard!
You can create custom resource provider traits for each vGPU type that your RHOSP environment supports. You can then create flavors that your cloud users can use to launch instances on hosts that have those custom traits. Custom traits are defined in uppercase letters, and must begin with the prefix CUSTOM_. For more information on resource provider traits, see Filtering by resource provider traits.
Procedure
Create a new trait:
openstack --os-placement-api-version 1.6 trait \ create CUSTOM_<TRAIT_NAME>
(overcloud)$ openstack --os-placement-api-version 1.6 trait \ create CUSTOM_<TRAIT_NAME>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<TRAIT_NAME>with the name of the trait. The name can contain only the letters A through Z, the numbers 0 through 9 and the underscore "_" character.
-
Replace
Collect the existing resource provider traits of each host:
existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')
(overcloud)$ existing_traits=$(openstack --os-placement-api-version 1.6 resource provider trait list -f value <host_uuid> | sed 's/^/--trait /')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the existing resource provider traits for the traits you require a host or host aggregate to have:
echo $existing_traits
(overcloud)$ echo $existing_traitsCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the traits you require are not already added to the resource provider, then add the existing traits and your required traits to the resource providers for each host:
openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait CUSTOM_<TRAIT_NAME> \ <host_uuid>
(overcloud)$ openstack --os-placement-api-version 1.6 \ resource provider trait set $existing_traits \ --trait CUSTOM_<TRAIT_NAME> \ <host_uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<TRAIT_NAME>with the name of the trait that you want to add to the resource provider. You can use the--traitoption more than once to add additional traits, as required.NoteThis command performs a full replacement of the traits for the resource provider. Therefore, you must retrieve the list of existing resource provider traits on the host and set them again to prevent them from being removed.
10.4. Creating a custom GPU instance image Copy linkLink copied to clipboard!
To enable your cloud users to create instances that use a virtual GPU (vGPU), you can create a custom vGPU-enabled image for launching instances. Use the following procedure to create a custom vGPU-enabled instance image with the NVIDIA GRID guest driver and license file.
Prerequisites
- You have configured and deployed the overcloud with GPU-enabled Compute nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
overcloudrccredential file:source ~/overcloudrc
$ source ~/overcloudrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an instance with the hardware and software profile that your vGPU instances require:
openstack server create --flavor <flavor> \ --image <image> temp_vgpu_instance
(overcloud)$ openstack server create --flavor <flavor> \ --image <image> temp_vgpu_instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<flavor>with the name or ID of the flavor that has the hardware profile that your vGPU instances require. For information about creating a vGPU flavor, see Creating a vGPU flavor for instances. -
Replace
<image>with the name or ID of the image that has the software profile that your vGPU instances require. For information about downloading RHEL cloud images, see Creating RHEL KVM or RHOSP-compatible images in Creating and managing images.
-
Replace
- Log in to the instance as a cloud-user.
-
Create the
gridd.confNVIDIA GRID license file on the instance, following the NVIDIA guidance: Licensing an NVIDIA vGPU on Linux by Using a Configuration File. Install the GPU driver on the instance. For more information about installing an NVIDIA driver, see Installing the NVIDIA vGPU Software Graphics Driver on Linux.
NoteUse the
hw_video_modelimage property to define the GPU driver type. You can choosenoneif you want to disable the emulated GPUs for your vGPU instances. For more information about supported drivers, see Image configuration parameters.Create an image snapshot of the instance:
openstack server image create \ --name vgpu_image temp_vgpu_instance
(overcloud)$ openstack server image create \ --name vgpu_image temp_vgpu_instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Optional: Delete the instance.
10.5. Creating a vGPU flavor for instances Copy linkLink copied to clipboard!
To enable your cloud users to create instances for GPU workloads, you can create a GPU flavor that can be used to launch vGPU instances, and assign the vGPU resource to that flavor.
Prerequisites
- You have configured and deployed the overcloud with GPU-designated Compute nodes.
Procedure
Create an NVIDIA GPU flavor, for example:
openstack flavor create --vcpus 6 \ --ram 8192 --disk 100 m1.small-gpu
(overcloud)$ openstack flavor create --vcpus 6 \ --ram 8192 --disk 100 m1.small-gpuCopy to Clipboard Copied! Toggle word wrap Toggle overflow Assign a vGPU resource to the flavor:
openstack flavor set m1.small-gpu \ --property "resources:VGPU=1"
(overcloud)$ openstack flavor set m1.small-gpu \ --property "resources:VGPU=1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou can assign only one vGPU for each instance.
Optional: To customize the flavor for a specific vGPU type, add a required trait to the flavor:
openstack flavor set m1.small-gpu \ --property trait:CUSTOM_NVIDIA_11=required
(overcloud)$ openstack flavor set m1.small-gpu \ --property trait:CUSTOM_NVIDIA_11=requiredCopy to Clipboard Copied! Toggle word wrap Toggle overflow For information on how to create custom resource provider traits for each vGPU type, see Creating a custom vGPU resource provider trait.
10.6. Launching a vGPU instance Copy linkLink copied to clipboard!
You can create a GPU-enabled instance for GPU workloads.
Procedure
Create an instance using a GPU flavor and image, for example:
openstack server create --flavor m1.small-gpu \ --image vgpu_image --security-group web --nic net-id=internal0 \ --key-name lambda vgpu-instance
(overcloud)$ openstack server create --flavor m1.small-gpu \ --image vgpu_image --security-group web --nic net-id=internal0 \ --key-name lambda vgpu-instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud-user.
To verify that the GPU is accessible from the instance, enter the following command from the instance:
lspci -nn | grep <gpu_name>
$ lspci -nn | grep <gpu_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
10.7. Enabling PCI passthrough for a GPU device Copy linkLink copied to clipboard!
You can use PCI passthrough to attach a physical PCI device, such as a graphics card, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Prerequisites
-
The
pciutilspackage is installed on the physical servers that have the PCI cards. - The driver for the GPU device must be installed on the instance that the device is passed through to. Therefore, you need to have created a custom instance image that has the required GPU driver installed. For more information about how to create a custom instance image with the GPU driver installed, see Creating a custom GPU instance image.
Procedure
To determine the vendor ID and product ID for each passthrough device type, enter the following command on the physical server that has the PCI cards:
lspci -nn | grep -i <gpu_name>
# lspci -nn | grep -i <gpu_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to determine the vendor and product ID for an NVIDIA GPU, enter the following command:
lspci -nn | grep -i nvidia
# lspci -nn | grep -i nvidia 3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1) d8:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1db4] (rev a1)Copy to Clipboard Copied! Toggle word wrap Toggle overflow To determine if each PCI device has Single Root I/O Virtualization (SR-IOV) capabilities, enter the following command on the physical server that has the PCI cards:
lspci -v -s 3b:00.0
# lspci -v -s 3b:00.0 3b:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1) ... Capabilities: [bcc] Single Root I/O Virtualization (SR-IOV) ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
To configure the Controller node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_controller.yaml. Add
PciPassthroughFilterto theNovaSchedulerEnabledFiltersparameter inpci_passthru_controller.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To specify the PCI alias for the devices on the Controller node, add the following configuration to
pci_passthru_controller.yaml:If the PCI device has SR-IOV capabilities:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the PCI device does not have SR-IOV capabilities:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on configuring the
device_typefield, see PCI passthrough device type field.NoteIf the
nova-apiservice is running in a role other than the Controller, then replaceControllerExtraConfigwith the user role, in the format<Role>ExtraConfig.
-
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_compute.yaml. To specify the available PCIs for the devices on the Compute node, add the following to
pci_passthru_compute.yaml:parameter_defaults: NovaPCIPassthrough: - vendor_id: "10de" product_id: "1eb8"parameter_defaults: NovaPCIPassthrough: - vendor_id: "10de" product_id: "1eb8"Copy to Clipboard Copied! Toggle word wrap Toggle overflow You must create a copy of the PCI alias on the Compute node for instance migration and resize operations. To specify the PCI alias for the devices on the Compute node, add the following to
pci_passthru_compute.yaml:If the PCI device has SR-IOV capabilities:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the PCI device does not have SR-IOV capabilities:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe Compute node aliases must be identical to the aliases on the Controller node.
To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgsparameter topci_passthru_compute.yaml:parameter_defaults: ... ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"parameter_defaults: ... ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you first add the
KernelArgsparameter to the configuration of a role, the overcloud nodes are automatically rebooted. If required, you can disable the automatic rebooting of nodes and instead perform node reboots manually after each overcloud deployment. For more information, see Configuring manual node reboot to defineKernelArgs.Add your custom environment files to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/pci_passthru_controller.yaml \ -e /home/stack/templates/pci_passthru_compute.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/pci_passthru_controller.yaml \ -e /home/stack/templates/pci_passthru_compute.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure a flavor to request the PCI devices. The following example requests two devices, each with a vendor ID of
10deand a product ID of13f2:openstack flavor set m1.large \ --property "pci_passthrough:alias"="t4:2"
# openstack flavor set m1.large \ --property "pci_passthrough:alias"="t4:2"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Create an instance with a PCI passthrough device:
openstack server create --flavor m1.large \ --image <custom_gpu> --wait test-pci
# openstack server create --flavor m1.large \ --image <custom_gpu> --wait test-pciCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<custom_gpu>with the name of your custom instance image that has the required GPU driver installed.- Log in to the instance as a cloud user. For more information, see Connecting to an instance.
To verify that the GPU is accessible from the instance, enter the following command from the instance:
lspci -nn | grep <gpu_name>
$ lspci -nn | grep <gpu_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To check the NVIDIA System Management Interface status, enter the following command from the instance:
nvidia-smi
$ nvidia-smiCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 11. Adding metadata to instances Copy linkLink copied to clipboard!
The Compute (nova) service uses metadata to pass configuration information to instances on launch. The instance can access the metadata by using a config drive or the metadata service.
- Config drive
- Config drives are special drives that you can attach to an instance when it boots. The config drive is presented to the instance as a read-only drive. The instance can mount this drive and read files from it to get information that is normally available through the metadata service.
- Metadata service
-
The Compute service provides the metadata service as a REST API, which can be used to retrieve data specific to an instance. Instances access this service at
169.254.169.254or atfe80::a9fe:a9fe.
11.1. Types of instance metadata Copy linkLink copied to clipboard!
Cloud users, cloud administrators, and the Compute service can pass metadata to instances:
- Cloud user provided data
- Cloud users can specify additional data to use when they launch an instance, such as a shell script that the instance runs on boot. The cloud user can pass data to instances by using the user data feature, and by passing key-value pairs as required properties when creating or updating an instance.
- Cloud administrator provided data
The RHOSP administrator uses the vendordata feature to pass data to instances. The Compute service provides the vendordata modules
StaticJSONandDynamicJSONto allow administrators to pass metadata to instances:-
StaticJSON: (Default) Use for metadata that is the same for all instances. -
DynamicJSON: Use for metadata that is different for each instance. This module makes a request to an external REST service to determine what metadata to add to an instance.
Vendordata configuration is located in one of the following read-only files on the instance:
-
/openstack/{version}/vendor_data.json -
/openstack/{version}/vendor_data2.json
-
- Compute service provided data
- The Compute service uses its internal implementation of the metadata service to pass information to the instance, such as the requested hostname for the instance, and the availability zone the instance is in. This happens by default and requires no configuration by the cloud user or administrator.
11.2. Adding a config drive to all instances Copy linkLink copied to clipboard!
As an administrator, you can configure the Compute service to always create a config drive for instances, and populate the config drive with metadata that is specific to your deployment. For example, you might use a config drive for the following reasons:
- To pass a networking configuration when your deployment does not use DHCP to assign IP addresses to instances. You can pass the IP address configuration for the instance through the config drive, which the instance can mount and access before you configure the network settings for the instance.
- To pass data to an instance that is not known to the user starting the instance, for example, a cryptographic token to be used to register the instance with Active Directory post boot.
- To create a local cached disk read to manage the load of instance requests, which reduces the impact of instances accessing the metadata servers regularly to check in and build facts.
Any instance operating system that is capable of mounting an ISO 9660 or VFAT file system can use the config drive.
Procedure
- Open your Compute environment file.
To always attach a config drive when launching an instance, set the following parameter to
True:parameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true'parameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To change the format of the config drive from the default value of
iso9660tovfat, add theconfig_drive_formatparameter to your configuration:parameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true' nova::compute::config_drive_format: vfatparameter_defaults: ComputeExtraConfig: nova::compute::force_config_drive: 'true' nova::compute::config_drive_format: vfatCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml \Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Create an instance:
openstack server create --flavor m1.tiny \ --image cirros test-config-drive-instance
(overcloud)$ openstack server create --flavor m1.tiny \ --image cirros test-config-drive-instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance.
Mount the config drive:
If the instance OS uses
udev:mkdir -p /mnt/config mount /dev/disk/by-label/config-2 /mnt/config
# mkdir -p /mnt/config # mount /dev/disk/by-label/config-2 /mnt/configCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the instance OS does not use
udev, you need to first identify the block device that corresponds to the config drive:blkid -t LABEL="config-2" -odevice mkdir -p /mnt/config mount /dev/vdb /mnt/config
# blkid -t LABEL="config-2" -odevice /dev/vdb # mkdir -p /mnt/config # mount /dev/vdb /mnt/configCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Inspect the files in the mounted config drive directory,
mnt/config/openstack/{version}/, for your metadata.
11.3. Adding dynamic metadata to instances Copy linkLink copied to clipboard!
You can configure your deployment to create instance-specific metadata, and make the metadata available to that instance through a JSON file.
You can use dynamic metadata on the undercloud to integrate director with a Red Hat Identity Management (IdM) server. An IdM server can be used as a certificate authority and manage the overcloud certificates when SSL/TLS is enabled on the overcloud. For more information, see Implementing TLS-e with Ansible in the Hardening Red Hat OpenStack Platform.
Procedure
- Open your Compute environment file.
Add
DynamicJSONto the vendordata provider module:parameter_defaults: ControllerExtraConfig: nova::vendordata::vendordata_providers: - DynamicJSONparameter_defaults: ControllerExtraConfig: nova::vendordata::vendordata_providers: - DynamicJSONCopy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the REST services to contact to generate the metadata. You can specify as many target REST services as required, for example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The Compute service generates the JSON file,
vendordata2.json, to contain the metadata retrieved from the configured target services, and stores it in the config drive directory.NoteDo not use the same name for a target service more than once.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 12. Configuring manual node reboot to define KernelArgs Copy linkLink copied to clipboard!
Overcloud nodes are automatically rebooted when the overcloud deployment includes setting the KernelArgs for the first time. Rebooting nodes can be an issue for existing workloads if you are adding KernelArgs to a deployment that is already in production. You can disable the automatic rebooting of nodes when updating a deployment, and instead perform node reboots manually after each overcloud deployment.
If you disable automatic reboot and then add new Compute nodes to your deployment, the new nodes will not be rebooted during their initial provisioning. This might cause deployment errors because the configuration of KernelArgs is applied only after a reboot.
12.1. Configuring manual node reboot to define KernelArgs Copy linkLink copied to clipboard!
You can disable the automatic rebooting of nodes when you configure KernelArgs for the first time, and instead reboot the nodes manually.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the
KernelArgsDeferRebootrole parameter in a custom environment file, for example,kernelargs_manual_reboot.yaml:parameter_defaults: <Role>Parameters: KernelArgsDeferReboot: Trueparameter_defaults: <Role>Parameters: KernelArgsDeferReboot: TrueCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add your custom environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/kernelargs_manual_reboot.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/kernelargs_manual_reboot.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Retrieve a list of your Compute nodes to identify the host name of the node that you want to reboot:
source ~/overcloudrc openstack compute service list
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack compute service listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the Compute service on the Compute node you want to reboot, to prevent the Compute scheduler from assigning new instances to the node:
openstack compute service set <node> nova-compute --disable
(overcloud)$ openstack compute service set <node> nova-compute --disableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the host name of the node you want to disable the Compute service on.Retrieve a list of the instances hosted on the Compute node that you want to migrate:
openstack server list --host <node_UUID> --all-projects
(overcloud)$ openstack server list --host <node_UUID> --all-projectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Migrate the instances to another Compute node. For information on migrating instances, see Migrating virtual machine instances between Compute nodes.
- Log in to the node that you want to reboot.
Reboot the node:
sudo reboot
[tripleo-admin@overcloud-compute-0 ~]$ sudo rebootCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait until the node boots.
Re-enable the Compute node:
openstack compute service set <node_UUID> nova-compute --enable
(overcloud)$ openstack compute service set <node_UUID> nova-compute --enableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the Compute node is enabled:
openstack compute service list
(overcloud)$ openstack compute service listCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 13. Configuring instance security Copy linkLink copied to clipboard!
As a cloud administrator, you can configure the following security features for the instances that run on your cloud:
-
UEFI Secure boot: You can create a UEFI Secure Boot flavor with the property key
os:secure_bootenabled. Cloud users can use this flavor to create instances that are protected with UEFI Secure Boot. For more information, see UEFI Secure Boot. - VNC console security: You can secure connections to the VNC console for an instance by configuring the allowed TLS ciphers and the minimum protocol version to enforce for incoming client connections to the VNC proxy service. For more information, see Securing connections to the VNC console of an instance.
- Emulated virtual Trusted Platform Module (vTPM): You can provide cloud users the ability to create instances that have emulated vTPM devices. For more information, see Configuring Compute nodes to provide emulated Trusted Platform Module (TPM) devices for instances.
- SEV: Use to enable your cloud users to create instances that use memory encryption. For more information, see Configuring AMD SEV Compute nodes to provide memory encryption for instances.
13.1. Securing connections to the VNC console of an instance Copy linkLink copied to clipboard!
You can secure connections to the VNC console for an instance by configuring the allowed TLS ciphers and the minimum protocol version to enforce for incoming client connections to the VNC proxy service.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Open your Compute environment file.
Configure the minimum protocol version to use for VNC console connections to instances:
parameter_defaults: ... NovaVNCProxySSLMinimumVersion: <version>
parameter_defaults: ... NovaVNCProxySSLMinimumVersion: <version>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<version>with the minimum allowed SSL/TLS protocol version. Set to one of the following valid values:-
default: Uses the underlying system OpenSSL defaults. tlsv1_1: Use if you have clients that do not support a later version.NoteTLS 1.0 and TLS 1.1 are deprecated in RHEL 8, and not supported in RHEL 9.
-
tlsv1_2: Use if you want to configure the SSL/TLS ciphers to use for VNC console connections to instances. -
tlsv1_3: Use if you want to use the standard cipher library for TLSv1.3. Configuration of theNovaVNCProxySSLCiphersparameter is ignored.
-
If you set the minimum allowed SSL/TLS protocol version to
tlsv1_2, then configure the SSL/TLS ciphers to use for VNC console connections to instances:parameter_defaults: NovaVNCProxySSLCiphers: <ciphers>
parameter_defaults: NovaVNCProxySSLCiphers: <ciphers>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<ciphers>with a colon-delimited list of the cipher suites to allow. Retrieve the list of available ciphers fromopenssl.Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
13.2. Configuring Compute nodes to provide emulated Trusted Platform Module (TPM) devices for instances Copy linkLink copied to clipboard!
As a cloud administrator, you can provide cloud users the ability to create instances that have emulated virtual Trusted Platform Module (vTPM) devices.
To enable your cloud users to create instances that have vTPM devices, you must perform the following tasks:
- Enable support for instances with vTPM devices and deploy the overcloud.
- Create a flavor or image for launching instances with vTPM devices.
Prerequisites
- The Key Manager service (barbican) is included in your RHOSP deployment to store vTPM keys. For information on managing secrets with the Key Manager service, see Managing secrets with the Key Manager service.
Limitations of instances with vTPM devices
- You cannot live migrate or evacuate instances that have vTPM devices.
- You cannot rescue or shelve instances that have vTPM devices.
- The instance must have the Q35 machine type.
13.2.1. Enabling support for instances with vTPM devices Copy linkLink copied to clipboard!
To enable your cloud users to create instances that have vTPM devices, you must configure the overcloud to enable vTPM devices for instances.
Procedure
- Open your Compute environment file.
Enable support for vTPM devices:
parameter_defaults: ComputeParameters: ... NovaEnableVTPM: Trueparameter_defaults: ComputeParameters: ... NovaEnableVTPM: TrueCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e /home/stack/templates/overcloud-baremetal-deployed.yaml \ -e /home/stack/templates/node-info.yaml \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e /home/stack/templates/overcloud-baremetal-deployed.yaml \ -e /home/stack/templates/node-info.yaml \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
13.2.2. Creating an image for vTPM devices Copy linkLink copied to clipboard!
When the overcloud is enabled to create instances that have vTPM devices, you can create a vTPM device instance image that your cloud users can use to launch instances that have vTPM devices.
If both the flavor and the image specify a TPM device model and the two values do not match, scheduling fails.
Procedure
Create a new image for vTPM devices:
openstack image create ... \ --property hw_tpm_version=2.0 vtpm-image
(overcloud)$ openstack image create ... \ --property hw_tpm_version=2.0 vtpm-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTPM version
1.2is not supported.Optional: Specify the TPM model to use:
openstack image set \ --property hw_tpm_model=<tpm_model> \ vtpm-image
(overcloud)$ openstack image set \ --property hw_tpm_model=<tpm_model> \ vtpm-imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<tpm_model>with the model of TPM device to use. Set to one of the following valid values:-
tpm-tis: (Default) TPM Interface Specification. -
tpm-crb: Command-Response Buffer.
-
NoteThe Compute service ignores the configuration of the
hw_tpm_modelproperty if thehw_tpm_versionproperty is not set.
Verification
Create an instance by using the vTPM image:
openstack server create --flavor m1.small \ --image vtpm-image vtpm-instance
(overcloud)$ openstack server create --flavor m1.small \ --image vtpm-image vtpm-instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud user.
To verify that the instance has access to a vTPM device, enter the following command from the instance:
dmesg | grep -i tpm
$ dmesg | grep -i tpmCopy to Clipboard Copied! Toggle word wrap Toggle overflow
13.2.3. Creating a flavor for vTPM devices Copy linkLink copied to clipboard!
When the overcloud is enabled to create instances that have vTPM devices, you can create one or more vTPM device flavors that your cloud users can use to launch instances that have vTPM devices.
A vTPM device flavor is necessary only when the hw_tpm_model and hw_tpm_version properties are not set on an image. If both the flavor and the image specify a TPM device model and the two values do not match, scheduling fails.
Procedure
Create a flavor for vTPM devices:
openstack flavor create --vcpus 1 --ram 512 --disk 2 \ --property hw:tpm_version=2.0 \ vtpm-flavor
(overcloud)$ openstack flavor create --vcpus 1 --ram 512 --disk 2 \ --property hw:tpm_version=2.0 \ vtpm-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTPM version
1.2is not supported.Optional: Specify the TPM model to use:
openstack flavor set \ --property hw:tpm_model=<tpm_model> \ vtpm-flavor
(overcloud)$ openstack flavor set \ --property hw:tpm_model=<tpm_model> \ vtpm-flavorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<tpm_model>with the model of TPM device to use. Set to one of the following valid values:-
tpm-tis: (Default) TPM Interface Specification. tpm-crb: Command-Response Buffer. Compatible only with TPM version 2.0.NoteThe Compute service ignores the configuration of the
hw:tpm_modelproperty if thehw:tpm_versionproperty is not set.
-
Verification
Create an instance by using the vTPM flavor:
openstack server create --flavor vtpm-flavor \ --image rhel-image vtpm-instance
(overcloud)$ openstack server create --flavor vtpm-flavor \ --image rhel-image vtpm-instanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud user.
To verify that the instance has access to a vTPM device, enter the following command from the instance:
dmesg | grep -i tpm
$ dmesg | grep -i tpmCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 14. Database cleaning Copy linkLink copied to clipboard!
The Compute service includes an administrative tool, nova-manage, that you can use to perform deployment, upgrade, clean-up, and maintenance-related tasks, such as applying database schemas, performing online data migrations during an upgrade, and managing and cleaning up the database.
Director automates the following database management tasks on the overcloud by using cron:
- Archives deleted instance records by moving the deleted rows from the production tables to shadow tables.
- Purges deleted rows from the shadow tables after archiving is complete.
14.1. Configuring database management Copy linkLink copied to clipboard!
The cron jobs use default settings to perform database management tasks. By default, the database archiving cron jobs run daily at 00:01, and the database purging cron jobs run daily at 05:00, both with a jitter between 0 and 3600 seconds. You can modify these settings as required by using heat parameters.
Procedure
- Open your Compute environment file.
Add the heat parameter that controls the cron job that you want to add or modify. For example, to purge the shadow tables immediately after they are archived, set the following parameter to "True":
parameter_defaults: ... NovaCronArchiveDeleteRowsPurge: True
parameter_defaults: ... NovaCronArchiveDeleteRowsPurge: TrueCopy to Clipboard Copied! Toggle word wrap Toggle overflow For a complete list of the heat parameters to manage database cron jobs, see Configuration options for the Compute service automated database management.
- Save the updates to your Compute environment file.
Add your Compute environment file to the stack with your other environment files and deploy the overcloud:
openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yaml
(undercloud)$ openstack overcloud deploy --templates \ -e [your environment files] \ -e /home/stack/templates/<compute_environment_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
14.2. Configuration options for the Compute service automated database management Copy linkLink copied to clipboard!
Use the following heat parameters to enable and modify the automated cron jobs that manage the database.
| Parameter | Description |
|---|---|
|
| Set this parameter to "True" to archive deleted instance records from all cells.
Default: |
|
| Use this parameter to archive deleted instance records based on their age in days.
Set to
Default: |
|
| Use this parameter to configure the file for logging deleted instance records.
Default: |
|
| Use this parameter to configure the hour at which to run the cron command to move deleted instance records to another table.
Default: |
|
| Use this parameter to configure the maximum delay, in seconds, before moving deleted instance records to another table.
Default: |
|
| Use this parameter to configure the maximum number of deleted instance records that can be moved to another table.
Default: |
|
| Use this parameter to configure the minute past the hour at which to run the cron command to move deleted instance records to another table.
Default: |
|
| Use this parameter to configure on which day of the month to run the cron command to move deleted instance records to another table.
Default: |
|
| Use this parameter to configure in which month to run the cron command to move deleted instance records to another table.
Default: |
|
| Set this parameter to "True" to purge shadow tables immediately after scheduled archiving.
Default: |
|
| Set this parameter to "True" to continue to move deleted instance records to another table until all records are moved.
Default: |
|
| Use this parameter to configure the user that owns the crontab that archives deleted instance records and that has access to the log file the crontab uses.
Default: |
|
| Use this parameter to configure on which day of the week to run the cron command to move deleted instance records to another table.
Default: |
|
| Use this parameter to purge shadow tables based on their age in days.
Set to
Default: |
|
| Set this parameter to "True" to purge shadow tables from all cells.
Default: |
|
| Use this parameter to configure the file for logging purged shadow tables.
Default: |
|
| Use this parameter to configure the hour at which to run the cron command to purge shadow tables.
Default: |
|
| Use this parameter to configure the maximum delay, in seconds, before purging shadow tables.
Default: |
|
| Use this parameter to configure the minute past the hour at which to run the cron command to purge shadow tables.
Default: |
|
| Use this parameter to configure in which month to run the cron command to purge the shadow tables.
Default: |
|
| Use this parameter to configure on which day of the month to run the cron command to purge the shadow tables.
Default: |
|
| Use this parameter to configure the user that owns the crontab that purges the shadow tables and that has access to the log file the crontab uses.
Default: |
|
| Use this parameter to enable verbose logging in the log file for purged shadow tables.
Default: |
|
| Use this parameter to configure on which day of the week to run the cron command to purge the shadow tables.
Default: |
Chapter 15. Migrating virtual machine instances between Compute nodes Copy linkLink copied to clipboard!
You sometimes need to migrate instances from one Compute node to another Compute node in the overcloud, to perform maintenance, rebalance the workload, or replace a failed or failing node.
- Compute node maintenance
- If you need to temporarily take a Compute node out of service, for instance, to perform hardware maintenance or repair, kernel upgrades and software updates, you can migrate instances running on the Compute node to another Compute node.
- Failing Compute node
- If a Compute node is about to fail and you need to service it or replace it, you can migrate instances from the failing Compute node to a healthy Compute node.
- Failed Compute nodes
- If a Compute node has already failed, you can evacuate the instances. You can rebuild instances from the original image on another Compute node, using the same name, UUID, network addresses, and any other allocated resources the instance had before the Compute node failed.
- Workload rebalancing
- You can migrate one or more instances to another Compute node to rebalance the workload. For example, you can consolidate instances on a Compute node to conserve power, migrate instances to a Compute node that is physically closer to other networked resources to reduce latency, or distribute instances across Compute nodes to avoid hot spots and increase resiliency.
Director configures all Compute nodes to provide secure migration. All Compute nodes also require a shared SSH key to provide the users of each host with access to other Compute nodes during the migration process. Director creates this key using the OS::TripleO::Services::NovaCompute composable service. This composable service is one of the main services included on all Compute roles by default. For more information, see Composable services and custom roles in the Customizing your Red Hat OpenStack Platform deployment guide.
If you have a functioning Compute node, and you want to make a copy of an instance for backup purposes, or to copy the instance to a different environment, follow the procedure in Importing virtual machines into the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide.
15.1. Migration types Copy linkLink copied to clipboard!
Red Hat OpenStack Platform (RHOSP) supports the following types of migration.
Cold migration
Cold migration, or non-live migration, involves shutting down a running instance before migrating it from the source Compute node to the destination Compute node.
Cold migration involves some downtime for the instance. The migrated instance maintains access to the same volumes and IP addresses.
Cold migration requires that both the source and destination Compute nodes are running.
Live migration
Live migration involves moving the instance from the source Compute node to the destination Compute node without shutting it down, and while maintaining state consistency.
Live migrating an instance involves little or no perceptible downtime. However, live migration does impact performance for the duration of the migration operation. Therefore, instances should be taken out of the critical path while being migrated.
Live migration impacts the performance of the workload being moved. Red Hat does not provide support for increased packet loss, network latency, memory latency or a reduction in network bandwith, memory bandwidth, storage IO, or CPU peformance during live migration.
Live migration requires that both the source and destination Compute nodes are running.
In some cases, instances cannot use live migration. For more information, see Migration constraints.
Evacuation
If you need to migrate instances because the source Compute node has already failed, you can evacuate the instances.
15.2. Migration constraints Copy linkLink copied to clipboard!
Migration constraints typically arise with block migration, configuration disks, or when one or more instances access physical hardware on the Compute node.
CPU constraints
The source and destination Compute nodes must have the same CPU architecture. For example, Red Hat does not support migrating an instance from a ppc64le CPU to a x86_64 CPU.
Migration between different CPU models is not supported. In some cases, the CPU of the source and destination Compute node must match exactly, such as instances that use CPU host passthrough. In all cases, the CPU features of the destination node must be a superset of the CPU features on the source node.
Memory constraints
The destination Compute node must have sufficient available RAM. Memory oversubscription can cause migration to fail.
Block migration constraints
Migrating instances that use disks that are stored locally on a Compute node takes significantly longer than migrating volume-backed instances that use shared storage, such as Red Hat Ceph Storage. This latency arises because OpenStack Compute (nova) migrates local disks block-by-block between the Compute nodes over the control plane network by default. By contrast, volume-backed instances that use shared storage, such as Red Hat Ceph Storage, do not have to migrate the volumes, because each Compute node already has access to the shared storage.
Network congestion in the control plane network caused by migrating local disks or instances that consume large amounts of RAM might impact the performance of other systems that use the control plane network, such as RabbitMQ.
Read-only drive migration constraints
Migrating a drive is supported only if the drive has both read and write capabilities. For example, OpenStack Compute (nova) cannot migrate a CD-ROM drive or a read-only config drive. However, OpenStack Compute (nova) can migrate a drive with both read and write capabilities, including a config drive with a drive format such as vfat.
Live migration constraints
In some cases, live migrating instances involves additional constraints.
Live migration impacts the performance of the workload being moved. Red Hat does not provide support for increased packet loss, network latency, memory latency or a reduction in network bandwidth, memory bandwidth, storage IO, or CPU performance during live migration.
- No new operations during migration
- To achieve state consistency between the copies of the instance on the source and destination nodes, RHOSP must prevent new operations during live migration. Otherwise, live migration might take a long time or potentially never end if writes to memory occur faster than live migration can replicate the state of the memory.
- CPU pinning with NUMA
-
The
NovaSchedulerEnabledFiltersparameter in the Compute configuration must include the valuesAggregateInstanceExtraSpecsFilterandNUMATopologyFilter. - Multi-cell clouds
- In a multi-cell cloud, you can live migrate instances to a different host in the same cell, but not across cells.
- Floating instances
-
When live migrating floating instances, if the configuration of
NovaComputeCpuSharedSeton the destination Compute node is different from the configuration ofNovaComputeCpuSharedSeton the source Compute node, the instances will not be allocated to the CPUs configured for shared (unpinned) instances on the destination Compute node. Therefore, if you need to live migrate floating instances, you must configure all the Compute nodes with the same CPU mappings for dedicated (pinned) and shared (unpinned) instances, or use a host aggregate for the shared instances. - Destination Compute node capacity
- The destination Compute node must have sufficient capacity to host the instance that you want to migrate.
- SR-IOV live migration
- Instances with SR-IOV-based network interfaces can be live migrated. Live migrating instances with direct mode SR-IOV network interfaces incurs network downtime. This is because the direct mode interfaces need to be detached and re-attached during the migration.
- Live migration on ML2/OVS deployments
- During the live migration process, when the virtual machine is unpaused in the destination host, the metadata service might not be available because the metadata server proxy has not yet spawned. This unavailability is brief. The service becomes available momentarily and the live migration succeeds.
- Constraints that preclude live migration
- You cannot live migrate an instance that uses the following features.
- PCI passthrough
- QEMU/KVM hypervisors support attaching PCI devices on the Compute node to an instance. Use PCI passthrough to give an instance exclusive access to PCI devices, which appear and behave as if they are physically attached to the operating system of the instance. However, because PCI passthrough involves direct access to the physical devices, QEMU/KVM does not support live migration of instances using PCI passthrough.
- Port resource requests
You cannot live migrate an instance that uses a port that has resource requests, such as a guaranteed minimum bandwidth QoS policy. Use the following command to check if a port has resource requests:
openstack port show <port_name/port_id>
$ openstack port show <port_name/port_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
15.3. Preparing to migrate Copy linkLink copied to clipboard!
Before you migrate one or more instances, you need to determine the Compute node names and the IDs of the instances to migrate.
Procedure
Identify the source Compute node host name and the destination Compute node host name:
source ~/overcloudrc openstack compute service list
(undercloud)$ source ~/overcloudrc (overcloud)$ openstack compute service listCopy to Clipboard Copied! Toggle word wrap Toggle overflow List the instances on the source Compute node and locate the ID of the instance or instances that you want to migrate:
openstack server list --host <source> --all-projects
(overcloud)$ openstack server list --host <source> --all-projectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<source>with the name or ID of the source Compute node.Optional: If you are migrating instances from a source Compute node to perform maintenance on the node, you must disable the node to prevent the scheduler from assigning new instances to the node during maintenance:
openstack compute service set <source> nova-compute --disable
(overcloud)$ openstack compute service set <source> nova-compute --disableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<source>with the host name of the source Compute node.
You are now ready to perform the migration. Follow the required procedure detailed in Cold migrating an instance or Live migrating an instance.
15.4. Cold migrating an instance Copy linkLink copied to clipboard!
Cold migrating an instance involves stopping the instance and moving it to another Compute node. Cold migration facilitates migration scenarios that live migrating cannot facilitate, such as migrating instances that use PCI passthrough. The scheduler automatically selects the destination Compute node. For more information, see Migration constraints.
Procedure
To cold migrate an instance, enter the following command to power off and move the instance:
openstack server migrate <instance> --wait
(overcloud)$ openstack server migrate <instance> --waitCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or ID of the instance to migrate. -
Specify the
--block-migrationflag if migrating a locally stored volume.
-
Replace
- Wait for migration to complete. While you wait for the instance migration to complete, you can check the migration status. For more information, see Checking migration status.
Check the status of the instance:
openstack server list --all-projects
(overcloud)$ openstack server list --all-projectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow A status of "VERIFY_RESIZE" indicates you need to confirm or revert the migration:
If the migration worked as expected, confirm it:
openstack server resize --confirm <instance>
(overcloud)$ openstack server resize --confirm <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance to migrate. A status of "ACTIVE" indicates that the instance is ready to use.If the migration did not work as expected, revert it:
openstack server resize --revert <instance>
(overcloud)$ openstack server resize --revert <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance.
Restart the instance:
openstack server start <instance>
(overcloud)$ openstack server start <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance.Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:
openstack compute service set <source> nova-compute --enable
(overcloud)$ openstack compute service set <source> nova-compute --enableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<source>with the host name of the source Compute node.
15.5. Live migrating an instance Copy linkLink copied to clipboard!
Live migration moves an instance from a source Compute node to a destination Compute node with a minimal amount of downtime. Live migration might not be appropriate for all instances. For more information, see Migration constraints.
Procedure
To live migrate an instance, specify the instance and the destination Compute node:
openstack server migrate <instance> --live-migration [--host <dest>] --wait
(overcloud)$ openstack server migrate <instance> --live-migration [--host <dest>] --waitCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or ID of the instance. Replace
<dest>with the name or ID of the destination Compute node.NoteThe
openstack server migratecommand covers migrating instances with shared storage, which is the default. Specify the--block-migrationflag to migrate a locally stored volume:openstack server migrate <instance> --live-migration [--host <dest>] --wait --block-migration
(overcloud)$ openstack server migrate <instance> --live-migration [--host <dest>] --wait --block-migrationCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Replace
Confirm that the instance is migrating:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait for migration to complete. While you wait for the instance migration to complete, you can check the migration status. For more information, see Checking migration status.
Check the status of the instance to confirm if the migration was successful:
openstack server list --host <dest> --all-projects
(overcloud)$ openstack server list --host <dest> --all-projectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<dest>with the name or ID of the destination Compute node.Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:
openstack compute service set <source> nova-compute --enable
(overcloud)$ openstack compute service set <source> nova-compute --enableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<source>with the host name of the source Compute node.
15.6. Checking migration status Copy linkLink copied to clipboard!
Migration involves several state transitions before migration is complete. During a healthy migration, the migration state typically transitions as follows:
- Queued: The Compute service has accepted the request to migrate an instance, and migration is pending.
- Preparing: The Compute service is preparing to migrate the instance.
- Running: The Compute service is migrating the instance.
- Post-migrating: The Compute service has built the instance on the destination Compute node and is releasing resources on the source Compute node.
- Completed: The Compute service has completed migrating the instance and finished releasing resources on the source Compute node.
Procedure
Retrieve the list of migration IDs for the instance:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance.Show the status of the migration:
openstack server migration show <instance> <migration_id>
$ openstack server migration show <instance> <migration_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or ID of the instance. Replace
<migration_id>with the ID of the migration.Running the
openstack server migration showcommand returns the following example output:Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipThe Compute service measures progress of the migration by the number of remaining memory bytes to copy. If this number does not decrease over time, the migration might be unable to complete, and the Compute service might abort it.
-
Replace
Sometimes instance migration can take a long time or encounter errors. For more information, see Troubleshooting migration.
15.7. Evacuating an instance Copy linkLink copied to clipboard!
If you want to move an instance from a failed or shut-down Compute node to a new host in the same environment, you can evacuate it.
The evacuate process destroys the original instance and rebuilds it on another Compute node using the original image, instance name, UUID, network addresses, and any other resources the original instance had allocated to it.
If the instance uses shared storage, the instance root disk is not rebuilt during the evacuate process, as the disk remains accessible by the destination Compute node. If the instance does not use shared storage, then the instance root disk is also rebuilt on the destination Compute node.
-
You can only perform an evacuation when the Compute node is fenced, and the API reports that the state of the Compute node is "down" or "forced-down". If the Compute node is not reported as "down" or "forced-down", the
evacuatecommand fails. - To perform an evacuation, you must be a cloud administrator.
15.7.1. Evacuating an instance Copy linkLink copied to clipboard!
To evacuate all instances on a host, you must evacuate them one at a time.
Procedure
Confirm that the instance is not running:
openstack server list --host <node> --all-projects
(overcloud)$ openstack server list --host <node> --all-projectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<node>with the name or UUID of the Compute node that hosts the instance.
-
Replace
Check the instance task state:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or UUID of the instance that you want to evacuate.
NoteIf the instance task state is not "NONE" the evacuation might fail.
-
Replace
Confirm that the host Compute node is fenced or shut down:
openstack baremetal node show <node>
(overcloud)$ openstack baremetal node show <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<node>with the name or UUID of the Compute node that hosts the instance to evacuate. To perform an evacuation, the Compute node must have a status ofdownorforced-down.
-
Replace
Disable the Compute node:
openstack compute service set \ <node> nova-compute --disable --disable-reason <disable_host_reason>
(overcloud)$ openstack compute service set \ <node> nova-compute --disable --disable-reason <disable_host_reason>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<node>with the name of the Compute node to evacuate the instance from. -
Replace
<disable_host_reason>with details about why you disabled the Compute node.
-
Replace
Evacuate the instance:
openstack server evacuate [--host <dest>] \ [--password <password>] <instance>
(overcloud)$ openstack server evacuate [--host <dest>] \ [--password <password>] <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Replace
<dest>with the name of the Compute node to evacuate the instance to. If you do not specify the destination Compute node, the Compute scheduler selects one for you. You can find possible Compute nodes by using the following command:openstack hypervisor list
(overcloud)[stack@director ~]$ openstack hypervisor listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Replace
<password>with the administrative password required to access the evacuated instance. If a password is not specified, a random password is generated and output when the evacuation is complete.NoteThe password is changed only when ephemeral instance disks are stored on the local hypervisor disk. The password is not changed if the instance is hosted on shared storage or has a Block Storage volume attached, and no error message is displayed to inform you that the password was not changed.
-
Replace
<instance>with the name or ID of the instance to evacuate.
NoteIf the evacuation fails and the task state of the instance is not "NONE", contact Red Hat Support for help to recover the instance.
Optional: Enable the Compute node when it is recovered:
openstack compute service set \ <node> nova-compute --enable
(overcloud)$ openstack compute service set \ <node> nova-compute --enableCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<node>with the name of the Compute node to enable.
-
Replace
15.8. Troubleshooting migration Copy linkLink copied to clipboard!
The following issues can arise during instance migration:
- The migration process encounters errors.
- The migration process never ends.
- Performance of the instance degrades after migration.
15.8.1. Errors during migration Copy linkLink copied to clipboard!
The following issues can send the migration operation into an error state:
- Running a cluster with different versions of Red Hat OpenStack Platform (RHOSP).
- Specifying an instance ID that cannot be found.
-
The instance you are trying to migrate is in an
errorstate. - The Compute service is shutting down.
- A race condition occurs.
-
Live migration enters a
failedstate.
When live migration enters a failed state, it is typically followed by an error state. The following common issues can cause a failed state:
- A destination Compute host is not available.
- A scheduler exception occurs.
- The rebuild process fails due to insufficient computing resources.
- A server group check fails.
- The instance on the source Compute node gets deleted before migration to the destination Compute node is complete.
15.8.2. Never-ending live migration Copy linkLink copied to clipboard!
Live migration can fail to complete, which leaves migration in a perpetual running state. A common reason for a live migration that never completes is that client requests to the instance running on the source Compute node create changes that occur faster than the Compute service can replicate them to the destination Compute node.
Use one of the following methods to address this situation:
- Abort the live migration.
- Force the live migration to complete.
Aborting live migration
If the instance state changes faster than the migration procedure can copy it to the destination node, and you do not want to temporarily suspend the instance operations, you can abort the live migration.
Procedure
Retrieve the list of migrations for the instance:
openstack server migration list --server <instance>
$ openstack server migration list --server <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance.Abort the live migration:
openstack server migration abort <instance> <migration_id>
$ openstack server migration abort <instance> <migration_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or ID of the instance. -
Replace
<migration_id>with the ID of the migration.
-
Replace
Forcing live migration to complete
If the instance state changes faster than the migration procedure can copy it to the destination node, and you want to temporarily suspend the instance operations to force migration to complete, you can force the live migration procedure to complete.
Forcing live migration to complete might lead to perceptible downtime.
Procedure
Retrieve the list of migrations for the instance:
openstack server migration list --server <instance>
$ openstack server migration list --server <instance>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<instance>with the name or ID of the instance.Force the live migration to complete:
openstack server migration force complete <instance> <migration_id>
$ openstack server migration force complete <instance> <migration_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<instance>with the name or ID of the instance. -
Replace
<migration_id>with the ID of the migration.
-
Replace
15.8.3. Instance performance degrades after migration Copy linkLink copied to clipboard!
For instances that use a NUMA topology, the source and destination Compute nodes must have the same NUMA topology and configuration. The NUMA topology of the destination Compute node must have sufficient resources available. If the NUMA configuration between the source and destination Compute nodes is not the same, it is possible that live migration succeeds while the instance performance degrades. For example, if the source Compute node maps NIC 1 to NUMA node 0, but the destination Compute node maps NIC 1 to NUMA node 5, after migration the instance might route network traffic from a first CPU across the bus to a second CPU with NUMA node 5 to route traffic to NIC 1. This can result in expected behavior, but degraded performance. Similarly, if NUMA node 0 on the source Compute node has sufficient available CPU and RAM, but NUMA node 0 on the destination Compute node already has instances using some of the resources, the instance might run correctly but suffer performance degradation. For more information, see Migration constraints.