Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 3. Red Hat OpenStack deployment best practices


Review the following best practices when you plan and prepare to deploy OpenStack. You can apply one or more of these practices in your environment.

3.1. Red Hat OpenStack deployment preparation

Before you deploy Red Hat OpenStack Platform (RHOSP), review the following list of deployment preparation tasks. You can apply one or more of the deployment preparation tasks in your environment:

Set a subnet range for introspection to accommodate the maximum overcloud nodes for which you want to perform introspection at a time
When you use director to deploy and configure RHOSP, use CIDR notations for the control plane network to accommodate all overcloud nodes that you add now or in the future.
Set the root password on your overcloud image to allow console access to the overcloud image
Use the console to troubleshoot failed deployments when networking is set incorrectly. Adhere to the information security policies of your organization for password management when you implement this recommendation.
Use scheduler hints to assign hardware to a role
  • Use scheduler hints to assign hardware to a role, such as Controller, Compute, CephStorage, and others. Scheduler hints provide easier identification of deployment issues that affect only a specific piece of hardware.
  • The nova-scheduler, which is a single process, can overexert when it schedules a large number of nodes. Scheduler hints reduce the load on nova-scheduler when scheduler hints implement tag matching. As a result, nova-scheduler encounters fewer scheduling errors during the deployment and the deployment takes less time when you use scheduler hints.
  • Do not use profile tagging when you use scheduler hints.
  • In performance testing, use identical hardware for specific roles to reduce variability in testing and performance results.
Set the World Wide Name (WWN) as the root disk hint for each node to prevent nodes from using the wrong disk during deployment and booting
When nodes contain multiple disks, use the introspection data to set the WWN as the root disk hint for each node. This prevents the node from using the wrong disk during deployment and booting. For more information, see Defining the Root Disk for Multi-Disk Clusters in the Director Installation and Usage guide.
Enable the Bare Metal service (ironic) automated cleaning on nodes that have more than one disk

Use the Bare Metal service automated cleaning to erase metadata on nodes that have more than one disk and are likely to have multiple boot loaders. Nodes might become inconsistent with the boot disk due to the presence of multiple bootloaders on disks, which leads to node deployment failure when you attempt to pull the metadata that uses the wrong URL.

To enable the Bare Metal service automated cleaning, on the undercloud node, edit the undercloud.conf file and add the following line:

clean_nodes = true
Copy to Clipboard Toggle word wrap
Limit the number of nodes for Bare Metal (ironic) introspection

If you perform introspection on all nodes at the same time, failures might occur due to network constraints. Perform introspection on up to 50 nodes at a time.

Ensure that the dhcp_start and dhcp_end range in the undercloud.conf file is large enough for the number of nodes that you expect to have in the environment.

If there are insufficient available IPs, do not issue more than the size of the range. This limits the number of simultaneous introspection operations. To allow the introspection DHCP leases to expire, do not issue more IP addresses for a few minutes after the introspection completes.

Prepare Ceph for different types of configurations

The following list is a set of recommendations for different types of configurations:

  • All-flash OSD configuration

    Each OSD requires additional CPUs according to the IOPS capacity of the device type, so Ceph IOPS are CPU-limited at a lower number of OSDs. This is true for NVM SSDs, which can have two orders of magnitude higher IOPS capacity than traditional HDDs. For SATA/SAS SSDs, expect one order of magnitude greater random IOPS/OSD than HDDs, but only about two to four times the sequential IOPS increase. You can supply less CPU resources to Ceph than Ceph needs for OSD devices.

  • Hyper Converged Infrastructure (HCI)

    It is recommended to reserve at least half of your CPU capacity, memory, and network for the OpenStack Compute (nova) guests. Ensure that you have enough CPU capacity and memory to support both OpenStack Compute (nova) guests and Ceph Storage. Observe memory consumption because Ceph Storage memory consumption is not elastic. On a multi-CPUs socket system, limit Ceph CPU consumption with NUMA-pinning Ceph to a single socket. For example, use the numactl -N 0 -p 0 command. Do not hard-pin Ceph memory consumption to 1 socket.

  • Latency-sensitive applications such as NFV

    Place Ceph on the same CPU socket as the network card that Ceph uses and limit the network card interruptions to that CPU socket if possible, with a network application that runs on a different NUMA socket and network card.

    If you use dual bootloaders, use disk-by-path for the OSD map. This gives the user consistent deployments, unlike using the device name. The following snippet is an example of the CephAnsibleDisksConfig for a disk-by-path mapping.

    CephAnsibleDisksConfig:
      osd_scenario: non-collocated
      devices:
        - /dev/disk/by-path/pci-0000:03:00.0-scsi-0:2:0:0
        - /dev/disk/by-path/pci-0000:03:00.0-scsi-0:2:1:0
      dedicated_devices:
        - /dev/nvme0n1
        - /dev/nvme0n1
      journal_size: 512
    Copy to Clipboard Toggle word wrap

3.2. Red Hat OpenStack deployment configuration

Review the following list of recommendations for your Red Hat OpenStack Platform(RHOSP) deployment configuration:

Validate the heat templates with a small scale deployment
Deploy a small environment that consists of at least three Controllers, one Compute note, and three Ceph Storage nodes. You can use this configuration to ensure that all of your heat templates are correct.
Disable telemetry notifications on the undercloud

You can disable telemetry notifications on the undercloud for the following OpenStack services to decrease the RabbitMQ queue:

  • Compute (nova)
  • Networking (neutron)
  • Orchestration (heat)
  • Identity (keystone)

To disable the notifications, in the /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml template, set the notification driver setting to noop.

Limit the number of nodes that are provisioned at the same time

Fifty is the typical amount of servers that can fit within a average enterprise-level rack unit, therefore, you can deploy an average of one rack of nodes at one time.

To minimize the debugging necessary to diagnose issues with the deployment, deploy no more than 50 nodes at one time. However, if you want to deploy a higher number of nodes, Red Hat has successfully tested up to 100 nodes simultaneously.

To scale Compute nodes in batches, use the openstack overcloud deploy command with the --limit option. This can result in saved time and lower resource consumption on the undercloud.

Note

The --limit option is in Technology Preview.

Use this option with a comma-separated list of tags from the config-download playbook to run the deployment with a specific set of config-download tasks.

Disable unused NICs

If the overcloud has any unused NICs during the deployment, you must define the unused interfaces in the NIC configuration templates and set the interfaces to use_dhcp: false and defroute: false.

If you do not define unused interfaces, there might be routing issues and IP allocation problems during introspection and scaling operations. By default, the NICs set BOOTPROTO=dhcp, which means the unused overcloud NICs consume IP addresses that are needed for the PXE provisioning. This can reduce the pool of available IP addresses for your nodes.

Power off unused Bare Metal Provisioning (ironic) nodes
Ensure that you power off any unused Bare Metal Provisioning (ironic) nodes in maintenance mode. Red Hat has identified cases where nodes from previous deployments are left in maintenance mode in a powered on state. This can occur with Bare Metal automated cleaning, where a node that fails cleaning is set to maintenance mode. Bare Metal Provisioning does not track the power state of nodes in maintenance mode and incorrectly reports the power state as off. This can cause problems with ongoing deployments. When you redeploy after a failed deployment, ensure that you power off all unused nodes that use the power management device of the node.

3.3. Tuning the undercloud

Review this section when you plan to scale your Red Hat OpenStack Platform (RHOSP) deployment and tune to your default undercloud settings.

If you use the Telemetry service (ceilometer), improve the performance of the service
Because the Telemetry service is CPU-intensive, telemetry is not enabled by default in RHOSP 16.1. If you use want to use Telemetry, you can improve the performance of the service.
Separate the provisioning and configuration processes
  • To create only the stack and associated RHOSP resources, you can run the deployment command with the --stack-only option.
  • Red Hat recommends separating the stack and config-download steps when deploying more than 100 nodes.

Include any environment files that are required for your overcloud:

$ openstack overcloud deploy \
  --templates \
  -e <environment-file1.yaml> \
  -e <environment-file2.yaml> \
  ...
  --stack-only
Copy to Clipboard Toggle word wrap
  • After you have provisioned the stack, you can enable SSH access for the tripleo-admin user from the undercloud to the overcloud. The config-download process uses the tripleo-admin user to perform the Ansible based configuration:

    $ openstack overcloud admin authorize
    Copy to Clipboard Toggle word wrap
  • To disable the overcloud stack creation and run only the config-download workflow to apply the software configuration, you can run the deployment command with the --config-download-only option. Include any environment files that are required for your overcloud:

    $ openstack overcloud deploy \
     --templates \
     -e <environment-file1.yaml> \
     -e <environment-file2.yaml> \
      ...
     --config-download-only
    Copy to Clipboard Toggle word wrap
  • To limit the config-download playbook execution to a specific node or set of nodes, you can use the --limit option.
  • The --limit option can be used to separate nodes into different roles, to limit the number of nodes to deploy, or to separate nodes with a specific hardware type. For scale-up operations, when you want to apply software configuration on the new nodes only, use the --limit option with the --config-download-only option.

    $ openstack overcloud deploy \
    --templates \
    -e <environment-file1.yaml> \
    -e <environment-file2.yaml> \
    ...
    --config-download-only --config-download-timeout --limit <Undercloud>,<Controller>,<Compute-1>,<Compute-2>
    Copy to Clipboard Toggle word wrap

    If you use the --limit option always include <Controller> and <Undercloud> in the list. Tasks that use the external_deploy_steps interface, for example all Ceph configurations, are only executed if the <Undercloud> is included in the options list. All external_deploy_steps tasks run on the undercloud.

    For example, if you run a scale-up task to add a compute node that requires a connection to Ceph and you do not include <Undercloud> in the list, the Ceph configuration and cephx key files are missing, and the task fails. Do not use the --skip-tags external_deploy_steps option or the task fails.

Retour au début
Red Hat logoGithubredditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance. Découvrez nos récentes mises à jour.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez le Blog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

Theme

© 2025 Red Hat