Planning a large-scale RHOSO deployment


Red Hat OpenStack Services on OpenShift 18.0

Hardware requirements and recommendations for large deployments

OpenStack Documentation Team

Abstract

This guide contains requirements and recommendations for deploying Red Hat OpenStack Services on OpenShift (RHOSO) at scale.

Providing feedback on Red Hat documentation

We appreciate your input on our documentation. Tell us how we can make it better.

Use the Create Issue form to provide feedback on the documentation for Red Hat OpenStack Services on OpenShift (RHOSO) or earlier releases of Red Hat OpenStack Platform (RHOSP). When you create an issue for RHOSO or RHOSP documents, the issue is recorded in the RHOSO Jira project, where you can track the progress of your feedback.

To complete the Create Issue form, ensure that you are logged in to Jira. If you do not have a Red Hat Jira account, you can create an account at https://issues.redhat.com.

  1. Click the following link to open a Create Issue page: Create Issue
  2. Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue. Do not modify any other fields in the form.
  3. Click Create.

Deploy Red Hat OpenStack Services on OpenShift (RHOSO) on a Red Hat OpenShift Container Platform (RHOSCP) cluster.

RHOSO consists of a control plane and a data plane. The control plane is the deployed services that are installed on RHOCP. The data plane is one or more node sets. The node sets are deployed by the control plane.

A node set is a group of either RHOSO Compute nodes or RHOSO Network nodes that are grouped by node properties. A node set is comparable to composable roles in previous releases.

Red Hat has tested RHOSO 18 with a deployment of up to 250 bare-metal nodes with various node sets.

For information on RHOSO deployment steps, see Deploying Red Hat OpenStack Services on OpenShift.

1.1. Control plane requirements for RHOSO at scale

Ensure that your Red Hat OpenShift Container Platform (RHOCP) cluster meets the minimum tested requirements for hosting Red Hat OpenStack Services on OpenShift (RHOSO).

Expand
System requirementDescription

Node count

3 node RHOCP cluster at version 4.16

CPUs

40 total cores, 80 threads

Disk

500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1)

500GB dedicated disk for Swift (1x SSD or 1x NVMe)

Optional: 500GB disk for image caching (1x SSD or 2x hard drives with 7200RPM; RAID 1)

Memory

384 GB

Network

10 Gbps or higher

1.2. Data plane requirements for RHOSO at scale

Ensure that your Compute nodes meet the tested requirements before deploying a Red Hat OpenStack Services on an OpenShift data plane.

Expand
ResourceContraints

Compute nodes in node set

Up to 50 nodes

Total compute node Count

Up to 250 nodes

CPUs

At least 2 sockets each with 12 cores, 24 threads

Disk

At least a 500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1)

Memory

At least 128GB (64GB per NUMA node); 2GB is reserved for the host out by default. With Distributed Virtual Routing, increase the reserved RAM to 5GB.

Network interfaces

2 x 10 Gbps or faster

1.3. Red Hat Ceph Storage node system requirements

For information about system requirements for Red Hat Ceph Storage, see the following resources:

Chapter 2. RHOSO deployment best practices

The deployment of Red Hat OpenStack Services on OpenShift (RHOSO) can be a network intensive activity. Take steps to reduce the chances of network saturation or unnecessary troubleshooting.

2.1. RHOSO deployment prepartation

Prepare to deploy Red Hat OpenStack Services on OpenShift at scale by reviewing the networking requirements for the deployment, and for operation.

Dedicate separate NICs for RHOCP and RHOSO networks

You must have a minimum of two NICs for each control plane worker node:

  • One NIC is dedicated to OpenShift, facilitating communication between OpenShift components within the cluster network.
  • The other NIC is designated for OpenStack, enabling connectivity between OpenStack services on worker nodes and the isolated networks in the RHOSO data plane.
Limit the number of nodes for Bare Metal (ironic) introspection

When you run introspection on many nodes at once, the introspection process can fail. If this occurs, perform introspection on no more than 50 nodes at a time.

Ensure that the provisioning network has enough IPs allocated for the number of nodes that you expect to have in the environment.

Enable Jumbo Frames for networks with heavy traffic

A standard frame has a maximum MTU of 1500 bytes. Jumbo frames can be as large as 9000 bytes. Jumbo frames can reduce CPU overhead on high throughput network connections because fewer datagrams must be processed per gigabyte of transferred data.

Enable jumble frames only for networks that have a network switch that supports them. Networks that are known to have better performance with jumbo frames include the following:

  • Tenant network
  • Storage network
  • Management network

2.2. RHOSO deployment configurations

Ensure a successful operation of Red Hat OpenStack Services on OpenShift (RHOSO) by understanding the contraints, behaviors, and networking properties of RHOSO.

Validate your custom resources (CRs) with a small scale deployment
Deploy a small environment with a control plane hosted on a three-node RHOCP cluster, one data plane node, and three Red Hat Ceph Storage nodes. Use this configuration to ensure your CR configurations are correct.
Limit the number of data plane nodes that are provisioned at the same time
You can typically fit 50 servers on an average enterprise-level rack unit. Using this assumption, deploy one rack at a time, using one node set per rack. Red Hat has successfully tested a deployment with 5 node sets totalling 250 nodes. When you limit the number of nodes in a single node set, you minimize the debugging necessary to diagnose potential deployment issues.
Power off unused Bare Metal Provisioning (ironic) nodes

When Bare Metal automated cleaning is enabled, and a node fails that cleaning, that node is set to maintenance mode. Nodes that are in maintenance mode can remain in a powered on state, and be incorrectly reported by Bare Metal Provisioning (ironic) as powered off. This can cause problems with ongoing deployments.

If you are redeploying after a failed deployment, ensure that you power off all unused nodes that use the power management device of the node.

Improve instance distribution across Compute

The Compute scheduler updates Compute resources only after instances that have been scheduled are confirmed for the Compute node. To help prevent the uneven distribution of instances on Compute nodes, perform the following actions:

  • Set the value of the [filter_scheduler]shuffle_best_same_weighed_hosts parameter to true.
  • To ensure that a Compute node is not overloaded with instances, set max_instances_per_host to the maximum number of instances that any Compute node can spawn and ensure that the NumInstancesFilter parameter is enabled. When this instance count is reached by a Compute node, then the scheduler will no longer select it for further instance spawn scheduling.

Additionally, set configurations for the Networking service (neutron) to improve performance at scale.

The ovsdb-server sends probes at specific intervals to the following clients:

  • neutron
  • ovn-controller
  • ovn-metadata-agent

If ovsdb-server does not get a reply from one of these clients before a timeout is reached, it disconnects from the client and forces a reconnect. A client can be slow to respond after the initial connection, when it is loading a copy of the database into memory. If the timeout is too low, the ovsdb-server can disconnect the client during this process. When the client reconnects, the process starts over and continuously repeats. If the maximum timeout interval does not work, disable the probe by setting this value to 0.

If the client-side probe intervals are disabled, they use TCP keepalive messages to monitor their connections to the ovsdb-server.

The following settings are tested and validated to improve performance and stability on a large-scale RHOSO environment.

OVN Southbound server-side inactivity probe

Increase the probe interval to 180000 ms in the OpenStackControlPlan CR file:

spec:
    ovn:
        template:
            ovnDBCluster:
                ovndbcluster-sb:
                    dbType: SB
                    inactivityProbe: 180000
Copy to Clipboard Toggle word wrap
OVN Northbound server-side inactivity probe

Increase the probe interval to 60000 ms:

spec:
    ovn:
        template:
            ovnDBCluster:
                ovndbcluster-nb:
                    dbType: NB
                    inactivityProbe: 60000
Copy to Clipboard Toggle word wrap
OVN controller remote probe interval on Compute nodes

Increase the probe interval to 180000 ms using the edpm_ovn_remote_probe_interval in your node set CR file.

spec:
    nodeTemplate:
        ansible:
            ansibleUser: root
            ansibleVars:
                edpm_ovn_remote_probe_interval: 180000
Copy to Clipboard Toggle word wrap
Network service client-side probe interval

Increase the probe interval to 180000 ms using the customServiceConfig hook in the neutron/template section of your OpenStackControlPlane CR file:

spec:
    neutron:
        template:
            customServiceConfig: |
                [ovn]
                ovsdb_probe_interval = 180000
Copy to Clipboard Toggle word wrap
Networking service api_workers
Increase the default number of separate API worker processes to 16 or more, based on the load on the neutron-server.
Networking service agent_down_time

Set agent_down_time to the maximum permissible number of 2147483 for very large clusters. Use the customServiceConfig hook in the neutron/template section of your OpenStackControlPlane CR file:

spec:
    neutron:
        template:
            customServiceConfig: |
                [ovn]
                agent_down_time = 2147483
Copy to Clipboard Toggle word wrap
OVN controller remote probe interval on Compute nodes

Increase the probe interval to 180000 ms using the edpm_ovn_remote_probe_interval in your node set CR file.

spec:
    nodeTemplate:
        ansible:
            ansibleUser: root
            ansibleVars:
                edpm_ovn_remote_probe_interval: 180000
Copy to Clipboard Toggle word wrap
OVN metadata client-side probe interval on Compute nodes

Increase the probe interval to 180000 ms using the edpm_neutron_metadata_agent_ovn_ovsdb_probe_interval in your node set CR file.

spec:
    nodeTemplate:
        ansible:
            ansibleUser: root
            ansibleVars:
                edpm_neutron_metadata_agent_ovn_ovsdb_probe_interval: 180000
Copy to Clipboard Toggle word wrap

When you scale your Red Hat OpenStack Services on OpenShift (RHOSO) deployment, consider tuning your custom resources (CR)s to have more resources.

Procedure

  1. Edit the rabbitmq-cell1 section of the OpenStackControlPlane manifest file and configure resources to the following values:

    • persistence storage: 20Gi
    • replicas: 3
    • cpu: 8
    • memory: 20Gi

      Example

            rabbitmq-cell1:
                 persistence:
                storage: 20Gi
              rabbitmq: {}
              replicas: 3
              resources:
                limits:
                  cpu: "8"
                  memory: 20Gi
                requests:
                  cpu: "8"
                  memory: 20Gi
      Copy to Clipboard Toggle word wrap
  2. Edit the galera section of the OpenStackControlPlane manifest file and configure resources to the following values:

    • replicas: 3
    • storageRequest: 20G

      Example

        galera:
          enabled: true
          templates:
            openstack:
              logToDisk: false
              replicas: 3
              secret: osp-secret
              storageClass: lvms-vg1
              storageRequest: 20G
              tls: {}
            openstack-cell1:
              logToDisk: false
              replicas: 3
              secret: osp-secret
              storageClass: lvms-vg1
              storageRequest: 20G
              tls: {}
      Copy to Clipboard Toggle word wrap
  3. Edit the nova, neutron, keystone, and glance sections of the OpenStackControlPlane manifest file and configure those services to have three replicas.

Legal Notice

Copyright © 2025 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat