Chapter 2. Recommended specifications
2.1. Undercloud
For best performance, install the undercloud node on a physical server. However, if you use a virtualized undercloud node, ensure that the virtual machine has enough resources similar to a physical machine described in the following table.
Counts | 1 |
CPUs | 12 cores, 24 threads |
Disk | 500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500GB disk for swift (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 64 GB |
Network | 10 Gbps network interfaces |
2.2. Overcloud Controller nodes
All control plane services must run on exactly 3 nodes. Typically, all control plane services are deployed across 3 Controller nodes.
Scaling controller services
To increase the resources available for controller services, you can scale these services to additional nodes. For example, you can deploy the db
or messaging
controller services on dedicated nodes to reduce the load on the Controller nodes.
To scale controller services, use composable roles to define the set of services that you want to scale. When you use composable roles, each service must run on exactly 3 additional dedicated nodes and the total number of nodes in the control plane must be odd to maintain Pacemaker quorum.
The control plane in this example consists of the following 9 nodes:
- 3 Controller nodes
- 3 Database nodes
- 3 Messaging nodes
For more information, see Composable services and custom roles in Advanced Overcloud Customization.
For questions about scaling controller services with composable roles, contact Red Hat Global Professional Services.
Storage considerations
Include sufficient storage when you plan Controller nodes in your overcloud deployment. OpenStack Telemetry Metrics (gnocchi) and OpenStack Image service (glance) services are I/O intensive. Use Ceph Storage and the Image service for telemetry because the overcloud moves the I/O load to the Ceph OSD servers.
If your deployment does not include Ceph storage, use a dedicated disk or node for Object Storage (swift) that Telemetry Metrics (gnocchi) and Image (glance) services can use. If you use Object Storage on Controller nodes, use an NVMe device separate from the root disk to reduce disk utilization during object data storage.
Counts | 3 Controller nodes with controller services contained within the Controller role. Optionally, to scale controller services on dedicated nodes, use composable services. For more information, see Composable services and custom roles in Advanced Overcloud Customization. |
CPUs | 2 sockets each with 12 cores, 24 threads |
Disk | 500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 128 GB |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If using 10 Gbps network interfaces, use network bonding to create two bonds:
|
Counts | 3 Controller nodes with controller services contained within the Controller role. Optionally, to scale controller services on dedicated nodes, use composable services. For more information, see Composable services and custom roles in Advanced Overcloud Customization. |
CPUs | 2 sockets each with 12 cores, 24 threads |
Disk | 500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500GB disk for Swift (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 128 GB |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If using 10 Gbps network interfaces, use network bonding to create two bonds:
|
2.3. Overcloud Compute nodes
Counts | Red Hat has tested a scale of 300 nodes. |
CPUs | 2 sockets each with 12 cores, 24 threads |
Disk | 500GB root disk (1x SSD or 2x hard drives with 7200RPM; RAID 1) 500GB disk for glance image cache (1x SSD or 2x hard drives with 7200RPM; RAID 1) |
Memory | 128 GB (64 GB per NUMA node); 2GB is reserved for the host out of the box. With Distributed Virtual Routing, increase the reserved RAM to 5 GB. |
Network | 25 Gbps network interfaces or 10 Gbps network interfaces. If using 10 Gbps network interfaces, use network bonding to create two bonds:
|
2.4. Red Hat Ceph Storage nodes
Counts | A minimum of 5 nodes with three-way replication is required. With all-flash configuration, a minimum of 3 nodes with two-way replication is required. |
CPUs | 1 Intel Broadwell CPU core per OSD to support storage I/O requirements. If you are using a light I/O workload, you might not need Ceph to run at the speed of your block devices. For example, for some NFV applications, Ceph supplies data durability, high availability, and low latency but throughput is not really a target, so it is acceptable to supply a little less CPU power. |
Memory | Allow 5 GB RAM per OSD. This is required for caching OSD data and metadata to optimize performance, not just for the OSD process memory. For hyper-converged infrastructure (HCI) environments, calculate the required memory in conjunction with the Compute node specifications. |
Network | Ensure the network capacity in MB/s is higher than the total MB/s capacity of the Ceph devices to support workloads that use a large I/O transfer size. Use a cluster network to lower write latency by shifting inter-OSD traffic onto a separate set of physical network ports. To do this in Red Hat OpenStack Platform, configure separate VLANs for networks and assigning the VLANs to separate physical network interfaces. |
Disk | Solid-State Drive (SSD) Journaling reduces I/O contention on hard disk drives (HDD), which increases the speed of write IOPS, but SSDs have zero effect on read input/output operations per second. If using SATA/SAS SSD journals, you typically need a ratio of SSD:HDD of 1:5. If using NVM SSD journals, you can typically use a SSD:HDD ratio of 1:10 or even 1:15 in cases where the workload is read-mostly. However, if this ratio is too high, the SSD journal device failure can affect the OSDs. |
For more information, see Deploying an overcloud with containerized Red Hat Ceph.
For more information on changing the storage replication number, see Pool, PG, and CRUSH Configuration Reference in the Red Hat Ceph Storage Configuration Guide.