Chapter 1. Configuring a Hyperconverged Infrastructure environment
This section describes how to deploy a Hyperconverged Infrastructure (HCI) environment. A HCI environment contains data plane nodes that host both Ceph Storage and the Compute service.
Create an HCI environment, by completing the following high-level tasks:
- Configuring the data plane node networking.
- Installing Red Hat Ceph Storage on the data plane nodes.
- Configuring Red Hat OpenStack Services on OpenShift (RHOSO) to use the Red Hat Ceph Storage cluster.
1.1. Data plane node services list
Create an OpenStackDataPlaneNodeSet
CR to configure data plane nodes. The openstack-operator
reconciles the OpenStackDataPlaneNodeSet
CR when an OpenStackDataPlaneDeployment
CR is created.
These CRs have a service list similar to the following example:
apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... services: - configure-network - validate-network - install-os - configure-os - run-os - ovn - libvirt - nova
Only the services in the services list are configured.
Red Hat Ceph Storage must be deployed on the data plane node after the Storage
network and NTP are configured but before the Compute service is configured. This means you must edit the services list and make other changes to the CR. Throughout this section, you edit the services list to complete the configuration of the HCI environment.
1.2. Configuring the data plane node networks
You must configure the data plane node networks to accommodate the Red Hat Ceph Storage networking requirements.
Prerequisites
- Control plane deployment is complete but has not yet been modified to use Ceph Storage.
- The data plane nodes have been provisioned with an operating system.
- The data plane nodes are accessible through an SSH key that Ansible can use.
- The data plane nodes have disks available to be used as Ceph OSDs.
- There are a minimum of three available data plane nodes. Ceph Storage clusters must have a minimum of three nodes to ensure redundancy.
Procedure
Create an
OpenStackDataPlaneNodeSet
CRD file to represent the data plane nodes.NoteDo not create the CR in Red Hat OpenShift yet.
Add the
ceph-hci-pre
service to the list before theconfigure-os
service and remove all other service listings afterrun-os
.The following is an example of the edited list:
apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... services: - download-cache - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-os
NoteNote the services that you remove from the list. You add them back to the list later.
-
(Optional) The
ceph-hci-pre
service prepares EDPM nodes to host Red Hat Ceph Storage services after network configuration using theedpm_ceph_hci_pre edpm-ansible
role. By default, theedpm_ceph_hci_pre_enabled_services
parameter of this role only contains RBD, RGW, and NFS services. If other services, such as the Dashboard, are deployed with HCI nodes; they must be added to theedpm_ceph_hci_pre_enabled_services
parameter list. For more information about this role, see edpm_ceph_hci_pre role. Configure the Red Hat Ceph Storage
cluster_network
for storage management traffic between OSDs. Modify the CR to setedpm-ansible
variables so that theedpm_network_config
role configures a storage management network which Ceph uses as thecluster_network
.The following example has 3 nodes. It assumes the storage management network range is
172.20.0.0/24
and that it is onVLAN23
. The bolded lines are additions for thecluster_network
:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm namespace: openstack spec: env: - name: ANSIBLE_FORCE_COLOR value: "True" networkAttachments: - ctlplane nodeTemplate: ansible: ansiblePort: 22 ansibleUser: cloud-admin ansibleVars: edpm_ceph_hci_pre_enabled_services: - ceph_mon - ceph_mgr - ceph_osd - ceph_rgw - ceph_nfs - ceph_rgw_frontend - ceph_nfs_frontend edpm_fips_mode: check edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }} edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }} edpm_network_config_hide_sensitive_logs: false edpm_network_config_os_net_config_mappings: edpm-compute-0: nic1: 52:54:00:1e:af:6b nic2: 52:54:00:d9:cb:f4 edpm-compute-1: nic1: 52:54:00:f2:bc:af nic2: 52:54:00:f1:c7:dd edpm-compute-2: nic1: 52:54:00:dd:33:14 nic2: 52:54:00:50:fb:c3 edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic2 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }} edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false edpm_selinux_mode: enforcing edpm_sshd_allowed_ranges: - 192.168.122.0/24 - 192.168.111.0/24 edpm_sshd_configure_firewall: true enable_debug: false gather_facts: false image_tag: current-podified neutron_physical_bridge_name: br-ex neutron_public_interface_name: eth0 service_net_map: nova_api_network: internalapi nova_libvirt_network: internalapi storage_mgmt_cidr: "24" storage_mgmt_host_routes: [] storage_mgmt_mtu: 9000 storage_mgmt_vlan_id: 23 storage_mtu: 9000 timesync_ntp_servers: - hostname: pool.ntp.org ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret managementNetwork: ctlplane networks: - defaultRoute: true name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 nodes: edpm-compute-0: ansible: host: 192.168.122.100 hostName: compute-0 networks: - defaultRoute: true fixedIP: 192.168.122.100 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 edpm-compute-1: ansible: host: 192.168.122.101 hostName: compute-1 networks: - defaultRoute: true fixedIP: 192.168.122.101 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 edpm-compute-2: ansible: host: 192.168.122.102 hostName: compute-2 networks: - defaultRoute: true fixedIP: 192.168.122.102 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 preProvisioned: true services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-os
NoteIt is not necessary to add the storage management network to the networkAttachments key.
Apply the CR:
$ oc apply -f <dataplane_cr_file>
Replace
<dataplane_cr_file>
with the name of your file.NoteAnsible does not configure or validate the networks until the
OpenStackDataPlaneDeployment
CRD is created.
-
Create an
OpenStackDataPlaneDeployment
CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has theOpenStackDataPlaneNodeSet
CRD file defined above to have Ansible configure the services on the data plane nodes. To confirm the network is configured, complete the following steps:
- SSH into a data plane node.
-
Use the
ip a
command to display configured networks. - Confirm the storage networks are in the list of configured networks.
1.2.1. Red Hat Ceph Storage MTU settings
The example in this procedure changes the MTU of the storage and storage_mgmt networks from 1500 to 9000. An MTU of 9000 is known as a jumbo frame. Even though it is not mandatory to increase the MTU, jumbo frames are used for improved storage performance. If jumbo frames are used, all network switch ports in the data path must be configured to support jumbo frames. MTU changes must also be made for services using the Storage network running on OpenShift.
To change the MTU for the OpenShift services connecting to the data plane nodes, update the Node Network Configuration Policy (NNCP) for the base interface and the VLAN interface. It is not necessary to update the Network Attachment Definition (NAD) if the main NAD interface already has the desired MTU. If the MTU of the underlying interface is set to 9000, and it is not specified for the VLAN interface on top of it, then it will default to the value from the underlying interface.
If the MTU values are not consistent, issues can occur on the application layer that can cause the Red Hat Ceph Storage cluster to not reach quorum or not support authentication using the CephX protocol. If the MTU is changed and you observe these types of problems, verify all hosts that use the network using jumbo frames can communicate at the chosen MTU value with the ping command, for example:
$ ping -M do -s 8972 172.20.0.100
1.3. Configuring and deploying Red Hat Ceph Storage on data plane nodes
Use the cephadm
utility to configure and deploy Red Hat Ceph Storage for an HCI environment.
1.3.1. The cephadm
utility
Use the cephadm
utility to configure and deploy Red Hat Ceph Storage on the data plane nodes. The cephadm
package must be deployed on at least one data plane node before proceeding; edpm-ansible
does not deploy Red Hat Ceph Storage.
For additional information and procedures for deploying Red Hat Ceph Storage, see Red Hat Ceph Storage installation in the Red Hat Ceph Storage Installation Guide.
1.3.2. Configuring and deploying Red Hat Ceph Storage
Configure and deploy Red Hat Ceph Storage by editing the configuration file and using the cephadm
utility.
Procedure
- Edit the Red Hat Ceph Storage configuration file.
Add the
Storage
andStorage Management
network ranges. Red Hat Ceph Storage uses theStorage
network as the Red Hat Ceph Storagepublic_network
and theStorage Management
network as thecluster_network
.The following example is for a configuration file entry where the
Storage
network range is172.18.0.0/24
and theStorage Management
network range is172.20.0.0/24
:[global] public_network = 172.18.0.0/24 cluster_network = 172.20.0.0/24
Add collocation boundaries between the Compute service and Ceph OSD services. Boundaries should be set between collocated Compute service and Ceph OSD services to reduce CPU and memory contention.
The following is an example for a Ceph configuration file entry with these boundaries set:
[osd] osd_memory_target_autotune = true osd_numa_auto_affinity = true [mgr] mgr/cephadm/autotune_memory_target_ratio = 0.2
In this example, the
osd_memory_target_autotune
parameter is set totrue
so that the OSD daemons adjust memory consumption based on theosd_memory_target
option. Theautotune_memory_target_ratio
defaults to 0.7. This means 70 percent of the total RAM in the system is the starting point from which any memory consumed by non-autotuned Ceph daemons is subtracted. The remaining memory is divided between the OSDs; assuming all OSDs haveosd_memory_target_autotune
set to true. For HCI deployments, you can setmgr/cephadm/autotune_memory_target_ratio
to 0.2 so that more memory is available for the Compute service.For additional information about service collocation, see Collocating services in a HCI environment for NUMA nodes.
NoteIf these values need to be adjusted after the deployment, use the
ceph config set osd <key> <value>
command.Deploy Ceph Storage with the edited configuration file on a data plane node:
$ cephadm bootstrap --config <config_file> --mon-ip <data_plane_node_ip> --skip-monitoring-stack
-
Replace
<config_file>
with the name of your Ceph configuration file. Replace
<data_plane_node_ip>
with theStorage
network IP address of the data plane node on which Red Hat Ceph Storage will be installed.NoteThe
--skip-monitoring-stack
option is used in thecephadm bootstrap
command to skip the deployment of monitoring services. This ensures the Red Hat Ceph Storage deployment completes successfully if monitoring services have been previously deployed as part of any other preceding process.If monitoring services have not been deployed, see the Red Hat Ceph Storage documentation for information and procedures on enabling monitoring services.
-
Replace
- After the Red Hat Ceph Storage cluster is bootstrapped on the first EDPM node, see Red Hat Ceph Storage installation in the Red Hat Ceph Storage Installation Guide to add the other EDPM nodes to the Ceph cluster.
1.3.2.1. Collocating services in a HCI environment for NUMA nodes
A two-NUMA node system can host a latency sensitive Compute service workload on one NUMA node and a Ceph OSD workload on the other NUMA node. To configure Ceph OSDs to use a specific NUMA node not being used by the the Compute service, use either of the following Ceph OSD configurations:
-
osd_numa_node
sets affinity to a NUMA node (-1 for none). -
osd_numa_auto_affinity
automatically sets affinity to the NUMA node where storage and network match.
If there are network interfaces on both NUMA nodes and the disk controllers are on NUMA node 0, do the following:
- Use a network interface on NUMA node 0 for the storage network
- Host the Ceph OSD workload on NUMA node 0.
- Host the Compute service workload on NUMA node 1 and have it use the network interfaces on NUMA node 1.
Set osd_numa_auto_affinity
to true, as in the initial Ceph configuration file. Alternatively, set the osd_numa_node
directly to 0 and clear the osd_numa_auto_affinity
parameter so that it defaults to false
.
When a hyperconverged cluster backfills as a result of an OSD going offline, the backfill process can be slowed down. In exchange for a slower recovery, the backfill activity has less of an impact on the collocated Compute service (nova) workload. Red Hat Ceph Storage has the following defaults to control the rate of backfill activity.
-
osd_recovery_op_priority = 3
-
osd_max_backfills = 1
-
osd_recovery_max_active_hdd = 3
-
osd_recovery_max_active_ssd = 10
1.3.3. Confirming Red Hat Ceph Storage deployment
Confirm Red Hat Ceph Storage is deployed before proceeding.
Procedure
- Connect to a data plane node by using SSH.
View the status of the Red Hat Ceph Storage cluster:
$ cephadm shell -- ceph -s
1.3.4. Confirming Red Hat Ceph Storage tuning
Ensure that Red Hat Ceph Storage is properly tuned before proceeding.
Procedure
- Connect to a data plane node by using SSH.
Verify overall Red Hat Ceph Storage tuning with the following commands:
$ ceph config dump | grep numa $ ceph config dump | grep autotune $ ceph config dump | get mgr
Verify the tuning of an OSD with the following commands:
$ ceph config get <osd_number> osd_memory_target $ ceph config get <osd_number> osd_memory_target_autotune $ ceph config get <osd_number> osd_numa_auto_affinity
-
Replace
<osd_number>
with the number of an OSD. For example, to refer to OSD 11, useosd.11
.
-
Replace
Verify the default backfill values of an OSD with the following commands:
$ ceph config get <osd_number> osd_recovery_op_priority $ ceph config get <osd_number> osd_max_backfills $ ceph config get <osd_number> osd_recovery_max_active_hdd $ ceph config get <osd_number> osd_recovery_max_active_ssd
-
Replace
<osd_number>
with the number of an OSD. For example, to refer to OSD 11, useosd.11
.
-
Replace
1.4. Configuring the data plane to use the collocated Red Hat Ceph Storage server
Although the Red Hat Ceph Storage cluster is physically collocated with the Compute services on the data plane nodes, it is treated as logically separated. Red Hat Ceph Storage must be configured as the storage solution before the data plane nodes can use it.
Prerequisites
- Complete the procedures in Integrating Red Hat Ceph Storage.
Procedure
-
Edit the
OpenStackDataPlaneNodeSet
CR. To define the
cephx
key and configuration file for the Compute service (nova), use theextraMounts
parameter.The following is an example of using the
extraMounts
parameter for this purpose:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true
-
Locate the
services
list in the CR. Edit the
services
list to restore all of the services removed in Configuring the data plane node networks. Restoring the fullservices
list allows the remaining jobs to be run that complete the configuration of the HCI environment.The following is an example of a full
services
list with the additional services in bold:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn - neutron-metadata - libvirt - nova-custom-ceph
NoteIn addition to restoring the default service list, the
ceph-client
service is added after therun-os
service. Theceph-client
service configures EDPM nodes as clients of a Red Hat Ceph Storage server. This service distributes the files necessary for the clients to connect to the Red Hat Ceph Storage server.Create a
ConfigMap
to set thereserved_host_memory_mb
parameter to a value appropriate for your configuration.The following is an example of a ConfigMap used for this purpose:
apiVersion: v1 kind: ConfigMap metadata: name: reserved-memory-nova data: 04-reserved-memory-nova.conf: | [DEFAULT] reserved_host_memory_mb=75000
NoteThe value for the
reserved_host_memory_mb
parameter may be set so that the Compute service scheduler does not give memory to a virtual machine that a Ceph OSD on the same server needs. The example reserves 5 GB per OSD for 10 OSDs per host in addition to the default reserved memory for the hypervisor. In an IOPS-optimized cluster, you can improve performance by reserving more memory for each OSD. The 5 GB number is provided as a starting point which can be further tuned if necessary.Add
reserved-memory-nova
to theconfigMaps
list by editing theOpenStackDataPlaneService/nova-custom-ceph
file:kind: OpenStackDataPlaneService <...> spec: configMaps: - ceph-nova - reserved-memory-nova
Apply the CR changes.
$ oc apply -f <dataplane_cr_file>
Replace
<dataplane_cr_file>
with the name of your file.NoteAnsible does not configure or validate the networks until the
OpenStackDataPlaneDeployment
CRD is created.
-
Create an
OpenStackDataPlaneDeployment
CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has theOpenStackDataPlaneNodeSet
CRD file defined above to have Ansible configure the services on the data plane nodes.