Chapter 1. Configuring a hyperconverged infrastructure environment

You can optimize resource usage and simplify infrastructure management by deploying a hyperconverged infrastructure (HCI) environment, where data plane nodes host both Red Hat Ceph Storage and the Compute service (nova).

Create an HCI environment, by completing the following high-level tasks:

Configuring the data plane node networking.
Installing Red Hat Ceph Storage on the data plane nodes.
Configuring Red Hat OpenStack Services on OpenShift (RHOSO) to use the Red Hat Ceph Storage cluster.

Note

RHOSO supports external deployments of Red Hat Ceph Storage 7, 8, and 9. Configuration examples in procedures that reference Red Hat Ceph Storage use Release 7 information. If you are using a later version of Red Hat Ceph Storage, adjust the configuration examples accordingly.

1.1. Data plane node services list
Copy link

You create an OpenStackDataPlaneNodeSet CR to configure data plane nodes. The openstack-operator reconciles the OpenStackDataPlaneNodeSet CR when you create an OpenStackDataPlaneDeployment CR.

These CRs have a list of services similar to the following example. This example may not list all the services in your environment:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  services:
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - run-os
    - ovn
    - libvirt
    - nova

To view the default list of services for your environment, run the following command:

$ oc get -n openstack crd/openstackdataplanenodesets.dataplane.openstack.org -o yaml |yq -r '.spec.versions.[].schema.openAPIV3Schema.properties.spec.properties.services.default'

For more information about data plane services, see Data plane services.

The system configures only the services in this list. For HCI deployments, you must customize this list of services. Red Hat Ceph Storage must be deployed on the data plane node after the Storage network and NTP are configured but before the Compute service is configured. This means you must edit the services list and make other changes to the CR to complete the configuration of the HCI environment.

1.2. Configuring the data plane node networks
Copy link

You must configure the data plane node networks to meet Red Hat Ceph Storage networking requirements.

Prerequisites

Control plane deployment is complete but has not yet been modified to use Ceph Storage.
The data plane nodes have been provisioned with an operating system.
The data plane nodes are accessible through an SSH key that Ansible can use.
The data plane nodes have disks available to be used as Ceph OSDs.
There are a minimum of three available data plane nodes. Ceph Storage clusters must have a minimum of three nodes to ensure redundancy.

Procedure

Create an OpenStackDataPlaneNodeSet CRD file to represent the data plane nodes. Do not create the CR in Red Hat OpenShift yet.

Add the ceph-hci-pre service to the list of services before the configure-os service, and remove all services listed after reboot-os. Track which services you remove, as you will add them back to the list later.

The following is an example of the edited list:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  services:
    - redhat
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - ceph-hci-pre
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os

(Optional) The ceph-hci-pre service prepares EDPM nodes to host Red Hat Ceph Storage services after network configuration by using the edpm_ceph_hci_pre edpm-ansible role. By default, the edpm_ceph_hci_pre_enabled_services parameter of this role only contains RBD, RGW, and NFS services. If other services, such as the Dashboard, are deployed with HCI nodes; they must be added to the edpm_ceph_hci_pre_enabled_services parameter list. For more information about this role, see edpm_ceph_hci_pre role.

Configure the Red Hat Ceph Storage cluster_network for storage management traffic between OSDs. Modify the CR to set edpm-ansible variables so that the edpm_network_config role configures a storage management network which Ceph uses as the cluster_network.

The following example has 3 nodes. It assumes the storage management network range is 172.20.0.0/24 and that it is on VLAN23. The bolded lines are additions for the cluster_network:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm
  namespace: openstack
spec:
  env:
  - name: ANSIBLE_FORCE_COLOR
    value: "True"
  networkAttachments:
  - ctlplane
  nodeTemplate:
    ansible:
      ansiblePort: 22
      ansibleUser: cloud-admin
      ansibleVars:
        edpm_ceph_hci_pre_enabled_services:
        - ceph_mon
        - ceph_mgr
        - ceph_osd
        - ceph_rgw
        - ceph_nfs
        - ceph_rgw_frontend
        - ceph_nfs_frontend
        edpm_fips_mode: check
        edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }}
        edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }}
        edpm_network_config_hide_sensitive_logs: false
        edpm_network_config_os_net_config_mappings:
          edpm-compute-0:
            nic1: 52:54:00:1e:af:6b
            nic2: 52:54:00:d9:cb:f4
          edpm-compute-1:
            nic1: 52:54:00:f2:bc:af
            nic2: 52:54:00:f1:c7:dd
          edpm-compute-2:
            nic1: 52:54:00:dd:33:14
            nic2: 52:54:00:50:fb:c3
        edpm_network_config_template: |
          ---
          {% set mtu_list = [ctlplane_mtu] %}
          {% for network in nodeset_networks %}
          {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }}
          {%- endfor %}
          {% set min_viable_mtu = mtu_list | max %}
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            mtu: {{ min_viable_mtu }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            domain: {{ dns_search_domains }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes: {{ ctlplane_host_routes }}
            members:
            - type: interface
              name: nic2
              mtu: {{ min_viable_mtu }}
              # force the MAC address of the bridge to this interface
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
              vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
              addresses:
              - ip_netmask:
                  {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
              routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
          {% endfor %}
        edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }}
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false
        edpm_selinux_mode: enforcing
        edpm_sshd_allowed_ranges:
        - 192.168.122.0/24
        - 192.168.111.0/24
        edpm_sshd_configure_firewall: true
        enable_debug: false
        gather_facts: false
        image_tag: current-podified
        neutron_physical_bridge_name: br-ex
        neutron_public_interface_name: eth0
        service_net_map:
          nova_api_network: internalapi
          nova_libvirt_network: internalapi
        storage_mgmt_cidr: "24"
        storage_mgmt_host_routes: []
        storage_mgmt_mtu: 9000
        storage_mgmt_vlan_id: 23
        storage_mtu: 9000
        timesync_ntp_servers:
        - hostname: pool.ntp.org
    ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
    managementNetwork: ctlplane
    networks:
    - defaultRoute: true
      name: ctlplane
      subnetName: subnet1
    - name: internalapi
      subnetName: subnet1
    - name: storage
      subnetName: subnet1
    - name: tenant
      subnetName: subnet1
  nodes:
    edpm-compute-0:
      ansible:
        host: 192.168.122.100
      hostName: compute-0
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.100
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
    edpm-compute-1:
      ansible:
        host: 192.168.122.101
      hostName: compute-1
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.101
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
    edpm-compute-2:
      ansible:
        host: 192.168.122.102
      hostName: compute-2
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.102
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
  preProvisioned: true
  services:
  - bootstrap
  - configure-network
  - validate-network
  - install-os
  - ceph-hci-pre
  - configure-os
  - ssh-known-hosts
  - run-os
  - reboot-os

Note

It is not necessary to add the storage management network to the networkAttachments key.

Apply the CR:
```
$ oc apply -f <dataplane_cr_file>
```
- Replace <dataplane_cr_file> with the name of your file.
  Note
  Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.
Create an OpenStackDataPlaneDeployment CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.
To confirm the network is configured, complete the following steps:
1. SSH into a data plane node.
2. Use the ip a command to display configured networks.
3. Confirm the storage networks are in the list of configured networks.

1.2.1. Red Hat Ceph Storage MTU settings
Copy link

You can improve storage performance by configuring jumbo frames (MTU 9000), which require consistent MTU settings on all network switches and OpenShift services in the data path.

To change the MTU for the OpenShift services connecting to the data plane nodes, update the Node Network Configuration Policy (NNCP) for the base interface and the VLAN interface. You do not need to update the Network Attachment Definition (NAD) if the main NAD interface already has the required MTU. If you set the MTU of the underlying interface to 9000, and it is not specified for the VLAN interface on top of it, then it defaults to the value from the underlying interface.

If the MTU values are not consistent, issues can occur on the application layer that prevent the Red Hat Ceph Storage cluster from reaching quorum or from supporting authentication by using the CephX protocol. If the MTU is changed and you observe these issues, verify that all hosts using the network with jumbo frames can communicate at the chosen MTU value by using the ping command, for example:

$ ping -M do -s 8972 172.20.0.100

1.3. Configuring and deploying Red Hat Ceph Storage on data plane nodes
Copy link

You can configure and deploy Red Hat Ceph Storage on your HCI environment by using the cephadm utility, which provides cluster management and deployment capabilities.

1.3.1. The cephadm utility
Copy link

You can configure and deploy Red Hat Ceph Storage on data plane nodes by using the cephadm utility. You must deploy cephadm on at least one node before proceeding, as edpm-ansible does not deploy it.

For more information and procedures for deploying Red Hat Ceph Storage, see "Red Hat Ceph Storage installation" in the Red Hat Ceph Storage Installation Guide:

1.3.2. Configuring and deploying Red Hat Ceph Storage
Copy link

You can configure and deploy Red Hat Ceph Storage by editing the configuration file and using the cephadm utility.

In HCI deployments where Red Hat Ceph Storage is co-located with Compute (nova) nodes, IP addresses on the storage and storage_mgmt networks must remain fixed after initial deployment. For more information, see Fixed IP addressing for HCI deployments.

Procedure

Edit the Red Hat Ceph Storage configuration file.
Add the Storage and Storage Management network ranges. Red Hat Ceph Storage uses the Storage network as the Red Hat Ceph Storage public_network and the Storage Management network as the cluster_network.
The following example is for a configuration file entry where the Storage network range is 172.18.0.0/24 and the Storage Management network range is 172.20.0.0/24:
```
[global]
public_network = 172.18.0.0/24
cluster_network = 172.20.0.0/24
```
Add co-location boundaries between the Compute service and Ceph OSD services. Set boundaries between co-located Compute service and Ceph OSD services to reduce CPU and memory contention.
The following is an example for a Ceph configuration file entry with these boundaries set:
```
[osd]
osd_memory_target_autotune = true
osd_numa_auto_affinity = true
[mgr]
mgr/cephadm/autotune_memory_target_ratio = 0.2
```
In this example, the osd_memory_target_autotune parameter is set to true so that the OSD daemons adjust memory consumption based on the osd_memory_target option. The autotune_memory_target_ratio defaults to 0.7. This means 70 percent of the total RAM in the system is the starting point from which any memory consumed by non-autotuned Ceph daemons is subtracted. The remaining memory is divided between the OSDs; assuming all OSDs have osd_memory_target_autotune set to true. For HCI deployments, you can set mgr/cephadm/autotune_memory_target_ratio to 0.2 so that more memory is available for the Compute service.
For more information about service co-location, see Co-locating services in a HCI environment for NUMA nodes.
Note
If these values need to be adjusted after the deployment, use the ceph config set osd <key> <value> command.
Deploy Ceph Storage with the edited configuration file on a data plane node:
$ cephadm bootstrap --config <config_file> --mon-ip <data_plane_node_ip> --skip-monitoring-stack
- Replace <config_file> with the name of your Ceph configuration file.
- Replace <data_plane_node_ip> with the Storage network IP address of the data plane node on which Red Hat Ceph Storage will be installed.
  Note
  Use the --skip-monitoring-stack option in the cephadm bootstrap command to skip the deployment of monitoring services. This ensures the Red Hat Ceph Storage deployment completes successfully if monitoring services have been previously deployed as part of any other preceding process.
  If monitoring services have not been deployed, see the Red Hat Ceph Storage documentation for information and procedures on enabling monitoring services:
  Red Hat Ceph Storage 7
  Red Hat Ceph Storage 8
  Red Hat Ceph Storage 9
After the Red Hat Ceph Storage cluster is bootstrapped on the first EDPM node, see "Red Hat Ceph Storage installation" in the Red Hat Ceph Storage Installation Guide to add the other EDPM nodes to the Ceph cluster:

1.3.3. Fixed IP addressing for HCI deployments
Copy link

In HCI deployments where Red Hat Ceph Storage is co-located with Compute (nova) nodes, IP addresses on the storage and storage_mgmt networks must remain fixed after initial deployment.

Workloads use the IP addresses on the storage network (Ceph public network) to connect to Ceph monitors (MONs). Changing these addresses breaks client connectivity to the storage cluster. Ceph Object Storage Daemons (OSDs) use IP addresses on the storage_mgmt network for cluster operations.

Changing the IP address of an OSD can cause the following issues:

The OSD to go into a DOWN state
The cluster to start data rebalancing to compensate for missing placement groups (PGs)
Degraded performance during replication and data movement
Potential data availability issues

While mitigation strategies exist to manage IP address changes, these operations are complex and carry risks. Plan your network addressing carefully during initial deployment to avoid the need for changes.

1.3.4. Co-locating services in a HCI environment for NUMA nodes
Copy link

A two-NUMA node system can host a latency-sensitive Compute service workload on one NUMA node and a Ceph OSD workload on the other.

To configure Ceph OSDs to use a specific NUMA node not used by the Compute service, use either of the following Ceph OSD configurations:

osd_numa_node sets affinity to a NUMA node (-1 for none).
osd_numa_auto_affinity automatically sets affinity to the NUMA node where storage and network match.

If there are network interfaces on both NUMA nodes and the disk controllers are on NUMA node 0, do the following:

Use a network interface on NUMA node 0 for the storage network
Host the Ceph OSD workload on NUMA node 0.
Host the Compute service workload on NUMA node 1 and have it use the network interfaces on NUMA node 1.

Set osd_numa_auto_affinity to true, as in the initial Ceph configuration file. Alternatively, set the osd_numa_node directly to 0 and clear the osd_numa_auto_affinity parameter so that it defaults to false.

When a hyperconverged cluster backfills as a result of an OSD going offline, the backfill process can be slowed down. In exchange for a slower recovery, the backfill activity has less of an impact on the co-located Compute service (nova) workload. Red Hat Ceph Storage has the following defaults to control the rate of backfill activity.

osd_recovery_op_priority = 3
osd_max_backfills = 1
osd_recovery_max_active_hdd = 3
osd_recovery_max_active_ssd = 10

1.3.5. Confirming Red Hat Ceph Storage deployment
Copy link

You must confirm that Red Hat Ceph Storage is deployed before proceeding to ensure the cluster is operational.

Procedure

Connect to a data plane node by using SSH.
View the status of the Red Hat Ceph Storage cluster:
```
$ cephadm shell -- ceph -s
```

1.3.6. Confirming Red Hat Ceph Storage tuning
Copy link

You must verify that Red Hat Ceph Storage is properly tuned before proceeding to ensure optimal performance in your HCI environment.

Procedure

Connect to a data plane node by using SSH.

Verify overall Red Hat Ceph Storage tuning with the following commands:

$ ceph config dump | grep numa
$ ceph config dump | grep autotune
$ ceph config dump | get mgr

Verify the tuning of an OSD with the following commands:

$ ceph config get <osd_number> osd_memory_target
$ ceph config get <osd_number> osd_memory_target_autotune
$ ceph config get <osd_number> osd_numa_auto_affinity

Replace <osd_number> with the number of an OSD. For example, to refer to OSD 11, use osd.11.

Verify the default backfill values of an OSD with the following commands:

$ ceph config get <osd_number> osd_recovery_op_priority
$ ceph config get <osd_number> osd_max_backfills
$ ceph config get <osd_number> osd_recovery_max_active_hdd
$ ceph config get <osd_number> osd_recovery_max_active_ssd

Replace <osd_number> with the number of an OSD. For example, to refer to OSD 11, use osd.11.

1.4. Configuring the data plane to use co-located Red Hat Ceph Storage
Copy link

Although the Red Hat Ceph Storage cluster is physically co-located with the Compute services on the data plane nodes, it is treated as logically separate. Configure Red Hat Ceph Storage as the storage solution before data plane nodes can use it.

Prerequisites

Complete the procedures in Integrating Red Hat Ceph Storage.

Procedure

Edit the OpenStackDataPlaneNodeSet CR.

To define the cephx key and configuration file for the Compute service (nova), use the extraMounts parameter.

The following is an example of using the extraMounts parameter for this purpose:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  nodeTemplate:
    extraMounts:
    - extraVolType: Ceph
      volumes:
      - name: ceph
        secret:
          secretName: ceph-conf-files
      mounts:
      - name: ceph
        mountPath: "/etc/ceph"
        readOnly: true

Locate the services list in the CR.
Edit the services list to restore all of the services removed in Configuring the data plane node networks. Restoring the full services list allows the remaining jobs to be run that complete the configuration of the HCI environment:
```
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
    services:
    - redhat
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ceph-client
    - ovn
    - neutron-metadata
    - libvirt
    - nova-custom-ceph
    - telemetry
```
The ceph-client service is added after the run-os service. It configures EDPM nodes as clients of a Red Hat Ceph Storage server and distributes the files necessary for the clients to connect to the server.
Create a ConfigMap to set the reserved_host_memory_mb parameter to a value appropriate for your configuration.
The following is an example of a ConfigMap used for this purpose:
```
apiVersion: v1
kind: ConfigMap
metadata:
  name: reserved-memory-nova
data:
  04-reserved-memory-nova.conf: |
    [DEFAULT]
    reserved_host_memory_mb=75000
```
Note
The value for the reserved_host_memory_mb parameter may be set so that the Compute service scheduler does not give memory to a virtual machine that a Ceph OSD on the same server needs. The example reserves 5 GB per OSD for 10 OSDs per host in addition to the default reserved memory for the hypervisor. In an IOPS-optimized cluster, you can improve performance by reserving more memory for each OSD. The 5 GB number is provided as a starting point which can be further tuned if necessary.
Add reserved-memory-nova to the configMaps list by editing the OpenStackDataPlaneService/nova-custom-ceph file:
```
kind: OpenStackDataPlaneService
<...>
spec:
  configMaps:
  - ceph-nova
  - reserved-memory-nova
```
Apply the CR changes.
```
$ oc apply -f <dataplane_cr_file>
```
- Replace <dataplane_cr_file> with the name of your file.
  Note
  Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.
Create an OpenStackDataPlaneDeployment CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.

Chapter 1. Configuring a hyperconverged infrastructure environment

1.1. Data plane node services list
Copy link

1.2. Configuring the data plane node networks
Copy link

1.2.1. Red Hat Ceph Storage MTU settings
Copy link

1.3. Configuring and deploying Red Hat Ceph Storage on data plane nodes
Copy link

1.3.1. The cephadm utility
Copy link

1.3.2. Configuring and deploying Red Hat Ceph Storage
Copy link

1.3.3. Fixed IP addressing for HCI deployments
Copy link

1.3.4. Co-locating services in a HCI environment for NUMA nodes
Copy link

1.3.5. Confirming Red Hat Ceph Storage deployment
Copy link

1.3.6. Confirming Red Hat Ceph Storage tuning
Copy link

1.4. Configuring the data plane to use co-located Red Hat Ceph Storage
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 1. Configuring a hyperconverged infrastructure environment

1.1. Data plane node services listCopy linkLink copied to clipboard!

1.2. Configuring the data plane node networksCopy linkLink copied to clipboard!

1.2.1. Red Hat Ceph Storage MTU settingsCopy linkLink copied to clipboard!

1.3. Configuring and deploying Red Hat Ceph Storage on data plane nodesCopy linkLink copied to clipboard!

1.3.1. The cephadm utilityCopy linkLink copied to clipboard!

1.3.2. Configuring and deploying Red Hat Ceph StorageCopy linkLink copied to clipboard!

1.3.3. Fixed IP addressing for HCI deploymentsCopy linkLink copied to clipboard!

1.3.4. Co-locating services in a HCI environment for NUMA nodesCopy linkLink copied to clipboard!

1.3.5. Confirming Red Hat Ceph Storage deploymentCopy linkLink copied to clipboard!

1.3.6. Confirming Red Hat Ceph Storage tuningCopy linkLink copied to clipboard!

1.4. Configuring the data plane to use co-located Red Hat Ceph StorageCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

1.1. Data plane node services list
Copy link

1.2. Configuring the data plane node networks
Copy link

1.2.1. Red Hat Ceph Storage MTU settings
Copy link

1.3. Configuring and deploying Red Hat Ceph Storage on data plane nodes
Copy link

1.3.1. The cephadm utility
Copy link

1.3.2. Configuring and deploying Red Hat Ceph Storage
Copy link

1.3.3. Fixed IP addressing for HCI deployments
Copy link

1.3.4. Co-locating services in a HCI environment for NUMA nodes
Copy link

1.3.5. Confirming Red Hat Ceph Storage deployment
Copy link

1.3.6. Confirming Red Hat Ceph Storage tuning
Copy link

1.4. Configuring the data plane to use co-located Red Hat Ceph Storage
Copy link