搜索

此内容没有您所选择的语言版本。

Chapter 3. Configuring a Hyperconverged Infrastructure environment

download PDF

This section describes how to deploy a Hyperconverged Infrastructure (HCI) environment. A HCI environment contains data plane nodes that host both Ceph Storage and the Compute service.

Create an HCI environment, by completing the following high-level tasks:

  1. Configuring the data plane node networking.
  2. Installing Red Hat Ceph Storage on the data plane nodes.
  3. Configuring Red Hat OpenStack Services on OpenShift (RHOSO) to use the Red Hat Ceph Storage cluster.

3.1. Data plane node services list

Create an OpenStackDataPlaneNodeSet CR to configure data plane nodes. The dataplane-operator reconciles the OpenStackDataPlaneNodeSet CR when an OpenStackDataPlaneDeployment CR is created.

These CRs have a service list similar to the following example:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  services:
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - run-os
    - ovn
    - libvirt
    - nova

Only the services in the services list are configured.

Red Hat Ceph Storage must be deployed on the data plane node after the Storage network and NTP are configured but before the Compute service is configured. This means you must edit the services list and make other changes to the CR. Throughout this section, you edit the services list to complete the configuration of the HCI environment.

3.2. Configuring the data plane node networks

You must configure the data plane node networks to accommodate the Red Hat Ceph Storage networking requirements.

Prerequisites

  • Control plane deployment is complete but has not yet been modified to use Ceph Storage.
  • The data plane nodes have been provisioned with an operating system.
  • The data plane nodes are accessible through an SSH key that Ansible can use.
  • The data plane nodes have disks available to be used as Ceph OSDs.
  • There are a minimum of three available data plane nodes. Ceph Storage clusters must have a minimum of three nodes to ensure redundancy.

Procedure

  1. Create an OpenStackDataPlaneNodeSet CRD file to represent the data plane nodes.

    Note

    Do not create the CR in Red Hat OpenShift yet.

  2. Add the ceph-hci-pre service to the list before the configure-os service and remove all other service listings after run-os.

    The following is an example of the edited list:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    spec:
      ...
      services:
        - download-cache
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - ceph-hci-pre
        - configure-os
        - ssh-known-hosts
        - run-os
        - reboot-os
    Note

    Note the services that you remove from the list. You add them back to the list later.

  3. (Optional) The ceph-hci-pre service prepares EDPM nodes to host Red Hat Ceph Storage services after network configuration using the edpm_ceph_hci_pre edpm-ansible role. By default, the edpm_ceph_hci_pre_enabled_services parameter of this role only contains RBD, RGW, and NFS services. If other services, such as the Dashboard, are deployed with HCI nodes; they must be added to the edpm_ceph_hci_pre_enabled_services parameter list. For more information about this role, see edpm_ceph_hci_pre role.
  4. Configure the Red Hat Ceph Storage cluster_network for storage management traffic between OSDs. Modify the CR to set edpm-ansible variables so that the edpm_network_config role configures a storage management network which Ceph uses as the cluster_network.

    The following example has 3 nodes. It assumes the storage management network range is 172.20.0.0/24 and that it is on VLAN23. The bolded lines are additions for the cluster_network:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm
      namespace: openstack
    spec:
      env:
      - name: ANSIBLE_FORCE_COLOR
        value: "True"
      networkAttachments:
      - ctlplane
      nodeTemplate:
        ansible:
          ansiblePort: 22
          ansibleUser: cloud-admin
          ansibleVars:
            edpm_ceph_hci_pre_enabled_services:
            - ceph_mon
            - ceph_mgr
            - ceph_osd
            - ceph_rgw
            - ceph_nfs
            - ceph_rgw_frontend
            - ceph_nfs_frontend
            edpm_fips_mode: check
            edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }}
            edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }}
            edpm_network_config_hide_sensitive_logs: false
            edpm_network_config_os_net_config_mappings:
              edpm-compute-0:
                nic1: 52:54:00:1e:af:6b
                nic2: 52:54:00:d9:cb:f4
              edpm-compute-1:
                nic1: 52:54:00:f2:bc:af
                nic2: 52:54:00:f1:c7:dd
              edpm-compute-2:
                nic1: 52:54:00:dd:33:14
                nic2: 52:54:00:50:fb:c3
            edpm_network_config_template: |
              ---
              {% set mtu_list = [ctlplane_mtu] %}
              {% for network in nodeset_networks %}
              {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }}
              {%- endfor %}
              {% set min_viable_mtu = mtu_list | max %}
              network_config:
              - type: ovs_bridge
                name: {{ neutron_physical_bridge_name }}
                mtu: {{ min_viable_mtu }}
                use_dhcp: false
                dns_servers: {{ ctlplane_dns_nameservers }}
                domain: {{ dns_search_domains }}
                addresses:
                - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
                routes: {{ ctlplane_host_routes }}
                members:
                - type: interface
                  name: nic2
                  mtu: {{ min_viable_mtu }}
                  # force the MAC address of the bridge to this interface
                  primary: true
              {% for network in nodeset_networks %}
                - type: vlan
                  mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
                  vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
                  addresses:
                  - ip_netmask:
                      {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
                  routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
              {% endfor %}
            edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }}
            edpm_nodes_validation_validate_controllers_icmp: false
            edpm_nodes_validation_validate_gateway_icmp: false
            edpm_selinux_mode: enforcing
            edpm_sshd_allowed_ranges:
            - 192.168.122.0/24
            - 192.168.111.0/24
            edpm_sshd_configure_firewall: true
            enable_debug: false
            gather_facts: false
            image_tag: current-podified
            neutron_physical_bridge_name: br-ex
            neutron_public_interface_name: eth0
            service_net_map:
              nova_api_network: internalapi
              nova_libvirt_network: internalapi
            storage_mgmt_cidr: "24"
            storage_mgmt_host_routes: []
            storage_mgmt_mtu: 9000
            storage_mgmt_vlan_id: 23
            storage_mtu: 9000
            timesync_ntp_servers:
            - hostname: pool.ntp.org
        ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
        managementNetwork: ctlplane
        networks:
        - defaultRoute: true
          name: ctlplane
          subnetName: subnet1
        - name: internalapi
          subnetName: subnet1
        - name: storage
          subnetName: subnet1
        - name: tenant
          subnetName: subnet1
      nodes:
        edpm-compute-0:
          ansible:
            host: 192.168.122.100
          hostName: compute-0
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.100
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
        edpm-compute-1:
          ansible:
            host: 192.168.122.101
          hostName: compute-1
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.101
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
        edpm-compute-2:
          ansible:
            host: 192.168.122.102
          hostName: compute-2
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.102
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
      preProvisioned: true
      services:
      - bootstrap
      - configure-network
      - validate-network
      - install-os
      - ceph-hci-pre
      - configure-os
      - ssh-known-hosts
      - run-os
      - reboot-os
    Note

    It is not necessary to add the storage management network to the networkAttachments key.

  5. Apply the CR:

    $ oc apply -f <dataplane_cr_file>
    • Replace <dataplane_cr_file> with the name of your file.

      Note

      Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.

  6. Create an OpenStackDataPlaneDeployment CRD, as described in Deploying the data plane, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.
  7. To confirm the network is configured, complete the following steps:

    1. SSH into a data plane node.
    2. Use the ip a command to display configured networks.
    3. Confirm the storage networks are in the list of configured networks.

3.2.1. Red Hat Ceph Storage MTU settings

The example in this procedure changes the MTU of the storage and storage_mgmt networks from 1500 to 9000. An MTU of 9000 is known as a jumbo frame. Even though it is not mandatory to increase the MTU, jumbo frames are used for improved storage performance. If jumbo frames are used, all network switch ports in the data path must be configured to support jumbo frames. MTU changes must also be made for services using the Storage network running on OpenShift.

To change the MTU for the OpenShift services connecting to the data plane nodes, update the Node Network Configuration Policy (NNCP) for the base interface and the VLAN interface. It is not necessary to update the Network Attachment Definition (NAD) if the main NAD interface already has the desired MTU. If the MTU of the underlying interface is set to 9000, and it is not specified for the VLAN interface on top of it, then it will default to the value from the underlying interface.

If the MTU values are not consistent, issues can occur on the application layer that can cause the Red Hat Ceph Storage cluster to not reach quorum or not support authentication using the CephX protocol. If the MTU is changed and you observe these types of problems, verify all hosts that use the network using jumbo frames can communicate at the chosen MTU value with the ping command, for example:

$ ping -M do -s 8972 172.20.0.100

3.3. Configuring and deploying Red Hat Ceph Storage on data plane nodes

Use the cephadm utility to configure and deploy Red Hat Ceph Storage for an HCI environment.

3.3.1. The cephadm utility

Use the cephadm utility to configure and deploy Red Hat Ceph Storage on the data plane nodes. The cephadm package must be deployed on at least one data plane node before proceeding; edpm-ansible does not deploy Red Hat Ceph Storage.

For additional information and procedures for deploying Red Hat Ceph Storage, see Red Hat Ceph Storage installation in the Red Hat Ceph Storage Installation Guide.

3.3.2. Configuring and deploying Red Hat Ceph Storage

Configure and deploy Red Hat Ceph Storage by editing the configuration file and using the cephadm utility.

Procedure

  1. Edit the Red Hat Ceph Storage configuration file.
  2. Add the Storage and Storage Management network ranges. Red Hat Ceph Storage uses the Storage network as the Red Hat Ceph Storage public_network and the Storage Management network as the cluster_network.

    The following example is for a configuration file entry where the Storage network range is 172.18.0.0/24 and the Storage Management network range is 172.20.0.0/24:

    [global]
    public_network = 172.18.0.0/24
    cluster_network = 172.20.0.0/24
  3. Add collocation boundaries between the Compute service and Ceph OSD services. Boundaries should be set between collocated Compute service and Ceph OSD services to reduce CPU and memory contention.

    The following is an example for a Ceph configuration file entry with these boundaries set:

    [osd]
    osd_memory_target_autotune = true
    osd_numa_auto_affinity = true
    [mgr]
    mgr/cephadm/autotune_memory_target_ratio = 0.2

    In this example, the osd_memory_target_autotune parameter is set to true so that the OSD daemons adjust memory consumption based on the osd_memory_target option. The autotune_memory_target_ratio defaults to 0.7. This means 70 percent of the total RAM in the system is the starting point from which any memory consumed by non-autotuned Ceph daemons is subtracted. The remaining memory is divided between the OSDs; assuming all OSDs have osd_memory_target_autotune set to true. For HCI deployments, you can set mgr/cephadm/autotune_memory_target_ratio to 0.2 so that more memory is available for the Compute service.

    For additional information about service collocation, see Collocating services in a HCI environment for NUMA nodes.

    Note

    If these values need to be adjusted after the deployment, use the ceph config set osd <key> <value> command.

  4. Deploy Ceph Storage with the edited configuration file on a data plane node:

    $ cephadm bootstrap --config <config_file> --mon-ip <data_plane_node_ip>

    • Replace <config_file> with the name of your Ceph configuration file.
    • Replace <data_plane_node_ip> with the Storage network IP address of the data plane node on which Red Hat Ceph Storage will be installed.
  5. After the Red Hat Ceph Storage cluster is bootstrapped on the first EDPM node, see Red Hat Ceph Storage installation in the Red Hat Ceph Storage Installation Guide to add the other EDPM nodes to the Ceph cluster.

3.3.2.1. Collocating services in a HCI environment for NUMA nodes

A two-NUMA node system can host a latency sensitive Compute service workload on one NUMA node and a Ceph OSD workload on the other NUMA node. To configure Ceph OSDs to use a specific NUMA node not being used by the the Compute service, use either of the following Ceph OSD configurations:

  • osd_numa_node sets affinity to a NUMA node (-1 for none).
  • osd_numa_auto_affinity automatically sets affinity to the NUMA node where storage and network match.

If there are network interfaces on both NUMA nodes and the disk controllers are on NUMA node 0, do the following:

  1. Use a network interface on NUMA node 0 for the storage network
  2. Host the Ceph OSD workload on NUMA node 0.
  3. Host the Compute service workload on NUMA node 1 and have it use the network interfaces on NUMA node 1.

Set osd_numa_auto_affinity to true, as in the initial Ceph configuration file. Alternatively, set the osd_numa_node directly to 0 and clear the osd_numa_auto_affinity parameter so that it defaults to false.

When a hyperconverged cluster backfills as a result of an OSD going offline, the backfill process can be slowed down. In exchange for a slower recovery, the backfill activity has less of an impact on the collocated Compute service (nova) workload. Red Hat Ceph Storage has the following defaults to control the rate of backfill activity.

  • osd_recovery_op_priority = 3
  • osd_max_backfills = 1
  • osd_recovery_max_active_hdd = 3
  • osd_recovery_max_active_ssd = 10

3.3.3. Confirming Red Hat Ceph Storage deployment

Confirm Red Hat Ceph Storage is deployed before proceeding.

Procedure

  1. Connect to a data plane node by using SSH.
  2. View the status of the Red Hat Ceph Storage cluster:

    $ cephadm shell -- ceph -s

3.3.4. Confirming Red Hat Ceph Storage tuning

Ensure that Red Hat Ceph Storage is properly tuned before proceeding.

Procedure

  1. Connect to a data plane node by using SSH.
  2. Verify overall Red Hat Ceph Storage tuning with the following commands:

    $ ceph config dump | grep numa
    $ ceph config dump | grep autotune
    $ ceph config dump | get mgr
  3. Verify the tuning of an OSD with the following commands:

    $ ceph config get <osd_number> osd_memory_target
    $ ceph config get <osd_number> osd_memory_target_autotune
    $ ceph config get <osd_number> osd_numa_auto_affinity
    • Replace <osd_number> with the number of an OSD. For example, to refer to OSD 11, use osd.11.
  4. Verify the default backfill values of an OSD with the following commands:

    $ ceph config get <osd_number> osd_recovery_op_priority
    $ ceph config get <osd_number> osd_max_backfills
    $ ceph config get <osd_number> osd_recovery_max_active_hdd
    $ ceph config get <osd_number> osd_recovery_max_active_ssd
    • Replace <osd_number> with the number of an OSD. For example, to refer to OSD 11, use osd.11.

3.4. Configuring the data plane to use the collocated Red Hat Ceph Storage server

Although the Red Hat Ceph Storage cluster is physically collocated with the Compute services on the data plane nodes, it is treated as logically separated. Red Hat Ceph Storage must be configured as the storage solution before the data plane nodes can use it.

Prerequisites

Procedure

  1. Edit the OpenStackDataPlaneNodeSet CR.
  2. To define the cephx key and configuration file for the Compute service (nova), use the extraMounts parameter.

    The following is an example of using the extraMounts parameter for this purpose:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlane
    spec:
      roles:
        edpm-compute:
          nodeTemplate:
            extraMounts:
            - extraVolType: Ceph
              volumes:
              - name: ceph
                secret:
                  secretName: ceph-conf-files
              mounts:
              - name: ceph
                mountPath: "/etc/ceph"
                readOnly: true
  3. Locate the services list in the CR.
  4. Edit the services list to restore all of the services removed in Configuring the data plane node networks. Restoring the full services list allows the remaining jobs to be run that complete the configuration of the HCI environment.

    The following is an example of a full services list with the additional services in bold:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    spec:
      ...
      services:
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - ceph-hci-pre
        - configure-os
        - ssh-known-hosts
        - run-os
        - reboot-os
        - install-certs
        - ceph-client
        - ovn
        - neutron-metadata
        - libvirt
        - nova-custom-ceph
    Note

    In addition to restoring the default service list, the ceph-client `service is added after the run-os service. The `ceph-client service configures EDPM nodes as clients of a Red Hat Ceph Storage server. This service distributes the files necessary for the clients to connect to the Red Hat Ceph Storage server.

  5. Create a ConfigMap to set the reserved_host_memory_mb parameter to a value appropriate for your configuration.

    The following is an example of a ConfigMap used for this purpose:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: reserved-memory-nova
    data:
      04-reserved-memory-nova.conf: |
        [DEFAULT]
        reserved_host_memory_mb=75000
    Note

    The value for the reserved_host_memory_mb parameter may be set so that the Compute service scheduler does not give memory to a virtual machine that a Ceph OSD on the same server needs. The example reserves 5 GB per OSD for 10 OSDs per host in addition to the default reserved memory for the hypervisor. In an IOPS-optimized cluster, you can improve performance by reserving more memory for each OSD. The 5 GB number is provided as a starting point which can be further tuned if necessary.

  6. Add reserved-memory-nova to the configMaps list by editing the OpenStackDataPlaneService/nova-custom-ceph file:

    kind: OpenStackDataPlaneService
    <...>
    spec:
      configMaps:
      - ceph-nova
      - reserved-memory-nova
  7. Apply the CR changes.

    $ oc apply -f <dataplane_cr_file>
    • Replace <dataplane_cr_file> with the name of your file.

      Note

      Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.

  8. Create an OpenStackDataPlaneDeployment CRD, as described in Deploying the data plane, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.
Red Hat logoGithubRedditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

© 2024 Red Hat, Inc.