Chapter 4. Configuring resource isolation on hyperconverged nodes

Colocating Ceph OSD and Compute services on hyperconverged nodes risks resource contention between Ceph and Compute services, as neither are aware of each other’s presence on the same host. Resource contention can result in degradation of service, which offsets the benefits of hyperconvergence.

The following sections detail how resource isolation is configured for both Ceph and Compute services to prevent contention.

4.1. Reserving CPU and memory resources for Compute
Copy link

The director provides a default plan environment file for configuring resource constraints on hyperconverged nodes during deployment. This plan environment file instructs the OpenStack Workflow to complete the following processes:

Retrieve the hardware introspection data collected during Inspecting the Hardware of Nodes in the Director Installation and Usage guide.
Calculate optimal CPU and memory allocation workload for Compute on hyperconverged nodes based on that data.
Autogenerate the parameters required to configure those constraints and reserve CPU and memory resources for Compute. These parameters are defined under the hci_profile_config section of the plan-environment-derived-params.yaml file.

Note

The average_guest_memory_size_in_mb and average_guest_cpu_utilization_percentage parameters in each workload profile are used to calculate values for the reserved_host_memory and cpu_allocation_ratio settings of Compute.

You can override the autogenerated Compute settings by adding the following parameters to your Compute environment file:

Expand

Autogenerated nova.conf parameter Compute environment file override Description

Autogenerated `nova.conf` parameter	Compute environment file override	Description
`reserved_host_memory`	`parameter_defaults: ComputeHCIParameters: NovaReservedHostMemory: 181000` Copy to Clipboard Toggle word wrap	Sets how much RAM should be reserved for the Ceph OSD services and per-guest instance overhead on hyperconverged nodes.
`cpu_allocation_ratio`	`parameter_defaults: ComputeHCIExtraConfig: nova::cpu_allocation_ratio: 8.2` Copy to Clipboard Toggle word wrap	Sets the ratio that the Compute scheduler should use when choosing which Compute node to deploy an instance on.

reserved_host_memory

parameter_defaults:
  ComputeHCIParameters:
    NovaReservedHostMemory: 181000

parameter_defaults:
  ComputeHCIParameters:
    NovaReservedHostMemory: 181000

Copy to Clipboard

Toggle word wrap

Sets how much RAM should be reserved for the Ceph OSD services and per-guest instance overhead on hyperconverged nodes.

cpu_allocation_ratio

parameter_defaults:
  ComputeHCIExtraConfig:
    nova::cpu_allocation_ratio: 8.2

parameter_defaults:
  ComputeHCIExtraConfig:
    nova::cpu_allocation_ratio: 8.2

Copy to Clipboard

Toggle word wrap

Sets the ratio that the Compute scheduler should use when choosing which Compute node to deploy an instance on.

These overrides are applied to all nodes that use the ComputeHCI role, namely, all hyperconverged nodes. For more information about manually determining optimal values for NovaReservedHostMemory and nova::cpu_allocation_ratio, see Compute CPU and Memory Calculator.

Tip

You can use the following script to calculate suitable baseline NovaReservedHostMemory and cpu_allocation_ratio values for your hyperconverged nodes.

nova_mem_cpu_calc.py

4.2. Reserving CPU and memory resources for Ceph
Copy link

The following procedure details how to reserve CPU and memory resources for Ceph.

Procedure

Set the parameter is_hci to "true" in /home/stack/templates/storage-container-config.yaml:
```
parameter_defaults:
  CephAnsibleExtraConfig:
    is_hci: true
```
```
parameter_defaults:
  CephAnsibleExtraConfig:
    is_hci: true
```
Copy to Clipboard Toggle word wrap
This allows ceph-ansible to reserve memory resources for Ceph, and reduce memory growth by Ceph OSDs, by automatically adjusting the osd_memory_target parameter setting for a HCI deployment.
Warning
Red Hat does not recommend directly overriding the ceph_osd_docker_memory_limit parameter.
Note
As of ceph-ansible 3.2, the ceph_osd_docker_memory_limit is set automatically to the maximum memory of the host, as discovered by Ansible, regardless of whether the FileStore or BlueStore back end is used.
(Optional) By default, ceph-ansible reserves one vCPU for each Ceph OSD. If more than one CPU per Ceph OSD is required, add the following configuration to /home/stack/templates/storage-container-config.yaml, setting ceph_osd_docker_cpu_limit to the desired CPU limit:
```
parameter_defaults:
  CephAnsibleExtraConfig:
    ceph_osd_docker_cpu_limit: 2
```
```
parameter_defaults:
  CephAnsibleExtraConfig:
    ceph_osd_docker_cpu_limit: 2
```
Copy to Clipboard Toggle word wrap
For more information on how to tune CPU resources based on your hardware and workload, see Red Hat Ceph Storage Hardware Selection Guide.

4.3. Reduce Ceph Backfill and Recovery Operations
Copy link

When a Ceph OSD is removed, Ceph uses backfill and recovery operations to rebalance the cluster. Ceph does this to keep multiple copies of data according to the placement group policy. These operations use system resources. If a Ceph cluster is under load, its performance drops as it diverts resources to backfill and recovery.

To mitigate this performance effect during OSD removal, you can reduce the priority of backfill and recovery operations. The trade off for this is that there are less data replicas for a longer time, which puts the data at a slightly greater risk.

The parameters detailed in the following table are used to configure the priority of backfill and recovery operations.

Expand

Parameter	Description	Default value
`osd_recovery_op_priority`	Sets the priority for recovery operations, relative to the OSD client OP priority.	3
`osd_recovery_max_active`	Sets the number of active recovery requests per OSD, at one time. More requests accelerate recovery, but the requests place an increased load on the cluster. Set this to 1 if you want to reduce latency.	3
`osd_max_backfills`	Sets the maximum number of backfills allowed to or from a single OSD.	1

To change this default configuration, add an environment file named ceph-backfill-recovery.yaml to `~/templates that contains the following:

parameter_defaults:
  CephConfigOverrides:
    osd_recovery_op_priority: ${priority_value}
    osd_recovery_max_active: ${no_active_recovery_requests}
    osd_max_backfills: ${max_no_backfills}

parameter_defaults:
  CephConfigOverrides:
    osd_recovery_op_priority: ${priority_value}
    osd_recovery_max_active: ${no_active_recovery_requests}
    osd_max_backfills: ${max_no_backfills}

Copy to Clipboard

Toggle word wrap

Chapter 4. Configuring resource isolation on hyperconverged nodes

4.1. Reserving CPU and memory resources for Compute
Copy link

4.2. Reserving CPU and memory resources for Ceph
Copy link

4.3. Reduce Ceph Backfill and Recovery Operations
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 4. Configuring resource isolation on hyperconverged nodes

4.1. Reserving CPU and memory resources for ComputeCopy linkLink copied to clipboard!

4.2. Reserving CPU and memory resources for CephCopy linkLink copied to clipboard!

4.3. Reduce Ceph Backfill and Recovery OperationsCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

4.1. Reserving CPU and memory resources for Compute
Copy link

4.2. Reserving CPU and memory resources for Ceph
Copy link

4.3. Reduce Ceph Backfill and Recovery Operations
Copy link