Chapter 8. CRUSH Weights
The CRUSH algorithm assigns a weight value per device with the objective of approximating a uniform probability distribution for I/O requests. As a best practice, we recommend creating pools with devices of the same type and size, and assigning the same relative weight. Since this is not always practical, you may incorporate devices of different size and use a relative weight so that Ceph will distribute more data to larger drives and less data to smaller drives.
To adjust an OSD’s crush weight in the CRUSH map of a running cluster, execute the following:
ceph osd crush reweight {name} {weight}
Where:
name
- Description
- The full name of the OSD.
- Type
- String
- Required
- Yes
- Example
-
osd.0
weight
- Description
- The CRUSH weight for the OSD.
- Type
- Double
- Required
- Yes
- Example
-
2.0
You can also set weights on osd crush add
or osd crush set
(move).
CRUSH buckets reflect the sum of the weights of the buckets or the devices they contain. For example, a rack containing a two hosts with two OSDs each, might have a weight of 4.0
and each host a weight of 2.0
--the sum for each OSD, where the weight per OSD is 1.00. Generally, we recommend using 1.0
as the measure of 1TB of data.
Introducing devices of different size and performance characteristics in the same pool can lead to variance in data distribution and performance.
CRUSH weight is a persistent setting, and it affects how CRUSH assigns data to OSDs. Ceph also has temporary reweight settings if the cluster gets out of balance. For example, whereas a Ceph Block Device will shard a block device image into a series of smaller objects and stripe them across the cluster, using librados
to store data without normalizing the size of objects can lead to imbalanced clusters (e.g., storing 100 1MB objects and 10 4MB objects will make a few OSDs have more data than the others).
You can temporarily increase or decrease the weight of particular OSDs by executing:
ceph osd reweight {id} {weight}
Where:
-
id
is the OSD number. -
weight
is a range from 0.0-1.0.
You can also temporarily reweight OSDs by utilization.
ceph osd reweight-by-utilization {threshold}
Where:
-
threshold
is a percentage of utilization where OSDs facing higher loads will receive a lower weight. The default value is 120, reflecting 120%. Any value from 100+ is a valid threshold.
Restarting the cluster will wipe out osd reweight
and osd reweight-by-utilization
, but osd crush reweight
settings are persistent.