Home
Products
Red Hat Ceph Storage
8
Operations Guide
Chapter 5. Management of managers using the Ceph Orchestrator

Chapter 5. Management of managers using the Ceph Orchestrator

As a storage administrator, you can use the Ceph Orchestrator to deploy additional manager daemons. Cephadm automatically installs a manager daemon on the bootstrap node during the bootstrapping process.

In general, you should set up a Ceph Manager on each of the hosts running the Ceph Monitor daemon to achieve same level of availability.

By default, whichever ceph-mgr instance comes up first is made active by the Ceph Monitors, and others are standby managers. There is no requirement that there should be a quorum among the ceph-mgr daemons.

If the active daemon fails to send a beacon to the monitors for more than the mon mgr beacon grace, then it is replaced by a standby.

If you want to pre-empt failover, you can explicitly mark a ceph-mgr daemon as failed with ceph mgr fail MANAGER_NAME command.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to all the nodes.
Hosts are added to the cluster.

5.1. Deploying the manager daemons using the Ceph Orchestrator
Copy link

The Ceph Orchestrator deploys two Manager daemons by default. You can deploy additional manager daemons using the placement specification in the command line interface. To deploy a different number of Manager daemons, specify a different number. If you do not specify the hosts where the Manager daemons should be deployed, the Ceph Orchestrator randomly selects the hosts and deploys the Manager daemons to them.

Note

Ensure your deployment has at least three Ceph Managers in each deployment.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.

Procedure

Log into the Cephadm shell:
Example
```
cephadm shell
```
```
[root@host01 ~]# cephadm shell
```
Copy to Clipboard Toggle word wrap
You can deploy manager daemons in two different ways:

Method 1

Deploy manager daemons using placement specification on specific set of hosts:

Note

Red Hat recommends that you use the --placement option to deploy on specific hosts.

Syntax

ceph orch apply mgr --placement=" HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"

ceph orch apply mgr --placement=" HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph orch apply mgr --placement="host01 host02 host03"

[ceph: root@host01 /]# ceph orch apply mgr --placement="host01 host02 host03"

Copy to Clipboard

Toggle word wrap

Method 2

Deploy manager daemons randomly on the hosts in the storage cluster:
Syntax
```
ceph orch apply mgr NUMBER_OF_DAEMONS
```
```
ceph orch apply mgr NUMBER_OF_DAEMONS
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph orch apply mgr 3
```
```
[ceph: root@host01 /]# ceph orch apply mgr 3
```
Copy to Clipboard Toggle word wrap

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls
```
```
[ceph: root@host01 /]# ceph orch ls
```
Copy to Clipboard Toggle word wrap

List the hosts, daemons, and processes:

Syntax

ceph orch ps --daemon_type=DAEMON_NAME

ceph orch ps --daemon_type=DAEMON_NAME

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

[ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

Copy to Clipboard

Toggle word wrap

5.2. Removing the manager daemons using the Ceph Orchestrator
Copy link

To remove the manager daemons from the host, you can just redeploy the daemons on other hosts.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to all the nodes.
Hosts are added to the cluster.
At least one manager daemon deployed on the hosts.

Procedure

Log into the Cephadm shell:
Example
```
cephadm shell
```
```
[root@host01 ~]# cephadm shell
```
Copy to Clipboard Toggle word wrap
Run the ceph orch apply command to redeploy the required manager daemons:
Syntax
```
ceph orch apply mgr "NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_3"
```
```
ceph orch apply mgr "NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_3"
```
Copy to Clipboard Toggle word wrap
If you want to remove manager daemons from host02, then you can redeploy the manager daemons on other hosts.
Example
```
[ceph: root@host01 /]# ceph orch apply mgr "2 host01 host03"
```
```
[ceph: root@host01 /]# ceph orch apply mgr "2 host01 host03"
```
Copy to Clipboard Toggle word wrap

Verification

List the hosts,daemons, and processes:

Syntax

ceph orch ps --daemon_type=DAEMON_NAME

ceph orch ps --daemon_type=DAEMON_NAME

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

[ceph: root@host01 /]# ceph orch ps --daemon_type=mgr

Copy to Clipboard

Toggle word wrap

5.3. Using the Ceph Manager modules
Copy link

Use the ceph mgr module ls command to see the available modules and the modules that are presently enabled.

Enable or disable modules with ceph mgr module enable MODULE command or ceph mgr module disable MODULE command respectively.

If a module is enabled, then the active ceph-mgr daemon loads and executes it. In the case of modules that provide a service, such as an HTTP server, the module might publish its address when it is loaded. To see the addresses of such modules, run the ceph mgr services command.

Some modules might also implement a special standby mode which runs on standby ceph-mgr daemon as well as the active daemon. This enables modules that provide services to redirect their clients to the active daemon, if the client tries to connect to a standby.

Following is an example to enable the dashboard module:

[ceph: root@host01 /]# ceph mgr module enable dashboard

[ceph: root@host01 /]# ceph mgr module ls
MODULE
balancer              on (always on)
crash                 on (always on)
devicehealth          on (always on)
orchestrator          on (always on)
pg_autoscaler         on (always on)
progress              on (always on)
rbd_support           on (always on)
status                on (always on)
telemetry             on (always on)
volumes               on (always on)
cephadm               on
dashboard             on
iostat                on
nfs                   on
prometheus            on
restful               on
alerts                -
diskprediction_local  -
influx                -
insights              -
k8sevents             -
localpool             -
mds_autoscaler        -
mirroring             -
osd_perf_query        -
osd_support           -
rgw                   -
rook                  -
selftest              -
snap_schedule         -
stats                 -
telegraf              -
test_orchestrator     -
zabbix                -

[ceph: root@host01 /]# ceph mgr services
{
        "dashboard": "http://myserver.com:7789/",
        "restful": "https://myserver.com:8789/"
}

[ceph: root@host01 /]# ceph mgr module enable dashboard

[ceph: root@host01 /]# ceph mgr module ls
MODULE
balancer              on (always on)
crash                 on (always on)
devicehealth          on (always on)
orchestrator          on (always on)
pg_autoscaler         on (always on)
progress              on (always on)
rbd_support           on (always on)
status                on (always on)
telemetry             on (always on)
volumes               on (always on)
cephadm               on
dashboard             on
iostat                on
nfs                   on
prometheus            on
restful               on
alerts                -
diskprediction_local  -
influx                -
insights              -
k8sevents             -
localpool             -
mds_autoscaler        -
mirroring             -
osd_perf_query        -
osd_support           -
rgw                   -
rook                  -
selftest              -
snap_schedule         -
stats                 -
telegraf              -
test_orchestrator     -
zabbix                -

[ceph: root@host01 /]# ceph mgr services
{
        "dashboard": "http://myserver.com:7789/",
        "restful": "https://myserver.com:8789/"
}

Copy to Clipboard

Toggle word wrap

The first time the cluster starts, it uses the mgr_initial_modules setting to override which modules to enable. However, this setting is ignored through the rest of the lifetime of the cluster: only use it for bootstrapping. For example, before starting your monitor daemons for the first time, you might add a section like this to your ceph.conf file:

[mon]
    mgr initial modules = dashboard balancer

[mon]
    mgr initial modules = dashboard balancer

Copy to Clipboard

Toggle word wrap

Where a module implements comment line hooks, the commands are accessible as ordinary Ceph commands and Ceph automatically incorporates module commands into the standard CLI interface and route them appropriately to the module:

[ceph: root@host01 /]# ceph <command | help>

[ceph: root@host01 /]# ceph <command | help>

Copy to Clipboard

Toggle word wrap

You can use the following configuration parameters with the above command:

Expand

Table 5.1. Configuration parameters
Configuration	Description	Type	Default
`mgr module path`	Path to load modules from.	String	`"<library dir>/mgr"`
`mgr data`	Path to load daemon data (such as keyring)	String	`"/var/lib/ceph/mgr/$cluster-$id"`
`mgr tick period`	How many seconds between manager beacons to monitors, and other periodic checks.	Integer	`5`
`mon mgr beacon grace`	How long after last beacon should a manager be considered failed.	Integer	`30`

5.4. Using the Ceph Manager balancer module
Copy link

The balancer is a module for Ceph Manager (ceph-mgr) that optimizes the placement of placement groups (PGs) across OSDs in order to achieve a balanced distribution, either automatically or in a supervised fashion.

The balancer module cannot be disabled. It can only be turned off to customize the configuration.

Modes

The following are the supported balancer modes:

crush-compat

The CRUSH compat mode uses the compat weight-set feature, introduced in Ceph Luminous, to manage an alternative set of weights for devices in the CRUSH hierarchy. The normal weights should remain set to the size of the device to reflect the target amount of data that you want to store on the device. The balancer then optimizes the weight-set values, adjusting them up or down in small increments in order to achieve a distribution that matches the target distribution as closely as possible. Because PG placement is a pseudorandom process, there is a natural amount of variation in the placement; by optimizing the weights, the balancer counter-acts that natural variation.

This mode is fully backwards compatible with older clients. When an OSDMap and CRUSH map are shared with older clients, the balancer presents the optimized weightsff as the real weights.

The primary restriction of this mode is that the balancer cannot handle multiple CRUSH hierarchies with different placement rules if the subtrees of the hierarchy share any OSDs. Because this configuration makes managing space utilization on the shared OSDs difficult, it is generally not recommended. As such, this restriction is normally not an issue.

upmap

Starting with Luminous, the OSDMap can store explicit mappings for individual OSDs as exceptions to the normal CRUSH placement calculation. These upmap entries provide fine-grained control over the PG mapping. This CRUSH mode will optimize the placement of individual PGs in order to achieve a balanced distribution. In most cases, this distribution is "perfect", with an equal number of PGs on each OSD +/-1 PG, as they might not divide evenly.

Important

To allow use of this feature, you must tell the cluster that it only needs to support luminous or later clients with the following command:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous

[ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous

Copy to Clipboard

Toggle word wrap

This command fails if any pre-luminous clients or daemons are connected to the monitors.

Due to a known issue, kernel CephFS clients report themselves as jewel clients. To work around this issue, use the --yes-i-really-mean-it flag:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous --yes-i-really-mean-it

[ceph: root@host01 /]# ceph osd set-require-min-compat-client luminous --yes-i-really-mean-it

Copy to Clipboard

Toggle word wrap

You can check what client versions are in use with:

[ceph: root@host01 /]# ceph features

[ceph: root@host01 /]# ceph features

Copy to Clipboard

Toggle word wrap

read

The OSDMap can store explicit mappings for individual primary OSDs as exceptions to the normal CRUSH placement calculation. These pg-upmap-primary entries provide fine-grained control over primary PG mappings. This mode optimizes the placement of individual primary PGs in order to achieve balanced reads, or primary PGs, in a cluster. In read mode, upmap behavior is not exercised, so this mode is best for uses cases in which only read balancing is desired.

Important

To allow use of this feature, you must tell the cluster that it only needs to support Reef or later clients with the following command:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef

Copy to Clipboard

Toggle word wrap

This command fails if any pre-Reef clients or daemons are connected to the monitors.

To work around this issue, use the --yes-i-really-mean-it flag:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef --yes-i-really-mean-it

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef --yes-i-really-mean-it

Copy to Clipboard

Toggle word wrap

You can check what client versions are in use with:

[ceph: root@host01 /]# ceph features

[ceph: root@host01 /]# ceph features

Copy to Clipboard

Toggle word wrap

upmap-read

This balancer mode combines optimization benefits of both upmap and read mode. Like in read mode, upmap-read makes use of pg-upmap-primary.

Use upmap-read for achieving the upmap mode’s offering of balanced PG distribution as well as the read mode’s offering of balanced reads.

Important

To allow use of this feature, you must tell the cluster that it only needs to support Reef or later clients with the following command:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef

Copy to Clipboard

Toggle word wrap

This command fails if any pre-Reef clients or daemons are connected to the monitors.

To work around this issue, use the --yes-i-really-mean-it flag:

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef --yes-i-really-mean-it

[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef --yes-i-really-mean-it

Copy to Clipboard

Toggle word wrap

You can check what client versions are in use with:

[ceph: root@host01 /]# ceph features

[ceph: root@host01 /]# ceph features

Copy to Clipboard

Toggle word wrap

5.4.1. Balancing a Red Hat Ceph cluster using capacity balancer
Copy link

Balance a Red Hat Ceph storage cluster using the capacity balancer.

Prerequisites

A running Red Hat Ceph Storage cluster.

Procedure

Check if the balancer module is enabled:

Example

[ceph: root@host01 /]# ceph mgr module enable balancer

[ceph: root@host01 /]# ceph mgr module enable balancer

Copy to Clipboard

Toggle word wrap

Turn on the balancer module:
Example
```
[ceph: root@host01 /]# ceph balancer on
```
```
[ceph: root@host01 /]# ceph balancer on
```
Copy to Clipboard Toggle word wrap

To change the mode use the following command. The default mode is upmap:

Example

[ceph: root@host01 /]# ceph balancer mode crush-compat

[ceph: root@host01 /]# ceph balancer mode crush-compat

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph balancer mode upmap

[ceph: root@host01 /]# ceph balancer mode upmap

Copy to Clipboard

Toggle word wrap

Check the current status of the balancer.
Example
```
[ceph: root@host01 /]# ceph balancer status
```
```
[ceph: root@host01 /]# ceph balancer status
```
Copy to Clipboard Toggle word wrap

Automatic balancing

Use automatic balancing when turning on the balancer module. Turn on the capacity balancer, by using the ceph balancer on command. By default, when turning on the balancer module, automatic balancing is used and the upmap mode is enabled.

Example

[ceph: root@host01 /]# ceph balancer on

[ceph: root@host01 /]# ceph balancer on

Copy to Clipboard

Toggle word wrap

You can change the balancer mode from upmap to crush-compact mode. The crush-compat mode is backward compatible with older clients and makes small changes to the data distribution to ensure that OSDs are equally utilized.

After the balancer is on, you can verify the balancer status.

Example

[ceph: root@host01 /]# ceph balancer on
[ceph: root@host01 /]# ceph balancer mode crush-compat
[ceph: root@host01 /]# ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.001174",
    "last_optimize_started": "Fri Nov 22 11:09:18 2024",
    "mode": "crush-compact",
    "no_optimization_needed": false,
    "optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
    "plans": []
}

[ceph: root@host01 /]# ceph balancer on
[ceph: root@host01 /]# ceph balancer mode crush-compat
[ceph: root@host01 /]# ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.001174",
    "last_optimize_started": "Fri Nov 22 11:09:18 2024",
    "mode": "crush-compact",
    "no_optimization_needed": false,
    "optimize_result": "Unable to find further optimization, change balancer mode and retry might help",
    "plans": []
}

Copy to Clipboard

Toggle word wrap

To turn off the balancer module, use the ceph balancer off command.

Example

[ceph: root@host01 /]# ceph balancer off
[ceph: root@host01 /]#  ceph balancer status
{
    "active": false,
    "last_optimize_duration": "",
    "last_optimize_started": "",
    "mode": "crush-compat",
    "no_optimization_needed": false,
    "optimize_result": "",
    "plans": []
}

[ceph: root@host01 /]# ceph balancer off
[ceph: root@host01 /]#  ceph balancer status
{
    "active": false,
    "last_optimize_duration": "",
    "last_optimize_started": "",
    "mode": "crush-compat",
    "no_optimization_needed": false,
    "optimize_result": "",
    "plans": []
}

Copy to Clipboard

Toggle word wrap

Throttling

No adjustments are made to the PG distribution if the cluster is degraded, for example, if an OSD has failed and the system has not yet healed itself.

When the cluster is healthy, the balancer throttles its changes such that the percentage of PGs that are misplaced, or need to be moved, is below a threshold of 5% by default. This percentage can be adjusted using the target_max_misplaced_ratio setting.

Syntax

ceph config-key set mgr target_max_misplaced_ratio THRESHOLD_PERCENTAGE

ceph config-key set mgr target_max_misplaced_ratio THRESHOLD_PERCENTAGE

Copy to Clipboard

Toggle word wrap

The following example increases the threshold to 7%.

Example

[ceph: root@host01 /]# ceph config-key set mgr target_max_misplaced_ratio .07

[ceph: root@host01 /]# ceph config-key set mgr target_max_misplaced_ratio .07

Copy to Clipboard

Toggle word wrap

For automatic balancing:

Set the number of seconds to sleep in between runs of the automatic balancer:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/sleep_interval 60

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/sleep_interval 60

Copy to Clipboard

Toggle word wrap

Set the time of day to begin automatic balancing in HHMM format:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_time 0000

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_time 0000

Copy to Clipboard

Toggle word wrap

Set the time of day to finish automatic balancing in HHMM format:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_time 2359

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_time 2359

Copy to Clipboard

Toggle word wrap

Restrict automatic balancing to this day of the week or later. Uses the same conventions as crontab, 0 is Sunday, 1 is Monday, and so on:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_weekday 0

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/begin_weekday 0

Copy to Clipboard

Toggle word wrap

Restrict automatic balancing to this day of the week or earlier. This uses the same conventions as crontab, 0 is Sunday, 1 is Monday, and so on:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_weekday 6

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/end_weekday 6

Copy to Clipboard

Toggle word wrap

Define the pool IDs to which the automatic balancing is limited. The default for this is an empty string, meaning all pools are balanced. The numeric pool IDs can be gotten with the ceph osd pool ls detail command:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/pool_ids 1,2,3

[ceph: root@host01 /]# ceph config set mgr mgr/balancer/pool_ids 1,2,3

Copy to Clipboard

Toggle word wrap

Supervised optimization

The balancer operation is broken into a few distinct phases:

Building a plan.
Evaluating the quality of the data distribution, either for the current PG distribution, or the PG distribution that would result after executing a plan.
Executing the plan.
- To evaluate and score the current distribution:
  Example
  [ceph: root@host01 /]# ceph balancer eval
  
  Copy to Clipboard Toggle word wrap
- To evaluate the distribution for a single pool:
  Syntax
  ceph balancer eval POOL_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph balancer eval rbd
  
  Copy to Clipboard Toggle word wrap
- To see greater detail for the evaluation:
  Example
  [ceph: root@host01 /]# ceph balancer eval-verbose ...
  
  Copy to Clipboard Toggle word wrap
- To generate a plan using the currently configured mode:
  Syntax
  ceph balancer optimize PLAN_NAME
  
  Copy to Clipboard Toggle word wrap
  Replace PLAN_NAME with a custom plan name.
  Example
  [ceph: root@host01 /]# ceph balancer optimize rbd_123
  
  Copy to Clipboard Toggle word wrap
- To see the contents of a plan:
  Syntax
  ceph balancer show PLAN_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph balancer show rbd_123
  
  Copy to Clipboard Toggle word wrap
- To discard old plans:
  Syntax
  ceph balancer rm PLAN_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph balancer rm rbd_123
  
  Copy to Clipboard Toggle word wrap
- To see currently recorded plans use the status command:
  [ceph: root@host01 /]# ceph balancer status
  Copy to Clipboard Toggle word wrap
- To calculate the quality of the distribution that would result after executing a plan:
  Syntax
  ceph balancer eval PLAN_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph balancer eval rbd_123
  
  Copy to Clipboard Toggle word wrap
- To execute the plan:
  Syntax
  ceph balancer execute PLAN_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph balancer execute rbd_123
  
  Copy to Clipboard Toggle word wrap
  Note
  Only execute the plan if it is expected to improve the distribution. After execution, the plan is discarded.

5.4.2. Balancing a Red Hat Ceph cluster using read balancer
Copy link

Balance primary placement groups (PGs) in a cluster. The balancer can optimize the allocation of placement groups across OSDs to achieve a balanced distribution. The balancer can operate either automatically (online), offline, or in a supervised fashion (offline).

5.4.2.1. Online optimization
Copy link

Balance primary PGs in a cluster by using the balancer module.

Prerequisites

Before you begin, make sure that you have a running Red Hat Ceph Storage cluster.

Procedure

Enable the balancer module.

Syntax

[ceph: root@host01 /]# ceph mgr module enable balancer

[ceph: root@host01 /]# ceph mgr module enable balancer

Copy to Clipboard

Toggle word wrap

Turn on the balancer module.
Syntax
```
[ceph: root@host01 /]# ceph balancer on
```
```
[ceph: root@host01 /]# ceph balancer on
```
Copy to Clipboard Toggle word wrap
Update the min_compact_client on the cluster. To use online optimization, the support for Reef or later clients must be indicated on the cluster.
Syntax
```
ceph osd set-require-min-compat-client reef
```
```
ceph osd set-require-min-compat-client reef
```
Copy to Clipboard Toggle word wrap
Note
This command fails if any pre-Reef clients or daemons are connected to the monitors. To work around this issue, use the --yes-i-really-mean-it flag:
Syntax
[ceph: root@host01 /]# ceph osd set-require-min-compat-client reef --yes-i-really-mean-it

Copy to Clipboard Toggle word wrap
You can check what client versions are in use with: the ceph features command.
Copy to Clipboard Toggle word wrap
Syntax
[ceph: root@host01 /]# ceph features

Copy to Clipboard Toggle word wrap
Change the mode to upmap-read or read for read balancing.
The default mode is upmap, enable these modes by running one of the following commands:
Syntax
```
ceph balancer mode upmap-read
ceph balancer mode read
```
```
ceph balancer mode upmap-read
ceph balancer mode read
```
Copy to Clipboard Toggle word wrap

Check the current status of the balancer.

Syntax

ceph balancer status

ceph balancer status

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.013640",
"last_optimize_started": "Mon Nov 22 14:47:57 2024",
"mode": "upmap-read",
"no_optimization_needed": true,
"optimize_result": "Unable to find further optimization, or pool(s) pg_num is decreasing, or distribution is already perfect",
"plans": []
}

[ceph: root@host01 /]# ceph balancer status
{
"active": true,
"last_optimize_duration": "0:00:00.013640",
"last_optimize_started": "Mon Nov 22 14:47:57 2024",
"mode": "upmap-read",
"no_optimization_needed": true,
"optimize_result": "Unable to find further optimization, or pool(s) pg_num is decreasing, or distribution is already perfect",
"plans": []
}

Copy to Clipboard

Toggle word wrap

5.4.2.2. Offline optimization (Technology Preview)
Copy link

Important

Technology Preview features are not supported with Red Hat production service level agreements (SLAs), might not be functionally complete, and Red Hat does not recommend using them for production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. See the support scope for Red Hat Technology Preview features for more details.

If you have unbalanced primary OSDs, you can update them with an offline optimizer that is built into the osdmaptool.

Red Hat recommends that you run the capacity balancer before running the read balancer to ensure optimal results.

Follow the steps in the procedure to balance a cluster using the read balancer:

Prerequisites

Before you begin, make sure that you have the following prerequisites in place:

A running and capacity balanced Red Hat Ceph Storage cluster.
Run the capacity balancer to balance the capacity on each OSD before running the read balancer to ensure optimal results. Use the following steps to balance the capacity:
1. Get the latest copy of your osdmap.
  [ceph: root@host01 /]# ceph osd getmap -o map
  Copy to Clipboard Toggle word wrap
2. Run the upmap balancer.
  [ceph: root@host01 /]# ospmaptool map –upmap out.txt
  Copy to Clipboard Toggle word wrap
3. The file out.txt contains the proposed solution.
  The commands in this procedure are normal Ceph CLI commands that are run to apply the changes to the cluster.
  Run the following command if there are any recommendations in the out.txt file.
  [ceph: root@host01 /]# source out.txt
  Copy to Clipboard Toggle word wrap

Procedure

Check the read_balance_score, available for each pool:

[ceph: root@host01 /]# ceph osd pool ls detail

[ceph: root@host01 /]# ceph osd pool ls detail

Copy to Clipboard

Toggle word wrap

If the read_balance_score is considerably above 1, your pool has unbalanced primary OSDs.

For a homogenous cluster the optimal score is [Ceil{(number of PGs/Number of OSDs)}/(number of PGs/Number of OSDs)]/[ (number of PGs/Number of OSDs)/(number of PGs/Number of OSDs)]. For example, if you have a pool with 32 PG and 10 OSDs then (number of PGs/Number of OSDs) = 32/10 = 3.2. So, the optimal score if all the devices are identical is the ceiling value of 3.2 divided by (number of PGs/Number of OSDs) that is 4/3.2 = 1.25. If you have another pool in the same system with 64 PGs the optimal score is 7/6.4 =1.09375

Example output:

ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 17 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 3.00
pool 2 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 55 lfor 0/0/25 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.50
pool 3 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 27 lfor 0/0/25 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.31

$ ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 17 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 3.00
pool 2 'cephfs.a.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 55 lfor 0/0/25 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 1.50
pool 3 'cephfs.a.data' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 27 lfor 0/0/25 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.31

Copy to Clipboard

Toggle word wrap

Get the the latest copy of your osdmap:
```
[ceph: root@host01 /]# ceph osd getmap -o om
```
```
[ceph: root@host01 /]# ceph osd getmap -o om
```
Copy to Clipboard Toggle word wrap
Example output:
```
got osdmap epoch 56
```
```
got osdmap epoch 56
```
Copy to Clipboard Toggle word wrap

Run the optimizer:

The file out.txt contains the the proposed solution.

[ceph: root@host01 /]# osdmaptool om --read out.txt --read-pool _POOL_NAME_ [--vstart]

[ceph: root@host01 /]# osdmaptool om --read out.txt --read-pool _POOL_NAME_ [--vstart]

Copy to Clipboard

Toggle word wrap

Example output:

osdmaptool om --read out.txt --read-pool cephfs.a.meta
./bin/osdmaptool: osdmap file 'om'
writing upmap command output to: out.txt
---------- BEFORE ------------
 osd.0 | primary affinity: 1 | number of prims: 4
 osd.1 | primary affinity: 1 | number of prims: 8
 osd.2 | primary affinity: 1 | number of prims: 4

read_balance_score of 'cephfs.a.meta': 1.5

---------- AFTER ------------
 osd.0 | primary affinity: 1 | number of prims: 5
 osd.1 | primary affinity: 1 | number of prims: 6
 osd.2 | primary affinity: 1 | number of prims: 5

read_balance_score of 'cephfs.a.meta': 1.13


num changes: 2

$ osdmaptool om --read out.txt --read-pool cephfs.a.meta
./bin/osdmaptool: osdmap file 'om'
writing upmap command output to: out.txt
---------- BEFORE ------------
 osd.0 | primary affinity: 1 | number of prims: 4
 osd.1 | primary affinity: 1 | number of prims: 8
 osd.2 | primary affinity: 1 | number of prims: 4

read_balance_score of 'cephfs.a.meta': 1.5

---------- AFTER ------------
 osd.0 | primary affinity: 1 | number of prims: 5
 osd.1 | primary affinity: 1 | number of prims: 6
 osd.2 | primary affinity: 1 | number of prims: 5

read_balance_score of 'cephfs.a.meta': 1.13


num changes: 2

Copy to Clipboard

Toggle word wrap

The file out.txt contains the the proposed solution.

The commands in this procedure are normal Ceph CLI commands that are run in order to apply the changes to the cluster. If you are working in a vstart cluster, you can pass the --vstart parameter so the CLI commands are formatted with the ./bin/ prefix.

[ceph: root@host01 /]# source out.txt

[ceph: root@host01 /]# source out.txt

Copy to Clipboard

Toggle word wrap

Example output:

cat out.txt
ceph osd pg-upmap-primary 2.3 0
ceph osd pg-upmap-primary 2.4 2

$ source out.txt
change primary for pg 2.3 to osd.0
change primary for pg 2.4 to osd.2

$ cat out.txt
ceph osd pg-upmap-primary 2.3 0
ceph osd pg-upmap-primary 2.4 2

$ source out.txt
change primary for pg 2.3 to osd.0
change primary for pg 2.4 to osd.2

Copy to Clipboard

Toggle word wrap

Note

If you are running the command ceph osd pg-upmap-primary for the first time, you might get a warning as:

Error EPERM: min_compat_client luminous < reef, which is required for pg-upmap-primary. Try 'ceph osd set-require-min-compat-client reef' before using the new interface

Error EPERM: min_compat_client luminous < reef, which is required for pg-upmap-primary. Try 'ceph osd set-require-min-compat-client reef' before using the new interface

Copy to Clipboard

Toggle word wrap

In this case, run the recommended command ceph osd set-require-min-compat-client reef and adjust your cluster’s min-compact-client.

Note

Consider rechecking the scores and re-running the balancer if the number of placement groups (PGs) change or if any OSDs are added or removed from the cluster as these operations can considerably impact the read balancer effect on a pool.

5.4.2.3. Supervised optimization
Copy link

The read balancer can also be used with supervised optimization. If using supervised optimization, use the information detailed in Balancing a Red Hat Ceph cluster using capacity balancer. Set the mode to either upmap-read or read.

5.5. Using the Ceph Manager alerts module
Copy link

You can use the Ceph Manager alerts module to send simple alert messages about the Red Hat Ceph Storage cluster’s health by email.

Note

This module is not intended to be a robust monitoring solution. The fact that it is run as part of the Ceph cluster itself is fundamentally limiting in that a failure of the ceph-mgr daemon prevents alerts from being sent. This module can, however, be useful for standalone clusters that exist in environments where existing monitoring infrastructure does not exist.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to the Ceph Monitor node.

Procedure

Log into the Cephadm shell:
Example
```
cephadm shell
```
```
[root@host01 ~]# cephadm shell
```
Copy to Clipboard Toggle word wrap

Enable the alerts module:

Example

[ceph: root@host01 /]# ceph mgr module enable alerts

[ceph: root@host01 /]# ceph mgr module enable alerts

Copy to Clipboard

Toggle word wrap

Ensure the alerts module is enabled:

Example

[ceph: root@host01 /]# ceph mgr module ls | more
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator",
        "pg_autoscaler",
        "progress",
        "rbd_support",
        "status",
        "telemetry",
        "volumes"
    ],
    "enabled_modules": [
        "alerts",
        "cephadm",
        "dashboard",
        "iostat",
        "nfs",
        "prometheus",
        "restful"
    ]

[ceph: root@host01 /]# ceph mgr module ls | more
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator",
        "pg_autoscaler",
        "progress",
        "rbd_support",
        "status",
        "telemetry",
        "volumes"
    ],
    "enabled_modules": [
        "alerts",
        "cephadm",
        "dashboard",
        "iostat",
        "nfs",
        "prometheus",
        "restful"
    ]

Copy to Clipboard

Toggle word wrap

Configure the Simple Mail Transfer Protocol (SMTP):

Syntax

ceph config set mgr mgr/alerts/smtp_host SMTP_SERVER
ceph config set mgr mgr/alerts/smtp_destination RECEIVER_EMAIL_ADDRESS
ceph config set mgr mgr/alerts/smtp_sender SENDER_EMAIL_ADDRESS

ceph config set mgr mgr/alerts/smtp_host SMTP_SERVER
ceph config set mgr mgr/alerts/smtp_destination RECEIVER_EMAIL_ADDRESS
ceph config set mgr mgr/alerts/smtp_sender SENDER_EMAIL_ADDRESS

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_host smtp.example.com
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_destination example@example.com
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_sender example2@example.com

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_host smtp.example.com
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_destination example@example.com
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_sender example2@example.com

Copy to Clipboard

Toggle word wrap

Optional: By default, the alerts module uses SSL and port 465.
Syntax
```
ceph config set mgr mgr/alerts/smtp_port PORT_NUMBER
```
```
ceph config set mgr mgr/alerts/smtp_port PORT_NUMBER
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_port 587
```
```
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_port 587
```
Copy to Clipboard Toggle word wrap
Do not set the smtp_ssl parameter while configuring alerts.

Authenticate to the SMTP server:

Syntax

ceph config set mgr mgr/alerts/smtp_user USERNAME
ceph config set mgr mgr/alerts/smtp_password PASSWORD

ceph config set mgr mgr/alerts/smtp_user USERNAME
ceph config set mgr mgr/alerts/smtp_password PASSWORD

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_user admin1234
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_password admin1234

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_user admin1234
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_password admin1234

Copy to Clipboard

Toggle word wrap

Optional: By default, SMTP From name is Ceph. To change that, set the smtp_from_name parameter:

Syntax

ceph config set mgr mgr/alerts/smtp_from_name CLUSTER_NAME

ceph config set mgr mgr/alerts/smtp_from_name CLUSTER_NAME

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_from_name 'Ceph Cluster Test'

[ceph: root@host01 /]# ceph config set mgr mgr/alerts/smtp_from_name 'Ceph Cluster Test'

Copy to Clipboard

Toggle word wrap

Optional: By default, the alerts module checks the storage cluster’s health every minute, and sends a message when there is a change in the cluster health status. To change the frequency, set the interval parameter:
Syntax
```
ceph config set mgr mgr/alerts/interval INTERVAL
```
```
ceph config set mgr mgr/alerts/interval INTERVAL
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/interval "5m"
```
```
[ceph: root@host01 /]# ceph config set mgr mgr/alerts/interval "5m"
```
Copy to Clipboard Toggle word wrap
In this example, the interval is set to 5 minutes.
Optional: Send an alert immediately:
Example
```
[ceph: root@host01 /]# ceph alerts send
```
```
[ceph: root@host01 /]# ceph alerts send
```
Copy to Clipboard Toggle word wrap

5.6. Using the Ceph manager crash module
Copy link

Using the Ceph manager crash module, you can collect information about daemon crashdumps and store it in the Red Hat Ceph Storage cluster for further analysis.

By default, daemon crashdumps are dumped in /var/lib/ceph/crash. You can configure it with the option crash dir. Crash directories are named by time, date, and a randomly-generated UUID, and contain a metadata file meta and a recent log file, with a crash_id that is the same.

You can use ceph-crash.service to submit these crashes automatically and persist in the Ceph Monitors. The ceph-crash.service watches the crashdump directory and uploads them with ceph crash post.

The RECENT_CRASH heath message is one of the most common health messages in a Ceph cluster. This health message means that one or more Ceph daemons has crashed recently, and the crash has not yet been archived or acknowledged by the administrator. This might indicate a software bug, a hardware problem like a failing disk, or some other problem. The option mgr/crash/warn_recent_interval controls the time period of what recent means, which is two weeks by default. You can disable the warnings by running the following command:

Example

[ceph: root@host01 /]# ceph config set mgr/crash/warn_recent_interval 0

[ceph: root@host01 /]# ceph config set mgr/crash/warn_recent_interval 0

Copy to Clipboard

Toggle word wrap

The option mgr/crash/retain_interval controls the period for which you want to retain the crash reports before they are automatically purged. The default for this option is one year.

Prerequisites

A running Red Hat Ceph Storage cluster.

Procedure

Ensure the crash module is enabled:

Example

[ceph: root@host01 /]# ceph mgr module ls | more
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator_cli",
        "progress",
        "rbd_support",
        "status",
        "volumes"
    ],
    "enabled_modules": [
        "dashboard",
        "pg_autoscaler",
        "prometheus"
    ]

[ceph: root@host01 /]# ceph mgr module ls | more
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator_cli",
        "progress",
        "rbd_support",
        "status",
        "volumes"
    ],
    "enabled_modules": [
        "dashboard",
        "pg_autoscaler",
        "prometheus"
    ]

Copy to Clipboard

Toggle word wrap

Save a crash dump: The metadata file is a JSON blob stored in the crash dir as meta. You can invoke the ceph command -i - option, which reads from stdin.
Example
```
[ceph: root@host01 /]# ceph crash post -i meta
```
```
[ceph: root@host01 /]# ceph crash post -i meta
```
Copy to Clipboard Toggle word wrap
List the timestamp or the UUID crash IDs for all the new and archived crash info:
Example
```
[ceph: root@host01 /]# ceph crash ls
```
```
[ceph: root@host01 /]# ceph crash ls
```
Copy to Clipboard Toggle word wrap
List the timestamp or the UUID crash IDs for all the new crash information:
Example
```
[ceph: root@host01 /]# ceph crash ls-new
```
```
[ceph: root@host01 /]# ceph crash ls-new
```
Copy to Clipboard Toggle word wrap
List the timestamp or the UUID crash IDs for all the new crash information:
Example
```
[ceph: root@host01 /]# ceph crash ls-new
```
```
[ceph: root@host01 /]# ceph crash ls-new
```
Copy to Clipboard Toggle word wrap

List the summary of saved crash information grouped by age:

Example

[ceph: root@host01 /]# ceph crash stat
8 crashes recorded
8 older than 1 days old:
2022-05-20T08:30:14.533316Z_4ea88673-8db6-4959-a8c6-0eea22d305c2
2022-05-20T08:30:14.590789Z_30a8bb92-2147-4e0f-a58b-a12c2c73d4f5
2022-05-20T08:34:42.278648Z_6a91a778-bce6-4ef3-a3fb-84c4276c8297
2022-05-20T08:34:42.801268Z_e5f25c74-c381-46b1-bee3-63d891f9fc2d
2022-05-20T08:34:42.803141Z_96adfc59-be3a-4a38-9981-e71ad3d55e47
2022-05-20T08:34:42.830416Z_e45ed474-550c-44b3-b9bb-283e3f4cc1fe
2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
2022-05-24T19:58:44.315282Z_1847afbc-f8a9-45da-94e8-5aef0738954e

[ceph: root@host01 /]# ceph crash stat
8 crashes recorded
8 older than 1 days old:
2022-05-20T08:30:14.533316Z_4ea88673-8db6-4959-a8c6-0eea22d305c2
2022-05-20T08:30:14.590789Z_30a8bb92-2147-4e0f-a58b-a12c2c73d4f5
2022-05-20T08:34:42.278648Z_6a91a778-bce6-4ef3-a3fb-84c4276c8297
2022-05-20T08:34:42.801268Z_e5f25c74-c381-46b1-bee3-63d891f9fc2d
2022-05-20T08:34:42.803141Z_96adfc59-be3a-4a38-9981-e71ad3d55e47
2022-05-20T08:34:42.830416Z_e45ed474-550c-44b3-b9bb-283e3f4cc1fe
2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
2022-05-24T19:58:44.315282Z_1847afbc-f8a9-45da-94e8-5aef0738954e

Copy to Clipboard

Toggle word wrap

View the details of the saved crash:

Syntax

ceph crash info CRASH_ID

ceph crash info CRASH_ID

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph crash info 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
{
    "assert_condition": "session_map.sessions.empty()",
    "assert_file": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc",
    "assert_func": "virtual Monitor::~Monitor()",
    "assert_line": 287,
    "assert_msg": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f67a1aeb700 time 2022-05-24T19:58:42.545485+0000\n/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: 287: FAILED ceph_assert(session_map.sessions.empty())\n",
    "assert_thread_name": "ceph-mon",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12b30) [0x7f679678bb30]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6798c8d37b]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x276544) [0x7f6798c8d544]",
        "(Monitor::~Monitor()+0xe30) [0x561152ed3c80]",
        "(Monitor::~Monitor()+0xd) [0x561152ed3cdd]",
        "main()",
        "__libc_start_main()",
        "_start()"
    ],
    "ceph_version": "16.2.8-65.el8cp",
    "crash_id": "2022-07-06T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d",
    "entity_name": "mon.ceph-adm4",
    "os_id": "rhel",
    "os_name": "Red Hat Enterprise Linux",
    "os_version": "8.5 (Ootpa)",
    "os_version_id": "8.5",
    "process_name": "ceph-mon",
    "stack_sig": "957c21d558d0cba4cee9e8aaf9227b3b1b09738b8a4d2c9f4dc26d9233b0d511",
    "timestamp": "2022-07-06T19:58:42.549073Z",
    "utsname_hostname": "host02",
    "utsname_machine": "x86_64",
    "utsname_release": "4.18.0-240.15.1.el8_3.x86_64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Wed Jul 06 03:12:15 EDT 2022"
}

[ceph: root@host01 /]# ceph crash info 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
{
    "assert_condition": "session_map.sessions.empty()",
    "assert_file": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc",
    "assert_func": "virtual Monitor::~Monitor()",
    "assert_line": 287,
    "assert_msg": "/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: In function 'virtual Monitor::~Monitor()' thread 7f67a1aeb700 time 2022-05-24T19:58:42.545485+0000\n/builddir/build/BUILD/ceph-16.1.0-486-g324d7073/src/mon/Monitor.cc: 287: FAILED ceph_assert(session_map.sessions.empty())\n",
    "assert_thread_name": "ceph-mon",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12b30) [0x7f679678bb30]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f6798c8d37b]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x276544) [0x7f6798c8d544]",
        "(Monitor::~Monitor()+0xe30) [0x561152ed3c80]",
        "(Monitor::~Monitor()+0xd) [0x561152ed3cdd]",
        "main()",
        "__libc_start_main()",
        "_start()"
    ],
    "ceph_version": "16.2.8-65.el8cp",
    "crash_id": "2022-07-06T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d",
    "entity_name": "mon.ceph-adm4",
    "os_id": "rhel",
    "os_name": "Red Hat Enterprise Linux",
    "os_version": "8.5 (Ootpa)",
    "os_version_id": "8.5",
    "process_name": "ceph-mon",
    "stack_sig": "957c21d558d0cba4cee9e8aaf9227b3b1b09738b8a4d2c9f4dc26d9233b0d511",
    "timestamp": "2022-07-06T19:58:42.549073Z",
    "utsname_hostname": "host02",
    "utsname_machine": "x86_64",
    "utsname_release": "4.18.0-240.15.1.el8_3.x86_64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Wed Jul 06 03:12:15 EDT 2022"
}

Copy to Clipboard

Toggle word wrap

Remove saved crashes older than KEEP days: Here, KEEP must be an integer.
Syntax
```
ceph crash prune KEEP
```
```
ceph crash prune KEEP
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph crash prune 60
```
```
[ceph: root@host01 /]# ceph crash prune 60
```
Copy to Clipboard Toggle word wrap
Archive a crash report so that it is no longer considered for the RECENT_CRASH health check and does not appear in the crash ls-new output. It appears in the crash ls.
Syntax
```
ceph crash archive CRASH_ID
```
```
ceph crash archive CRASH_ID
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph crash archive 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
```
```
[ceph: root@host01 /]# ceph crash archive 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d
```
Copy to Clipboard Toggle word wrap

Archive all crash reports:

Example

[ceph: root@host01 /]# ceph crash archive-all

[ceph: root@host01 /]# ceph crash archive-all

Copy to Clipboard

Toggle word wrap

Remove the crash dump:

Syntax

ceph crash rm CRASH_ID

ceph crash rm CRASH_ID

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph crash rm 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d

[ceph: root@host01 /]# ceph crash rm 2022-05-24T19:58:42.549073Z_b2382865-ea89-4be2-b46f-9a59af7b7a2d

Copy to Clipboard

Toggle word wrap

5.7. Telemetry module
Copy link

The telemetry module sends data about the storage cluster to help understand how Ceph is used and what problems are encountered during operations. The data is visualized on the public dashboard to view the summary statistics on how many clusters are reporting, their total capacity and OSD count, and version distribution trends.

Channels

The telemetry report is broken down into different channels, each with a different type of information. After the telemetry is enabled, you can turn on or turn off the individual channels.

The following are the four different channels:

basic - The default is on. This channel provides the basic information about the clusters, which includes the following information:
- The capacity of the cluster.
- The number of monitors, managers, OSDs, MDSs, object gateways, or other daemons.
- The software version that is currently being used.
- The number and types of RADOS pools and Ceph File Systems.
- The names of configuration options that are changed from their default (but not their values).
crash - The default is on. This channel provides information about the daemon crashes, which includes the following information:
- The type of daemon.
- The version of the daemon.
- The operating system, the OS distribution, and the kernel version.
- The stack trace that identifies where in the Ceph code the crash occurred.
device - The default is on. This channel provides information about the device metrics, which includes anonymized SMART metrics.
ident - The default is off. This channel provides the user-provided identifying information about the cluster such as cluster description, and contact email address.
perf - The default is off. This channel provides the various performance metrics of the cluster, which can be used for the following:
- Reveal overall cluster health.
- Identify workload patterns.
- Troubleshoot issues with latency, throttling, memory management, and other similar issues.
- Monitor cluster performance by daemon.

The data that is reported does not contain any sensitive data such as pool names, object names, object contents, hostnames, or device serial numbers.

It contains counters and statistics on how the cluster is deployed, Ceph version, host distribution, and other parameters that help the project to gain a better understanding of the way Ceph is used.

Data is secure and is sent to https://telemetry.ceph.com.

Enable telemetry

Before enabling channels, ensure that the telemetry is on.

Enable telemetry:
```
ceph telemetry on
```
```
ceph telemetry on
```
Copy to Clipboard Toggle word wrap

Enable and disable channels

Enable or disable individual channels:

ceph telemetry enable channel basic
ceph telemetry enable channel crash
ceph telemetry enable channel device
ceph telemetry enable channel ident
ceph telemetry enable channel perf

ceph telemetry disable channel basic
ceph telemetry disable channel crash
ceph telemetry disable channel device
ceph telemetry disable channel ident
ceph telemetry disable channel perf

ceph telemetry enable channel basic
ceph telemetry enable channel crash
ceph telemetry enable channel device
ceph telemetry enable channel ident
ceph telemetry enable channel perf

ceph telemetry disable channel basic
ceph telemetry disable channel crash
ceph telemetry disable channel device
ceph telemetry disable channel ident
ceph telemetry disable channel perf

Copy to Clipboard

Toggle word wrap

Enable or disable multiple channels:

ceph telemetry enable channel basic crash device ident perf
ceph telemetry disable channel basic crash device ident perf

ceph telemetry enable channel basic crash device ident perf
ceph telemetry disable channel basic crash device ident perf

Copy to Clipboard

Toggle word wrap

Enable or disable all channels together:

ceph telemetry enable channel all
ceph telemetry disable channel all

ceph telemetry enable channel all
ceph telemetry disable channel all

Copy to Clipboard

Toggle word wrap

Sample report

To review the data reported at any time, generate a sample report:
```
ceph telemetry show
```
```
ceph telemetry show
```
Copy to Clipboard Toggle word wrap
If telemetry is off, preview the sample report:
```
ceph telemetry preview
```
```
ceph telemetry preview
```
Copy to Clipboard Toggle word wrap
It takes longer to generate a sample report for storage clusters with hundreds of OSDs or more.
To protect your privacy, device reports are generated separately, and data such as hostname and device serial number are anonymized. The device telemetry is sent to a different endpoint and does not associate the device data with a particular cluster. To see the device report, run the following command:
```
ceph telemetry show-device
```
```
ceph telemetry show-device
```
Copy to Clipboard Toggle word wrap
If telemetry is off, preview the sample device report:
```
ceph telemetry preview-device
```
```
ceph telemetry preview-device
```
Copy to Clipboard Toggle word wrap
Get a single output of both the reports with telemetry on:
```
ceph telemetry show-all
```
```
ceph telemetry show-all
```
Copy to Clipboard Toggle word wrap
Get a single output of both the reports with telemetry off:
```
ceph telemetry preview-all
```
```
ceph telemetry preview-all
```
Copy to Clipboard Toggle word wrap
Generate a sample report by channel:
Syntax
```
ceph telemetry show CHANNEL_NAME
```
```
ceph telemetry show CHANNEL_NAME
```
Copy to Clipboard Toggle word wrap
Generate a preview of the sample report by channel:
Syntax
```
ceph telemetry preview CHANNEL_NAME
```
```
ceph telemetry preview CHANNEL_NAME
```
Copy to Clipboard Toggle word wrap

Collections

Collections are different aspects of data that is collected within a channel.

List the collections:
```
ceph telemetry collection ls
```
```
ceph telemetry collection ls
```
Copy to Clipboard Toggle word wrap
See the difference between the collections that you are enrolled in, and the new, available collections:
```
ceph telemetry diff
```
```
ceph telemetry diff
```
Copy to Clipboard Toggle word wrap

Enroll to the most recent collections:

Syntax

ceph telemetry on
ceph telemetry enable channel CHANNEL_NAME

ceph telemetry on
ceph telemetry enable channel CHANNEL_NAME

Copy to Clipboard

Toggle word wrap

Interval

The module compiles and sends a new report every 24 hours by default.

Adjust the interval:

Syntax

ceph config set mgr mgr/telemetry/interval INTERVAL

ceph config set mgr mgr/telemetry/interval INTERVAL

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/interval 72

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/interval 72

Copy to Clipboard

Toggle word wrap

In the example, the report is generated every three days (72 hours).

Status

View the current configuration:
```
ceph telemetry status
```
```
ceph telemetry status
```
Copy to Clipboard Toggle word wrap

Manually sending telemetry

Send telemetry data on an ad hoc basis:
```
ceph telemetry send
```
```
ceph telemetry send
```
Copy to Clipboard Toggle word wrap
If telemetry is disabled, add --license sharing-1-0 to the ceph telemetry send command.

Sending telemetry through a proxy

If the cluster cannot connect directly to the configured telemetry endpoint, you can configure a HTTP/HTTPs proxy server:

Syntax

ceph config set mgr mgr/telemetry/proxy PROXY_URL

ceph config set mgr mgr/telemetry/proxy PROXY_URL

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/proxy https://10.0.0.1:8080

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/proxy https://10.0.0.1:8080

Copy to Clipboard

Toggle word wrap

You can include the user pass in the command:

Example

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/proxy https://10.0.0.1:8080

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/proxy https://10.0.0.1:8080

Copy to Clipboard

Toggle word wrap

Contact and description

Optional: Add a contact and description to the report:

Syntax

ceph config set mgr mgr/telemetry/contact '_CONTACT_NAME_'
ceph config set mgr mgr/telemetry/description '_DESCRIPTION_'
ceph config set mgr mgr/telemetry/channel_ident true

ceph config set mgr mgr/telemetry/contact '_CONTACT_NAME_'
ceph config set mgr mgr/telemetry/description '_DESCRIPTION_'
ceph config set mgr mgr/telemetry/channel_ident true

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/contact 'John Doe <john.doe@example.com>'
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/description 'My first Ceph cluster'
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/channel_ident true

[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/contact 'John Doe <john.doe@example.com>'
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/description 'My first Ceph cluster'
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/channel_ident true

Copy to Clipboard

Toggle word wrap

If ident flag is enabled, its details are not displayed in the leaderboard.

Leaderboard

Participate in a leaderboard on the public dashboard:
Example
```
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/leaderboard true
```
```
[ceph: root@host01 /]# ceph config set mgr mgr/telemetry/leaderboard true
```
Copy to Clipboard Toggle word wrap
The leaderboard displays basic information about the storage cluster. This board includes the total storage capacity and the number of OSDs.

Disable telemetry

Disable telemetry any time:
Example
```
ceph telemetry off
```
```
ceph telemetry off
```
Copy to Clipboard Toggle word wrap

Chapter 5. Management of managers using the Ceph Orchestrator

5.1. Deploying the manager daemons using the Ceph Orchestrator
Copy link

5.2. Removing the manager daemons using the Ceph Orchestrator
Copy link

5.3. Using the Ceph Manager modules
Copy link

5.4. Using the Ceph Manager balancer module
Copy link

5.4.1. Balancing a Red Hat Ceph cluster using capacity balancer
Copy link

5.4.2. Balancing a Red Hat Ceph cluster using read balancer
Copy link

5.4.2.1. Online optimization
Copy link

5.4.2.2. Offline optimization (Technology Preview)
Copy link

5.4.2.3. Supervised optimization
Copy link

5.5. Using the Ceph Manager alerts module
Copy link

5.6. Using the Ceph manager crash module
Copy link

5.7. Telemetry module
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 5. Management of managers using the Ceph Orchestrator

5.1. Deploying the manager daemons using the Ceph OrchestratorCopy linkLink copied to clipboard!

5.2. Removing the manager daemons using the Ceph OrchestratorCopy linkLink copied to clipboard!

5.3. Using the Ceph Manager modulesCopy linkLink copied to clipboard!

5.4. Using the Ceph Manager balancer moduleCopy linkLink copied to clipboard!

5.4.1. Balancing a Red Hat Ceph cluster using capacity balancerCopy linkLink copied to clipboard!

5.4.2. Balancing a Red Hat Ceph cluster using read balancerCopy linkLink copied to clipboard!

5.4.2.1. Online optimizationCopy linkLink copied to clipboard!

5.4.2.2. Offline optimization (Technology Preview)Copy linkLink copied to clipboard!

5.4.2.3. Supervised optimizationCopy linkLink copied to clipboard!

5.5. Using the Ceph Manager alerts moduleCopy linkLink copied to clipboard!

5.6. Using the Ceph manager crash moduleCopy linkLink copied to clipboard!

5.7. Telemetry moduleCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

5.1. Deploying the manager daemons using the Ceph Orchestrator
Copy link

5.2. Removing the manager daemons using the Ceph Orchestrator
Copy link

5.3. Using the Ceph Manager modules
Copy link

5.4. Using the Ceph Manager balancer module
Copy link

5.4.1. Balancing a Red Hat Ceph cluster using capacity balancer
Copy link

5.4.2. Balancing a Red Hat Ceph cluster using read balancer
Copy link

5.4.2.1. Online optimization
Copy link

5.4.2.2. Offline optimization (Technology Preview)
Copy link

5.4.2.3. Supervised optimization
Copy link

5.5. Using the Ceph Manager alerts module
Copy link

5.6. Using the Ceph manager crash module
Copy link

5.7. Telemetry module
Copy link