Home
Prodotti
Red Hat Ceph Storage
8
File System Guide
Chapter 2. The Ceph File System Metadata Server

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 2. The Ceph File System Metadata Server

Additional Resources

As a storage administrator, you can learn about the different states of the Ceph File System (CephFS) Metadata Server (MDS), along with learning about CephFS MDS ranking mechanics, configuring the MDS standby daemon, and cache size limits. Knowing these concepts can enable you to configure the MDS daemons for a storage environment.

Prerequisites

A running, and healthy Red Hat Ceph Storage cluster.
Installation of the Ceph Metadata Server daemons (ceph-mds). See the Management of MDS service using the Ceph Orchestrator section in the Red Hat Ceph Storage File System Guide for details on configuring MDS daemons.

2.1. Metadata Server daemon states
Copia collegamento

The Metadata Server (MDS) daemons operate in two states:

Active — manages metadata for files and directories stores on the Ceph File System.
Standby — serves as a backup, and becomes active when an active MDS daemon becomes unresponsive.

By default, a Ceph File System uses only one active MDS daemon. However, systems with many clients benefit from multiple active MDS daemons.

You can configure the file system to use multiple active MDS daemons so that you can scale metadata performance for larger workloads. The active MDS daemons dynamically share the metadata workload when metadata load patterns change. Note that systems with multiple active MDS daemons still require standby MDS daemons to remain highly available.

What Happens When the Active MDS Daemon Fails

When the active MDS becomes unresponsive, a Ceph Monitor daemon waits a number of seconds equal to the value specified in the mds_beacon_grace option. If the active MDS is still unresponsive after the specified time period has passed, the Ceph Monitor marks the MDS daemon as laggy. One of the standby daemons becomes active, depending on the configuration.

Note

To change the value of mds_beacon_grace, add this option to the Ceph configuration file and specify the new value.

2.2. Metadata Server ranks
Copia collegamento

Each Ceph File System (CephFS) has a number of ranks, one by default, which starts at zero.

Ranks define how the metadata workload is shared between multiple Metadata Server (MDS) daemons. The number of ranks is the maximum number of MDS daemons that can be active at one time. Each MDS daemon handles a subset of the CephFS metadata that is assigned to that rank.

Each MDS daemon initially starts without a rank. The Ceph Monitor assigns a rank to the daemon. The MDS daemon can only hold one rank at a time. Daemons only lose ranks when they are stopped.

The max_mds setting controls how many ranks will be created.

The actual number of ranks in the CephFS is only increased if a spare daemon is available to accept the new rank.

Rank States

Ranks can be:

Up - A rank that is assigned to the MDS daemon.
Failed - A rank that is not associated with any MDS daemon.
Damaged - A rank that is damaged; its metadata is corrupted or missing. Damaged ranks are not assigned to any MDS daemons until the operator fixes the problem, and uses the ceph mds repaired command on the damaged rank.

2.3. Metadata Server cache size limits
Copia collegamento

You can limit the size of the Ceph File System (CephFS) Metadata Server (MDS) cache by:

A memory limit: Use the mds_cache_memory_limit option. Red Hat recommends a value between 8 GB and 64 GB for mds_cache_memory_limit. Setting more cache can cause issues with recovery. This limit is approximately 66% of the desired maximum memory use of the MDS.
Note
The default value for mds_cache_memory_limit is 4 GB. Since the default value is outside the recommended range, Red Hat recommends setting the value within the mentioned range.
Important
Red Hat recommends using memory limits instead of inode count limits.
Inode count: Use the mds_cache_size option. By default, limiting the MDS cache by inode count is disabled.

In addition, you can specify a cache reservation by using the mds_cache_reservation option for MDS operations. The cache reservation is limited as a percentage of the memory or inode limit and is set to 5% by default. The intent of this parameter is to have the MDS maintain an extra reserve of memory for its cache for new metadata operations to use. As a consequence, the MDS should in general operate below its memory limit because it will recall old state from clients to drop unused metadata in its cache.

The mds_cache_reservation option replaces the mds_health_cache_threshold option in all situations, except when MDS nodes send a health alert to the Ceph Monitors indicating the cache is too large. By default, mds_health_cache_threshold is 150% of the maximum cache size.

Be aware that the cache limit is not a hard limit. Potential bugs in the CephFS client or MDS or misbehaving applications might cause the MDS to exceed its cache size. The mds_health_cache_threshold option configures the storage cluster health warning message, so that operators can investigate why the MDS cannot shrink its cache.

Additional Resources

See the Metadata Server daemon configuration reference section in the Red Hat Ceph Storage File System Guide for more information.

2.4. File system affinity
Copia collegamento

You can configure a Ceph File System (CephFS) to prefer a particular Ceph Metadata Server (MDS) over another Ceph MDS. For example, you have MDS running on newer, faster hardware that you want to give preference to over a standby MDS running on older, maybe slower hardware. You can specify this preference by setting the mds_join_fs option, which enforces this file system affinity. Ceph Monitors give preference to MDS standby daemons with mds_join_fs equal to the file system name with the failed rank. The standby-replay daemons are selected before choosing another standby daemon. If no standby daemon exists with the mds_join_fs option, then the Ceph Monitors will choose an ordinary standby for replacement or any other available standby as a last resort. The Ceph Monitors will periodically examine Ceph File Systems to see if a standby with a stronger affinity is available to replace the Ceph MDS that has a lower affinity.

Additional Resources

See the Configuring file system affinity section in the Red Hat Ceph Storage File System Guide for details.

2.5. Management of MDS service using the Ceph Orchestrator
Copia collegamento

As a storage administrator, you can use the Ceph Orchestrator with Cephadm in the backend to deploy the MDS service. By default, a Ceph File System (CephFS) uses only one active MDS daemon. However, systems with many clients benefit from multiple active MDS daemons.

This section covers the following administrative tasks:

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to all the nodes.
Hosts are added to the cluster.
All manager, monitor, and OSD daemons are deployed.

2.5.1. Deploying the MDS service using the command line interface
Copia collegamento

Using the Ceph Orchestrator, you can deploy the Metadata Server (MDS) service using the placement specification in the command line interface. Ceph File System (CephFS) requires one or more MDS.

Note

Ensure you have at least two pools, one for Ceph file system (CephFS) data and one for CephFS metadata.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager, monitor, and OSD daemons are deployed.

Procedure

Log into the Cephadm shell:
Example
```
cephadm shell
```
```
[root@host01 ~]# cephadm shell
```
Copy to Clipboard Toggle word wrap
There are two ways of deploying MDS daemons using placement specification:

Method 1

Use ceph fs volume to create the MDS daemons. This creates the CephFS volume and pools associated with the CephFS, and also starts the MDS service on the hosts.

Syntax

ceph fs volume create FILESYSTEM_NAME --placement="NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"

ceph fs volume create FILESYSTEM_NAME --placement="NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"

Copy to Clipboard

Toggle word wrap

Note

By default, replicated pools are created for this command.

Example

[ceph: root@host01 /]# ceph fs volume create test --placement="2 host01 host02"

[ceph: root@host01 /]# ceph fs volume create test --placement="2 host01 host02"

Copy to Clipboard

Toggle word wrap

Method 2

Create the pools, CephFS, and then deploy MDS service using placement specification:
1. Create the pools for CephFS:
  Syntax
  ceph osd pool create DATA_POOL [PG_NUM] ceph osd pool create METADATA_POOL [PG_NUM]
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph osd pool create cephfs_data 64 [ceph: root@host01 /]# ceph osd pool create cephfs_metadata 64
  
  Copy to Clipboard Toggle word wrap
  Typically, the metadata pool can start with a conservative number of Placement Groups (PGs) as it generally has far fewer objects than the data pool. It is possible to increase the number of PGs if needed. The pool sizes range from 64 PGs to 512 PGs. Size the data pool is proportional to the number and sizes of files you expect in the file system.
  Important
  For the metadata pool, consider to use:
  A higher replication level because any data loss to this pool can make the whole file system inaccessible.
  Storage with lower latency such as Solid-State Drive (SSD) disks because this directly affects the observed latency of file system operations on clients.
2. Create the file system for the data pools and metadata pools:
  Syntax
  ceph fs new FILESYSTEM_NAME METADATA_POOL DATA_POOL
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph fs new test cephfs_metadata cephfs_data
  
  Copy to Clipboard Toggle word wrap
3. Deploy MDS service using the ceph orch apply command:
  Syntax
  ceph orch apply mds FILESYSTEM_NAME --placement="NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph orch apply mds test --placement="2 host01 host02"
  
  Copy to Clipboard Toggle word wrap

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls
```
```
[ceph: root@host01 /]# ceph orch ls
```
Copy to Clipboard Toggle word wrap

Check the CephFS status:

Example

[ceph: root@host01 /]# ceph fs ls
[ceph: root@host01 /]# ceph fs status

[ceph: root@host01 /]# ceph fs ls
[ceph: root@host01 /]# ceph fs status

Copy to Clipboard

Toggle word wrap

List the hosts, daemons, and processes:

Syntax

ceph orch ps --daemon_type=DAEMON_NAME

ceph orch ps --daemon_type=DAEMON_NAME

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph orch ps --daemon_type=mds

[ceph: root@host01 /]# ceph orch ps --daemon_type=mds

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Red Hat Ceph Storage File System Guide for more information about creating the Ceph File System (CephFS).
For information on setting the pool values, see Setting number of placement groups in a pool.

2.5.2. Deploying the MDS service using the service specification
Copia collegamento

Using the Ceph Orchestrator, you can deploy the MDS service using the service specification.

Note

Ensure you have at least two pools, one for the Ceph File System (CephFS) data and one for the CephFS metadata.

Prerequisites

A running Red Hat Ceph Storage cluster.
Hosts are added to the cluster.
All manager, monitor, and OSD daemons are deployed.

Procedure

Create the mds.yaml file:
Example
```
touch mds.yaml
```
```
[root@host01 ~]# touch mds.yaml
```
Copy to Clipboard Toggle word wrap

Edit the mds.yaml file to include the following details:

Syntax

service_type: mds
service_id: FILESYSTEM_NAME
placement:
  hosts:
  - HOST_NAME_1
  - HOST_NAME_2
  - HOST_NAME_3

service_type: mds
service_id: FILESYSTEM_NAME
placement:
  hosts:
  - HOST_NAME_1
  - HOST_NAME_2
  - HOST_NAME_3

Copy to Clipboard

Toggle word wrap

Example

service_type: mds
service_id: fs_name
placement:
  hosts:
  - host01
  - host02

service_type: mds
service_id: fs_name
placement:
  hosts:
  - host01
  - host02

Copy to Clipboard

Toggle word wrap

Mount the YAML file under a directory in the container:

Example

cephadm shell --mount mds.yaml:/var/lib/ceph/mds/mds.yaml

[root@host01 ~]# cephadm shell --mount mds.yaml:/var/lib/ceph/mds/mds.yaml

Copy to Clipboard

Toggle word wrap

Navigate to the directory:

Example

[ceph: root@host01 /]# cd /var/lib/ceph/mds/

[ceph: root@host01 /]# cd /var/lib/ceph/mds/

Copy to Clipboard

Toggle word wrap

Log into the Cephadm shell:
Example
```
cephadm shell
```
```
[root@host01 ~]# cephadm shell
```
Copy to Clipboard Toggle word wrap
Navigate to the following directory:
Example
```
[ceph: root@host01 /]# cd /var/lib/ceph/mds/
```
```
[ceph: root@host01 /]# cd /var/lib/ceph/mds/
```
Copy to Clipboard Toggle word wrap

Deploy MDS service using service specification:

Syntax

ceph orch apply -i FILE_NAME.yaml

ceph orch apply -i FILE_NAME.yaml

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 mds]# ceph orch apply -i mds.yaml

[ceph: root@host01 mds]# ceph orch apply -i mds.yaml

Copy to Clipboard

Toggle word wrap

Once the MDS services is deployed and functional, create the CephFS:

Syntax

ceph fs new CEPHFS_NAME METADATA_POOL DATA_POOL

ceph fs new CEPHFS_NAME METADATA_POOL DATA_POOL

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph fs new test metadata_pool data_pool

[ceph: root@host01 /]# ceph fs new test metadata_pool data_pool

Copy to Clipboard

Toggle word wrap

Verification

List the service:
Example
```
[ceph: root@host01 /]# ceph orch ls
```
```
[ceph: root@host01 /]# ceph orch ls
```
Copy to Clipboard Toggle word wrap

List the hosts, daemons, and processes:

Syntax

ceph orch ps --daemon_type=DAEMON_NAME

ceph orch ps --daemon_type=DAEMON_NAME

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph orch ps --daemon_type=mds

[ceph: root@host01 /]# ceph orch ps --daemon_type=mds

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Red Hat Ceph Storage File System Guide for more information about creating the Ceph File System (CephFS).

2.5.3. Removing the MDS service using the Ceph Orchestrator
Copia collegamento

You can remove the service using the ceph orch rm command. Alternatively, you can remove the file system and the associated pools.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to all the nodes.
Hosts are added to the cluster.
At least one MDS daemon deployed on the hosts.

Procedure

There are two ways of removing MDS daemons from the cluster:

Method 1

Remove the CephFS volume, associated pools, and the services:
1. Log into the Cephadm shell:
  Example
  [root@host01 ~]# cephadm shell
  
  Copy to Clipboard Toggle word wrap
2. Set the configuration parameter mon_allow_pool_delete to true:
  Example
  [ceph: root@host01 /]# ceph config set mon mon_allow_pool_delete true
  
  Copy to Clipboard Toggle word wrap
3. Remove the file system:
  Syntax
  ceph fs volume rm FILESYSTEM_NAME --yes-i-really-mean-it
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph fs volume rm cephfs-new --yes-i-really-mean-it
  
  Copy to Clipboard Toggle word wrap
  This command will remove the file system, its data, and metadata pools. It also tries to remove the MDS using the enabled ceph-mgr Orchestrator module.

Method 2

Use the ceph orch rm command to remove the MDS service from the entire cluster:
1. List the service:
  Example
  [ceph: root@host01 /]# ceph orch ls
  
  Copy to Clipboard Toggle word wrap
2. Remove the service
  Syntax
  ceph orch rm SERVICE_NAME
  
  Copy to Clipboard Toggle word wrap
  Example
  [ceph: root@host01 /]# ceph orch rm mds.test
  
  Copy to Clipboard Toggle word wrap

Verification

List the hosts, daemons, and processes:
Syntax
```
ceph orch ps
```
```
ceph orch ps
```
Copy to Clipboard Toggle word wrap
Example
```
[ceph: root@host01 /]# ceph orch ps
```
```
[ceph: root@host01 /]# ceph orch ps
```
Copy to Clipboard Toggle word wrap

Additional Resources

See Deploying the MDS service using the command line interface section in the Red Hat Ceph Storage Operations Guide for more information.
See Deploying the MDS service using the service specification section in the Red Hat Ceph Storage Operations Guide for more information.

2.6. Configuring file system affinity
Copia collegamento

Set the Ceph File System (CephFS) affinity for a particular Ceph Metadata Server (MDS).

Prerequisites

A healthy, and running Ceph File System.
Root-level access to a Ceph Monitor node.

Procedure

Check the current state of a Ceph File System:

Example

ceph fs dump
dumped fsmap epoch 399
...
Filesystem 'cephfs01' (27)
...
e399
max_mds 1
in      0
up      {0=20384}
failed
damaged
stopped
...
[mds.a{0:20384} state up:active seq 239 addr [v2:127.0.0.1:6854/966242805,v1:127.0.0.1:6855/966242805]]

Standby daemons:

[mds.b{-1:10420} state up:standby seq 2 addr [v2:127.0.0.1:6856/2745199145,v1:127.0.0.1:6857/2745199145]]

[root@mon ~]# ceph fs dump
dumped fsmap epoch 399
...
Filesystem 'cephfs01' (27)
...
e399
max_mds 1
in      0
up      {0=20384}
failed
damaged
stopped
...
[mds.a{0:20384} state up:active seq 239 addr [v2:127.0.0.1:6854/966242805,v1:127.0.0.1:6855/966242805]]

Standby daemons:

[mds.b{-1:10420} state up:standby seq 2 addr [v2:127.0.0.1:6856/2745199145,v1:127.0.0.1:6857/2745199145]]

Copy to Clipboard

Toggle word wrap

Set the file system affinity:

Syntax

ceph config set STANDBY_DAEMON mds_join_fs FILE_SYSTEM_NAME

ceph config set STANDBY_DAEMON mds_join_fs FILE_SYSTEM_NAME

Copy to Clipboard

Toggle word wrap

Example

ceph config set mds.b mds_join_fs cephfs01

[root@mon ~]# ceph config set mds.b mds_join_fs cephfs01

Copy to Clipboard

Toggle word wrap

After a Ceph MDS failover event, the file system favors the standby daemon for which the affinity is set.

Example

ceph fs dump
dumped fsmap epoch 405
e405
...
Filesystem 'cephfs01' (27)
...
max_mds 1
in      0
up      {0=10420}
failed
damaged
stopped
...
[mds.b{0:10420} state up:active seq 274 join_fscid=27 addr [v2:127.0.0.1:6856/2745199145,v1:127.0.0.1:6857/2745199145]]

Standby daemons:

[mds.a{-1:10720} state up:standby seq 2 addr [v2:127.0.0.1:6854/1340357658,v1:127.0.0.1:6855/1340357658]]

[root@mon ~]# ceph fs dump
dumped fsmap epoch 405
e405
...
Filesystem 'cephfs01' (27)
...
max_mds 1
in      0
up      {0=10420}
failed
damaged
stopped
...
[mds.b{0:10420} state up:active seq 274 join_fscid=27 addr [v2:127.0.0.1:6856/2745199145,v1:127.0.0.1:6857/2745199145]]



Standby daemons:

[mds.a{-1:10720} state up:standby seq 2 addr [v2:127.0.0.1:6854/1340357658,v1:127.0.0.1:6855/1340357658]]

Copy to Clipboard

Toggle word wrap

1: The mds.b daemon now has the join_fscid=27 in the file system dump output.

Important

If a file system is in a degraded or undersized state, then no failover will occur to enforce the file system affinity.

Additional Resources

See the File system affinity section in the Red Hat Ceph Storage File System Guide for more details.

2.7. Configuring multiple active Metadata Server daemons
Copia collegamento

Configure multiple active Metadata Server (MDS) daemons to scale metadata performance for large systems.

Important

Do not convert all standby MDS daemons to active ones. A Ceph File System (CephFS) requires at least one standby MDS daemon to remain highly available.

Prerequisites

Ceph administration capabilities on the MDS node.
Root-level access to a Ceph Monitor node.

Procedure

Set the max_mds parameter to the desired number of active MDS daemons:
Syntax
```
ceph fs set NAME max_mds NUMBER
```
```
ceph fs set NAME max_mds NUMBER
```
Copy to Clipboard Toggle word wrap
Example
```
ceph fs set cephfs max_mds 2
```
```
[root@mon ~]# ceph fs set cephfs max_mds 2
```
Copy to Clipboard Toggle word wrap
This example increases the number of active MDS daemons to two in the CephFS called cephfs
Note
Ceph only increases the actual number of ranks in the CephFS if a spare MDS daemon is available to take the new rank.

Verify the number of active MDS daemons:

Syntax

ceph fs status NAME

ceph fs status NAME

Copy to Clipboard

Toggle word wrap

Example

ceph fs status cephfs
cephfs - 0 clients
======
+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
|  1   | active | node2 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------+--------+
+-----------------+----------+-------+-------+
|       POOL      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| STANDBY MDS |
+-------------+
|    node3    |
+-------------+

[root@mon ~]# ceph fs status cephfs
cephfs - 0 clients
======
+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
|  1   | active | node2 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------+--------+
+-----------------+----------+-------+-------+
|       POOL      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| STANDBY MDS |
+-------------+
|    node3    |
+-------------+

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Metadata Server daemons states section in the Red Hat Ceph Storage File System Guide for more details.
See the Decreasing the number of active MDS Daemons section in the Red Hat Ceph Storage File System Guide for more details.
See the Managing Ceph users section in the Red Hat Ceph Storage Administration Guide for more details.

2.8. Configuring the number of standby daemons
Copia collegamento

Each Ceph File System (CephFS) can specify the required number of standby daemons to be considered healthy. This number also includes the standby-replay daemon waiting for a rank failure.

Prerequisites

Root-level access to a Ceph Monitor node.

Procedure

Set the expected number of standby daemons for a particular CephFS:
Syntax
```
ceph fs set FS_NAME standby_count_wanted NUMBER
```
```
ceph fs set FS_NAME standby_count_wanted NUMBER
```
Copy to Clipboard Toggle word wrap
Note
Setting the NUMBER to zero disables the daemon health check.
Example
```
ceph fs set cephfs standby_count_wanted 2
```
```
[root@mon ~]# ceph fs set cephfs standby_count_wanted 2
```
Copy to Clipboard Toggle word wrap
This example sets the expected standby daemon count to two.

2.9. Configuring the standby-replay Metadata Server
Copia collegamento

Configure each Ceph File System (CephFS) by adding a standby-replay Metadata Server (MDS) daemon. Doing this reduces failover time if the active MDS becomes unavailable.

This specific standby-replay daemon follows the active MDS’s metadata journal. The standby-replay daemon is only used by the active MDS of the same rank, and is not available to other ranks.

Important

If using standby-replay, then every active MDS must have a standby-replay daemon.

Prerequisites

Root-level access to a Ceph Monitor node.

Procedure

Set the standby-replay for a particular CephFS:
Syntax
```
ceph fs set FS_NAME allow_standby_replay 1
```
```
ceph fs set FS_NAME allow_standby_replay 1
```
Copy to Clipboard Toggle word wrap
Example
```
ceph fs set cephfs allow_standby_replay 1
```
```
[root@mon ~]# ceph fs set cephfs allow_standby_replay 1
```
Copy to Clipboard Toggle word wrap
In this example, the Boolean value is 1, which enables the standby-replay daemons to be assigned to the active Ceph MDS daemons.

Additional Resources

See the Using the ceph mds fail command section in the Red Hat Ceph Storage File System Guide for details.

2.10. Ephemeral pinning policies
Copia collegamento

An ephemeral pin is a static partition of subtrees, and can be set with a policy using extended attributes. A policy can automatically set ephemeral pins to directories. When setting an ephemeral pin to a directory, it is automatically assigned to a particular rank, as to be uniformly distributed across all Ceph MDS ranks. Determining which rank gets assigned is done by a consistent hash and the directory’s inode number. Ephemeral pins do not persist when the directory’s inode is dropped from file system cache. When failing over a Ceph Metadata Server (MDS), the ephemeral pin is recorded in its journal so the Ceph MDS standby server does not lose this information. There are two types of policies for using ephemeral pins:

Note

The attr and jq packages must be installed as a prerequisite for the ephemeral pinning policies.

Distributed: This policy enforces that all of a directory’s immediate children must be ephemerally pinned. For example, use a distributed policy to spread a user’s home directory across the entire Ceph File System cluster. Enable this policy by setting the ceph.dir.pin.distributed extended attribute.

Syntax

setfattr -n ceph.dir.pin.distributed -v 1 DIRECTORY_PATH

setfattr -n ceph.dir.pin.distributed -v 1 DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

setfattr -n ceph.dir.pin.distributed -v 1 dir1/

[root@host01 mount]# setfattr -n ceph.dir.pin.distributed -v 1 dir1/

Copy to Clipboard

Toggle word wrap

Random: This policy enforces a chance that any descendent subdirectory might be ephemerally pinned. You can customize the percent of directories that can be ephemerally pinned. Enable this policy by setting the ceph.dir.pin.random and setting a percentage. Red Hat recommends setting this percentage to a value smaller than 1% (0.01). Having too many subtree partitions can cause slow performance. You can set the maximum percentage by setting the mds_export_ephemeral_random_max Ceph MDS configuration option. The parameters mds_export_ephemeral_distributed and mds_export_ephemeral_random are already enabled.

Syntax

setfattr -n ceph.dir.pin.random -v PERCENTAGE_IN_DECIMAL DIRECTORY_PATH

setfattr -n ceph.dir.pin.random -v PERCENTAGE_IN_DECIMAL DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

setfattr -n ceph.dir.pin.random -v  0.01 dir1/

[root@host01 mount]# setfattr -n ceph.dir.pin.random -v  0.01 dir1/

Copy to Clipboard

Toggle word wrap

After enabling pinning, you can verify by running either of the following commands:

Syntax

getfattr -n ceph.dir.pin.random DIRECTORY_PATH
getfattr -n ceph.dir.pin.distributed DIRECTORY_PATH

getfattr -n ceph.dir.pin.random DIRECTORY_PATH
getfattr -n ceph.dir.pin.distributed DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

getfattr -n ceph.dir.pin.distributed dir1/
# file: dir1/
ceph.dir.pin.distributed="1"

getfattr -n ceph.dir.pin.random dir1/
# file: dir1/
ceph.dir.pin.random="0.01"

[root@host01 mount]# getfattr -n ceph.dir.pin.distributed dir1/
# file: dir1/
ceph.dir.pin.distributed="1"

[root@host01 mount]# getfattr -n ceph.dir.pin.random dir1/
# file: dir1/
ceph.dir.pin.random="0.01"

Copy to Clipboard

Toggle word wrap

Example

[ceph: root@host01 /]# ceph tell mds.a get subtrees | jq '.[] | [.dir.path, .auth_first, .export_pin]'

[ceph: root@host01 /]# ceph tell mds.a get subtrees | jq '.[] | [.dir.path, .auth_first, .export_pin]'

Copy to Clipboard

Toggle word wrap

If the directory is pinned, the value of export_pin is 0 if it is pinned to rank 0, 1 if it is pinned to rank 1, and so on. If the directory is not pinned, the value is -1.

To remove a partitioning policy, remove the extended attributes or set the value to 0.

Syntax

 setfattr -n ceph.dir.pin.distributed -v 0 DIRECTORY_PATH

 setfattr -n ceph.dir.pin.distributed -v 0 DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

setfattr -n ceph.dir.pin.distributed -v 0 dir1/

[root@host01 mount]# setfattr -n ceph.dir.pin.distributed -v 0 dir1/

Copy to Clipboard

Toggle word wrap

You can verify by running either of the following commands .Syntax

getfattr -n ceph.dir.pin.distributed DIRECTORY_PATH

getfattr -n ceph.dir.pin.distributed DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

getfattr -n ceph.dir.pin.distributed dir1/

[root@host01 mount]# getfattr -n ceph.dir.pin.distributed dir1/

Copy to Clipboard

Toggle word wrap

For export pins, remove the extended attribute or set the extended attribute to -1.

Syntax

setfattr -n ceph.dir.pin -v -1 DIRECTORY_PATH

setfattr -n ceph.dir.pin -v -1 DIRECTORY_PATH

Copy to Clipboard

Toggle word wrap

Example

setfattr -n ceph.dir.pin -v -1 dir1/

[root@host01 mount]# setfattr -n ceph.dir.pin -v -1 dir1/

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Manually pinning directory trees to a particular rank section in the Red Hat Ceph Storage File System Guide for details on manually setting pins.

2.11. Manually pinning directory trees to a particular rank
Copia collegamento

Sometimes it might be desirable to override the dynamic balancer with explicit mappings of metadata to a particular Ceph Metadata Server (MDS) rank. You can do this manually to evenly spread the load of an application or to limit the impact of users' metadata requests on the Ceph File System cluster. Manually pinning directories is also known as an export pin by setting the ceph.dir.pin extended attribute.

A directory’s export pin is inherited from its closest parent directory, but can be overwritten by setting an export pin on that directory. Setting an export pin on a directory affects all of its sub-directories, for example:

mkdir -p a/b
setfattr -n ceph.dir.pin -v 1 a/
setfattr -n ceph.dir.pin -v 0 a/b

[root@client ~]# mkdir -p a/b


[root@client ~]# setfattr -n ceph.dir.pin -v 1 a/


[root@client ~]# setfattr -n ceph.dir.pin -v 0 a/b

Copy to Clipboard

Toggle word wrap

1: Directories a/ and a/b both start without an export pin set.
2: Directories a/ and a/b are now pinned to rank 1.
3: Directory a/b is now pinned to rank 0 and directory a/ and the rest of its sub-directories are still pinned to rank 1.

Prerequisites

A running Red Hat Ceph Storage cluster.
A running Ceph File System.
Root-level access to the CephFS client.
Installation of the attr package.

Procedure

Set the export pin on a directory:

Syntax

setfattr -n ceph.dir.pin -v RANK PATH_TO_DIRECTORY

setfattr -n ceph.dir.pin -v RANK PATH_TO_DIRECTORY

Copy to Clipboard

Toggle word wrap

Example

setfattr -n ceph.dir.pin -v 2 cephfs/home

[root@client ~]# setfattr -n ceph.dir.pin -v 2 cephfs/home

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Ephemeral pinning policies section in the Red Hat Ceph Storage File System Guide for details on automatically setting pins.

2.12. Decreasing the number of active Metadata Server daemons
Copia collegamento

How to decrease the number of active Ceph File System (CephFS) Metadata Server (MDS) daemons.

Prerequisites

The rank that you will remove must be active first, meaning that you must have the same number of MDS daemons as specified by the max_mds parameter.
Root-level access to a Ceph Monitor node.

Procedure

Set the same number of MDS daemons as specified by the max_mds parameter:

Syntax

ceph fs status NAME

ceph fs status NAME

Copy to Clipboard

Toggle word wrap

Example

ceph fs status cephfs
cephfs - 0 clients

+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
|  1   | active | node2 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------+--------+
+-----------------+----------+-------+-------+
|       POOL      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
|    node3    |
+-------------+

[root@mon ~]# ceph fs status cephfs
cephfs - 0 clients

+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
|  1   | active | node2 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------+--------+
+-----------------+----------+-------+-------+
|       POOL      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
|    node3    |
+-------------+

Copy to Clipboard

Toggle word wrap

On a node with administration capabilities, change the max_mds parameter to the desired number of active MDS daemons:
Syntax
```
ceph fs set NAME max_mds NUMBER
```
```
ceph fs set NAME max_mds NUMBER
```
Copy to Clipboard Toggle word wrap
Example
```
ceph fs set cephfs max_mds 1
```
```
[root@mon ~]# ceph fs set cephfs max_mds 1
```
Copy to Clipboard Toggle word wrap
Wait for the storage cluster to stabilize to the new max_mds value by watching the Ceph File System status.

Verify the number of active MDS daemons:

Syntax

ceph fs status NAME

ceph fs status NAME

Copy to Clipboard

Toggle word wrap

Example

ceph fs status cephfs
cephfs - 0 clients

+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------|--------+
+-----------------+----------+-------+-------+
|       POOl      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
|    node3    |
|    node2    |
+-------------+

[root@mon ~]# ceph fs status cephfs
cephfs - 0 clients

+------+--------+-------+---------------+-------+-------+--------+--------+
| RANK | STATE  |  MDS  |    ACTIVITY   |  DNS  |  INOS |  DIRS  |  CAPS  |
+------+--------+-------+---------------+-------+-------+--------+--------+
|  0   | active | node1 | Reqs:    0 /s |   10  |   12  |   12   |   0    |
+------+--------+-------+---------------+-------+-------+--------|--------+
+-----------------+----------+-------+-------+
|       POOl      |   TYPE   |  USED | AVAIL |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 4638  | 26.7G |
|   cephfs_data   |   data   |    0  | 26.7G |
+-----------------+----------+-------+-------+

+-------------+
| Standby MDS |
+-------------+
|    node3    |
|    node2    |
+-------------+

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Metadata Server daemons states section in the Red Hat Ceph Storage File System Guide.
See the Configuring multiple active Metadata Server daemons section in the Red Hat Ceph Storage File System Guide.

See the Red Hat Ceph Storage Installation Guide for details on installing a Red Hat Ceph Storage cluster.

2.13. Viewing metrics for Ceph metadata server clients
Copia collegamento

You can use the command-line interface to view the metrics for the Ceph metadata server (MDS). CephFS uses Perf Counters to track metrics. You can view the metrics using the counter dump command.

Prequisites

A running IBM Storage Ceph cluster.

Procedure

Get the name of the mds service:

Syntax

[ceph: root@mds-host01 /]# ceph orch ps | grep mds

[ceph: root@mds-host01 /]# ceph orch ps | grep mds

Copy to Clipboard

Toggle word wrap

Check the MDS per client metrics:

Syntax

[ceph: root@mds-host01 /]# ceph tell MDS_SERVICE_NAME counter dump

[ceph: root@mds-host01 /]# ceph tell MDS_SERVICE_NAME counter dump

Copy to Clipboard

Toggle word wrap

Example

ceph tell mds.cephfs.ceph2-hk-n-0mfqao-node4.isztbk counter dump

[
  {
    "key": "mds_client_metrics",
    "value": [
      {
        "labels": {
          "fs_name": "cephfs",
          "id": "24379"
        },
        "counters": {
          "num_clients": 4
        }
      }
    ]
  },
  {
    "key": "mds_client_metrics-cephfs",
    "value": [
      {
        "labels": {
          "client": "client.24413",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 56,
          "cap_miss": 9,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 2,
          "dentry_lease_miss": 12,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 4,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 0,
          "total_write_size": 0
        }
      },
      {
        "labels": {
          "client": "client.24502",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 921403,
          "cap_miss": 102382,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 17117,
          "dentry_lease_miss": 204710,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 7,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 1,
          "total_write_size": 132
        }
      },
      {
        "labels": {
          "client": "client.24508",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 928694,
          "cap_miss": 103183,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 17217,
          "dentry_lease_miss": 206348,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 7,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 1,
          "total_write_size": 132
        }
      },
      {
        "labels": {
          "client": "client.24520",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 56,
          "cap_miss": 9,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 2,
          "dentry_lease_miss": 12,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 4,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 0,
          "total_write_size": 0
        }
      }
    ]
  }
]

[root@ceph2-hk-n-0mfqao-node1-installer ~]# ceph tell mds.cephfs.ceph2-hk-n-0mfqao-node4.isztbk counter dump

[
  {
    "key": "mds_client_metrics",
    "value": [
      {
        "labels": {
          "fs_name": "cephfs",
          "id": "24379"
        },
        "counters": {
          "num_clients": 4
        }
      }
    ]
  },
  {
    "key": "mds_client_metrics-cephfs",
    "value": [
      {
        "labels": {
          "client": "client.24413",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 56,
          "cap_miss": 9,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 2,
          "dentry_lease_miss": 12,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 4,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 0,
          "total_write_size": 0
        }
      },
      {
        "labels": {
          "client": "client.24502",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 921403,
          "cap_miss": 102382,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 17117,
          "dentry_lease_miss": 204710,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 7,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 1,
          "total_write_size": 132
        }
      },
      {
        "labels": {
          "client": "client.24508",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 928694,
          "cap_miss": 103183,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 17217,
          "dentry_lease_miss": 206348,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 7,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 1,
          "total_write_size": 132
        }
      },
      {
        "labels": {
          "client": "client.24520",
          "rank": "0"
        },
        "counters": {
          "cap_hits": 56,
          "cap_miss": 9,
          "avg_read_latency": 0E-9,
          "avg_write_latency": 0E-9,
          "avg_metadata_latency": 0E-9,
          "dentry_lease_hits": 2,
          "dentry_lease_miss": 12,
          "opened_files": 0,
          "opened_inodes": 9,
          "pinned_icaps": 4,
          "total_inodes": 9,
          "total_read_ops": 0,
          "total_read_size": 0,
          "total_write_ops": 0,
          "total_write_size": 0
        }
      }
    ]
  }
]

Copy to Clipboard

Toggle word wrap

Client metrics description

CephFS exports client metrics as Labeled Perf Counters, which you can use to monitor the client performance. CephFS exports the below client metrics:

Expand

NAME	TYPE	DESCRIPTION
cap_hits	Gauge	Percentage of file capability hits over total number of caps.
cap_miss	Gauge	Percentage of file capability misses over total number of caps.
avg_read_latency	Gauge	Mean value of the read latencies.
avg_write_latency	Gauge	Mean value of the write latencies.
avg_metadata_latency	Gauge	Mean value of the metadata latencies
dentry_lease_hits	Gauge	Percentage of dentry lease hits handed out over the total dentry lease request.
dentry_lease_miss	Gauge	Percentage of dentry lease misses handed out over the total dentry lease requests.
opened_files	Gauge	Number of opened files.
opened_inodes	Gauge	Number of opened inode.
pinned_icaps	Gauge	Number of pinned Inode Caps.
total_inodes	Gauge	Total number of Nodes.
total_read_ops	Gauge	Total number of read operations generated by all process.
total_read_size	Gauge	Number of bytes read in input/output operations generated by all process.
total_write_ops	Gauge	Total number of write operations generated by all process.
total_write_size	Gauge	Number of bytes written in input/output operations generated by all processes.

Questo contenuto non è disponibile nella lingua selezionata.

Chapter 2. The Ceph File System Metadata Server

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Rendiamo l’open source più inclusivo

Informazioni su Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links