Chapter 9. Ceph performance counters
As a storage administrator, you can gather performance metrics of the Red Hat Ceph Storage cluster. The Ceph performance counters are a collection of internal infrastructure metrics. The collection, aggregation, and graphing of this metric data can be done by an assortment of tools and can be useful for performance analytics.
9.1. Access to Ceph performance counters
The performance counters are available through a socket interface for the Ceph Monitors and the OSDs. The socket file for each respective daemon is located under /var/run/ceph
, by default. The performance counters are grouped together into collection names. These collections names represent a subsystem or an instance of a subsystem.
Here is the full list of the Monitor and the OSD collection name categories with a brief description for each :
Monitor Collection Name Categories
- Cluster Metrics - Displays information about the storage cluster: Monitors, OSDs, Pools, and PGs
-
Level Database Metrics - Displays information about the back-end
KeyValueStore
database - Monitor Metrics - Displays general monitor information
- Paxos Metrics - Displays information on cluster quorum management
- Throttle Metrics - Displays the statistics on how the monitor is throttling
OSD Collection Name Categories
- Write Back Throttle Metrics - Displays the statistics on how the write back throttle is tracking unflushed IO
-
Level Database Metrics - Displays information about the back-end
KeyValueStore
database - Objecter Metrics - Displays information on various object-based operations
- Read and Write Operations Metrics - Displays information on various read and write operations
- Recovery State Metrics - Displays - Displays latencies on various recovery states
- OSD Throttle Metrics - Display the statistics on how the OSD is throttling
RADOS Gateway Collection Name Categories
- Object Gateway Client Metrics - Displays statistics on GET and PUT requests
- Objecter Metrics - Displays information on various object-based operations
- Object Gateway Throttle Metrics - Display the statistics on how the OSD is throttling
9.2. Display the Ceph performance counters
The ceph daemon DAEMON_NAME perf schema
command outputs the available metrics. Each metric has an associated bit field value type.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
Procedure
To view the metric’s schema:
Synatx
ceph daemon DAEMON_NAME perf schema
NoteYou must run the
ceph daemon
command from the node running the daemon.Executing
ceph daemon DAEMON_NAME perf schema
command from the monitor node:Example
[ceph: root@host01 /]# ceph daemon mon.host01 perf schema
Executing the
ceph daemon DAEMON_NAME perf schema
command from the OSD node:Example
[ceph: root@host01 /]# ceph daemon osd.11 perf schema
Bit | Meaning |
---|---|
|
|
|
|
|
|
|
|
Each value will have bit 1 or 2 set to indicate the type, either a floating point or an integer value. When bit 4 is set, there will be two values to read, a sum and a count. When bit 8 is set, the average for the previous interval would be the sum delta, since the previous read, divided by the count delta. Alternatively, dividing the values outright would provide the lifetime average value. Typically these are used to measure latencies, the number of requests and a sum of request latencies. Some bit values are combined, for example 5, 6 and 10. A bit value of 5 is a combination of bit 1 and bit 4. This means the average will be a floating point value. A bit value of 6 is a combination of bit 2 and bit 4. This means the average value will be an integer. A bit value of 10 is a combination of bit 2 and bit 8. This means the counter value will be an integer value.
Additional Resources
- See Average count and sum section in the Red Hat Ceph Storage Administration Guide for more details.
9.3. Viewing the performance counters for users and buckets
The Ceph Object Gateway uses the performance counters to track metrics. You can visualize a cluster-wide view of the usage data over time in the Ceph Exporter port, which is usually, 9926
, which includes PUT operations for objects in a bucket.
To track the operation metrics by users, set the rgw_user_counters_cache
to true
and to track the operation metrics by buckets, set the rgw_bucket_counters_cache
to true
.
You can use both rgw_user_counters_cache_size
and rgw_bucket_counters_cache_size
to set number of entries in each cache.
Counters are evicted from a cache once the number of counters in the cache are greater than the cache size configuration variable. The counters that are evicted are the least recently used (LRU).
For example, if the number of buckets exceeded rgw_bucket_counters_cache_size
by 1
and the counters with label bucket1
were the last to be updated, the counters for bucket1
get evicted from the cache. If S3 operations tracked by the operation metrics were done on bucket1
after eviction, all the metrics in the cache for bucket1
start at 0
.
Cache sizing can depend on several factors, which include the following:
- Number of users in the cluster.
- Number of buckets in the cluster.
- Memory usage of the Ceph Object Gateway.
- Disk and memory usage of Prometheus.
-
To help calculate the Ceph Object Gateway’s memory usage of a cache, it should be noted that each cache entry, encompassing all the operation metrics, is
1360
bytes. This value is an estimate and subject to change if metrics are added or removed from the operation metrics list.
Since the operation metrics are labeled as performance counters, they live in memory. If the Ceph Object Gateway is restarted or crashes, all counters in the Ceph Object Gateway, whether in a cache or not, are lost.
Prerequisites
- A running Red Hat Ceph Storage cluster with Ceph Object Gateway installed.
-
Monitoring stack enabled which includes Prometheus and
ceph-exporter
.
Procedure
Set the performance counters for users and buckets.
Set the performance counters for the users.
Example
[ceph: root@host01 /]# ceph config set client.rgw.rgw.1.host05 rgw_user_counters_cache true
Set the performance counters for the buckets.
Example
[ceph: root@host01 /]# ceph config set client.rgw.rgw.1.host05 rgw_bucket_counters_cache true
Restart the Ceph Object Gateway service.
Example
[ceph: root@host01 /]# ceph orch restart rgw.rgw.1
- Create users. For more information, see User Management.
Create buckets and upload objects into the bucket.
Configure
s3cmd
.Example
[root@host01 ~]# s3cmd --configure
Create the S3 bucket.
Syntax
s3cmd mb s3://NAME_OF_THE_BUCKET_FOR_S3
Example
[root@host01 ~]# s3cmd mb s3://bucket Bucket 's3://bucket/' created
Create your file, input all the data, upload buckets on S3.
Syntax
s3cmd put FILE_NAME s3://NAME_OF_THE_BUCKET_ON_S3
Example
[root@host01 ~]# s3cmd put test.txt s3://bucket upload: 'test.txt' -> 's3://bucket/test.txt' [1 of 1] 21 of 21 100% in 1s 16.75 B/s done
Verify that the objects are uploaded.
Example
[root@host01 ~]# s3cmd ls s3://bucket
View the performance counter dump.
Syntax
config dump ceph daemon DAEMON_ID counter dump
Verify that the metrics are running on the local host.
Syntax
http://RGW_IP_ADDRESS:CEPH-EXPORTER_PORT/
Example for per bucket perf counter:
# HELP ceph_rgw_op_per_bucket_put_obj_ops Puts # TYPE ceph_rgw_op_per_bucket_put_obj_ops counter ceph_rgw_op_per_bucket_put_obj_ops{bucket="test-bkt1",instance_id="ceph-ck-perf-ej61qj-node5"} 10
Example for per user perf counter:
# HELP ceph_rgw_op_per_user_put_obj_ops Puts # TYPE ceph_rgw_op_per_user_put_obj_ops counter ceph_rgw_op_per_user_put_obj_ops{instance_id="ceph-ck-perf-ej61qj-node5",user="ckulal"} 10
Verify that the same metrics on Prometheus.
Syntax
http://RGW_IP_ADDRESS:PROMETHEUS_PORT/
Example
https://10.0.210.100:9283/
9.4. Dump the Ceph performance counters
The ceph daemon .. perf dump
command outputs the current values and groups the metrics under the collection name for each subsystem.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
Procedure
To view the current metric data:
Syntax
ceph daemon DAEMON_NAME perf dump
NoteYou must run the
ceph daemon
command from the node running the daemon.Executing
ceph daemon .. perf dump
command from the Monitor node:[ceph: root@host01 /]# ceph daemon mon.host01 perf dump
Executing the
ceph daemon .. perf dump
command from the OSD node:[ceph: root@host01 /]# ceph daemon osd.11 perf dump
Additional Resources
- To view a short description of each Monitor metric available, please see the Ceph monitor metrics table.
9.5. Average count and sum
All latency numbers have a bit field value of 5. This field contains floating point values for the average count and sum. The avgcount
is the number of operations within this range and the sum
is the total latency in seconds. When dividing the sum
by the avgcount
this will provide you with an idea of the latency per operation.
Additional Resources
- To view a short description of each OSD metric available, please see the Ceph OSD table.
9.6. Ceph Monitor metrics
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Number of monitors |
| 2 | Number of monitors in quorum | |
| 2 | Total number of OSD | |
| 2 | Number of OSDs that are up | |
| 2 | Number of OSDs that are in cluster | |
| 2 | Current epoch of OSD map | |
| 2 | Total capacity of cluster in bytes | |
| 2 | Number of used bytes on cluster | |
| 2 | Number of available bytes on cluster | |
| 2 | Number of pools | |
| 2 | Total number of placement groups | |
| 2 | Number of placement groups in active+clean state | |
| 2 | Number of placement groups in active state | |
| 2 | Number of placement groups in peering state | |
| 2 | Total number of objects on cluster | |
| 2 | Number of degraded (missing replicas) objects | |
| 2 | Number of misplaced (wrong location in the cluster) objects | |
| 2 | Number of unfound objects | |
| 2 | Total number of bytes of all objects | |
| 2 | Number of MDSs that are up | |
| 2 | Number of MDS that are in cluster | |
| 2 | Number of failed MDS | |
| 2 | Current epoch of MDS map |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Gets |
| 10 | Transactions | |
| 10 | Compactions | |
| 10 | Compactions by range | |
| 10 | Mergings of ranges in compaction queue | |
| 2 | Length of compaction queue |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Current number of opened monitor sessions |
| 10 | Number of created monitor sessions | |
| 10 | Number of remove_session calls in monitor | |
| 10 | Number of trimed monitor sessions | |
| 10 | Number of elections monitor took part in | |
| 10 | Number of elections started by monitor | |
| 10 | Number of elections won by monitor | |
| 10 | Number of elections lost by monitor |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Starts in leader role |
| 10 | Starts in peon role | |
| 10 | Restarts | |
| 10 | Refreshes | |
| 5 | Refresh latency | |
| 10 | Started and handled begins | |
| 6 | Keys in transaction on begin | |
| 6 | Data in transaction on begin | |
| 5 | Latency of begin operation | |
| 10 | Commits | |
| 6 | Keys in transaction on commit | |
| 6 | Data in transaction on commit | |
| 5 | Commit latency | |
| 10 | Peon collects | |
| 6 | Keys in transaction on peon collect | |
| 6 | Data in transaction on peon collect | |
| 5 | Peon collect latency | |
| 10 | Uncommitted values in started and handled collects | |
| 10 | Collect timeouts | |
| 10 | Accept timeouts | |
| 10 | Lease acknowledgement timeouts | |
| 10 | Lease timeouts | |
| 10 | Store a shared state on disk | |
| 6 | Keys in transaction in stored state | |
| 6 | Data in transaction in stored state | |
| 5 | Storing state latency | |
| 10 | Sharings of state | |
| 6 | Keys in shared state | |
| 6 | Data in shared state | |
| 10 | New proposal number queries | |
| 5 | New proposal number getting latency |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Currently available throttle |
| 10 | Max value for throttle | |
| 10 | Gets | |
| 10 | Got data | |
| 10 | Get blocked during get_or_fail | |
| 10 | Successful get during get_or_fail | |
| 10 | Takes | |
| 10 | Taken data | |
| 10 | Puts | |
| 10 | Put data | |
| 5 | Waiting latency |
9.7. Ceph OSD metrics
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Dirty data |
| 2 | Written data | |
| 2 | Dirty operations | |
| 2 | Written operations | |
| 2 | Entries waiting for write | |
| 2 | Written entries |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Gets |
| 10 | Transactions | |
| 10 | Compactions | |
| 10 | Compactions by range | |
| 10 | Mergings of ranges in compaction queue | |
| 2 | Length of compaction queue |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Active operations |
| 2 | Laggy operations | |
| 10 | Sent operations | |
| 10 | Sent data | |
| 10 | Resent operations | |
| 10 | Commit callbacks | |
| 10 | Operation commits | |
| 10 | Operation | |
| 10 | Read operations | |
| 10 | Write operations | |
| 10 | Read-modify-write operations | |
| 10 | PG operation | |
| 10 | Stat operations | |
| 10 | Create object operations | |
| 10 | Read operations | |
| 10 | Write operations | |
| 10 | Write full object operations | |
| 10 | Append operation | |
| 10 | Set object to zero operations | |
| 10 | Truncate object operations | |
| 10 | Delete object operations | |
| 10 | Map extent operations | |
| 10 | Sparse read operations | |
| 10 | Clone range operations | |
| 10 | Get xattr operations | |
| 10 | Set xattr operations | |
| 10 | Xattr comparison operations | |
| 10 | Remove xattr operations | |
| 10 | Reset xattr operations | |
| 10 | TMAP update operations | |
| 10 | TMAP put operations | |
| 10 | TMAP get operations | |
| 10 | Call (execute) operations | |
| 10 | Watch by object operations | |
| 10 | Notify about object operations | |
| 10 | Extended attribute comparison in multi operations | |
| 10 | Other operations | |
| 2 | Active lingering operations | |
| 10 | Sent lingering operations | |
| 10 | Resent lingering operations | |
| 10 | Sent pings to lingering operations | |
| 2 | Active pool operations | |
| 10 | Sent pool operations | |
| 10 | Resent pool operations | |
| 2 | Active get pool stat operations | |
| 10 | Pool stat operations sent | |
| 10 | Resent pool stats | |
| 2 | Statfs operations | |
| 10 | Sent FS stats | |
| 10 | Resent FS stats | |
| 2 | Active commands | |
| 10 | Sent commands | |
| 10 | Resent commands | |
| 2 | OSD map epoch | |
| 10 | Full OSD maps received | |
| 10 | Incremental OSD maps received | |
| 2 | Open sessions | |
| 10 | Sessions opened | |
| 10 | Sessions closed | |
| 2 | Laggy OSD sessions |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Replication operations currently being processed (primary) |
| 10 | Client operations total write size | |
| 10 | Client operations total read size | |
| 5 | Latency of client operations (including queue time) | |
| 5 | Latency of client operations (excluding queue time) | |
| 10 | Client read operations | |
| 10 | Client data read | |
| 5 | Latency of read operation (including queue time) | |
| 5 | Latency of read operation (excluding queue time) | |
| 10 | Client write operations | |
| 10 | Client data written | |
| 5 | Client write operation readable/applied latency | |
| 5 | Latency of write operation (including queue time) | |
| 5 | Latency of write operation (excluding queue time) | |
| 10 | Client read-modify-write operations | |
| 10 | Client read-modify-write operations write in | |
| 10 | Client read-modify-write operations read out | |
| 5 | Client read-modify-write operation readable/applied latency | |
| 5 | Latency of read-modify-write operation (including queue time) | |
| 5 | Latency of read-modify-write operation (excluding queue time) | |
| 10 | Suboperations | |
| 10 | Suboperations total size | |
| 5 | Suboperations latency | |
| 10 | Replicated writes | |
| 10 | Replicated written data size | |
| 5 | Replicated writes latency | |
| 10 | Suboperations pull requests | |
| 5 | Suboperations pull latency | |
| 10 | Suboperations push messages | |
| 10 | Suboperations pushed size | |
| 5 | Suboperations push latency | |
| 10 | Pull requests sent | |
| 10 | Push messages sent | |
| 10 | Pushed size | |
| 10 | Inbound push messages | |
| 10 | Inbound pushed size | |
| 10 | Started recovery operations | |
| 2 | CPU load | |
| 2 | Total allocated buffer size | |
| 2 | Placement groups | |
| 2 | Placement groups for which this osd is primary | |
| 2 | Placement groups for which this osd is replica | |
| 2 | Placement groups ready to be deleted from this osd | |
| 2 | Heartbeat (ping) peers we send to | |
| 2 | Heartbeat (ping) peers we recv from | |
| 10 | OSD map messages | |
| 10 | OSD map epochs | |
| 10 | OSD map duplicates | |
| 2 | OSD size | |
| 2 | Used space | |
| 2 | Available space | |
| 10 | Rados 'copy-from' operations | |
| 10 | Tier promotions | |
| 10 | Tier flushes | |
| 10 | Failed tier flushes | |
| 10 | Tier flush attempts | |
| 10 | Failed tier flush attempts | |
| 10 | Tier evictions | |
| 10 | Tier whiteouts | |
| 10 | Dirty tier flag set | |
| 10 | Dirty tier flag cleaned | |
| 10 | Tier delays (agent waiting) | |
| 10 | Tier proxy reads | |
| 10 | Tiering agent wake up | |
| 10 | Objects skipped by agent | |
| 10 | Tiering agent flushes | |
| 10 | Tiering agent evictions | |
| 10 | Object context cache hits | |
| 10 | Object context cache lookups | |
| 2 | Number of clients blocklisted |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 5 | Initial recovery state latency |
| 5 | Started recovery state latency | |
| 5 | Reset recovery state latency | |
| 5 | Start recovery state latency | |
| 5 | Primary recovery state latency | |
| 5 | Peering recovery state latency | |
| 5 | Backfilling recovery state latency | |
| 5 | Wait remote backfill reserved recovery state latency | |
| 5 | Wait local backfill reserved recovery state latency | |
| 5 | Notbackfilling recovery state latency | |
| 5 | Repnotrecovering recovery state latency | |
| 5 | Rep wait recovery reserved recovery state latency | |
| 5 | Rep wait backfill reserved recovery state latency | |
| 5 | RepRecovering recovery state latency | |
| 5 | Activating recovery state latency | |
| 5 | Wait local recovery reserved recovery state latency | |
| 5 | Wait remote recovery reserved recovery state latency | |
| 5 | Recovering recovery state latency | |
| 5 | Recovered recovery state latency | |
| 5 | Clean recovery state latency | |
| 5 | Active recovery state latency | |
| 5 | Replicaactive recovery state latency | |
| 5 | Stray recovery state latency | |
| 5 | Getinfo recovery state latency | |
| 5 | Getlog recovery state latency | |
| 5 | Waitactingchange recovery state latency | |
| 5 | Incomplete recovery state latency | |
| 5 | Getmissing recovery state latency | |
| 5 | Waitupthru recovery state latency |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Currently available throttle |
| 10 | Max value for throttle | |
| 10 | Gets | |
| 10 | Got data | |
| 10 | Get blocked during get_or_fail | |
| 10 | Successful get during get_or_fail | |
| 10 | Takes | |
| 10 | Taken data | |
| 10 | Puts | |
| 10 | Put data | |
| 5 | Waiting latency |
9.8. Ceph Object Gateway metrics
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Requests |
| 10 | Aborted requests | |
| 10 | Copy objects | |
| 10 | Size of copy objects | |
| 10 | Copy object latency | |
| 10 | Delete objects | |
| 10 | Size of delete objects | |
| 10 | Delete object latency | |
| 10 | Delete Buckets | |
| 10 | Delete bucket latency | |
| 10 | Gets | |
| 10 | Size of gets | |
| 5 | Get latency | |
| 10 | List objects | |
| 10 | List object latency | |
| 10 | List buckets | |
| 10 | List buckets latency | |
| 10 | Puts | |
| 10 | Size of puts | |
| 5 | Put latency | |
| 2 | Queue length | |
| 2 | Active requests queue | |
| 10 | Cache hits | |
| 10 | Cache miss | |
| 10 | Keystone token cache hits | |
| 10 | Keystone token cache miss |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 2 | Active operations |
| 2 | Laggy operations | |
| 10 | Sent operations | |
| 10 | Sent data | |
| 10 | Resent operations | |
| 10 | Commit callbacks | |
| 10 | Operation commits | |
| 10 | Operation | |
| 10 | Read operations | |
| 10 | Write operations | |
| 10 | Read-modify-write operations | |
| 10 | PG operation | |
| 10 | Stat operations | |
| 10 | Create object operations | |
| 10 | Read operations | |
| 10 | Write operations | |
| 10 | Write full object operations | |
| 10 | Append operation | |
| 10 | Set object to zero operations | |
| 10 | Truncate object operations | |
| 10 | Delete object operations | |
| 10 | Map extent operations | |
| 10 | Sparse read operations | |
| 10 | Clone range operations | |
| 10 | Get xattr operations | |
| 10 | Set xattr operations | |
| 10 | Xattr comparison operations | |
| 10 | Remove xattr operations | |
| 10 | Reset xattr operations | |
| 10 | TMAP update operations | |
| 10 | TMAP put operations | |
| 10 | TMAP get operations | |
| 10 | Call (execute) operations | |
| 10 | Watch by object operations | |
| 10 | Notify about object operations | |
| 10 | Extended attribute comparison in multi operations | |
| 10 | Other operations | |
| 2 | Active lingering operations | |
| 10 | Sent lingering operations | |
| 10 | Resent lingering operations | |
| 10 | Sent pings to lingering operations | |
| 2 | Active pool operations | |
| 10 | Sent pool operations | |
| 10 | Resent pool operations | |
| 2 | Active get pool stat operations | |
| 10 | Pool stat operations sent | |
| 10 | Resent pool stats | |
| 2 | Statfs operations | |
| 10 | Sent FS stats | |
| 10 | Resent FS stats | |
| 2 | Active commands | |
| 10 | Sent commands | |
| 10 | Resent commands | |
| 2 | OSD map epoch | |
| 10 | Full OSD maps received | |
| 10 | Incremental OSD maps received | |
| 2 | Open sessions | |
| 10 | Sessions opened | |
| 10 | Sessions closed | |
| 2 | Laggy OSD sessions |
Collection Name | Metric Name | Bit Field Value | Short Description |
---|---|---|---|
|
| 10 | Currently available throttle |
| 10 | Max value for throttle | |
| 10 | Gets | |
| 10 | Got data | |
| 10 | Get blocked during get_or_fail | |
| 10 | Successful get during get_or_fail | |
| 10 | Takes | |
| 10 | Taken data | |
| 10 | Puts | |
| 10 | Put data | |
| 5 | Waiting latency |