Chapter 9. Ceph performance benchmark
As a storage administrator, you can benchmark performance of the Red Hat Ceph Storage cluster. The purpose of this section is to give Ceph administrators a basic understanding of Ceph’s native benchmarking tools. These tools will provide some insight into how the Ceph storage cluster is performing. This is not the definitive guide to Ceph performance benchmarking, nor is it a guide on how to tune Ceph accordingly.
9.1. Performance baseline Copy linkLink copied to clipboard!
The OSD, including the journal, disks and the network throughput should each have a performance baseline to compare against. You can identify potential tuning opportunities by comparing the baseline performance data with the data from Ceph’s native tools. Red Hat Enterprise Linux has many built-in tools, along with a plethora of open source community tools, available to help accomplish these tasks.
Additional Resources
- For more details about some of the available tools, see this Knowledgebase article.
9.2. Benchmarking Ceph performance Copy linkLink copied to clipboard!
Ceph includes the rados bench
command to do performance benchmarking on a RADOS storage cluster. The command will execute a write test and two types of read tests. The --no-cleanup
option is important to use when testing both read and write performance. By default the rados bench
command will delete the objects it has written to the storage pool. Leaving behind these objects allows the two read tests to measure sequential and random read performance.
Before running these performance tests, drop all the file system caches by running the following:
Example
[ceph: root@host01 /]# echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync
[ceph: root@host01 /]# echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
Procedure
Create a new storage pool:
Example
[ceph: root@host01 /]# ceph osd pool create testbench 100 100
[ceph: root@host01 /]# ceph osd pool create testbench 100 100
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Execute a write test for 10 seconds to the newly created storage pool:
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Execute a sequential read test for 10 seconds to the storage pool:
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Execute a random read test for 10 seconds to the storage pool:
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To increase the number of concurrent reads and writes, use the
-t
option, which the default is 16 threads. Also, the-b
parameter can adjust the size of the object being written. The default object size is 4 MB. A safe maximum object size is 16 MB. Red Hat recommends running multiple copies of these benchmark tests to different pools. Doing this shows the changes in performance from multiple clients.Add the
--run-name LABEL
option to control the names of the objects that get written during the benchmark test. Multiplerados bench
commands might be ran simultaneously by changing the--run-name
label for each running command instance. This prevents potential I/O errors that can occur when multiple clients are trying to access the same object and allows for different clients to access different objects. The--run-name
option is also useful when trying to simulate a real world workload.Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the data created by the
rados bench
command:Example
[ceph: root@host01 /]# rados -p testbench cleanup
[ceph: root@host01 /]# rados -p testbench cleanup
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3. Benchmarking Ceph block performance Copy linkLink copied to clipboard!
Ceph includes the rbd bench-write
command to test sequential writes to the block device measuring throughput and latency. The default byte size is 4096, the default number of I/O threads is 16, and the default total number of bytes to write is 1 GB. These defaults can be modified by the --io-size
, --io-threads
and --io-total
options respectively.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
Procedure
Run the write performance test against the block device
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.4. Benchmarking CephFS performance Copy linkLink copied to clipboard!
You can use the FIO tool to benchmark Ceph File System (CephFS) performance. This tool can also be used to benchmark Ceph Block Device.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
- FIO tool installed on the nodes. See the KCS How to install the Flexible I/O Tester (fio) performance benchmarking tool for more details.
- Block Device or the Ceph File System mounted on the node.
Procedure
Navigate to the node or the application where the Block Device or the CephFS is mounted:
Example
cd /mnt/ceph-block-device cd /mnt/ceph-file-system
[root@host01 ~]# cd /mnt/ceph-block-device [root@host01 ~]# cd /mnt/ceph-file-system
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run FIO command. Start the
bs
value from 4k and repeat in power of 2 increments (4k, 8k, 16k, 32k … 128k… 512k, 1m, 2m, 4m ) and with differentiodepth
settings. You should also run tests at your expected workload operation size.Example for 4K tests with different iodepth values
fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=4k --iodepth=32 --size=5G --runtime=60 --group_reporting=1
fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=4k --iodepth=32 --size=5G --runtime=60 --group_reporting=1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example for 8K tests with different iodepth values
fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=8k --iodepth=32 --size=5G --runtime=60 --group_reporting=1
fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=8k --iodepth=32 --size=5G --runtime=60 --group_reporting=1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor more information on the usage of
fio
command, see thefio
man page.
9.5. Benchmarking Ceph Object Gateway performance Copy linkLink copied to clipboard!
You can use the s3cmd
tool to benchmark Ceph Object Gateway performance.
Use get
and put
requests to determine the performance.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the node.
-
s3cmd
installed on the nodes.
Procedure
Upload a file and measure the speed. The
time
command measures the duration of upload.Syntax
time s3cmd put PATH_OF_SOURCE_FILE PATH_OF_DESTINATION_FILE
time s3cmd put PATH_OF_SOURCE_FILE PATH_OF_DESTINATION_FILE
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
time s3cmd put /path-to-local-file s3://bucket-name/remote/file
time s3cmd put /path-to-local-file s3://bucket-name/remote/file
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
/path-to-local-file
with the file you want to upload ands3://bucket-name/remote/file
with the destination in your S3 bucket.Download a file and measure the speed. The
time
command measures the duration of download.Syntax
time s3cmd get PATH_OF_DESTINATION_FILE DESTINATION_PATH
time s3cmd get PATH_OF_DESTINATION_FILE DESTINATION_PATH
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
time s3cmd get s3://bucket-name/remote/file /path-to-local-destination
time s3cmd get s3://bucket-name/remote/file /path-to-local-destination
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
s3://bucket-name/remote/file
with the S3 object you want to download and/path-to-local-destination
with the local directory where you want to save the file.List all the objects in the specified bucket and measure response time.
Syntax
time s3cmd ls s3://BUCKET_NAME
time s3cmd ls s3://BUCKET_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
time s3cmd ls s3://bucket-name
time s3cmd ls s3://bucket-name
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Analyze the output to calculate upload/download speed and measure response time based on the duration reported by the
time
command.