Chapter 2. Recommended etcd practices


The following documentation provides information about recommended performance and scalability practices for etcd.

2.3. Validating the hardware for etcd

To validate the hardware for etcd before or after you create the OpenShift Container Platform cluster, you can use fio.

Prerequisites

  • Container runtimes such as Podman or Docker are installed on the machine that you are testing.
  • Data is written to the /var/lib/etcd path.

Procedure

  • Run fio and analyze the results:

    • If you use Podman, run this command:

      $ sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
      Copy to Clipboard Toggle word wrap
    • If you use Docker, run this command:

      $ sudo docker run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
      Copy to Clipboard Toggle word wrap

The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 10 ms. A few of the most important etcd metrics that might affected by I/O performance are as follows:

  • etcd_disk_wal_fsync_duration_seconds_bucket metric reports the etcd’s WAL fsync duration
  • etcd_disk_backend_commit_duration_seconds_bucket metric reports the etcd backend commit latency duration
  • etcd_server_leader_changes_seen_total metric reports the leader changes

Because etcd replicates the requests among all the members, its performance strongly depends on network input/output (I/O) latency. High network latencies result in etcd heartbeats taking longer than the election timeout, which results in leader elections that are disruptive to the cluster. A key metric to monitor on a deployed OpenShift Container Platform cluster is the 99th percentile of etcd network peer latency on each etcd cluster member. Use Prometheus to track the metric.

The histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[2m])) metric reports the round trip time for etcd to finish replicating the client requests between the members. Ensure that it is less than 50 ms.

Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat