Chapter 3. Requirements for deploying Red Hat Ceph Storage stretch cluster with arbiter
Red Hat Ceph Storage is an open-source enterprise platform that provides unified software-defined storage on standard, economical servers and disks. With block, object, and file storage combined into one platform, Red Hat Ceph Storage efficiently and automatically manages all your data, so you can focus on the applications and workloads that use it.
This section provides a basic overview of the Red Hat Ceph Storage deployment. For more complex deployment, refer to the official documentation guide for RHCS 5.
Only Flash media is supported since it runs with min_size=1
when degraded. Use stretch mode only with all-flash OSDs. Using all-flash OSDs minimizes the time needed to recover once connectivity is restored, thus minimizing the potential for data loss.
Erasure coded pools cannot be used with stretch mode.
3.1. Hardware requirements
For information on minimum hardware requirements for deploying Red Hat Ceph Storage, see Minimum hardware recommendations for containerized Ceph.
Node name | Datacenter | Ceph components |
---|---|---|
ceph1 | DC1 | OSD+MON+MGR |
ceph2 | DC1 | OSD+MON |
ceph3 | DC1 | OSD+MDS+RGW |
ceph4 | DC2 | OSD+MON+MGR |
ceph5 | DC2 | OSD+MON |
ceph6 | DC2 | OSD+MDS+RGW |
ceph7 | DC3 | MON |
3.2. Software requirements
Use the latest software version of Red Hat Ceph Storage 5.
For more information on the supported Operating System versions for Red Hat Ceph Storage, see knowledgebase article on Red Hat Ceph Storage: Supported configurations.
3.3. Network configuration requirements
The recommended Red Hat Ceph Storage configuration is as follows:
- You must have two separate networks, one public network and one private network.
You must have three different datacenters that support VLANS and subnets for Cephs private and public network for all datacenters.
NoteYou can use different subnets for each of the datacenters.
- The latencies between the two datacenters running the Red Hat Ceph Storage Object Storage Devices (OSDs) cannot exceed 10 ms RTT. For the arbiter datacenter, this was tested with values as high as 100 ms RTT to the other two OSD datacenters.
Here is an example of a basic network configuration that we have used in this guide:
- DC1: Ceph public/private network: 10.0.40.0/24
- DC2: Ceph public/private network: 10.0.40.0/24
- DC3: Ceph public/private network: 10.0.40.0/24
For more information on the required network environment, see Ceph network configuration.
3.4. Node pre-deployment requirements
Before installing the Red Hat Ceph Storage cluster, perform the following steps to fulfill all the requirements needed.
Register all the nodes to the Red Hat Network or Red Hat Satellite and subscribe to a valid pool:
subscription-manager register subscription-manager subscribe --pool=8a8XXXXXX9e0
Enable access for all the nodes in the Ceph cluster for the following repositories:
-
rhel-8-for-x86_64-baseos-rpms
rhel-8-for-x86_64-appstream-rpms
subscription-manager repos --disable="*" --enable="rhel-8-for-x86_64-baseos-rpms" --enable="rhel-8-for-x86_64-appstream-rpms"
-
Update the operating system RPMs to the latest version and reboot if needed:
dnf update -y reboot
Select a node from the cluster to be your bootstrap node.
ceph1
is our bootstrap node in this example going forward.Only on the bootstrap node
ceph1
, enable theansible-2.9-for-rhel-8-x86_64-rpms
andrhceph-5-tools-for-rhel-8-x86_64-rpms
repositories:subscription-manager repos --enable="ansible-2.9-for-rhel-8-x86_64-rpms" --enable="rhceph-5-tools-for-rhel-8-x86_64-rpms"
Configure the
hostname
using the bare/short hostname in all the hosts.hostnamectl set-hostname <short_name>
Verify the hostname configuration for deploying Red Hat Ceph Storage with cephadm.
$ hostname
Example output:
ceph1
Modify /etc/hosts file and add the fqdn entry to the 127.0.0.1 IP by setting the DOMAIN variable with our DNS domain name.
DOMAIN="example.domain.com" cat <<EOF >/etc/hosts 127.0.0.1 $(hostname).${DOMAIN} $(hostname) localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 $(hostname).${DOMAIN} $(hostname) localhost6 localhost6.localdomain6 EOF
Check the long hostname with the
fqdn
using thehostname -f
option.$ hostname -f
Example output:
ceph1.example.domain.com
Note: To know more about why these changes are required, see Fully Qualified Domain Names vs Bare Host Names.
Run the following steps on bootstrap node. In our example, the bootstrap node is
ceph1
.Install the
cephadm-ansible
RPM package:$ sudo dnf install -y cephadm-ansible
ImportantTo run the ansible playbooks, you must have
ssh
passwordless access to all the nodes that are configured to the Red Hat Ceph Storage cluster. Ensure that the configured user (for example,deployment-user
) has root privileges to invoke thesudo
command without needing a password.To use a custom key, configure the selected user (for example,
deployment-user
) ssh config file to specify the id/key that will be used for connecting to the nodes via ssh:cat <<EOF > ~/.ssh/config Host ceph* User deployment-user IdentityFile ~/.ssh/ceph.pem EOF
Build the ansible inventory
cat <<EOF > /usr/share/cephadm-ansible/inventory ceph1 ceph2 ceph3 ceph4 ceph5 ceph6 ceph7 [admin] ceph1 EOF
NoteHosts configured as part of the [admin] group on the inventory file will be tagged as
_admin
bycephadm
, so they receive the admin ceph keyring during the bootstrap process.Verify that
ansible
can access all nodes using ping module before running the pre-flight playbook.$ ansible -i /usr/share/cephadm-ansible/inventory -m ping all -b
Example output:
ceph6 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph4 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph3 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph2 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph5 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph1 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" } ceph7 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/libexec/platform-python" }, "changed": false, "ping": "pong" }
Run the following ansible playbook.
$ ansible-playbook -i /usr/share/cephadm-ansible/inventory /usr/share/cephadm-ansible/cephadm-preflight.yml --extra-vars "ceph_origin=rhcs"
The preflight playbook Ansible playbook configures the Red Hat Ceph Storage
dnf
repository and prepares the storage cluster for bootstrapping. It also installs podman, lvm2, chronyd, and cephadm. The default location forcephadm-ansible
andcephadm-preflight.yml
is/usr/share/cephadm-ansible
.
3.5. Cluster bootstrapping and service deployment with Cephadm
The cephadm utility installs and starts a single Ceph Monitor daemon and a Ceph Manager daemon for a new Red Hat Ceph Storage cluster on the local node where the cephadm bootstrap command is run.
For additional information on the bootstrapping process, see Bootstrapping a new storage cluster.
Procedure
Create json file to authenticate against the container registry using a json file as follows:
$ cat <<EOF > /root/registry.json { "url":"registry.redhat.io", "username":"User", "password":"Pass" } EOF
Create a
cluster-spec.yaml
that adds the nodes to the RHCS cluster and also sets specific labels for where the services should run following table 3.1.cat <<EOF > /root/cluster-spec.yaml service_type: host addr: 10.0.40.78 ## <XXX.XXX.XXX.XXX> hostname: ceph1 ## <ceph-hostname-1> location: root: default datacenter: DC1 labels: - osd - mon - mgr --- service_type: host addr: 10.0.40.35 hostname: ceph2 location: datacenter: DC1 labels: - osd - mon --- service_type: host addr: 10.0.40.24 hostname: ceph3 location: datacenter: DC1 labels: - osd - mds - rgw --- service_type: host addr: 10.0.40.185 hostname: ceph4 location: root: default datacenter: DC2 labels: - osd - mon - mgr --- service_type: host addr: 10.0.40.88 hostname: ceph5 location: datacenter: DC2 labels: - osd - mon --- service_type: host addr: 10.0.40.66 hostname: ceph6 location: datacenter: DC2 labels: - osd - mds - rgw --- service_type: host addr: 10.0.40.221 hostname: ceph7 labels: - mon --- service_type: mon placement: label: "mon" --- service_type: mds service_id: fs_name placement: label: "mds" --- service_type: mgr service_name: mgr placement: label: "mgr" --- service_type: osd service_id: all-available-devices service_name: osd.all-available-devices placement: label: "osd" spec: data_devices: all: true --- service_type: rgw service_id: objectgw service_name: rgw.objectgw placement: count: 2 label: "rgw" spec: rgw_frontend_port: 8080 EOF
Retrieve the IP for the NIC with the RHCS public network configured from the bootstrap node. After substituting
10.0.40.0
with the subnet that you have defined in your ceph public network, execute the following command.$ ip a | grep 10.0.40
Example output:
10.0.40.78
Run the
Cephadm
bootstrap command as the root user on the node that will be the initial Monitor node in the cluster. TheIP_ADDRESS
option is the node’s IP address that you are using to run thecephadm bootstrap
command.NoteIf you have configured a different user instead of
root
for passwordless SSH access, then use the--ssh-user=
flag with thecepadm bootstrap
command.$ cephadm bootstrap --ssh-user=deployment-user --mon-ip 10.0.40.78 --apply-spec /root/cluster-spec.yaml --registry-json /root/registry.json
ImportantIf the local node uses fully-qualified domain names (FQDN), then add the
--allow-fqdn-hostname
option tocephadm bootstrap
on the command line.Once the bootstrap finishes, you will see the following output from the previous cephadm bootstrap command:
You can access the Ceph CLI with: sudo /usr/sbin/cephadm shell --fsid dd77f050-9afe-11ec-a56c-029f8148ea14 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring Please consider enabling telemetry to help improve Ceph: ceph telemetry on For more information see: https://docs.ceph.com/docs/pacific/mgr/telemetry/
Verify the status of Red Hat Ceph Storage cluster deployment using the Ceph CLI client from ceph1:
$ ceph -s
Example output:
cluster: id: 3a801754-e01f-11ec-b7ab-005056838602 health: HEALTH_OK services: mon: 5 daemons, quorum ceph1,ceph2,ceph4,ceph5,ceph7 (age 4m) mgr: ceph1.khuuot(active, since 5m), standbys: ceph4.zotfsp osd: 12 osds: 12 up (since 3m), 12 in (since 4m) rgw: 2 daemons active (2 hosts, 1 zones) data: pools: 5 pools, 107 pgs objects: 191 objects, 5.3 KiB usage: 105 MiB used, 600 GiB / 600 GiB avail 105 active+clean
NoteIt may take several minutes for all the services to start.
It is normal to get a global recovery event while you don’t have any osds configured.
You can use
ceph orch ps
andceph orch ls
to further check the status of the services.Verify if all the nodes are part of the
cephadm
cluster.$ ceph orch host ls
Example output:
HOST ADDR LABELS STATUS ceph1 10.0.40.78 _admin osd mon mgr ceph2 10.0.40.35 osd mon ceph3 10.0.40.24 osd mds rgw ceph4 10.0.40.185 osd mon mgr ceph5 10.0.40.88 osd mon ceph6 10.0.40.66 osd mds rgw ceph7 10.0.40.221 mon
NoteYou can run Ceph commands directly from the host because
ceph1
was configured in thecephadm-ansible
inventory as part of the [admin] group. The Ceph admin keys were copied to the host during thecephadm bootstrap
process.Check the current placement of the Ceph monitor services on the datacenters.
$ ceph orch ps | grep mon | awk '{print $1 " " $2}'
Example output:
mon.ceph1 ceph1 mon.ceph2 ceph2 mon.ceph4 ceph4 mon.ceph5 ceph5 mon.ceph7 ceph7
Check the current placement of the Ceph manager services on the datacenters.
$ ceph orch ps | grep mgr | awk '{print $1 " " $2}'
Example output:
mgr.ceph2.ycgwyz ceph2 mgr.ceph5.kremtt ceph5
Check the ceph osd crush map layout to ensure that each host has one OSD configured and its status is
UP
. Also, double-check that each node is under the right datacenter bucket as specified in table 3.1$ ceph osd tree
Example output:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.87900 root default -16 0.43950 datacenter DC1 -11 0.14650 host ceph1 2 ssd 0.14650 osd.2 up 1.00000 1.00000 -3 0.14650 host ceph2 3 ssd 0.14650 osd.3 up 1.00000 1.00000 -13 0.14650 host ceph3 4 ssd 0.14650 osd.4 up 1.00000 1.00000 -17 0.43950 datacenter DC2 -5 0.14650 host ceph4 0 ssd 0.14650 osd.0 up 1.00000 1.00000 -9 0.14650 host ceph5 1 ssd 0.14650 osd.1 up 1.00000 1.00000 -7 0.14650 host ceph6 5 ssd 0.14650 osd.5 up 1.00000 1.00000
Create and enable a new RDB block pool.
$ ceph osd pool create rbdpool 32 32 $ ceph osd pool application enable rbdpool rbd
NoteThe number 32 at the end of the command is the number of PGs assigned to this pool. The number of PGs can vary depending on several factors like the number of OSDs in the cluster, expected % used of the pool, etc. You can use the following calculator to determine the number of PGs needed: Ceph Placement Groups (PGs) per Pool Calculator.
Verify that the RBD pool has been created.
$ ceph osd lspools | grep rbdpool
Example output:
3 rbdpool
Verify that MDS services are active and has located one service on each datacenter.
$ ceph orch ps | grep mds
Example output:
mds.cephfs.ceph3.cjpbqo ceph3 running (17m) 117s ago 17m 16.1M - 16.2.9 mds.cephfs.ceph6.lqmgqt ceph6 running (17m) 117s ago 17m 16.1M - 16.2.9
Create the CephFS volume.
$ ceph fs volume create cephfs
NoteThe
ceph fs volume create
command also creates the needed data and meta CephFS pools. For more information, see Configuring and Mounting Ceph File Systems.Check the
Ceph
status to verify how the MDS daemons have been deployed. Ensure that the state is active whereceph6
is the primary MDS for this filesystem andceph3
is the secondary MDS.$ ceph fs status
Example output:
cephfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.ceph6.ggjywj Reqs: 0 /s 10 13 12 0 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 96.0k 284G cephfs.cephfs.data data 0 284G STANDBY MDS cephfs.ceph3.ogcqkl
Verify that RGW services are active.
$ ceph orch ps | grep rgw
Example output:
rgw.objectgw.ceph3.kkmxgb ceph3 *:8080 running (7m) 3m ago 7m 52.7M - 16.2.9 rgw.objectgw.ceph6.xmnpah ceph6 *:8080 running (7m) 3m ago 7m 53.3M - 16.2.9