Chapter 3. Administering Ceph Clusters That Run in Containers
This chapter describes basic administration tasks to perform on Ceph clusters that run in containers, such as:
3.1. Starting, Stopping, and Restarting Ceph Daemons That Run in Containers
Use the systemctl
command start, stop, or restart Ceph daemons that run in containers.
Procedure
To start, stop, or restart a Ceph daemon running in a container, run a
systemctl
command asroot
composed in the following format:systemctl action ceph-daemon@ID
Where:
-
action is the action to perform;
start
,stop
, orrestart
-
daemon is the daemon;
osd
,mon
,mds
, orrgw
ID is either
-
The short host name where the
ceph-mon
,ceph-mds
, orceph-rgw
daemons are running -
The ID of the
ceph-osd
daemon if it was deployed theosd_scenario
parameter set tolvm
-
The device name that the
ceph-osd
daemon uses if it was deployed with theosd_scenario
parameter set tocollocated
ornon-collocated
-
The short host name where the
For example, to restart a
ceph-osd
daemon with the IDosd01
:# systemctl restart ceph-osd@osd01
To start a
ceph-mon
demon that runs on theceph-monitor01
host:# systemctl start ceph-mon@ceph-monitor01
To stop a
ceph-rgw
daemon that runs on theceph-rgw01
host:# systemctl stop ceph-radosgw@ceph-rgw01
-
action is the action to perform;
Verify that the action was completed successfully.
systemctl status ceph-daemon@_ID
For example:
# systemctl status ceph-mon@ceph-monitor01
Additional Resources
- The Running Ceph as a systemd Service section in the Administration Guide for Red Hat Ceph Storage 3.
3.2. Viewing Log Files of Ceph Daemons That Run in Containers
Use the journald
daemon from the container host to view a log file of a Ceph daemon from a container.
Procedure
To view the entire Ceph log file, run a
journalctl
command asroot
composed in the following format:journalctl -u ceph-daemon@ID
Where:
-
daemon is the Ceph daemon;
osd
,mon
, orrgw
ID is either
-
The short host name where the
ceph-mon
,ceph-mds
, orceph-rgw
daemons are running -
The ID of the
ceph-osd
daemon if it was deployed theosd_scenario
parameter set tolvm
-
The device name that the
ceph-osd
daemon uses if it was deployed with theosd_scenario
parameter set tocollocated
ornon-collocated
-
The short host name where the
For example, to view the entire log for the
ceph-osd
daemon with the IDosd01
:# journalctl -u ceph-osd@osd01
-
daemon is the Ceph daemon;
To show only the recent journal entries, use the
-f
option.journalctl -fu ceph-daemon@ID
For example, to view only recent journal entries for the
ceph-mon
daemon that runs on theceph-monitor01
host:# journalctl -fu ceph-mon@ceph-monitor01
You can also use the sosreport
utility to view the journald
logs. For more details about SOS reports, see the What is a sosreport and how to create one in Red Hat Enterprise Linux 4.6 and later? solution on the Red Hat Customer Portal.
Additional Resources
-
The
journalctl(1)
manual page
3.3. Adding a Ceph OSD using the command-line interface
Here is the high-level workflow for manually adding an OSD to a Red Hat Ceph Storage:
-
Install the
ceph-osd
package and create a new OSD instance - Prepare and mount the OSD data and journal drives
- Add the new OSD node to the CRUSH map
- Update the owner and group permissions
-
Enable and start the
ceph-osd
daemon
The ceph-disk
command is deprecated. The ceph-volume
command is now the preferred method for deploying OSDs from the command-line interface. Currently, the ceph-volume
command only supports the lvm
plugin. Red Hat will provide examples throughout this guide using both commands as a reference, allowing time for storage administrators to convert any custom scripts that rely on ceph-disk
to ceph-volume
instead.
See the Red Hat Ceph Storage Administration Guide, for more information on using the ceph-volume
command.
For custom storage cluster names, use the --cluster $CLUSTER_NAME
option with the ceph
and ceph-osd
commands.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Review the Requirements for Installing Red Hat Ceph Storage chapter in the Installation Guide for Red Hat Enterprise Linux or Ubuntu.
-
Having
root
access to the new nodes.
Procedure
Enable the Red Hat Ceph Storage 3 OSD software repository.
Red Hat Enterprise Linux
[root@osd ~]# subscription-manager repos --enable=rhel-7-server-rhceph-3-osd- els-rpms
-
Create the
/etc/ceph/
directory: - On the new OSD node, copy the Ceph administration keyring and configuration files from one of the Ceph Monitor nodes:
Install the
ceph-osd
package on the new Ceph OSD node:Red Hat Enterprise Linux
[root@osd ~]# yum install ceph-osd
Decide if you want to collocate a journal or use a dedicated journal for the new OSDs.
NoteThe
--filestore
option is required.For OSDs with a collocated journal:
Syntax
[root@osd ~]# docker exec $CONTAINER_ID ceph-disk --setuser ceph --setgroup ceph prepare --filestore /dev/$DEVICE_NAME
Example:
[root@osd ~]# docker exec ceph-osd-osd1 ceph-disk --setuser ceph --setgroup ceph prepare --filestore /dev/sda
For OSDs with a dedicated journal:
Syntax
[root@osd ~]# docker exec $CONTAINER_ID ceph-disk --setuser ceph --setgroup ceph prepare --filestore /dev/$DEVICE_NAME /dev/$JOURNAL_DEVICE_NAME
or
[root@osd ~]# docker exec $CONTAINER_ID ceph-volume lvm prepare --filestore --data /dev/$DEVICE_NAME --journal /dev/$JOURNAL_DEVICE_NAME
Examples
[root@osd ~]# docker exec ceph-osd-osd1 ceph-disk --setuser ceph --setgroup ceph prepare --filestore /dev/sda /dev/sdb
[root@osd ~]# docker exec ceph-osd-osd1 ceph-volume lvm prepare --filestore --data /dev/vg00/lvol1 --journal /dev/sdb
Set the
noup
option:[root@osd ~]# ceph osd set noup
Activate the new OSD:
Syntax
[root@osd ~]# docker exec $CONTAINER_ID ceph-disk activate /dev/$DEVICE_NAME
or
[root@osd ~]# docker exec $CONTAINER_ID ceph-volume lvm activate --filestore $OSD_ID $OSD_FSID
Example
[root@osd ~]# docker exec ceph-osd-osd1 ceph-disk activate /dev/sda
[root@osd ~]# docker exec ceph-osd-osd1 ceph-volume lvm activate --filestore 0 6cc43680-4f6e-4feb-92ff-9c7ba204120e
Add the OSD to the CRUSH map:
Syntax
ceph osd crush add $OSD_ID $WEIGHT [$BUCKET_TYPE=$BUCKET_NAME ...]
Example
[root@osd ~]# ceph osd crush add 4 1 host=node4
NoteIf you specify more than one bucket, the command places the OSD into the most specific bucket out of those you specified, and it moves the bucket underneath any other buckets you specified.
NoteYou can also edit the CRUSH map manually. See the Editing a CRUSH map section in the Storage Strategies guide for Red Hat Ceph Storage 3.
ImportantIf you specify only the root bucket, then the OSD attaches directly to the root, but the CRUSH rules expect OSDs to be inside of the host bucket.
Unset the
noup
option:[root@osd ~]# ceph osd unset noup
Update the owner and group permissions for the newly created directories:
Syntax
chown -R $OWNER:$GROUP $PATH_TO_DIRECTORY
Example
[root@osd ~]# chown -R ceph:ceph /var/lib/ceph/osd [root@osd ~]# chown -R ceph:ceph /var/log/ceph [root@osd ~]# chown -R ceph:ceph /var/run/ceph [root@osd ~]# chown -R ceph:ceph /etc/ceph
If you use clusters with custom names, then add the following line to the appropriate file:
Red Hat Enterprise Linux
[root@osd ~]# echo "CLUSTER=$CLUSTER_NAME" >> /etc/sysconfig/ceph
Replace
$CLUSTER_NAME
with the custom cluster name.To ensure that the new OSD is
up
and ready to receive data, enable and start the OSD service:Syntax
systemctl enable ceph-osd@$OSD_ID systemctl start ceph-osd@$OSD_ID
Example
[root@osd ~]# systemctl enable ceph-osd@4 [root@osd ~]# systemctl start ceph-osd@4
3.4. Removing a Ceph OSD using the command-line interface
Removing an OSD from a storage cluster involves updating the cluster map, removing its authentication key, removing the OSD from the OSD map, and removing the OSD from the ceph.conf
file. If the node has multiple drives, you might need to remove an OSD for each drive by repeating this procedure.
Prerequisites
- A running Red Hat Ceph Storage cluster.
-
Enough available OSDs so that the storage cluster is not at its
near full
ratio. -
Having
root
access to the OSD node.
Procedure
Disable and stop the OSD service:
Syntax
systemctl disable ceph-osd@$DEVICE_NAME systemctl stop ceph-osd@$DEVICE_NAME
Example
[root@osd ~]# systemctl disable ceph-osd@sdb [root@osd ~]# systemctl stop ceph-osd@sdb
Once the OSD is stopped, it is
down
.Remove the OSD from the storage cluster:
Syntax
ceph osd out $DEVICE_NAME
Example
[root@osd ~]# ceph osd out sdb
ImportantOnce the OSD is out, Ceph will start rebalancing and copying data to other OSDs in the storage cluster. Red Hat recommends waiting until the storage cluster becomes
active+clean
before proceeding to the next step. To observe the data migration, run the following command:[root@monitor ~]# ceph -w
Remove the OSD from the CRUSH map so that it no longer receives data.
Syntax
ceph osd crush remove $OSD_NAME
Example
[root@osd ~]# ceph osd crush remove osd.4
NoteYou can also decompile the CRUSH map, remove the OSD from the device list, remove the device as an item in the host bucket or remove the host bucket. If it is in the CRUSH map and you intend to remove the host, recompile the map and set it. See the Storage Strategies Guide for details.
Remove the OSD authentication key:
Syntax
ceph auth del osd.$DEVICE_NAME
Example
[root@osd ~]# ceph auth del osd.sdb
Remove the OSD:
Syntax
ceph osd rm $DEVICE_NAME
Example
[root@osd ~]# ceph osd rm sdb
Edit the storage cluster’s configuration file, by default
/etc/ceph.conf
, and remove the OSD entry, if it exists:Example
[osd.4] host = $HOST_NAME
-
Remove the reference to the OSD in the
/etc/fstab
file, if the OSD was added manually. Copy the updated configuration file to the
/etc/ceph/
directory of all other nodes in the storage cluster.Syntax
scp /etc/ceph/$CLUSTER_NAME.conf $USER_NAME@$HOST_NAME:/etc/ceph/
Example
[root@osd ~]# scp /etc/ceph/ceph.conf root@node4:/etc/ceph/
3.5. Replacing an OSD drive while retaining the OSD ID
When replacing a failed OSD drive, you can keep the original OSD ID and CRUSH map entry.
The ceph-volume lvm
commands defaults to BlueStore for OSDs. To use FileStore OSDs, then use the --filestore
, --data
and --journal
options.
See the Preparing the OSD Data and Journal Drives section for more details.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A failed disk.
Procedure
Destroy the OSD:
ceph osd destroy $OSD_ID --yes-i-really-mean-it
Example
$ ceph osd destroy 1 --yes-i-really-mean-it
Optionally, if the replacement disk was used previously, then you need to
zap
the disk:docker exec $CONTAINER_ID ceph-volume lvm zap $DEVICE
Example
$ docker exec ceph-osd-osd1 ceph-volume lvm zap /dev/sdb
Create the new OSD with the existing OSD ID:
docker exec $CONTAINER_ID ceph-volume lvm create --osd-id $OSD_ID --data $DEVICE
Example
$ docker exec ceph-osd-osd1 ceph-volume lvm create --osd-id 1 --data /dev/sdb
3.6. Purging Clusters Deployed by Ansible
If you no longer want to use a Ceph cluster, use the purge-docker-cluster.yml
playbook to purge the cluster. Purging a cluster is also useful when the installation process failed and you want to start over.
After purging a Ceph cluster, all data on the OSDs are lost.
Prerequisites
-
Ensure that the
/var/log/ansible.log
file is writable.
Procedure
Use the following commands from the Ansible administration node.
As the
root
user, navigate to the/usr/share/ceph-ansible/
directory.[root@admin ~]# cd /usr/share/ceph-ansible
Copy the
purge-docker-cluster.yml
playbook from the/usr/share/infrastructure-playbooks/
directory to the current directory:[root@admin ceph-ansible]# cp infrastructure-playbooks/purge-docker-cluster.yml .
As the Ansible user, use the
purge-docker-cluster.yml
playbook to purge the Ceph cluster.To remove all packages, containers, configuration files, and all the data created by the
ceph-ansible
playbook:[user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml
To specify a different inventory file than the default one (
/etc/ansible/hosts
), use-i
parameter:ansible-playbook purge-docker-cluster.yml -i inventory-file
Replace inventory-file with the path to the inventory file.
For example:
[user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml -i ~/ansible/hosts
To skip the removal of the Ceph container image, use the
--skip-tags=”remove_img”
option:[user@admin ceph-ansible]$ ansible-playbook --skip-tags="remove_img" purge-docker-cluster.yml
To skip the removal of the packages that were installed during the installation, use the
--skip-tags=”with_pkg”
option:[user@admin ceph-ansible]$ ansible-playbook --skip-tags="with_pkg" purge-docker-cluster.yml