Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 2. Storage classes
The OpenShift Data Foundation operator installs a default storage class depending on the platform in use. This default storage class is owned and controlled by the operator and it cannot be deleted or modified. However, you can create custom storage classes to use other storage resources or to offer a different behavior to applications.
Custom storage classes are not supported for external mode OpenShift Data Foundation clusters.
2.1. Creating storage classes and pools Copier lienLien copié sur presse-papiers!
You can create a storage class using an existing pool or you can create a new pool for the storage class while creating it.
Prerequisites
-
Ensure that you are logged into the OpenShift Container Platform web console and OpenShift Data Foundation cluster is in
Readystate.
Procedure
-
Click Storage
StorageClasses. - Click Create Storage Class.
- Enter the storage class Name and Description.
Reclaim Policy is set to
Deleteas the default option. Use this setting.If you change the reclaim policy to
Retainin the storage class, the persistent volume (PV) remains inReleasedstate even after deleting the persistent volume claim (PVC).Volume binding mode is set to
WaitForConsumeras the default option.If you choose the
Immediateoption, then the PV gets created immediately when creating the PVC.-
Select
RBDorCephFSProvisioner as the plugin for provisioning the persistent volumes. - Choose a Storage system for your workloads.
Select an existing Storage Pool from the list or create a new pool.
NoteThe 2-way replication data protection policy is only supported for the non-default RBD pool. 2-way replication can be used by creating an additional pool. To know about Data Availability and Integrity considerations for replica 2 pools, see Knowledgebase Customer Solution Article.
- Create new pool
- Click Create New Pool.
- Enter Pool name.
- Choose 2-way-Replication or 3-way-Replication as the Data Protection Policy.
Select Enable compression if you need to compress the data.
Enabling compression can impact application performance and might prove ineffective when data to be written is already compressed or encrypted. Data written before enabling compression will not be compressed.
- Click Create to create the new storage pool.
- Click Finish after the pool is created.
- Optional: Select Enable Encryption checkbox.
- Click Create to create the storage class.
2.2. Storage class with single replica Copier lienLien copié sur presse-papiers!
You can create a storage class with a single replica to be used by your applications. This avoids redundant data copies and allows resiliency management on the application level.
Enabling this feature creates a single replica pool without data replication, increasing the risk of data loss, data corruption, and potential system instability if your application does not have its own replication. If any OSDs are lost, this feature requires very disruptive steps to recover. All applications can lose their data, and must be recreated in case of a failed OSD.
Prerequisites
- Make sure to use one new storage device or OSD per failure domain.
Procedure
Enable the single replica feature using the following command:
$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/managedResources/cephNonResilientPools/enable", "value": true }]'Verify
storageclusteris inReadystate:$ oc get storageclusterExample output:
NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 10m Ready 2024-02-05T13:56:15Z 4.17.0New
cephblockpoolsare created for each failure domain. Verifycephblockpoolsare inReadystate:$ oc get cephblockpoolsExample output:
NAME PHASE ocs-storagecluster-cephblockpool Ready ocs-storagecluster-cephblockpool-us-east-1a Ready ocs-storagecluster-cephblockpool-us-east-1b Ready ocs-storagecluster-cephblockpool-us-east-1c ReadyVerify new storage classes have been created:
$ oc get storageclassExample output:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 104m gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 104m gp3-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 104m ocs-storagecluster-ceph-non-resilient-rbd openshift-storage.rbd.csi.ceph.com Delete WaitForFirstConsumer true 46m ocs-storagecluster-ceph-rbd openshift-storage.rbd.csi.ceph.com Delete Immediate true 52m ocs-storagecluster-cephfs openshift-storage.cephfs.csi.ceph.com Delete Immediate true 52m openshift-storage.noobaa.io openshift-storage.noobaa.io/obc Delete Immediate false 50mThree new OSDs are deployed on the new devices after the storage class with single replica is enabled; 3
osd-preparepods and 3 additional pods. Verify new OSD pods are inRunningstate:$ oc get pods | grep osdExample output:
rook-ceph-osd-0-6dc76777bc-snhnm 2/2 Running 0 9m50s rook-ceph-osd-1-768bdfdc4-h5n7k 2/2 Running 0 9m48s rook-ceph-osd-2-69878645c4-bkdlq 2/2 Running 0 9m37s rook-ceph-osd-3-64c44d7d76-zfxq9 2/2 Running 0 5m23s rook-ceph-osd-4-654445b78f-nsgjb 2/2 Running 0 5m23s rook-ceph-osd-5-5775949f57-vz6jp 2/2 Running 0 5m22s rook-ceph-osd-prepare-ocs-deviceset-gp2-0-data-0x6t87-59swf 0/1 Completed 0 10m rook-ceph-osd-prepare-ocs-deviceset-gp2-1-data-0klwr7-bk45t 0/1 Completed 0 10m rook-ceph-osd-prepare-ocs-deviceset-gp2-2-data-0mk2cz-jx7zv 0/1 Completed 0 10m
Disabling single replica
Disabling replica-1 is not a tested and supported scenario. But if it is necessary to disable and cleanup replica-1, perform the following steps:
Delete Workloads using the non-resilient(replica-1) storageClass.
$ oc delete deployment replica-1-workload deployment.apps "replica-1-workload" deletedDelete all the PVCs using the non-resilient storageClass.
$ oc get pvc --all-namespaces -o custom-columns="NAMESPACE:.metadata.namespace,NAME:.metadata.name,STORAGECLASS:.spec.storageClassName" | grep "ocs-storagecluster-ceph-non-resilient-rbd"$ oc delete pvc replica-1-pvc persistentvolumeclaim "replica-1-pvc" deletedMark the non-resilient pools
Enableflag to false on the storageCluster CR.$ oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/managedResources/cephNonResilientPools/enable", "value": false }]'Remove the non-resilient-rbd storageClass from storageconsumer CR spec.storageClasses.
$ oc get storageconsumer internal -n openshift-storage -o json | \ jq 'del(.spec.storageClasses[] | select(.name == "ocs-storagecluster-ceph-non-resilient-rbd"))' | \ oc replace -f -Delete the non-resilient-rbd storageClass.
$ oc delete sc ocs-storagecluster-ceph-non-resilient-rbdDelete the replica-1 cephBlockPools.
$ oc get cephblockpool NAME PHASE TYPE FAILUREDOMAIN AGE builtin-mgr Ready Replicated zone 136m ocs-storagecluster-cephblockpool Ready Replicated zone 136m ocs-storagecluster-cephblockpool-us-east-1 Ready Replicated zone 50m ocs-storagecluster-cephblockpool-us-east-2 Ready Replicated zone 50m ocs-storagecluster-cephblockpool-us-east-3 Ready Replicated zone 50m $ oc delete cephblockpool ocs-storagecluster-cephblockpool-us-east-1 ocs-storagecluster-cephblockpool-us-east-2 ocs-storagecluster-cephblockpool-us-east-3 cephblockpool.ceph.rook.io "ocs-storagecluster-cephblockpool-us-east-1" deleted cephblockpool.ceph.rook.io "ocs-storagecluster-cephblockpool-us-east-2" deleted cephblockpool.ceph.rook.io "ocs-storagecluster-cephblockpool-us-east-3" deletedRemove the replica-1 OSDs.
Identify the replica-1 OSDs with the class name of the failure domain,
sh-5.1$ ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 1.50000 - 1.5 TiB 4.8 GiB 4.4 GiB 32 KiB 395 MiB 1.5 TiB 0.31 1.00 - root default -5 1.50000 - 1.5 TiB 4.8 GiB 4.4 GiB 32 KiB 395 MiB 1.5 TiB 0.31 1.00 - region us-east -14 0.50000 - 512 GiB 1.7 GiB 1.5 GiB 12 KiB 176 MiB 510 GiB 0.32 1.03 - zone us-east-1 -13 0.25000 - 256 GiB 1.6 GiB 1.5 GiB 11 KiB 75 MiB 254 GiB 0.61 1.93 - host ocs-deviceset-2-data-0kp6fl 2 ssd 0.25000 1.00000 256 GiB 1.6 GiB 1.5 GiB 11 KiB 75 MiB 254 GiB 0.61 1.93 353 up osd.2 -46 0.25000 - 256 GiB 104 MiB 2.7 MiB 1 KiB 101 MiB 256 GiB 0.04 0.13 - host us-east-1-data-0c2xxv 3 us-east-1 0.25000 1.00000 256 GiB 104 MiB 2.7 MiB 1 KiB 101 MiB 256 GiB 0.04 0.13 0 up osd.3 -4 0.50000 - 512 GiB 1.6 GiB 1.5 GiB 8 KiB 101 MiB 510 GiB 0.31 0.98 - zone us-east-2 -3 0.25000 - 256 GiB 1.6 GiB 1.5 GiB 7 KiB 74 MiB 254 GiB 0.61 1.93 - host ocs-deviceset-1-data-0tllsb 0 ssd 0.25000 1.00000 256 GiB 1.6 GiB 1.5 GiB 7 KiB 74 MiB 254 GiB 0.61 1.93 353 up osd.0 -51 0.25000 - 256 GiB 29 MiB 2.7 MiB 1 KiB 26 MiB 256 GiB 0.01 0.04 - host us-east-2-data-068z76 5 us-east-2 0.25000 1.00000 256 GiB 29 MiB 2.7 MiB 1 KiB 26 MiB 256 GiB 0.01 0.04 0 up osd.5 -10 0.50000 - 512 GiB 1.6 GiB 1.5 GiB 12 KiB 118 MiB 510 GiB 0.31 0.99 - zone us-east-3 -9 0.25000 - 256 GiB 1.6 GiB 1.5 GiB 11 KiB 92 MiB 254 GiB 0.61 1.95 - host ocs-deviceset-0-data-06fhxp 1 ssd 0.25000 1.00000 256 GiB 1.6 GiB 1.5 GiB 11 KiB 92 MiB 254 GiB 0.61 1.95 353 up osd.1 -41 0.25000 - 256 GiB 29 MiB 2.7 MiB 1 KiB 26 MiB 256 GiB 0.01 0.04 - host us-east-3-data-04wvc8 4 us-east-3 0.25000 1.00000 256 GiB 29 MiB 2.7 MiB 1 KiB 26 MiB 256 GiB 0.01 0.04 0 up osd.4 TOTAL 1.5 TiB 4.8 GiB 4.4 GiB 36 KiB 395 MiB 1.5 TiB 0.31 MIN/MAX VAR: 0.04/1.95 STDDEV: 0.29In this example, OSD ID 3,5,4 are the replica-1 OSDs that need to be removed.
Force remove the identified replica-1 OSDs.
Perform the removal of the OSDs by following appropriate procedure applicable to your environment.
Repeat the following steps for all OSD Ids that you want to remove.
$ osd_id_to_remove=3 $ oc scale -n openshift-storage deployment rook-ceph-osd-${osd_id_to_remove} --replicas=0 deployment.apps/rook-ceph-osd-3 scaled $ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f - job.batch/ocs-osd-removal-job created $ oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal' 2025-06-20 10:33:13.153093 I | cephosd: completed removal of OSD 3 $ oc delete -n openshift-storage job ocs-osd-removal-job job.batch "ocs-osd-removal-job" deletedCheck the OSD tree.
sh-5.1$ ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 0.75000 - 768 GiB 5.0 GiB 4.8 GiB 29 KiB 227 MiB 763 GiB 0.65 1.00 - root default -5 0.75000 - 768 GiB 5.0 GiB 4.8 GiB 29 KiB 227 MiB 763 GiB 0.65 1.00 - region us-east -14 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 11 KiB 63 MiB 254 GiB 0.65 0.99 - zone us-east-1 -13 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 11 KiB 63 MiB 254 GiB 0.65 0.99 - host ocs-deviceset-2-data-0kp6fl 2 ssd 0.25000 1.00000 256 GiB 1.7 GiB 1.6 GiB 11 KiB 63 MiB 254 GiB 0.65 0.99 353 up osd.2 -4 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 7 KiB 80 MiB 254 GiB 0.65 1.00 - zone us-east-2 -3 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 7 KiB 80 MiB 254 GiB 0.65 1.00 - host ocs-deviceset-1-data-0tllsb 0 ssd 0.25000 1.00000 256 GiB 1.7 GiB 1.6 GiB 7 KiB 80 MiB 254 GiB 0.65 1.00 353 up osd.0 -10 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 11 KiB 84 MiB 254 GiB 0.65 1.00 - zone us-east-3 -9 0.25000 - 256 GiB 1.7 GiB 1.6 GiB 11 KiB 84 MiB 254 GiB 0.65 1.00 - host ocs-deviceset-0-data-06fhxp 1 ssd 0.25000 1.00000 256 GiB 1.7 GiB 1.6 GiB 11 KiB 84 MiB 254 GiB 0.65 1.00 353 up osd.1 TOTAL 768 GiB 5.0 GiB 4.8 GiB 31 KiB 227 MiB 763 GiB 0.65 MIN/MAX VAR: 0.99/1.00 STDDEV: 0.00
2.2.1. Recovering after OSD lost from single replica Copier lienLien copié sur presse-papiers!
When using replica 1, a storage class with a single replica, data loss is guaranteed when an OSD is lost. Lost OSDs go into a failing state. Use the following steps to recover after OSD loss.
Procedure
Follow these recovery steps to get your applications running again after data loss from replica 1. You first need to identify the domain where the failing OSD is.
If you know which failure domain the failing OSD is in, run the following command to get the exact
replica1-pool-namerequired for the next steps. If you do not know where the failing OSD is, skip to step 2.$ oc get cephblockpoolsExample output:
NAME PHASE ocs-storagecluster-cephblockpool Ready ocs-storagecluster-cephblockpool-us-south-1 Ready ocs-storagecluster-cephblockpool-us-south-2 Ready ocs-storagecluster-cephblockpool-us-south-3 ReadyCopy the corresponding failure domain name for use in next steps, then skip to step 4.
Find the OSD pod that is in
Errorstate orCrashLoopBackoffstate to find the failing OSD:$ oc get pods -n openshift-storage -l app=rook-ceph-osd | grep 'CrashLoopBackOff\|Error'Identify the replica-1 pool that had the failed OSD.
Identify the node where the failed OSD was running:
failed_osd_id=0 #replace with the ID of the failed OSDIdentify the failureDomainLabel for the node where the failed OSD was running:
failure_domain_label=$(oc get storageclass ocs-storagecluster-ceph-non-resilient-rbd -o yaml | grep domainLabel |head -1 |awk -F':' '{print $2}')failure_domain_value=$”(oc get pods $failed_osd_id -oyaml |grep topology-location-zone |awk ‘{print $2}’)”The output shows the replica-1 pool name whose OSD is failing, for example:
replica1-pool-name= "ocs-storagecluster-cephblockpool-$failure_domain_value”where
$failure_domain_valueis the failureDomainName.
Delete the replica-1 pool.
Connect to the toolbox pod:
toolbox=$(oc get pod -l app=rook-ceph-tools -n openshift-storage -o jsonpath='{.items[*].metadata.name}') oc rsh $toolbox -n openshift-storageDelete the replica-1 pool. Note that you have to enter the replica-1 pool name twice in the command, for example:
ceph osd pool rm <replica1-pool-name> <replica1-pool-name> --yes-i-really-really-mean-itReplace
replica1-pool-namewith the failure domain name identified earlier.
- Purge the failing OSD by following the steps in section "Replacing operational or failed storage devices" based on your platform in the Replacing devices guide.
Restart the rook-ceph operator:
$ oc delete pod -l rook-ceph-operator -n openshift-storage- Recreate any affected applications in that avaialbity zone to start using the new pool with same name.