Deploying OpenShift Container Storage using bare metal infrastructure


Red Hat OpenShift Container Storage 4.8

How to install and set up your bare metal environment

Red Hat Storage Documentation Team

Abstract

Read this document for instructions on installing Red Hat OpenShift Container Storage 4.8 to use local storage on bare metal infrastructure.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.

Providing feedback on Red Hat documentation

We appreciate your input on our documentation. Do let us know how we can make it better. To give feedback:

  • For simple comments on specific passages:

    1. Make sure you are viewing the documentation in the Multi-page HTML format. In addition, ensure you see the Feedback button in the upper right corner of the document.
    2. Use your mouse cursor to highlight the part of text that you want to comment on.
    3. Click the Add Feedback pop-up that appears below the highlighted text.
    4. Follow the displayed instructions.
  • For submitting more complex feedback, create a Bugzilla ticket:

    1. Go to the Bugzilla website.
    2. In the Component section, choose documentation.
    3. Fill in the Description field with your suggestion for improvement. Include a link to the relevant part(s) of documentation.
    4. Click Submit Bug.

Preface

Red Hat OpenShift Container Storage 4.8 supports deployment on existing Red Hat OpenShift Container Platform (RHOCP) bare metal clusters in connected or disconnected environments along with out-of-the-box support for proxy environments.

Both internal and external Openshift Container Storage clusters are supported on bare metal. See Planning your deployment and Preparing to deploy OpenShift Container Storage for more information about deployment requirements.

To deploy OpenShift Container Storage, follow the appropriate deployment process for your environment:

Chapter 1. Preparing to deploy OpenShift Container Storage

When you deploy OpenShift Container Storage on OpenShift Container Platform using local storage devices, it provides you with the option to create internal cluster resources. This results in the internal provisioning of the base services, which helps to make additional storage classes available to applications.

Before you begin the deployment of Red Hat OpenShift Container Storage using local storage, ensure that your resource requirements are met. For more information, see requirements for installing OpenShift Container Storage using local storage devices.

After you have addressed the above, follow the sections in the order given:

1.1. Requirements for installing OpenShift Container Storage using local storage devices

Node requirements

The cluster must consist of at least three OpenShift Container Platform worker nodes with locally attached-storage devices on each of them.

  • Each of the three selected nodes must have at least one raw block device available to be used by OpenShift Container Storage.
  • The devices you use must be empty; the disks must not include physical volumes (PVs), volume groups (VGs), or logical volumes (LVs) remaining on the disk.

For more information, see the Resource requirements section in the Planning guide.

Arbiter stretch cluster requirements [Technology Preview]

In this case, a single cluster is stretched across two zones with a third zone as the location for the arbiter. This is a technology preview feature that is currently intended for deployment in the OpenShift Container Platform on-premises.

For detailed requirements and instructions, see Configuring OpenShift Container Storage for Metro-DR stretch cluster.

Note

You cannot enable both flexible scaling and arbiter as they have conflicting scaling logic. With flexing scaling, you can add one node at a time to the OpenShift Container Storage cluster. Whereas, in an arbiter cluster, you need to add at least one node in each of the 2 data zones.

Compact mode requirements

OpenShift Container Storage can be installed on a three-node OpenShift compact bare metal cluster, where all the workloads run on three strong master nodes. There are no worker or storage nodes.

To configure OpenShift Container Platform in compact mode, see Configuring a three-node cluster and Delivering a Three-node Architecture for Edge Deployments.

Minimum starting node requirements [Technology Preview]

An OpenShift Container Storage cluster is deployed with minimum configuration when the standard deployment resource requirement is not met.

For more information, see Resource requirements section in the Planning guide.

1.2. Enabling file system access for containers on Red Hat Enterprise Linux based nodes

Deploying OpenShift Container Storage on an OpenShift Container Platform with worker nodes on a Red Hat Enterprise Linux base in a user provisioned infrastructure (UPI) does not automatically provide container access to the underlying Ceph file system.

Note

Skip this section for hosts based on Red Hat Enterprise Linux CoreOS (RHCOS).

Procedure

  1. Log in to the Red Hat Enterprise Linux based node and open a terminal.
  2. For each node in your cluster:

    1. Verify that the node has access to the rhel-7-server-extras-rpms repository.

      # subscription-manager repos --list-enabled | grep rhel-7-server

      If you do not see both rhel-7-server-rpms and rhel-7-server-extras-rpms in the output, or if there is no output, run the following commands to enable each repository.

      # subscription-manager repos --enable=rhel-7-server-rpms
      # subscription-manager repos --enable=rhel-7-server-extras-rpms
    2. Install the required packages.

      # yum install -y policycoreutils container-selinux
    3. Persistently enable container use of the Ceph file system in SELinux.

      # setsebool -P container_use_cephfs on

1.3. Enabling key value backend path and policy in Vault

Prerequisites

  • Administrator access to Vault.
  • Choose a unique path name as the backend path that follows the naming convention since it cannot be changed later.

Procedure

  1. Enable the Key/Value (KV) backend path in Vault.

    For Vault KV secret engine API, version 1:

    $ vault secrets enable -path=ocs kv

    For Vault KV secret engine API, version 2:

    $ vault secrets enable -path=ocs kv-v2
  2. Create a policy to restrict users to perform a write or delete operation on the secret using the following commands:

    echo '
    path "ocs/*" {
      capabilities = ["create", "read", "update", "delete", "list"]
    }
    path "sys/mounts" {
    capabilities = ["read"]
    }'| vault policy write ocs -
  3. Create a token matching the above policy:

    $ vault token create -policy=ocs -format json

Chapter 2. Deploying OpenShift Container Storage using local storage devices

To deploy OpenShift Container Storage using local storage devices on a bare metal infrastructure, follow the below sections:

2.1. Installing Local Storage Operator

You can install the Local Storage Operator using the Red Hat OpenShift Container Platform Operator Hub.

Prerequisite

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin and Operator installation permissions.

Procedure

  1. Log in to the OpenShift Web Console.
  2. Click Operators → OperatorHub.
  3. Type local storage in the Filter by keyword…​ box to search for Local Storage operator from the list of operators and click on it.
  4. Click Install.
  5. Set the following options on the Install Operator page:

    1. Channel as stable-4.8.
    2. Installation Mode as A specific namespace on the cluster.
    3. Installed Namespace as Operator recommended namespace openshift-local-storage.
    4. Approval Strategy as Automatic.
  6. Click Install.

Verification step

  • Verify that the Local Storage Operator shows the Status as Succeeded.

2.2. Installing Red Hat OpenShift Container Storage Operator

You can install Red Hat OpenShift Container Storage Operator using the Red Hat OpenShift Container Platform Operator Hub.

Prerequisites

  • Access to an OpenShift Container Platform cluster using an account with cluster-admin and operator installation permissions.
  • You have at least three worker nodes in the Red Hat OpenShift Container Platform cluster.
  • You have satisfied any additional requirements required. For more information, see Planning your deployment.
Note
  • When you need to override the cluster-wide default node selector for OpenShift Container Storage, you can use the following command to specify a blank node selector for the openshift-storage namespace (create openshift-storage namespace in this case):

    $ oc annotate namespace openshift-storage openshift.io/node-selector=
  • Taint a node as infra to ensure only Red Hat OpenShift Container Storage resources are scheduled on that node. This helps you save on subscription costs. For more information, see How to use dedicated worker nodes for Red Hat OpenShift Container Storage chapter in Managing and Allocating Storage Resources guide.

Procedure

  1. Log in to OpenShift Web Console.
  2. Click OperatorsOperatorHub.
  3. Search for OpenShift Container Storage from the list of operators and click on it.
  4. Click Install.
  5. Set the following options on the Install Operator page:

    1. Channel as stable-4.8.
    2. Installation Mode as A specific namespace on the cluster.
    3. Installed Namespace as Operator recommended namespace openshift-storage. If Namespace openshift-storage does not exist, it will be created during the operator installation.
    4. Approval Strategy as Automatic or Manual.
    5. Click Install.

      If you select Automatic updates, the Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your operator without any intervention.

      If you select Manual updates, the OLM creates an update request. As a cluster administrator, you must then manually approve that update request to have the operator updated to the new version.

Verification step

  • Verify that the OpenShift Container Storage Operator shows a green tick indicating successful installation.

2.3. Creating Multus networks

OpenShift Container Platform uses the Multus CNI plug-in to allow chaining of CNI plug-ins. During cluster installation, you can configure your default pod network. The default network handles all ordinary network traffic for the cluster. You can define an additional network based on the available CNI plug-ins and attach one or more of these networks to your pods. To attach additional network interfaces to a pod, you must create configurations that define how the interfaces are attached. You can specify each interface by using a NetworkAttachmentDefinition custom resource (CR). A CNI configuration inside each of the NetworkAttachmentDefinition defines how that interface is created.

OpenShift Container Storage uses the CNI plug-in called macvlan. Creating a macvlan-based additional network allows pods on a host to communicate with other hosts and pods on those hosts by using a physical network interface. Each pod that is attached to a macvlan-based additional network is provided a unique MAC address.

2.3.1. Creating network attachment definitions

To utilize Multus, an already working cluster with the correct networking configuration is required. For more information, see Recommended network configuration and requirements for a Multus configuration. The NetworkAttachmentDefinition (NAD) created now is later available to be selected during the storage cluster installation. This is the reason they must be created before the storage cluster.

As detailed in the Planning Guide, the Multus networks you create depend on the number of available network interfaces you have for OpenShift Container Storage traffic. It is possible to separate all of the storage traffic onto one of two interfaces (one interface used for default OpenShift SDN) or to further segregate storage traffic into client storage traffic (public) and storage replication traffic (private or cluster).

The following is an example NetworkAttachmentDefinition for all storage traffic, public and cluster, on the same interface. It requires one additional interface on all schedulable nodes (OpenShift default SDN on separate network interface).

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ocs-public-cluster
  namespace: openshift-storage
spec:
  config: '{
  	"cniVersion": "0.3.1",
  	"type": "macvlan",
  	"master": "ens2",
  	"mode": "bridge",
  	"ipam": {
    	    "type": "whereabouts",
    	    "range": "192.168.1.0/24"
  	}
  }'
Note

All network interface names must be the same on all nodes attached to Multus network (that is, ens2 for ocs-public-cluster).

The following is an example NetworkAttachmentDefinitions for storage traffic on separate Multus networks, public for client storage traffic and cluster for replication traffic. It requires two additional interfaces on OpenShift nodes hosting OSD pods and one additional interface on all other schedulable nodes (OpenShift default SDN on separate network interface).

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ocs-public
  namespace: openshift-storage
spec:
  config: '{
  	"cniVersion": "0.3.1",
  	"type": "macvlan",
  	"master": "ens2",
  	"mode": "bridge",
  	"ipam": {
    	    "type": "whereabouts",
    	    "range": "192.168.1.0/24"
  	}
  }'

Example NetworkAttachmentDefinition:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ocs-cluster
  namespace: openshift-storage
spec:
  config: '{
  	"cniVersion": "0.3.1",
  	"type": "macvlan",
  	"master": "ens3",
  	"mode": "bridge",
  	"ipam": {
    	    "type": "whereabouts",
    	    "range": "192.168.2.0/24"
  	}
  }'
Note

All network interface names must be the same on all nodes attached to Multus networks (that is, ens2 for ocs-public and ens3 for ocs-cluster).

2.4. Creating OpenShift Container Storage cluster on bare metal

Prerequisites

Procedure

  1. Log into the OpenShift Web Console.
  2. Click Operators → Installed Operators to view all the installed operators.
  3. Ensure that the Project selected is openshift-storage.
  4. Click OpenShift Container StorageCreate Instance link of Storage Cluster.
  5. Select mode as Internal-Attached devices.

    You are prompted to install the Local Storage Operator if it is not already installed. Click Install and follow the procedure described in Installing Local Storage Operator.

    You can create a dedicated storage class to consume storage by filtering set of storage volumes.

  6. Discover disks by using one of the following options:

    • All nodes to discover disks in all the nodes.
    • Select nodes to discover disks from a subset of available nodes.

      Important

      For using arbiter mode, do not select the All nodes option. Instead, use the Select nodes option to select the labeled nodes with attached storage devices from the two data-center zones.

      If the nodes to be selected are tainted and not discovered in the wizard, follow the steps in the Red Hat Knowledgebase Solution as a workaround to add tolerations for Local Storage Operator resources.

      If the nodes selected do not match the OpenShift Container Storage cluster requirement of an aggregated 30 CPUs and 72 GiB of RAM, a minimal cluster is deployed. For minimum starting node requirements, see Resource requirements section in the Planning guide.

  7. Click Next.
  8. Create Storage class

    1. Enter the Local Volume Set Name.
    2. Enter the Storage Class Name. By default, the volume set name appears for the storage class name. You can also change the name.
    3. The nodes selected for disk discovery in the previous step are displayed in the Filter Disks By section.

      Select one of the following:

      • Disks on all nodes to select all the nodes for which you discovered the devices.
      • Disks on selected nodes to select a subset of the nodes for which you discovered the devices. Spread the worker nodes across three different physical nodes, racks or failure domains for high availability.

        Important

        The flexible scaling feature gets enabled on creating a storage cluster with 3 or more nodes spread across fewer than the minimum requirement of 3 availability zones. This feature is available only for the new deployments of OpenShift Container Storage 4.7 clusters and does not support the upgraded clusters. For information about flexible scaling, see Scaling Storage Guide

    4. From the available list of Disk Type, select SSD/NVMe.
    5. Expand the Advanced section and set the following options:

      Volume Mode

      Block is selected by default.

      Device Type

      Select disk types. By default, Disk and Part are selected.

      Disk Size

      Minimum and maximum available size of the device that needs to be included.

      Note

      You must set a minimum size of 100GB for the device.

      Maximum Disk Limit

      This indicates the maximum number of PVs that can be created on a node. If this field is left empty, then PVs are created for all the available disks on the matching nodes.

    6. Click Next. A pop-up to confirm creation of the new storage class is displayed.
    7. Click Yes to continue.
  9. Set Capacity and nodes

    1. Select Storage Class. By default, the new storage class created in the previous step is selected.

      • Selected Nodes shows the nodes selected in the previous step. This list takes a few minutes to reflect the disks that were discovered in the previous step.
    2. Click Next.
  10. Optional: Set Security and network configuration.

    1. Select the Enable encryption checkbox to encrypt block and file storage.
    2. Select any one or both Encryption level:

      • Cluster-wide encryption to encrypt the entire cluster (block and file).
      • Storage class encryption to create encrypted persistent volume (block only) using encryption enabled storage class.
    3. Select the Connect to an external key management service checkbox. This is optional for cluster-wide encryption.

      1. Key Management Service Provider is set to Vault by default.
      2. Enter Vault Service Name, host Address of Vault server (https://<hostname or ip>), Port number and Token.
      3. Expand Advanced Settings to enter additional settings and certificate details based on your Vault configuration:

        1. Enter the Key Value secret path in the Backend Path that is dedicated and unique to OpenShift Container Storage.
        2. Optional: Enter TLS Server Name and Vault Enterprise Namespace.
        3. Provide CA Certificate, Client Certificate and Client Private Key by uploading the respective PEM encoded certificate file.
        4. Click Save.
  11. Select Default (SDN) if you are using a single network or Custom (Multus) Network if you plan on using multiple network interfaces.

    1. Select a Public Network Interface from drop down.
    2. Select a Cluster Network Interface from drop down.

      Note

      If you are using only one additional network interface, select the single NetworkAttachementDefinition (for example, ocs-public-cluster) for the Public Network Interface and leave the Cluster Network Interface blank.

  12. Click Next.
  13. Review the configuration details. To modify any configuration settings, click Back and you are redirected to the previous configuration page.
  14. Click Create.
  15. Edit the configmap if Vault Key/Value (KV) secret engine API, version 2 is used for cluster-wide encryption with Key Management System (KMS).

    1. On the OpenShift Web Console, click Workloads → ConfigMaps.
    2. To view the KMS connection details, click ocs-kms-connection-details.
    3. Edit the configmap.

      1. Click Action menu (⋮) → Edit ConfigMap.
      2. Set the VAULT_BACKEND parameter to v2.

        kind: ConfigMap
        apiVersion: v1
        metadata:
          name: ocs-kms-connection-details
        [...]
        data:
          KMS_PROVIDER: vault
          KMS_SERVICE_NAME: vault
        [...]
          VAULT_BACKEND: v2
        [...]
      3. Click Save.

Verification steps

  • To verify that the final Status of the installed storage cluster shows as Phase: Ready with a green tick mark.

    • Click OperatorsInstalled OperatorsStorage Cluster link to view the storage cluster installation status.
    • Alternatively, when you are on the Operator Details tab, you can click on the Storage Cluster tab to view the status.
  • To verify if flexible scaling is enabled on your storage cluster, perform the following steps (for arbiter mode, flexible scaling is disabled):

    1. Click ocs-storagecluster in the Storage Cluster tab.
    2. In the YAML tab, search for the keys flexibleScaling in spec section and failureDomain in status section. If flexible scaling is true and failureDomain is set to host, the flexible scaling feature is enabled.

      spec:
      flexibleScaling: true
      […]
      status:
      failureDomain: host
  • To verify that all components for OpenShift Container Storage are successfully installed, see Verifying your OpenShift Container Storage installation.
  • To verify that OpenShift Container Storage is successfully installed, see Verifying your OpenShift Container Storage installation.

Additional resources

  • To expand the capacity of the initial cluster, see the Scaling Storage guide.

Chapter 3. Verifying OpenShift Container Storage deployment

To verify that OpenShift Container storage for internal mode has deployed successfully, follow the below sections:

3.1. Verifying the state of the pods

To verify that the pods of OpenShift Containers Storage are in running state, follow the below procedure:

Procedure

  1. Log in to OpenShift Web Console.
  2. Click Workloads → Pods from the left pane of the OpenShift Web Console.
  3. Select openshift-storage from the Project drop down list.

    For more information on the expected number of pods for each component and how it varies depending on the number of nodes, see Table 3.1, “Pods corresponding to OpenShift Container storage cluster”.

  4. Click on the Running and Completed tabs to verify that the pods are running and in a completed state:

    Table 3.1. Pods corresponding to OpenShift Container storage cluster
    ComponentCorresponding pods

    OpenShift Container Storage Operator

    • ocs-operator-* (1 pod on any worker node)
    • ocs-metrics-exporter-*

    Rook-ceph Operator

    rook-ceph-operator-*

    (1 pod on any worker node)

    Multicloud Object Gateway

    • noobaa-operator-* (1 pod on any worker node)
    • noobaa-core-* (1 pod on any storage node)
    • noobaa-db-pg-* (1 pod on any storage node)
    • noobaa-endpoint-* (1 pod on any storage node)

    MON

    rook-ceph-mon-*

    (3 pods distributed across storage nodes)

    MGR

    rook-ceph-mgr-*

    (1 pod on any storage node)

    MDS

    rook-ceph-mds-ocs-storagecluster-cephfilesystem-*

    (2 pods distributed across storage nodes)

    RGW

    rook-ceph-rgw-ocs-storagecluster-cephobjectstore-* (1 pod on any storage node)

    CSI

    • cephfs

      • csi-cephfsplugin-* (1 pod on each worker node)
      • csi-cephfsplugin-provisioner-* (2 pods distributed across worker nodes)
    • rbd

      • csi-rbdplugin-* (1 pod on each worker node)
      • csi-rbdplugin-provisioner-* (2 pods distributed across worker nodes)

    rook-ceph-crashcollector

    rook-ceph-crashcollector-*

    (1 pod on each storage node)

    OSD

    • rook-ceph-osd-* (1 pod for each device)
    • rook-ceph-osd-prepare-ocs-deviceset-* (1 pod for each device)

3.2. Verifying the OpenShift Container Storage cluster is healthy

To verify that the cluster of OpenShift Container Storage is healthy, follow the steps in the procedure.

Procedure

  1. Click Storage → Overview and click the Block and File tab.
  2. In the Status card, verify that Storage Cluster and Data Resiliency has a green tick mark.
  3. In the Details card, verify that the cluster information is displayed.

For more information on the health of the OpenShift Container Storage clusters using the Block and File dashboard, see Monitoring OpenShift Container Storage.

3.3. Verifying the Multicloud Object Gateway is healthy

To verify that the OpenShift Container Storage Multicloud Object Gateway is healthy, follow the steps in the procedure.

Procedure

  1. Click Storage → Overview from the OpenShift Web Console and click the Object tab.
  2. In the Status card, verify that both Object Service and Data Resiliency are in Ready state (green tick).
  3. In the Details card, verify that the Multicloud Object Gateway information is displayed.

For more information on the health of the OpenShift Container Storage cluster using the object service dashboard, see Monitoring OpenShift Container Storage.

3.4. Verifying that the OpenShift Container Storage specific storage classes exist

To verify the storage classes exists in the cluster, follow the steps in the procedure.

Procedure

  1. Click Storage → Storage Classes from the OpenShift Web Console.
  2. Verify that the following storage classes are created with the OpenShift Container Storage cluster creation:

    • ocs-storagecluster-ceph-rbd
    • ocs-storagecluster-cephfs
    • openshift-storage.noobaa.io
    • ocs-storagecluster-ceph-rgw

3.5. Verifying the Multus networking

To determine if Multus is working in the cluster, verify the Multus networking.

Procedure

  1. Based on your network configuration choices, the OpenShift Container Storage operator does one of the following:

    • If only a single NetworkAttachmentDefinition (for example, ocs-public-cluster) is selected for the Public Network Interface, the traffic between the application pods and the OpenShift Container Storage cluster happens on this network. Additionally, the cluster is also self configured to use this network for the replication and rebalancing traffic between OSDs.
    • If both NetworkAttachmentDefinitions (for example, ocs-public and ocs-cluster) are selected for the Public Network Interface and the Cluster Network Interface respectively during the storage cluster installation, then client storage traffic is on the public network and cluster network is for the replication and rebalancing traffic between OSDs.
  2. To verify the network configuration is correct, follow the steps:

    1. In the OpenShift console, click Installed OperatorsStorage Clusterocs-storagecluster.
    2. In the YAML tab, search for network in the spec section and ensure the configuration is correct for your network interface choices. This example is for separating the client storage traffic from the storage replication traffic.

      Sample output:

      [..]
      spec:
          [..]
          network:
          provider: multus
          selectors:
            cluster: openshift-storage/ocs-cluster
            public: openshift-storage/ocs-public
          [..]
  3. To verify the network configuration is correct using the command line interface, run the following commands:

    $ oc get storagecluster ocs-storagecluster \
    -n openshift-storage \
    -o=jsonpath='{.spec.network}{"\n"}'

    Sample output:

    {"provider":"multus","selectors":{"cluster":"openshift-storage/ocs-cluster","public":"openshift-storage/ocs-public"}}
  4. Confirm the OSD pods are using correct network:

    1. In the openshift-storage namespace use one of the OSD pods to verify the pod has connectivity to the correct networks. This example is for separating the client storage traffic from the storage replication traffic.

      Note

      Only the OSD pods connects to both Multus public and cluster networks if both are created. All other OCS pods connect to the Multus public network.

      $ oc get -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-osd | grep 'osd-0') -o=jsonpath='{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}{"\n"}'

      Sample output:

      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.129.2.30"
          ],
          "default": true,
          "dns": {}
      },{
          "name": "openshift-storage/ocs-cluster",
          "interface": "net1",
          "ips": [
              "192.168.2.1"
          ],
          "mac": "e2:04:c6:81:52:f1",
          "dns": {}
      },{
          "name": "openshift-storage/ocs-public",
          "interface": "net2",
          "ips": [
              "192.168.1.1"
          ],
          "mac": "ee:a0:b6:a4:07:94",
          "dns": {}
      }]
  5. To confirm the OSD pods are using correct network using the command line interface, run the following command (requires the jq utility):

    $ oc get -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-osd | grep 'osd-0') -o=jsonpath='{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}{"\n"}' | jq -r '.[].name'

    Sample output:

    openshift-sdn
    openshift-storage/ocs-cluster
    openshift-storage/ocs-public

Chapter 4. Uninstalling OpenShift Container Storage

To uninstall and remove OpenShift Container storage, follow the below sections:

4.1. Uninstalling OpenShift Container Storage in Internal mode

Use the steps in this section to uninstall OpenShift Container Storage.

Uninstall Annotations

Annotations on the Storage Cluster are used to change the behavior of the uninstall process. To define the uninstall behavior, the following two annotations have been introduced in the storage cluster:

  • uninstall.ocs.openshift.io/cleanup-policy: delete
  • uninstall.ocs.openshift.io/mode: graceful

The below table provides information on the different values that can used with these annotations:

Table 4.1. uninstall.ocs.openshift.io uninstall annotations descriptions
AnnotationValueDefaultBehavior

cleanup-policy

delete

Yes

Rook cleans up the physical drives and the DataDirHostPath

cleanup-policy

retain

No

Rook does not clean up the physical drives and the DataDirHostPath

mode

graceful

Yes

Rook and NooBaa pauses the uninstall process until the PVCs and the OBCs are removed by the administrator/user

mode

forced

No

Rook and NooBaa proceeds with uninstall even if PVCs/OBCs provisioned using Rook and NooBaa exist respectively.

You can change the cleanup policy or the uninstall mode by editing the value of the annotation by using the following commands:

$ oc annotate storagecluster -n openshift-storage ocs-storagecluster uninstall.ocs.openshift.io/cleanup-policy="retain" --overwrite
storagecluster.ocs.openshift.io/ocs-storagecluster annotated
$ oc annotate storagecluster -n openshift-storage ocs-storagecluster uninstall.ocs.openshift.io/mode="forced" --overwrite
storagecluster.ocs.openshift.io/ocs-storagecluster annotated

Prerequisites

  • Ensure that the OpenShift Container Storage cluster is in a healthy state. The uninstall process can fail when some of the pods are not terminated successfully due to insufficient resources or nodes. In case the cluster is in an unhealthy state, contact Red Hat Customer Support before uninstalling OpenShift Container Storage.
  • Ensure that applications are not consuming persistent volume claims (PVCs) or object bucket claims (OBCs) using the storage classes provided by OpenShift Container Storage.
  • If any custom resources (such as custom storage classes, cephblockpools) were created by the admin, they must be deleted by the admin after removing the resources which consumed them.

Procedure

  1. Delete the volume snapshots that are using OpenShift Container Storage.

    1. List the volume snapshots from all the namespaces.

      $ oc get volumesnapshot --all-namespaces
    2. From the output of the previous command, identify and delete the volume snapshots that are using OpenShift Container Storage.

      $ oc delete volumesnapshot <VOLUME-SNAPSHOT-NAME> -n <NAMESPACE>
  2. Delete PVCs and OBCs that are using OpenShift Container Storage.

    In the default uninstall mode (graceful), the uninstaller waits till all the PVCs and OBCs that use OpenShift Container Storage are deleted.

    If you wish to delete the Storage Cluster without deleting the PVCs beforehand, you may set the uninstall mode annotation to forced and skip this step. Doing this results in orphan PVCs and OBCs in the system.

    1. Delete OpenShift Container Platform monitoring stack PVCs using OpenShift Container Storage.

      For more information, see Section 4.2, “Removing monitoring stack from OpenShift Container Storage”.

    2. Delete OpenShift Container Platform Registry PVCs using OpenShift Container Storage.

      For more information, see Section 4.3, “Removing OpenShift Container Platform registry from OpenShift Container Storage”.

    3. Delete OpenShift Container Platform logging PVCs using OpenShift Container Storage.

      For more information, see Section 4.4, “Removing the cluster logging operator from OpenShift Container Storage”.

    4. Delete other PVCs and OBCs provisioned using OpenShift Container Storage.

      • Following script is sample script to identify the PVCs and OBCs provisioned using OpenShift Container Storage. The script ignores the PVCs that are used internally by Openshift Container Storage.

        #!/bin/bash
        
        RBD_PROVISIONER="openshift-storage.rbd.csi.ceph.com"
        CEPHFS_PROVISIONER="openshift-storage.cephfs.csi.ceph.com"
        NOOBAA_PROVISIONER="openshift-storage.noobaa.io/obc"
        RGW_PROVISIONER="openshift-storage.ceph.rook.io/bucket"
        
        NOOBAA_DB_PVC="noobaa-db"
        NOOBAA_BACKINGSTORE_PVC="noobaa-default-backing-store-noobaa-pvc"
        
        # Find all the OCS StorageClasses
        OCS_STORAGECLASSES=$(oc get storageclasses | grep -e "$RBD_PROVISIONER" -e "$CEPHFS_PROVISIONER" -e "$NOOBAA_PROVISIONER" -e "$RGW_PROVISIONER" | awk '{print $1}')
        
        # List PVCs in each of the StorageClasses
        for SC in $OCS_STORAGECLASSES
        do
                echo "======================================================================"
                echo "$SC StorageClass PVCs and OBCs"
                echo "======================================================================"
                oc get pvc  --all-namespaces --no-headers 2>/dev/null | grep $SC | grep -v -e "$NOOBAA_DB_PVC" -e "$NOOBAA_BACKINGSTORE_PVC"
                oc get obc  --all-namespaces --no-headers 2>/dev/null | grep $SC
                echo
        done
        Note

        Omit RGW_PROVISIONER for cloud platforms.

      • Delete the OBCs.

        $ oc delete obc <obc name> -n <project name>
      • Delete the PVCs.

        $ oc delete pvc <pvc name> -n <project-name>
        Note

        Ensure that you have removed any custom backing stores, bucket classes, etc., created in the cluster.

  3. Delete the Storage Cluster object and wait for the removal of the associated resources.

    $ oc delete -n openshift-storage storagecluster --all --wait=true
  4. Check for cleanup pods if the uninstall.ocs.openshift.io/cleanup-policy was set to delete(default) and ensure that their status is Completed.

    $ oc get pods -n openshift-storage | grep -i cleanup
    NAME                                READY   STATUS      RESTARTS   AGE
    cluster-cleanup-job-<xx>        	0/1     Completed   0          8m35s
    cluster-cleanup-job-<yy>     		0/1     Completed   0          8m35s
    cluster-cleanup-job-<zz>     		0/1     Completed   0          8m35s
  5. Confirm that the directory /var/lib/rook is now empty. This directory will be empty only if the uninstall.ocs.openshift.io/cleanup-policy annotation was set to delete(default).

    $ for i in $(oc get node -l cluster.ocs.openshift.io/openshift-storage= -o jsonpath='{ .items[*].metadata.name }'); do oc debug node/${i} -- chroot /host  ls -l /var/lib/rook; done
  6. If encryption was enabled at the time of install, remove dm-crypt managed device-mapper mapping from OSD devices on all the OpenShift Container Storage nodes.

    1. Create a debug pod and chroot to the host on the storage node.

      $ oc debug node/<node name>
      $ chroot /host
    2. Get Device names and make note of the OpenShift Container Storage devices.

      $ dmsetup ls
      ocs-deviceset-0-data-0-57snx-block-dmcrypt (253:1)
    3. Remove the mapped device.

      $ cryptsetup luksClose --debug --verbose ocs-deviceset-0-data-0-57snx-block-dmcrypt
      Note

      If the above command gets stuck due to insufficient privileges, run the following commands:

      • Press CTRL+Z to exit the above command.
      • Find PID of the process which was stuck.

        $ ps -ef | grep crypt
      • Terminate the process using kill command.

        $ kill -9 <PID>
      • Verify that the device name is removed.

        $ dmsetup ls
  7. Delete the namespace and wait till the deletion is complete. You need to switch to another project if openshift-storage is the active project.

    For example:

    $ oc project default
    $ oc delete project openshift-storage --wait=true --timeout=5m

    The project is deleted if the following command returns a NotFound error.

    $ oc get project openshift-storage
    Note

    While uninstalling OpenShift Container Storage, if namespace is not deleted completely and remains in Terminating state, perform the steps in Troubleshooting and deleting remaining resources during Uninstall to identify objects that are blocking the namespace from being terminated.

  8. Delete the local storage operator configurations if you have deployed OpenShift Container Storage using local storage devices. See Removing local storage operator configurations.
  9. Unlabel the storage nodes.

    $ oc label nodes  --all cluster.ocs.openshift.io/openshift-storage-
    $ oc label nodes  --all topology.rook.io/rack-
  10. Remove the OpenShift Container Storage taint if the nodes were tainted.

    $ oc adm taint nodes --all node.ocs.openshift.io/storage-
  11. Confirm all PVs provisioned using OpenShift Container Storage are deleted. If there is any PV left in the Released state, delete it.

    $ oc get pv
    $ oc delete pv <pv name>
  12. Delete the Multicloud Object Gateway storageclass.

    $ oc delete storageclass openshift-storage.noobaa.io --wait=true --timeout=5m
  13. Remove CustomResourceDefinitions.

    $ oc delete crd backingstores.noobaa.io bucketclasses.noobaa.io cephblockpools.ceph.rook.io cephclusters.ceph.rook.io cephfilesystems.ceph.rook.io cephnfses.ceph.rook.io cephobjectstores.ceph.rook.io cephobjectstoreusers.ceph.rook.io noobaas.noobaa.io ocsinitializations.ocs.openshift.io storageclusters.ocs.openshift.io cephclients.ceph.rook.io cephobjectrealms.ceph.rook.io cephobjectzonegroups.ceph.rook.io cephobjectzones.ceph.rook.io cephrbdmirrors.ceph.rook.io --wait=true --timeout=5m
  14. Optional: To ensure that the vault keys are deleted permanently you need to manually delete the metadata associated with the vault key.

    Note

    Execute this step only if Vault Key/Value (KV) secret engine API, version 2 is used for cluster-wide encryption with Key Management System (KMS) since the vault keys are marked as deleted and not permanently deleted during the uninstallation of OpenShift Container Storage. You can always restore it later if required.

    1. List the keys in the vault.

      $ vault kv list <backend_path>
      <backend_path>

      Is the path in the vault where the encryption keys are stored.

      For example:

      $ vault kv list kv-v2

      Example output:

      Keys
      -----
      NOOBAA_ROOT_SECRET_PATH/
      rook-ceph-osd-encryption-key-ocs-deviceset-thin-0-data-0m27q8
      rook-ceph-osd-encryption-key-ocs-deviceset-thin-1-data-0sq227
      rook-ceph-osd-encryption-key-ocs-deviceset-thin-2-data-0xzszb
    2. List the metadata associated with the vault key.

      $ vault kv get kv-v2/<key>

      For the Multicloud Object Gateway (MCG) key:

      $ vault kv get kv-v2/NOOBAA_ROOT_SECRET_PATH/<key>
      <key>

      Is the encryption key.

      For Example:

      $ vault kv get kv-v2/rook-ceph-osd-encryption-key-ocs-deviceset-thin-0-data-0m27q8

      Example output:

      ====== Metadata ======
      Key              Value
      ---              -----
      created_time     2021-06-23T10:06:30.650103555Z
      deletion_time    2021-06-23T11:46:35.045328495Z
      destroyed        false
      version          1
    3. Delete the metadata.

      $ vault kv metadata delete kv-v2/<key>

      For the MCG key:

      $ vault kv metadata delete kv-v2/NOOBAA_ROOT_SECRET_PATH/<key>
      <key>

      Is the encryption key.

      For Example:

      $ vault kv metadata delete kv-v2/rook-ceph-osd-encryption-key-ocs-deviceset-thin-0-data-0m27q8

      Example output:

      Success! Data deleted (if it existed) at: kv-v2/metadata/rook-ceph-osd-encryption-key-ocs-deviceset-thin-0-data-0m27q8
    4. Repeat these steps to delete the metadata associated with all the vault keys.
  15. To ensure that OpenShift Container Storage is uninstalled completely, on the OpenShift Container Platform Web Console,

    1. Click Storage.
    2. Verify that Overview no longer appears under Storage.

4.1.1. Removing local storage operator configurations

Use the instructions in this section only if you have deployed OpenShift Container Storage using local storage devices.

Note

For OpenShift Container Storage deployments using only localvolume resources, refer step 8.

Procedure

  1. Identify the LocalVolumeSet and the corresponding StorageClassName being used by OpenShift Container Storage.
  2. Set the variable SC to the StorageClass providing the LocalVolumeSet.

    $ export SC="<StorageClassName>"
  3. Delete the LocalVolumeSet.

    $ oc delete localvolumesets.local.storage.openshift.io <name-of-volumeset> -n openshift-local-storage
  4. Delete the local storage PVs for the given StorageClassName.

    $ oc get pv | grep $SC | awk '{print $1}'| xargs oc delete pv
  5. Delete the StorageClassName.

    $ oc delete sc $SC
  6. Delete the symlinks created by the LocalVolumeSet.

    [[ ! -z $SC ]] && for i in $(oc get node -l cluster.ocs.openshift.io/openshift-storage= -o jsonpath='{ .items[*].metadata.name }'); do oc debug node/${i} -- chroot /host rm -rfv /mnt/local-storage/${SC}/; done
  7. Delete LocalVolumeDiscovery.

    $ oc delete localvolumediscovery.local.storage.openshift.io/auto-discover-devices -n openshift-local-storage
  8. Removing LocalVolume resources (if any).

    Use the following steps to remove the LocalVolume resources used to provision PVs in the current or previous OpenShift Container Storage version. Also, ensure that these resources are not being used by other tenants in the cluster.

    For each of the local volumes, perform the following:

    1. Identify the LocalVolume and the corresponding StorageClassName being used by OpenShift Container Storage.
    2. Set the variable LV to the name of the LocalVolume and variable SC to the name of the StorageClass

      For example:

      $ LV=local-block
      $ SC=localblock
    3. Delete the local volume resource.

      $ oc delete localvolume -n local-storage --wait=true $LV
    4. Delete the remaining PVs and StorageClasses if they exist.

      $ oc delete pv -l storage.openshift.com/local-volume-owner-name=${LV} --wait --timeout=5m
      $ oc delete storageclass $SC --wait --timeout=5m
    5. Clean up the artifacts from the storage nodes for that resource.

      $ [[ ! -z $SC ]] && for i in $(oc get node -l cluster.ocs.openshift.io/openshift-storage= -o jsonpath='{ .items[*].metadata.name }'); do oc debug node/${i} -- chroot /host rm -rfv /mnt/local-storage/${SC}/; done

      Example output:

      Starting pod/node-xxx-debug ...
      To use host binaries, run `chroot /host`
      removed '/mnt/local-storage/localblock/nvme2n1'
      removed directory '/mnt/local-storage/localblock'
      
      Removing debug pod ...
      Starting pod/node-yyy-debug ...
      To use host binaries, run `chroot /host`
      removed '/mnt/local-storage/localblock/nvme2n1'
      removed directory '/mnt/local-storage/localblock'
      
      Removing debug pod ...
      Starting pod/node-zzz-debug ...
      To use host binaries, run `chroot /host`
      removed '/mnt/local-storage/localblock/nvme2n1'
      removed directory '/mnt/local-storage/localblock'
      
      Removing debug pod ...

4.2. Removing monitoring stack from OpenShift Container Storage

Use this section to clean up the monitoring stack from OpenShift Container Storage.

The PVCs that are created as a part of configuring the monitoring stack are in the openshift-monitoring namespace.

Prerequisites

Procedure

  1. List the pods and PVCs that are currently running in the openshift-monitoring namespace.

    $ oc get pod,pvc -n openshift-monitoring
    NAME                           READY   STATUS    RESTARTS   AGE
    pod/alertmanager-main-0         3/3     Running   0          8d
    pod/alertmanager-main-1         3/3     Running   0          8d
    pod/alertmanager-main-2         3/3     Running   0          8d
    pod/cluster-monitoring-
    operator-84457656d-pkrxm        1/1     Running   0          8d
    pod/grafana-79ccf6689f-2ll28    2/2     Running   0          8d
    pod/kube-state-metrics-
    7d86fb966-rvd9w                 3/3     Running   0          8d
    pod/node-exporter-25894         2/2     Running   0          8d
    pod/node-exporter-4dsd7         2/2     Running   0          8d
    pod/node-exporter-6p4zc         2/2     Running   0          8d
    pod/node-exporter-jbjvg         2/2     Running   0          8d
    pod/node-exporter-jj4t5         2/2     Running   0          6d18h
    pod/node-exporter-k856s         2/2     Running   0          6d18h
    pod/node-exporter-rf8gn         2/2     Running   0          8d
    pod/node-exporter-rmb5m         2/2     Running   0          6d18h
    pod/node-exporter-zj7kx         2/2     Running   0          8d
    pod/openshift-state-metrics-
    59dbd4f654-4clng                3/3     Running   0          8d
    pod/prometheus-adapter-
    5df5865596-k8dzn                1/1     Running   0          7d23h
    pod/prometheus-adapter-
    5df5865596-n2gj9                1/1     Running   0          7d23h
    pod/prometheus-k8s-0            6/6     Running   1          8d
    pod/prometheus-k8s-1            6/6     Running   1          8d
    pod/prometheus-operator-
    55cfb858c9-c4zd9                1/1     Running   0          6d21h
    pod/telemeter-client-
    78fc8fc97d-2rgfp                3/3     Running   0          8d
    
    NAME                                                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
    persistentvolumeclaim/my-alertmanager-claim-alertmanager-main-0   Bound    pvc-0d519c4f-15a5-11ea-baa0-026d231574aa   40Gi       RWO            ocs-storagecluster-ceph-rbd   8d
    persistentvolumeclaim/my-alertmanager-claim-alertmanager-main-1   Bound    pvc-0d5a9825-15a5-11ea-baa0-026d231574aa   40Gi       RWO            ocs-storagecluster-ceph-rbd   8d
    persistentvolumeclaim/my-alertmanager-claim-alertmanager-main-2   Bound    pvc-0d6413dc-15a5-11ea-baa0-026d231574aa   40Gi       RWO            ocs-storagecluster-ceph-rbd   8d
    persistentvolumeclaim/my-prometheus-claim-prometheus-k8s-0        Bound    pvc-0b7c19b0-15a5-11ea-baa0-026d231574aa   40Gi       RWO            ocs-storagecluster-ceph-rbd   8d
    persistentvolumeclaim/my-prometheus-claim-prometheus-k8s-1        Bound    pvc-0b8aed3f-15a5-11ea-baa0-026d231574aa   40Gi       RWO            ocs-storagecluster-ceph-rbd   8d
  2. Edit the monitoring configmap.

    $ oc -n openshift-monitoring edit configmap cluster-monitoring-config
  3. Remove any config sections that reference the OpenShift Container Storage storage classes as shown in the following example and save it.

    Before editing

    .
    .
    .
    apiVersion: v1
    data:
      config.yaml: |
        alertmanagerMain:
          volumeClaimTemplate:
            metadata:
              name: my-alertmanager-claim
            spec:
              resources:
                requests:
                  storage: 40Gi
              storageClassName: ocs-storagecluster-ceph-rbd
        prometheusK8s:
          volumeClaimTemplate:
            metadata:
              name: my-prometheus-claim
            spec:
              resources:
                requests:
                  storage: 40Gi
              storageClassName: ocs-storagecluster-ceph-rbd
    kind: ConfigMap
    metadata:
      creationTimestamp: "2019-12-02T07:47:29Z"
      name: cluster-monitoring-config
      namespace: openshift-monitoring
      resourceVersion: "22110"
      selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config
      uid: fd6d988b-14d7-11ea-84ff-066035b9efa8
    .
    .
    .

    After editing

    .
    .
    .
    apiVersion: v1
    data:
      config.yaml: |
    kind: ConfigMap
    metadata:
      creationTimestamp: "2019-11-21T13:07:05Z"
      name: cluster-monitoring-config
      namespace: openshift-monitoring
      resourceVersion: "404352"
      selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config
      uid: d12c796a-0c5f-11ea-9832-063cd735b81c
    .
    .
    .

    In this example, alertmanagerMain and prometheusK8s monitoring components are using the OpenShift Container Storage PVCs.

  4. Delete relevant PVCs. Make sure you delete all the PVCs that are consuming the storage classes.

    $ oc delete -n openshift-monitoring pvc <pvc-name> --wait=true --timeout=5m

4.3. Removing OpenShift Container Platform registry from OpenShift Container Storage

Use this section to clean up OpenShift Container Platform registry from OpenShift Container Storage. If you want to configure an alternative storage, see image registry

The PVCs that are created as a part of configuring OpenShift Container Platform registry are in the openshift-image-registry namespace.

Prerequisites

  • The image registry should have been configured to use an OpenShift Container Storage PVC.

Procedure

  1. Edit the configs.imageregistry.operator.openshift.io object and remove the content in the storage section.

    $ oc edit configs.imageregistry.operator.openshift.io

    Before editing

    .
    .
    .
    storage:
      pvc:
        claim: registry-cephfs-rwx-pvc
    .
    .
    .

    After editing

    .
    .
    .
    storage:
      emptyDir: {}
    .
    .
    .

    In this example, the PVC is called registry-cephfs-rwx-pvc, which is now safe to delete.

  2. Delete the PVC.

    $ oc delete pvc <pvc-name> -n openshift-image-registry --wait=true --timeout=5m

4.4. Removing the cluster logging operator from OpenShift Container Storage

To clean the cluster logging operator from the OpenShift Container Storage, follow the steps in the procedure.

The PVCs created as a part of configuring cluster logging operator are in the openshift-logging namespace.

Prerequisites

  • The cluster logging instance must be configured to use OpenShift Container Storage PVCs.

Procedure

  1. Remove the ClusterLogging instance in the namespace.

    $ oc delete clusterlogging instance -n openshift-logging --wait=true --timeout=5m

    The PVCs in the openshift-logging namespace are now safe to delete.

  2. Delete PVCs.

    $ oc delete pvc <pvc-name> -n openshift-logging --wait=true --timeout=5m
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.