Storage


Red Hat build of MicroShift 4.17

Configuring and managing cluster storage

Red Hat OpenShift Documentation Team

Abstract

This document provides information about using storage for MicroShift.

Chapter 1. Storage overview

MicroShift supports dynamic, ephemeral, and persistent storage, both for on-premise and cloud providers. You can manage container storage for persistent and non-persistent data in a MicroShift node.

1.1. Dynamic storage with the LVMS plugin

Using dynamic provisioning allows you to create storage volumes on-demand, eliminating the need for pre-provisioned storage. MicroShift enables dynamic storage provisioning that is ready for immediate use with the logical volume manager storage (LVMS) Container Storage Interface (CSI) provider.

1.2. Ephemeral storage

Pods and containers are ephemeral or transient in nature and designed for stateless applications. Ephemeral storage allows administrators and developers to better manage the local storage for some of their operations.

1.3. Persistent storage

Persistent storage in MicroShift enables stateful applications to retain data beyond the lifecycle of individual pods. You can use persistent volumes (PVs) to provision storage and persistent volume claims (PVCs) to request storage for your applications.

1.4. Dynamic storage provisioning

Using dynamic provisioning allows you to create storage volumes on-demand, eliminating the need for pre-provisioned storage.

Using dynamic provisioning allows you to create storage volumes on-demand, eliminating the need for pre-provisioned storage.

MicroShift enables dynamic storage provisioning that is ready for immediate use with the logical volume manager storage (LVMS) Container Storage Interface (CSI) provider. The LVMS plugin is the Red Hat downstream version of TopoLVM, a CSI plugin for managing logical volume management (LVM) logical volumes (LVs) for Kubernetes.

LVMS provisions new LVM logical volumes for container workloads with appropriately configured persistent volume claims (PVCs). Each PVC references a storage class that represents an LVM Volume Group (VG) on the host node. LVs are only provisioned for scheduled pods.

2.1. LVMS system requirements

Using LVMS in MicroShift requires the following system specifications.

2.1.1. Volume group name

If you did not configure LVMS in an lvmd.yaml file placed in the /etc/microshift/ directory, MicroShift attempts to assign a default volume group (VG) dynamically by running the vgs command.

  • MicroShift assigns a default VG when only one VG is found.
  • If more than one VG is present, the VG named microshift is assigned as the default.
  • If a VG named microshift does not exist, LVMS is not deployed.

If there are no volume groups on the MicroShift host, LVMS is disabled.

If you want to use a specific VG, LVMS must be configured to select that VG. You can change the default name of the VG in the configuration file. For details, read the "Configuring the LVMS" section of this document.

You can change the default name of the VG in the configuration file. For details, read the "Configuring the LVMS" section of this document.

After MicroShift starts, you can update the lvmd.yaml to include or remove VGs. To implement changes, you must restart MicroShift. If the lvmd.yaml is deleted, MicroShift attempts to find a default VG again.

2.1.2. Volume size increments

The LVMS provisions storage in increments of 1 gigabyte (GB). Storage requests are rounded up to the nearest GB. When the capacity of a VG is less than 1 GB, the PersistentVolumeClaim registers a ProvisioningFailed event, for example:

Example output

Warning  ProvisioningFailed    3s (x2 over 5s)  topolvm.cybozu.com_topolvm-controller-858c78d96c-xttzp_0fa83aef-2070-4ae2-bcb9-163f818dcd9f failed to provision volume with
StorageClass "topolvm-provisioner": rpc error: code = ResourceExhausted desc = no enough space left on VG: free=(BYTES_INT), requested=(BYTES_INT)
Copy to Clipboard Toggle word wrap

You can reduce the use of runtime resources such as RAM, CPU, and storage by removing or disabling the following storage components:

  • You can configure MicroShift to disable the built-in logical volume manager storage (LVMS) Container Storage Interface (CSI) provider.
  • You can configure MicroShift to disable the Container Storage Interface (CSI) snapshot capabilities.
  • You can uninstall the installed CSI implementations using oc commands.
Important

Automated uninstallation is not supported as this can cause orphaning of the provisioned volumes. Without the LVMS CSI driver, the node does not detect the underlying storage interface and cannot perform provisioning and deprovisioning or mounting and unmounting operations.

Note

You can configure MicroShift to disable CSI provider and CSI snapshot only before installing and running MicroShift. After MicroShift is installed and running, you must update the configuration file and uninstall the components.

Use the following procedure to disable installation of the CSI implementation pods.

Important

This procedure is for users who are defining the configuration file before installing and running MicroShift. If MicroShift is already started then CSI snapshot implementation will be running. Users must manually remove it by following the uninstallation instructions.

Note

MicroShift will not delete CSI snapshot implementation pods. You must configure MicroShift to disable installation of the CSI snapshot implementation pods during the startup process.

Procedure

  1. Disable installation of the CSI snapshot controller by entering the optionalCsiComponents value under the storage section of the MicroShift configuration file in /etc/microshift/config.yaml:

    # ...
      storage: {} 
    1
    
    # ...
    Copy to Clipboard Toggle word wrap
    1
    Accepted values are:
    • Not defining optionalCsiComponents.
    • Specifying optionalCsiComponents field with an empty value ([]) or a single empty string element ([""]).
    • Specifying optionalCsiComponents with one of the accepted values which are snapshot-controller, snapshot-webhook, or none. none is mutually exclusive with all other values.

      Note

      If the optionalCsiComponents value is empty or null, MicroShift defaults to deploying snapshot-controller and snapshot-webhook.

  2. After the optionalCsiComponents field is specified with a supported value in the config.yaml, start MicroShift by running the following command:

    $ sudo systemctl start microshift
    Copy to Clipboard Toggle word wrap
    Note

    MicroShift does not redeploy the disabled components after a restart.

Use the following procedure to disable installation of the CSI implementation pods. MicroShift does not delete CSI driver implementation pods. You must configure MicroShift to disable installation of the CSI driver implementation pods during the startup process.

Important

This procedure is for defining the configuration file before installing and running MicroShift. If MicroShift is already started, then the CSI driver implementation is running. You must manually remove it by following the uninstallation instructions.

Procedure

  1. Disable installation of the CSI driver by entering the driver value under the storage section of the MicroShift configuration file in /etc/microshift/config.yaml:

    # ...
      storage
       driver:
       - "none" 
    1
    
    # ...
    Copy to Clipboard Toggle word wrap
    1
    Valid values are none or lvms.
    Note

    By default, the driver value is empty or null and LVMS is deployed.

  2. Start MicroShift after the driver field is specified with a supported value in the /etc/microshift/config.yaml file by running the following command:

    $ sudo systemctl enable --now microshift
    Copy to Clipboard Toggle word wrap
    Note

    MicroShift does not redeploy the disabled components after a restart operation.

2.5. Uninstalling the CSI snapshot implementation

To uninstall the installed CSI snapshot implementation, use the following procedure.

Prerequisites

  • MicroShift is installed and running.
  • The CSI snapshot implementation is deployed on the MicroShift node.

Procedure

  1. Uninstall the CSI snapshot implementation by running the following command:

    $ oc delete -n kube-system deployment.apps/snapshot-controller deployment.apps/snapshot-webhook
    Copy to Clipboard Toggle word wrap

    Example output

    deployment.apps "snapshot-controller" deleted
    deployment.apps "snapshot-webhook" deleted
    Copy to Clipboard Toggle word wrap

2.6. Uninstalling the CSI driver implementation

To uninstall the installed CSI driver implementation, use the following procedure.

Prerequisites

  • MicroShift is installed and running.
  • The CSI driver implementation is deployed on the MicroShift node.

Procedure

  1. Delete the lvmclusters object by running the following command:

    $ oc delete -n openshift-storage lvmclusters.lvm.topolvm.io/lvms
    Copy to Clipboard Toggle word wrap

    Example output

    lvmcluster.lvm.topolvm.io "lvms" deleted
    Copy to Clipboard Toggle word wrap

  2. Delete the lvms-operator by running the following command:

    $ oc delete -n openshift-storage deployment.apps/lvms-operator
    Copy to Clipboard Toggle word wrap

    Example output

    deployment.apps "lvms-operator" deleted
    Copy to Clipboard Toggle word wrap

  3. Delete the topolvm-provisioner StorageClass by running the following command:

    $ oc delete storageclasses.storage.k8s.io/topolvm-provisioner
    Copy to Clipboard Toggle word wrap

    Example output

    storageclass.storage.k8s.io "topolvm-provisioner" deleted
    Copy to Clipboard Toggle word wrap

2.7. LVMS deployment

LVMS is automatically deployed on to the node in the openshift-storage namespace after MicroShift starts.

LVMS uses StorageCapacity tracking to ensure that pods with an LVMS PVC are not scheduled if the requested storage is greater than the free storage of the volume group. For more information about StorageCapacity tracking, read Storage Capacity.

The limitations to configure the size of the devices that you can use to provision storage using LVM Storage are as follows:

  • The total storage size that you can provision is limited by the size of the underlying Logical Volume Manager (LVM) thin pool and the over-provisioning factor.
  • The size of the logical volume depends on the size of the Physical Extent (PE) and the Logical Extent (LE).

    • You can define the size of PE and LE during the physical and logical device creation.
    • The default PE and LE size is 4 MB.
    • If the size of the PE is increased, the maximum size of the LVM is determined by the kernel limits and your disk space.
    • The size limit for Red Hat Enterprise Linux (RHEL) 9 using the default PE and LE size is 8 EB.
    • The following are the minimum storage sizes that you can request for each file system type:

      • block: 8 MiB
      • xfs: 300 MiB
      • ext4: 32 MiB

2.9. Creating an LVMS configuration file

When MicroShift runs, it uses LVMS configuration from /etc/microshift/lvmd.yaml, if provided. You must place any configuration files that you create into the /etc/microshift/ directory.

Procedure

  • To create the lvmd.yaml configuration file, run the following command:

    $ sudo cp /etc/microshift/lvmd.yaml.default /etc/microshift/lvmd.yaml
    Copy to Clipboard Toggle word wrap

2.10. Basic LVMS configuration example

MicroShift supports passing through your LVM configuration and allows you to specify custom volume groups, thin volume provisioning parameters, and reserved unallocated volume group space. You can edit the LVMS configuration file you created at any time. You must restart MicroShift to deploy configuration changes after editing the file.

Note

If you need to take volume snapshots, you must use thin provisioning in your lvmd.conf file. If you do not need to take volume snapshots, you can use thick volumes.

The following lvmd.yaml example file shows a basic LVMS configuration:

LVMS configuration example

socket-name: 
1

device-classes: 
2

  - name: "default" 
3

    volume-group: "VGNAMEHERE" 
4

    spare-gb: 0 
5

    default: 
6
Copy to Clipboard Toggle word wrap

1
String. The UNIX domain socket endpoint of gRPC. Defaults to '/run/lvmd/lvmd.socket'.
2
A list of maps for the settings for each device-class.
3
String. The name of the device-class.
4
String. The group where the device-class creates the logical volumes.
5
Unsigned 64-bit integer. Storage capacity in GB to be left unallocated in the volume group. Defaults to 0.
6
Boolean. Indicates that the device-class is used by default. Defaults to false. At least one value must be entered in the YAML file values when this is set to true.
Important

A race condition prevents LVMS from accurately tracking the allocated space and preserving the spare-gb for a device class when multiple PVCs are created simultaneously. Use separate volume groups and device classes to protect the storage of highly dynamic workloads from each other.

2.11. Using the LVMS

The LVMS StorageClass is deployed with a default StorageClass. Any PersistentVolumeClaim objects without a .spec.storageClassName defined automatically has a PersistentVolume provisioned from the default StorageClass. Use the following procedure to provision and mount a logical volume to a pod.

Procedure

  • To provision and mount a logical volume to a pod, run the following command:

    $ cat <<EOF | oc apply -f -
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: my-lv-pvc
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1G
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod
    spec:
      containers:
      - name: nginx
        image: nginx
        command: ["/usr/bin/sh", "-c"]
        args: ["sleep", "1h"]
        volumeMounts:
        - mountPath: /mnt
          name: my-volume
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
              - ALL
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault
      volumes:
        - name: my-volume
          persistentVolumeClaim:
            claimName: my-lv-pvc
    EOF
    Copy to Clipboard Toggle word wrap

2.11.1. Device classes

You can create custom device classes by adding a device-classes array to your logical volume manager storage (LVMS) configuration. Add the array to the /etc/microshift/lvmd.yaml configuration file. A single device class must be set as the default. You must restart MicroShift for configuration changes to take effect.

Warning

Removing a device class while there are still persistent volumes or VolumeSnapshotContent objects connected to that device class breaks both thick and thin provisioning.

You can define multiple device classes in the device-classes array. These classes can be a mix of thick and thin volume configurations.

Example of a mixed device-class array

socket-name: /run/topolvm/lvmd.sock
device-classes:
  - name: ssd
    volume-group: ssd-vg
    spare-gb: 0 
1

    default: true
  - name: hdd
    volume-group: hdd-vg
    spare-gb: 0
  - name: thin
    spare-gb: 0
    thin-pool:
      name: thin
      overprovision-ratio: 10
    type: thin
    volume-group: ssd
  - name: striped
    volume-group: multi-pv-vg
    spare-gb: 0
    stripe: 2
    stripe-size: "64"
    lvcreate-options:
2
Copy to Clipboard Toggle word wrap

1
When you set the spare capacity to anything other than 0, more space can be allocated than expected.
2
Extra arguments to pass to the lvcreate command, such as --type=<type>. Neither MicroShift nor the LVMS verifies lvcreate-options values. These optional values are passed as is to the lvcreate command. Ensure that the options specified here are correct.

Chapter 3. Using ephemeral storage

Ephemeral storage is unstructured and temporary. It is often used with immutable applications. This guide discusses how ephemeral storage works for MicroShift.

3.1. Overview

Pods and containers are ephemeral or transient in nature and designed for stateless applications. Ephemeral storage allows administrators and developers to better manage the local storage for some of their operations.

In addition to persistent storage, pods and containers can require ephemeral or transient local storage for their operation. The lifetime of this ephemeral storage does not extend beyond the life of the individual pod, and this ephemeral storage cannot be shared across pods.

Pods use ephemeral local storage for scratch space, caching, and logs. Issues related to the lack of local storage accounting and isolation include the following:

  • Pods cannot detect how much local storage is available to them.
  • Pods cannot request guaranteed local storage.
  • Local storage is a best-effort resource.
  • Pods can be evicted due to other pods filling the local storage, after which new pods are not admitted until sufficient storage is reclaimed.

Unlike persistent volumes, ephemeral storage is unstructured and the space is shared between all pods running on the node, other uses by the system, and Red Hat build of MicroShift. The ephemeral storage framework allows pods to specify their transient local storage needs. It also allows Red Hat build of MicroShift to protect the node against excessive use of local storage.

While the ephemeral storage framework allows administrators and developers to better manage local storage, I/O throughput and latency are not directly effected.

3.2. Types of ephemeral storage

Ephemeral local storage is always made available in the primary partition. There are two basic ways of creating the primary partition: root and runtime.

3.2.1. Root

This partition holds the kubelet root directory, /var/lib/kubelet/ by default, and /var/log/ directory. This partition can be shared between user pods, the OS, and Kubernetes system daemons. This partition can be consumed by pods through EmptyDir volumes, container logs, image layers, and container-writable layers. Kubelet manages shared access and isolation of this partition. This partition is ephemeral, and applications cannot expect any performance SLAs, such as disk IOPS, from this partition.

3.2.2. Runtime

This is an optional partition that runtimes can use for overlay file systems. Red Hat build of MicroShift attempts to identify and provide shared access along with isolation to this partition. Container image layers and writable layers are stored here. If the runtime partition exists, the root partition does not hold any image layer or other writable storage.

3.3. Ephemeral storage management

Cluster administrators can manage ephemeral storage within a project by setting quotas that define the limit ranges and number of requests for ephemeral storage across all pods in a non-terminal state. Developers can also set requests and limits on this compute resource at the pod and container level.

You can manage local ephemeral storage by specifying requests and limits. Each container in a pod can specify the following:

  • spec.containers[].resources.limits.ephemeral-storage
  • spec.containers[].resources.requests.ephemeral-storage

3.3.1. Ephemeral storage limits and requests units

Limits and requests for ephemeral storage are measured in byte quantities. You can express storage as a plain integer or as a fixed-point number using one of these suffixes: E, P, T, G, M, k. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.

For example, the following quantities all represent approximately the same value: 128974848, 129e6, 129M, and 123Mi.

Important

The suffixes for each byte quantity are case-sensitive. Be sure to use the correct case. Use the case-sensitive "M", such as used in "400M" to set the request at 400 megabytes. Use the case-sensitive "400Mi" to request 400 mebibytes. If you specify "400m" of ephemeral storage, the storage requests is only 0.4 bytes.

The following example configuration file shows a pod with two containers:

  • Each container requests 2GiB of local ephemeral storage.
  • Each container has a limit of 4GiB of local ephemeral storage.
  • At the pod level, kubelet works out an overall pod storage limit by adding up the limits of all the containers in that pod.

    • In this case, the total storage usage at the pod level is the sum of the disk usage from all containers plus the pod’s emptyDir volumes.
    • Therefore, the pod has a request of 4GiB of local ephemeral storage, and a limit of 8GiB of local ephemeral storage.

Example ephemeral storage configuration with quotas and limits

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        ephemeral-storage: "2Gi" 
1

      limits:
        ephemeral-storage: "4Gi" 
2

    volumeMounts:
    - name: ephemeral
      mountPath: "/tmp"
  - name: log-aggregator
    image: images.my-company.example/log-aggregator:v6
    resources:
      requests:
        ephemeral-storage: "2Gi"
      limits:
        ephemeral-storage: "4Gi"
    volumeMounts:
    - name: ephemeral
      mountPath: "/tmp"
  volumes:
    - name: ephemeral
      emptyDir: {}
Copy to Clipboard Toggle word wrap

1
Container request for local ephemeral storage.
2
Container limit for local ephemeral storage.

The settings in the pod spec affect when kubelet evicts pods. At the container level, because the first container sets a resource limit, kubelet eviction manager measures the disk usage of this container and evicts the pod if the storage usage of the container exceeds its limit (4GiB). The kubelet eviction manager also marks the pod for eviction if the total usage exceeds the overall pod storage limit (8GiB).

Note

This policy is strictly for emptyDir volumes and is not applied to persistent storage. You can specify the priorityClass of pods to exempt the pod from eviction.

3.4. Monitoring ephemeral storage

You can use /bin/df as a tool to monitor ephemeral storage usage on the volume where ephemeral container data is located, which is /var/lib/kubelet and /var/lib/containers. The available space for only /var/lib/kubelet is shown when you use the df command if /var/lib/containers is placed on a separate disk by the cluster administrator.

Procedure

  • To show the human-readable values of used and available space in /var/lib, enter the following command:

    $ df -h /var/lib
    Copy to Clipboard Toggle word wrap

    The output shows the ephemeral storage usage in /var/lib:

    Example output

    Filesystem  Size  Used Avail Use% Mounted on
    /dev/disk/by-partuuid/4cd1448a-01    69G   32G   34G  49% /
    Copy to Clipboard Toggle word wrap

Chapter 4. Generic ephemeral volumes

To understand temporary storage that persists only for the duration of a pod, review the properties of ephemeral volumes in MicroShift.

4.1. Overview of generic ephemeral volumes

To manage scratch data by using standard storage drivers, use generic ephemeral volumes. These volumes provide per-pod directories similar to emptyDir volumes but work with any driver that supports persistent volumes and dynamic provisioning, so that you can leverage existing storage infrastructure for temporary needs.

Generic ephemeral volumes are specified inline in the pod specification and follow the lifecycle of a pod. They are created and deleted along with the pod.

Generic ephemeral volumes have the following features:

  • Storage can be local or network-attached.
  • Volumes can have a fixed size that pods cannot exceed.
  • Volumes might have some initial data, depending on the driver and parameters.
  • Typical operations on volumes are supported, assuming that the driver supports them, including snapshotting, cloning, resizing, and storage capacity tracking.
Note

Generic ephemeral volumes do not support offline snapshotting and resizing.

4.2. Lifecycle and persistent volume claims

To bind the lifecycle of storage resources to a specific pod, configure persistent volume claim (PVC) parameters within the volume source of the pod. This setup ensures that the ephemeral volume controller creates the PVC in the same namespace upon pod creation and automatically deletes the PVC when the pod is removed.

Labels, annotations, and the whole set of fields for persistent volume claims (PVCs) are supported.

The ephemeral volume controller creates a PVC object from the template shown in the Creating generic ephemeral volumes procedure.

Volume binding and provisioning can be triggered in one of two ways:

  • Either immediately, if the storage class uses immediate volume binding.

    With immediate binding, the scheduler is forced to select a node that has access to the volume after it is available.

  • When the pod is tentatively scheduled onto a node (WaitForFirstConsumer volume binding mode).

    This volume binding option is recommended for generic ephemeral volumes because then the scheduler can choose a suitable node for the pod.

In terms of resource ownership, a pod that has generic ephemeral storage is the owner of the PVCs that provide that ephemeral storage. When the pod is deleted, the Kubernetes garbage collector deletes the PVC, which then usually triggers deletion of the volume because the default reclaim policy of storage classes is to delete volumes. You can create quasi-ephemeral local storage by using a storage class with a reclaim policy of retain. The storage outlives the pod, and in this case, you must ensure that volume clean-up happens separately. While these PVCs exist, they can be used like any other PVC. In particular, they can be referenced as data sources in volume cloning or snapshotting. The PVC object also holds the current status of the volume.

4.3. Security

You can enable the generic ephemeral volume feature so that if a user can create pods, they can also create persistent volume claims (PVCs) indirectly.

The generic ephemeral volume feature works even if these users do not have permission to create PVCs directly. Cluster administrators must be aware of this. If this does not fit your security model, use an admission webhook that rejects objects such as pods that have a generic ephemeral volume.

The normal namespace quota for PVCs still applies, so even if users are allowed to use this new mechanism, they cannot use it to circumvent other policies.

4.4. Persistent volume claim naming

To avoid resource conflicts, review the naming convention for automatically created persistent volume claims (PVCs). Because the system generates names by combining the pod name and volume name with a hyphen, you must ensure manually created resources do not inadvertently match this pattern.

For example, pod-a with volume scratch and pod with volume a-scratch both end up with the same PVC name, pod-a-scratch.

Such conflicts are detected, and a PVC is only used for an ephemeral volume if it was created for the pod. This check is based on the ownership relationship. An existing PVC is not overwritten or modified, but this does not resolve the conflict. Without the right PVC, a pod cannot start.

Important

Be careful when naming pods and volumes inside the same namespace so that naming conflicts do not occur.

4.5. Creating generic ephemeral volumes

Configure temporary storage by creating generic ephemeral volumes that use drivers that support dynamic provisioning.

Procedure

  1. Create the pod object definition and save it to a file.
  2. Include the generic ephemeral volume information in the file.

    my-example-pod-with-generic-vols.yaml

    kind: Pod
    apiVersion: v1
    metadata:
      name: my-app
    spec:
      containers:
        - name: my-frontend
          image: busybox:1.28
          volumeMounts:
          - mountPath: "/mnt/storage"
            name: data
          command: [ "sleep", "1000000" ]
      volumes:
        - name: data
          ephemeral:
            volumeClaimTemplate:
              metadata:
                labels:
                  type: my-app-ephvol
              spec:
                accessModes: [ "ReadWriteOnce" ]
                storageClassName: "topolvm-provisioner"
                resources:
                  requests:
                    storage: 1Gi
    Copy to Clipboard Toggle word wrap

    • volumes.name:: Specifies the name for the generic ephemeral volume.

Chapter 5. Using persistent storage

Managing storage is a distinct problem from managing compute resources. MicroShift uses the Kubernetes persistent volume (PV) framework to allow node administrators to provision persistent storage for a node. Developers can use persistent volume claims (PVCs) to request PV resources without having specific knowledge of the underlying storage infrastructure.

You can use security context constraints (SCCs) to control permissions for the pods in your node. These permissions determine the actions that a pod can perform and what resources it can access. You can use SCCs to define a set of conditions that a pod must run with to be accepted into the system.

For more information, see "Managing security context constraints".

Important

Only RWO volume mounts are supported. SCC could be blocked if pods are not operating with the SCC contexts.

5.3. Persistent storage overview

Stateful applications deployed in containers require persistent storage. MicroShift uses a pre-provisioned storage framework called persistent volumes (PV) to allow node administrators to provision persistent storage. The data inside these volumes can exist beyond the lifecycle of an individual pod. Developers can use persistent volume claims (PVCs) to request storage requirements.

PVCs are specific to a namespace, and are created and used by developers as a means to use a PV. PV resources on their own are not scoped to any single namespace; they can be shared across the entire Red Hat build of MicroShift node and claimed from any namespace. After a PV is bound to a PVC, that PV can not then be bound to additional PVCs. This has the effect of scoping a bound PV to a single namespace.

PVs are defined by a PersistentVolume API object, which represents a piece of existing storage in the cluster that was either statically provisioned by the cluster administrator or dynamically provisioned using a StorageClass object. It is a resource in the cluster just like a node is a cluster resource.

PVs are volume plugins like Volumes but have a lifecycle that is independent of any individual pod that uses the PV. PV objects capture the details of the implementation of the storage, be that LVM, the host filesystem such as hostpath, or raw block devices.

Important

High availability of storage in the infrastructure is left to the underlying storage provider.

Like PersistentVolumes, PersistentVolumeClaims (PVCs) are API objects, which represents a request for storage by a developer. It is similar to a pod in that pods consume node resources and PVCs consume PV resources. For example, pods can request specific levels of resources, such as CPU and memory, while PVCs can request specific storage capacity and access modes. Access modes supported by OpenShift Container Platform are also definable in Red Hat build of MicroShift. However, because Red Hat build of MicroShift does not support multi-node deployments, only ReadWriteOnce (RWO) is pertinent.

5.5. Lifecycle of a volume and claim

PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs have the following lifecycle.

5.5.1. Provision storage

In response to requests from a developer defined in a PVC, a cluster administrator configures one or more dynamic provisioners that provision storage and a matching PV.

5.5.2. Bind claims

When you create a PVC, you request a specific amount of storage, specify the required access mode, and create a storage class to describe and classify the storage. The control loop in the master watches for new PVCs and binds the new PVC to an appropriate PV. If an appropriate PV does not exist, a provisioner for the storage class creates one.

The size of all PVs might exceed your PVC size. This is especially true with manually provisioned PVs. To minimize the excess, Red Hat build of MicroShift binds to the smallest PV that matches all other criteria.

Claims remain unbound indefinitely if a matching volume does not exist or can not be created with any available provisioner servicing a storage class. Claims are bound as matching volumes become available. For example, a cluster with many manually provisioned 50Gi volumes would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.

5.5.3. Use pods and claimed PVs

Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a pod. For those volumes that support multiple access modes, you must specify which mode applies when you use the claim as a volume in a pod.

Once you have a claim and that claim is bound, the bound PV belongs to you for as long as you need it. You can schedule pods and access claimed PVs by including persistentVolumeClaim in the pod’s volumes block.

Note

If you attach persistent volumes that have high file counts to pods, those pods can fail or can take a long time to start. For more information, see When using Persistent Volumes with high file counts in OpenShift, why do pods fail to start or take an excessive amount of time to achieve "Ready" state?.

5.5.4. Release a persistent volume

When you are finished with a volume, you can delete the PVC object from the API, which allows reclamation of the resource. The volume is considered released when the claim is deleted, but it is not yet available for another claim. The previous claimant’s data remains on the volume and must be handled according to policy.

5.5.5. Reclaim policy for persistent volumes

The reclaim policy of a persistent volume tells the cluster what to do with the volume after it is released. A volume’s reclaim policy can be Retain, Recycle, or Delete.

  • Retain reclaim policy allows manual reclamation of the resource for those volume plugins that support it.
  • Recycle reclaim policy recycles the volume back into the pool of unbound persistent volumes once it is released from its claim.
Important

The Recycle reclaim policy is deprecated in Red Hat build of MicroShift 4. Dynamic provisioning is recommended for equivalent and better functionality.

  • Delete reclaim policy deletes both the PersistentVolume object from Red Hat build of MicroShift and the associated storage asset in external infrastructure, such as Amazon Elastic Block Store (Amazon EBS) or VMware vSphere.
Note

Dynamically provisioned volumes are always deleted.

5.5.6. Reclaiming a persistent volume manually

When a persistent volume claim (PVC) is deleted, the underlying logical volume is handled according to the reclaimPolicy.

Procedure

To manually reclaim the PV as a cluster administrator:

  1. Delete the PV by running the following command:

    $ oc delete pv <pv_name>
    Copy to Clipboard Toggle word wrap

    The associated storage asset in the external infrastructure, such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume, still exists after the PV is deleted.

  2. Clean up the data on the associated storage asset.
  3. Delete the associated storage asset. Alternately, to reuse the same storage asset, create a new PV with the storage asset definition.

The reclaimed PV is now available for use by another PVC.

You can change the reclaim policy of a persistent volume.

Procedure

  1. List the persistent volumes in your cluster:

    $ oc get pv
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM             STORAGECLASS     REASON    AGE
     pvc-b6efd8da-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Delete          Bound     default/claim1    manual                     10s
     pvc-b95650f8-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Delete          Bound     default/claim2    manual                     6s
     pvc-bb3ca71d-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Delete          Bound     default/claim3    manual                     3s
    Copy to Clipboard Toggle word wrap

  2. Choose one of your persistent volumes and change its reclaim policy:

    $ oc patch pv <your-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    Copy to Clipboard Toggle word wrap
  3. Verify that your chosen persistent volume has the right policy:

    $ oc get pv
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM             STORAGECLASS     REASON    AGE
     pvc-b6efd8da-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Delete          Bound     default/claim1    manual                     10s
     pvc-b95650f8-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Delete          Bound     default/claim2    manual                     6s
     pvc-bb3ca71d-b7b5-11e6-9d58-0ed433a7dd94   4Gi        RWO           Retain          Bound     default/claim3    manual                     3s
    Copy to Clipboard Toggle word wrap

    In the preceding output, the volume bound to claim default/claim3 now has a Retain reclaim policy. The volume will not be automatically deleted when a user deletes claim default/claim3.

5.6. Persistent volumes

Each PV contains a spec and status, which is the specification and status of the volume, for example:

PersistentVolume object definition example

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv0001 
1

spec:
  capacity:
    storage: 5Gi 
2

  accessModes:
    - ReadWriteOnce 
3

  persistentVolumeReclaimPolicy: Retain 
4

  ...
status:
  ...
Copy to Clipboard Toggle word wrap

1
Name of the persistent volume.
2
The amount of storage available to the volume.
3
The access mode, defining the read-write and mount permissions.
4
The reclaim policy, indicating how the resource should be handled once it is released.

5.6.1. Capacity

Generally, a persistent volume (PV) has a specific storage capacity. This is set by using the capacity attribute of the PV.

Currently, storage capacity is the only resource that can be set or requested. Future attributes may include IOPS, throughput, and so on.

5.6.2. Supported access modes

LVMS is the only CSI plugin Red Hat build of MicroShift supports. The hostPath and LVs built in to OpenShift Container Platform also support RWO.

5.6.3. Phase

Volumes can be found in one of the following phases:

Expand
Table 5.1. Volume phases
PhaseDescription

Available

A free resource not yet bound to a claim.

Bound

The volume is bound to a claim.

Released

The claim was deleted, but the resource is not yet reclaimed by the cluster.

Failed

The volume has failed its automatic reclamation.

You can view the name of the PVC that is bound to the PV by running the following command:

$ oc get pv <pv_claim>
Copy to Clipboard Toggle word wrap
5.6.3.1. Mount options

You can specify mount options while mounting a PV by using the attribute mountOptions.

For example:

Mount options example

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  name: topolvm-provisioner
mountOptions:
  - uid=1500
  - gid=1500
parameters:
  csi.storage.k8s.io/fstype: xfs
provisioner: topolvm.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Copy to Clipboard Toggle word wrap

Note

The mountOptions parameter values are not validated. Incorrect values cause the mount to fail and an event to be logged to the PVC.

To enable concurrent access for pods on a single node, configure the ReadWriteOnce (RWO) access mode for your Persistent Volume Claims (PVCs). This setting allows multiple workloads on the same node to read from and write to the same Persistent Volume (PV) simultaneously.

Sometimes pods of the same node are not able to read or write into the same PV. This happens when the pods in the node do not have the same SELinux context.

Persistent volumes can be mounted, while later claimed by PVCs, with the RWO access mode.

5.8. Checking the pods for mismatch

To ensure workload consistency, check the pods running on MicroShift for mismatches. Identifying these discrepancies helps verify that your running workloads match the expected configuration.

Procedure

  1. List the mount point within the first pod by running the following command:

    $ oc get pods -n <pod_name_a> -ojsonpath='{.spec.containers[].volumeMounts[].mountPath}'
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_a> with the name of the first pod.

      Example output

      /files /var/run/secrets/kubernetes.io/serviceaccount
      Copy to Clipboard Toggle word wrap

  2. List the mount point within the second pod by running the following command:

    $ oc get pods -n <pod_name_b> -ojsonpath='{.spec.containers[].volumeMounts[].mountPath}'
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_b> with the name of the second pod.

      Example output

      /files /var/run/secrets/kubernetes.io/serviceaccount
      Copy to Clipboard Toggle word wrap

  3. Check the context and permissions inside the first pod by running the following command:

    $ oc rsh <pod_name_a> ls -lZah <pvc_mountpoint>
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_a> with the name of the first pod.
    • Replace <pvc_mountpoint> with the mount point within the first pod.

      Example output

      total 12K
      dr-xr-xr-x.   1 root root system_u:object_r:container_file_t:s0:c398,c806   40 Feb 17 13:36 .
      dr-xr-xr-x.   1 root root system_u:object_r:container_file_t:s0:c398,c806   40 Feb 17 13:36 ..
      [...]
      Copy to Clipboard Toggle word wrap

  4. Check the context and permissions inside the second pod by running the following command:

    $ oc rsh <pod_name_b> ls -lZah <pvc_mountpoint>
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_b> with the name of the second pod.
    • Replace <pvc_mountpoint> with the mount point within the second pod.

      Example output

      total 12K
      dr-xr-xr-x.   1 root root system_u:object_r:container_file_t:s0:c15,c25   40 Feb 17 13:34 .
      dr-xr-xr-x.   1 root root system_u:object_r:container_file_t:s0:c15,c25   40 Feb 17 13:34 ..
      [...]
      Copy to Clipboard Toggle word wrap

  5. Compare both the outputs to check if there is a mismatch of SELinux context.

5.9. Updating the pods which have mismatch

To resolve configuration discrepancies, update the SELinux context of the pods that display a mismatch status. This process ensures that your running workloads align with the expected configuration, maintaining consistency across your cluster.

Procedure

  1. When there is a mismatch of the SELinux content, create a new security context constraint (SCC) and assign it to both pods. To create a SCC, see "Creating security context constraints".
  2. Update the SELinux context as shown in the following example:

    Example output

     [...]
     securityContext:privileged
          seLinuxOptions:MustRunAs
            level: "s0:cXX,cYY"
      [...]
    Copy to Clipboard Toggle word wrap

5.10. Verifying pods after resolving a mismatch

To confirm that the mismatch is resolved, verify the security context constraint (SCC) and the SELinux label of the pods. Checking these settings ensures that your workloads are functioning with the correct security configurations.

Procedure

  1. Verify that the same SCC is assigned to the first pod by running the following command:

    $ oc describe pod <pod_name_a> |grep -i scc
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_a> with the name of the first pod.

      Example output

      openshift.io/scc: restricted
      Copy to Clipboard Toggle word wrap

  2. Verify that the same SCC is assigned to first second pod by running the following command:

    $ oc describe pod <pod_name_b> |grep -i scc
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_b> with the name of the second pod.

      Example output

      openshift.io/scc: restricted
      Copy to Clipboard Toggle word wrap

  3. Verify that the same SELinux label is applied to first pod by running the following command:

    $ oc exec <pod_name_a> -- ls -laZ <pvc_mountpoint>
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_a> with the name of the first pod.
    • Replace <pvc_mountpoint> with the mount point within the first pod.

      Example output

      total 4
      drwxrwsrwx. 2 root       1000670000 system_u:object_r:container_file_t:s0:c10,c26 19 Aug 29 18:17 .
      dr-xr-xr-x. 1 root       root       system_u:object_r:container_file_t:s0:c10,c26 61 Aug 29 18:16 ..
      -rw-rw-rw-. 1 1000670000 1000670000 system_u:object_r:container_file_t:s0:c10,c26 29 Aug 29 18:17 test1
      [...]
      Copy to Clipboard Toggle word wrap

  4. Verify that the same SELinux label is applied to second pod by running the following command:

    $ oc exec <pod_name_b> -- ls -laZ <pvc_mountpoint>
    Copy to Clipboard Toggle word wrap
    • Replace <pod_name_b> with the name of the second pod.
    • Replace <pvc_mountpoint> with the mount point within the second pod.

      Example output

      total 4
      drwxrwsrwx. 2 root       1000670000 system_u:object_r:container_file_t:s0:c10,c26 19 Aug 29 18:17 .
      dr-xr-xr-x. 1 root       root       system_u:object_r:container_file_t:s0:c10,c26 61 Aug 29 18:16 ..
      -rw-rw-rw-. 1 1000670000 1000670000 system_u:object_r:container_file_t:s0:c10,c26 29 Aug 29 18:17 test1
      [...]
      Copy to Clipboard Toggle word wrap

5.11. Persistent volume claims

To define storage requirements for your workloads, review the structure of a PersistentVolumeClaim (PVC). This object includes a spec field to configure the request and a status field to monitor the current state of the claim.

PersistentVolumeClaim object definition example

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
  storageClassName: gold
status:
# ...
Copy to Clipboard Toggle word wrap

where:

apiVersion
Specifies the name of the PVC.
spec.accessModes
Specifies the access mode, defining the read/write and mount permissions.
requests.storage
Specifies the amount of storage available to the PVC.
storageClassName
Specifies the name of the StorageClass required by the claim.

5.11.1. Storage classes

To request specific storage capabilities, define the StorageClass name in the storageClassName attribute of your PersistentVolumeClaim (PVC). This setting ensures the claim binds only to matching PersistentVolumes (PVs) or triggers dynamic provisioning if the cluster administrator has configured on-demand creation.

The cluster administrator can also set a default storage class for all PVCs. When you configure a default storage class, the PVC must explicitly ask for StorageClass or storageClassName annotations set to "" to be bound to a PV without a storage class.

Note

If more than one storage class is marked as default, a PVC can only be created if the storageClassName is explicitly specified. Therefore, only set one storage class as the default.

5.11.2. Claims as volumes

To enable pods to access storage resources, configure Persistent Volume Claims (PVCs) as volumes. By mounting the claim to the host and into the pod, the cluster locates the backing PersistentVolume (PV) in the same namespace, ensuring the workload can read and write data effectively.

Claims use the same conventions as volumes when requesting storage with specific access modes.

Claims, such as pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to volumes and claims.

Mount volume to the host and into the pod example

kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: dockerfile/nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: myclaim
# ...
Copy to Clipboard Toggle word wrap

where:

volumeMounts.mountPath
Specifies the path to mount the volume inside the pod.
volumeMounts.name
Specifies the name of the volume to mount. Do not mount to the container root, /, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host /dev/pts files. Using /host is a safe option for mounting the host.
persistentVolumeClaim.claimName
Specifies the name of the PVC, that exists in the same namespace, to use.

5.11.3. Setting PVC viewing permissions

To monitor storage resources, verify that you have the necessary privileges to view Persistent Volume Claim (PVC) usage statistics. Ensuring you have the correct permissions means that you can access usage data and track resource consumption effectively.

To view PVC usage statistics, you must have the necessary privileges.

Procedure

  • If you have admin privileges, log on to MicroShift as an admin.
  • If you do not have admin privileges, complete the following steps:

    • Create cluster roles for the user by running the following command:

      $ oc create clusterrole routes-view --verb=get,list --resource=routes
      Copy to Clipboard Toggle word wrap
    • Add the routes-view cluster role for the user by running the following command:

      $ oc admin policy add-cluster-role-to-user routes-view _<user_name>_
      Copy to Clipboard Toggle word wrap
    • Replace <user_name> with the user name.
    • Add the cluster-monitoring-view cluster role for the user by running the following command:

      $ oc admin policy add-cluster-role-to-user cluster-monitoring-view _<user_name>_
      Copy to Clipboard Toggle word wrap
    • Replace <user_name> with the user name.

5.11.4. Viewing PVC usage statistics

To monitor storage consumption, view the usage statistics for Persistent Volume Claims (PVCs). By accessing these metrics, you can track resource use and ensure that your workloads have sufficient capacity.

Important

PVC usage statistics command is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Procedure

  • To view statistics across a cluster, run the following command:

    $ oc adm top pvc -A
    Copy to Clipboard Toggle word wrap

    Example command output

    NAMESPACE     NAME         USAGE(%)
    namespace-1   data-etcd-1  3.82%
    namespace-1   data-etcd-0  3.81%
    namespace-1   data-etcd-2  3.81%
    namespace-2   mypvc-fs-gp3 0.00%
    default       mypvc-fs     98.36%
    Copy to Clipboard Toggle word wrap

  • To view PVC usage statistics for a specified namespace, run the following command:

    $ oc adm top pvc -n <namespace_name>
    Copy to Clipboard Toggle word wrap
    • Where <namespace_name> is the name of the specified namespace.

      Example command output

      NAMESPACE     NAME        USAGE(%)
      namespace-1   data-etcd-2 3.81%
      namespace-1   data-etcd-0 3.81%
      namespace-1   data-etcd-1 3.82%
      Copy to Clipboard Toggle word wrap

      In this example, the specified namespace is namespace-1.

  • To view usage statistics for a specified PVC and for a specified namespace, run the following command:

    $ oc adm top pvc <pvc_name> -n <namespace_name>
    Copy to Clipboard Toggle word wrap
    • Where <pvc_name> is the name of specified PVC.
    • Where <namespace_name> is the name of the specified namespace.

      Example command output

      NAMESPACE   NAME        USAGE(%)
      namespace-1 data-etcd-0 3.81%
      Copy to Clipboard Toggle word wrap

      In this example, the specified namespace is namespace-1 and the specified PVC is data-etcd-0.

5.12. Reduce pod timeouts by using fsGroup

To reduce pod timeouts when using a storage volume with many files, configure the fsGroup field. By specifying this field, you can manage how file ownership and permissions are applied, preventing delays caused by the default recursive permission changes on large volumes.

If a storage volume contains many files (~1,000,000 or greater), you may experience pod timeouts.

This can occur because, by default, Red Hat build of MicroShift recursively changes ownership and permissions for the contents of each volume to match the fsGroup specified in the securityContext of the pod when that volume is mounted. For volumes with many files, checking and changing ownership and permissions can be time consuming, slowing pod startup. You can use the fsGroupChangePolicy field inside a securityContext to control the way that Red Hat build of MicroShift checks and manages ownership and permissions for a volume.

fsGroupChangePolicy defines behavior for changing ownership and permission of the volume before being exposed inside a pod. This field only applies to volume types that support fsGroup-controlled ownership and permissions. This field has two possible values:

  • OnRootMismatch: Only change permissions and ownership if permission and ownership of root directory does not match with expected permissions of the volume. This can help shorten the time it takes to change ownership and permission of a volume to reduce pod timeouts.
  • Always: Always change permission and ownership of the volume when a volume is mounted.

fsGroupChangePolicy example

securityContext:
  runAsUser: 1000
  runAsGroup: 3000
  fsGroup: 2000
  fsGroupChangePolicy: "OnRootMismatch" 
1

  ...
Copy to Clipboard Toggle word wrap

1
OnRootMismatch specifies skipping recursive permission change, thus helping to avoid pod timeout problems.
Note

The fsGroupChangePolicyfield has no effect on ephemeral volume types, such as secret, configMap, and emptydir.

Chapter 6. Expanding persistent volumes

To increase available storage capacity, expand persistent volumes in MicroShift. You can resize existing volumes to accommodate growing application data requirements without recreating storage resources.

6.1. Expanding CSI volumes

You can use the Container Storage Interface (CSI) to expand storage volumes after they have already been created.

CSI volume expansion does not support the following:

  • Recovering from failure when expanding volumes
  • Shrinking

Prerequisites

  • The underlying CSI driver supports resize.
  • Dynamic provisioning is used.
  • The controlling StorageClass object has allowVolumeExpansion set to true. For more information, see "Enabling volume expansion support."

Procedure

  1. For the persistent volume claim (PVC), set .spec.resources.requests.storage to the desired new size.
  2. Watch the status.conditions field of the PVC to see if the resize has completed. Red Hat build of MicroShift adds the Resizing condition to the PVC during expansion, which is removed after expansion completes.

6.3. Expanding local volumes

You can manually expand persistent volumes (PVs) and persistent volume claims (PVCs) created by using the local storage operator (LSO).

Procedure

  1. Expand the underlying devices. Ensure that appropriate capacity is available on these devices.
  2. Update the corresponding PV objects to match the new device sizes by editing the .spec.capacity field of the PV.
  3. For the storage class that is used for binding the PVC to PV, set the allowVolumeExpansion field to true.
  4. For the PVC, set the .spec.resources.requests.storage field to match the new size.

    Kubelet automatically expands the underlying file system on the volume, if necessary, and updates the status field of the PVC to reflect the new size.

Expanding PVCs based on volume types that need file system resizing, such as GCE Persistent Disk volumes (gcePD), AWS Elastic Block Store (EBS), and Cinder, is a two-step process. First, expand the volume objects in the cloud provider. Second, expand the file system on the node.

Expanding the file system on the node only happens when a new pod is started with the volume.

Prerequisites

  • The controlling StorageClass object must have allowVolumeExpansion set to true.

Procedure

  1. Edit the PVC and request a new size by editing spec.resources.requests. For example, the following expands the ebs PVC to 8 Gi:

    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      name: ebs
    spec:
      storageClass: "storageClassWithFlagSet"
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 8Gi
    Copy to Clipboard Toggle word wrap
    • requests.storage: Specifies the size for the PVC.
  2. After the cloud provider object has finished resizing, the PVC is set to FileSystemResizePending. Check the condition by entering the following command:

    $ oc describe pvc <pvc_name>
    Copy to Clipboard Toggle word wrap
  3. When the cloud provider object has finished resizing, the PersistentVolume object reflects the newly requested size in PersistentVolume.Spec.Capacity. At this point, you can create or re-create a pod from the PVC to finish the file system resizing. Once the pod is running, the newly requested size is available and the FileSystemResizePending condition is removed from the PVC.

To retry a failed or pending resize request, update the spec.resources.requests.storage field in the persistent volume claim (PVC). You must specify a value larger than the original volume size to successfully retrigger the operation.

Entering a smaller resize value in the .spec.resources.requests.storage field for the existing PVC does not work.

Procedure

  1. Mark the persistent volume (PV) that is bound to the PVC with the Retain reclaim policy. Change the persistentVolumeReclaimPolicy field to Retain.
  2. Delete the PVC.
  3. Manually edit the PV and delete the claimRef entry from the PV specs to ensure that the newly created PVC can bind to the PV marked Retain. This marks the PV as Available.
  4. Re-create the PVC in a smaller size, or a size that can be allocated by the underlying storage provider.
  5. Restore the reclaim policy on the PV.

Chapter 7. Working with volume snapshots

MicroShift administrators can use volume snapshots to help protect against data loss by using the supported MicroShift logical volume manager storage (LVMS) Container Storage Interface (CSI) provider. Familiarity with persistent volumes is required.

A snapshot represents the state of the storage volume in a node at a particular point in time. Volume snapshots can also be used to provision new volumes. Snapshots are created as read-only logical volumes (LVs) located on the same device as the original data.

A MicroShift administrator can complete the following tasks using CSI volume snapshots:

  • Create a snapshot of an existing persistent volume claim (PVC).
  • Back up a volume snapshot to a secure location.
  • Restore a volume snapshot as a different PVC.
  • Delete an existing volume snapshot.
Important

Only the logical volume manager storage (LVMS) plugin CSI driver is supported by MicroShift.

7.1. About LVM thin volumes

To enable advanced storage capabilities, such as volume snapshots and volume cloning, complete specific configuration steps. Preparing your environment ensures that the necessary components are active and ready to support these features for your workloads.

The following list describes the configuration steps:

  • Configure both the logical volume manager storage (LVMS) provider and the node.
  • Provision a logical volume manager (LVM) thin-pool on the RHEL for Edge host.
  • Attach LVM thin-pools to a volume group.
Important

To create Container Storage Interface (CSI) snapshots, you must configure thin volumes on the RHEL for Edge host. The CSI does not support volume shrinking.

Important

When using thin provisioning, you must monitor the storage pool and add more capacity as the available physical space runs out. You can configure the storage pool to auto expand when there is available space within the volume group (VG). See "Creating a thin logical volume".

For LVMS to manage thin logical volumes (LVs), a thin-pool device-class array must be specified in the etc/lvmd.yaml configuration file. Multiple thin-pool device classes are permitted.

If additional storage pools are configured with device classes, then additional storage classes must also exist to expose the storage pools to users and workloads. To enable dynamic provisioning on a thin-pool, a StorageClass resource must be present on the node. The StorageClass resource specifies the source device-class array in the topolvm.io/device-class parameter.

Example lvmd.yaml file that specifies a single device class for a thin-pool

socket-name:
device-classes:
  - name: thin
    default: true
    spare-gb: 0
    thin-pool:
      name: thin
      overprovision-ratio: 1
    type: thin
    volume-group: ssd
Copy to Clipboard Toggle word wrap

where:

socket-name
Specifies the UNIX domain socket endpoint of gRPC. Defaults to /run/lvmd/lvmd.socket. Takes a string value.
device-classes
Specifies a list of maps for the settings for each device-class.
device-classes.name
Specifies the unique name of the device-class. Takes a string value.
device-classes.spare-gb
Specifies storage capacity in GB to be left unallocated in the volume group. Defaults to 0. Takes an unsigned 64-bit integer.
thin-pool.overprovision-ratio
Specifies a float factor by which you can provision additional storage based on the available storage in the thin pool. For example, if this field is set to 10, you can provision up to 10 times the amount of available storage in the thin pool. To disable over-provisioning, set this field to 1.
type
Specifies thin provisioning is required to create volume snapshots.
volume-group
Specifies the group where the device-class creates the logical volumes. Takes a string value.
Important

When multiple PVCs are created simultaneously, a race condition prevents LVMS from accurately tracking the allocated space and preserving the storage capacity for a device class. Use separate volume groups and device classes to protect the storage of highly dynamic workloads from each other.

7.1.1. Storage classes

To configure the workload layer interface for device class selection, review the supported storage class parameters in MicroShift. By understanding these parameters, you can define how storage is provisioned and managed for your specific workload requirements.

The following storage class parameters are supported in MicroShift:

  • The csi.storage.k8s.io/fstype parameter selects the file system types. Both xfs and ext4 file system types are supported.
  • The topolvm.io/device-class parameter is the name of the device class. If a device class is not provided, the default device class is assumed.

Multiple storage classes can refer to the same device class. You can provide varying sets of parameters for the same backing device class, such as xfs and ext4 variants.

Example MicroShift default storage class resource

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  name: topolvm-provisioner
parameters:
  "csi.storage.k8s.io/fstype": "xfs"
provisioner: topolvm.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion:
# ...
Copy to Clipboard Toggle word wrap

where:

storageclass.kubernetes.io/is-default-class
Specifies an example of the default storage class. If a PVC does not specify a storage class, this class is assumed. There can only be one default storage class in a MicroShift node. Having no value assigned to this annotation is also supported.
csi.storage.k8s.io/fstype
Specifies what file system to provision on the volume. Options are "xfs" and "ext4".
provisioner
Specifies what provisioner should manage this class.
volumeBindingMode
Specifies whether to provision the volume before a client pod is present or immediately. Options are WaitForFirstConsumer and Immediate. WaitForFirstConsumer is recommended to ensure that storage is only provisioned for pods that can be scheduled.
allowVolumeExpansion
Specifies if PVCs provisioned from the StorageClass permit expansion. The MicroShift LVMS CSI plugin does support volume expansion, but if this value is set to false, expansion is blocked.

7.2. Volume snapshot classes

To enable dynamic snapshotting in LVMS, ensure that at least one VolumeSnapshotClass configuration file is present on the node. This resource defines the Container Storage Interface (CSI) parameters required to create and manage volume snapshots.

Important

You must enable thin logical volumes to take logical volume snapshots.

Example VolumeSnapshotClass configuration file

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: topolvm-snapclass
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: topolvm.io
deletionPolicy: Delete
Copy to Clipboard Toggle word wrap

where:

snapshot.storage.kubernetes.io/is-default-class
Specifies the VolumeSnapshotClass configuration file to use when none is specified by VolumeSnapshot. Where VolumeSnapshot is a request for snapshot of a volume by a user.
driver
Identifies the snapshot provisioner that manages the requests for snapshots of a volume by a user for this class.
deletionPolicy
Specifies the VolumeSnapshotContent objects and the backing snapshots that are kept or deleted when a bound VolumeSnapshot is deleted. Valid values are Retain or Delete.

7.3. About volume snapshots

You can use volume snapshots with logical volume manager (LVM) thin volumes to help protect against data loss from applications running in a MicroShift node. MicroShift only supports the logical volume manager storage (LVMS) Container Storage Interface (CSI) provider.

Note

LVMS only supports the volumeBindingMode of the storage class being set to WaitForFirstConsumer. This setting means the storage volume is not provisioned until a pod is ready to mount it.

Example workload that deploys a single pod and PVC

$ oc apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-claim-thin
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: topolvm-provisioner-thin
---
apiVersion: v1
kind: Pod
metadata:
  name: base
spec:
  containers:
  - command:
	    - nginx
	    - -g
	    - 'daemon off;'
    image: registry.redhat.io/rhel8/nginx-122@sha256:908ebb0dec0d669caaf4145a8a21e04fdf9ebffbba5fd4562ce5ab388bf41ab2
    name: test-container
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    volumeMounts:
    - mountPath: /vol
      name: test-vol
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - name: test-vol
    persistentVolumeClaim:
      claimName: test-claim-thin
EOF
Copy to Clipboard Toggle word wrap

7.3.1. Creating a volume snapshot

To preserve the data on a PersistentVolumeClaim (PVC) at a specific point in time, create a volume snapshot. By using a volume snapshot, you can restore the volume to its previous state or provision new volumes with the saved data.

To create a snapshot of a MicroShift storage volume, you must first configure RHEL for Edge and the node.

In the following example procedure, the pod that the source volume is mounted to is deleted. Deleting the pod prevents data from being written to it during snapshot creation. Ensuring that no data is being written during a snapshot is crucial to creating a viable snapshot.

Prerequisites

  • User has root access to a MicroShift node.
  • MicroShift is running.
  • A device class defines an LVM thin-pool.
  • A volumeSnapshotClass specifies driver: topolvm.io.
  • Any workload attached to the source PVC is paused or deleted. This helps avoid data corruption.
Important

All writes to the volume must be halted while you are creating the snapshot. If you do not halt writes, your data might be corrupted.

Procedure

  1. Prevent data from being written to the volume during snapshotting by using one of the two following steps:

    1. Delete the pod to ensure that no data is written to the volume during snapshotting by running the following command:

      $ oc delete my-pod
      Copy to Clipboard Toggle word wrap
    2. Scale the replica count to zero on a pod that is managed with a replication controller. Setting the count to zero prevents the instant creation of a new pod when one is deleted.
  2. After all writes to the volume are halted, run a command similar to the example that follows. Insert your own configuration details.

    Example snapshot configuration

    # oc apply -f <<EOF
    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshot
    metadata:
      name: <snapshot_name>
    spec:
      volumeSnapshotClassName: topolvm-snapclass
      source:
        persistentVolumeClaimName: test-claim-thin
    EOF
    Copy to Clipboard Toggle word wrap

    where:

    kind
    Specifies the type of VolumeSnapshot object to create.
    metadata.name
    Specifies the name that you specify for the snapshot.
    volumeSnapshotClassName
    Specifies the desired name of the VolumeSnapshotClass object.
    persistentVolumeClaimName
    Specifies either persistentVolumeClaimName or volumeSnapshotContentName. In this example, a snapshot is created from a PVC named test-claim-thin.
  3. Wait for the storage driver to finish creating the snapshot by running the following command:

    $ oc wait volumesnapshot/<snapshot_name> --for=jsonpath\='{.status.readyToUse}=true'
    Copy to Clipboard Toggle word wrap
  4. When the volumeSnapshot object is in a ReadyToUse state, you can restore the state as a volume for future PVCs. Restart the pod or scale the replica count back up to the desired number.
  5. After you have created the volume snapshot, you can remount the source PVC to a new pod.

    Important

    Volume snapshots are located on the same devices as the original data. To use the volume snapshots as backups, move the snapshots to a secure location.

7.3.2. Backing up a volume snapshot

Snapshots of data from applications running on a MicroShift node are created as read-only logical volumes (LVs) located on the same devices as the original data. You must manually mount local volumes before they can be copied as persistent volumes (PVs) and used as backup copies. To use a snapshot of a MicroShift storage volume as a backup, find it on the local host and then move it to a secure location.

Prerequisites

  • You have root access to the host machine.
  • You have an existing volume snapshot.

Procedure

  1. Get the name of the volume snapshot by running the following command:

    $ oc get volumesnapshot -n <namespace> <snapshot_name> -o 'jsonpath={.status.volumeSnapshotContentName}'
    Copy to Clipboard Toggle word wrap
    • Replace <namespace> and <snapshot_name> with the namespace and snapshot name you used.
  2. Get the unique identity of the volume created on the storage backend by using the following command and inserting the name retrieved in the previous step:

    $ oc get volumesnapshotcontent snapcontent-<retrieved_volume_identity> -o 'jsonpath={.status.snapshotHandle}'
    Copy to Clipboard Toggle word wrap
    • Replace <retrieved_volume_identity> with the volume identity.
  3. Display the snapshots by using the unique identity of the volume you retrieved in the previous step to determine which one you want to backup by running the following command:

    $ sudo lvdisplay <retrieved_volume_identity>
    Copy to Clipboard Toggle word wrap
    • Replace <retrieved_volume_identity> with the volume identity.

      Example output

      --- Logical volume ---
      LV Path                /dev/rhel/732e45ff-f220-49ce-859e-87ccca26b14c
      LV Name                732e45ff-f220-49ce-859e-87ccca26b14c
      VG Name                rhel
      LV UUID                6Ojwc0-YTfp-nKJ3-F9FO-PvMR-Ic7b-LzNGSx
      LV Write Access        read only
      LV Creation host, time rhel-92.lab.local, 2023-08-07 14:45:26 -0500
      LV Pool name           thinpool
      LV Thin origin name    a2d2dcdc-747e-4572-8c83-56cd873d3b07
      LV Status              available
      # open                 0
      LV Size                1.00 GiB
      Mapped size            1.04%
      Current LE             256
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     256
      Block device           253:11
      Copy to Clipboard Toggle word wrap

  4. Create a directory to use for mounting the LV by running the following command:

    $ sudo mkdir /mnt/snapshot
    Copy to Clipboard Toggle word wrap
  5. Mount the LV using the device name for the retrieved snapshot handle by running the following command:

    $ sudo mount /dev/<retrieved_snapshot_handle> /mnt/snapshot
    Copy to Clipboard Toggle word wrap
    • Replace <retrieved_snapshot_handle> with the device name.
  6. Copy the files from the mounted location and store them in a secure location by running the following command:

    $ sudo cp -r /mnt/snapshot <destination>
    Copy to Clipboard Toggle word wrap
    • Replace <destination> with the path to the secure location.

7.3.3. Restoring a volume snapshot

To recover data from a point-in-time copy, restore a volume snapshot to a new PersistentVolumeClaim (PVC). This process ensures that data from the source volume is preserved and you can verify the integrity of the restored content on the new claim.

The following workflow demonstrates snapshot restoration. In this example, the verification steps are also given to ensure that data written to a source persistent volume claim (PVC) is preserved and restored on a new PVC.

Important

A snapshot must be restored to a PVC of exactly the same size as the source volume of the snapshot. You can resize the PVC after the snapshot is restored successfully if a larger PVC is needed.

Procedure

  • Restore a snapshot by specifying the VolumeSnapshot object as the data source in a persistent volume claim by entering the following command:

    $ oc apply -f <<EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: snapshot-restore
    spec:
      accessModes:
      - ReadWriteOnce
      dataSource:
        apiGroup: snapshot.storage.k8s.io
        kind: VolumeSnapshot
        name: my-snap
      resources:
        requests:
          storage: 1Gi
      storageClassName: topolvm-provisioner-thin
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: base
    spec:
      containers:
      - command:
          - nginx
    	    - -g
    	    - 'daemon off;'
        image: registry.redhat.io/rhel8/nginx-122@sha256:908ebb0dec0d669caaf4145a8a21e04fdf9ebffbba5fd4562ce5ab388bf41ab2
        name: test-container
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - mountPath: /vol
          name: test-vol
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
      volumes:
      - name: test-vol
        persistentVolumeClaim:
          claimName: snapshot-restore
    EOF
    Copy to Clipboard Toggle word wrap

Verification

  1. Wait for the pod to reach the Ready state:

    $ oc wait --for=condition=Ready pod/base
    Copy to Clipboard Toggle word wrap
  2. When the new pod is ready, verify that the data from your application is correct in the snapshot.

7.3.4. Deleting a volume snapshot

You can configure how Red Hat build of MicroShift deletes volume snapshots.

Procedure

  1. Specify the deletion policy that you require in the VolumeSnapshotClass object, as shown in the following example:

    Example volumesnapshotclass.yaml file

    apiVersion: snapshot.storage.k8s.io/v1
    kind: VolumeSnapshotClass
    metadata:
      name: csi-hostpath-snap
    driver: hostpath.csi.k8s.io
    deletionPolicy: Delete
    # ...
    Copy to Clipboard Toggle word wrap

    • deletionPolicy: When deleting the volume snapshot, if the Delete value is set, the underlying snapshot is deleted along with the VolumeSnapshotContent object. If the Retain value is set, both the underlying snapshot and VolumeSnapshotContent object remain.

      Note

      If the Retain value is set and the VolumeSnapshot object is deleted without deleting the corresponding VolumeSnapshotContent object, the content remains. The snapshot itself is also retained in the storage back end.

  2. Delete the volume snapshot by entering the following command:

    $ oc delete volumesnapshot _<volumesnapshot_name>_
    Copy to Clipboard Toggle word wrap
    • Replace <volumesnapshot_name> with the name of the volume snapshot you want to delete.

      Example output

      volumesnapshot.snapshot.storage.k8s.io "mysnapshot" deleted
      Copy to Clipboard Toggle word wrap

  3. If the deletion policy is set to Retain, delete the volume snapshot content by entering the following command:

    $ oc delete volumesnapshotcontent _<volumesnapshotcontent_name>_
    Copy to Clipboard Toggle word wrap
    • Replace <volumesnapshotcontent_name> with the content you want to delete.
  4. Optional: If the VolumeSnapshot object is not successfully deleted, enter the following command to remove any finalizers for the leftover resource so that the delete operation can continue:

    Important

    Only remove the finalizers if you are confident that there are no existing references from either persistent volume claims or volume snapshot contents to the VolumeSnapshot object. Even with the --force option, the delete operation does not delete snapshot objects until all finalizers are removed.

    $ oc patch -n $PROJECT volumesnapshot/$NAME --type=merge -p '{"metadata": {"finalizers":null}}'
    Copy to Clipboard Toggle word wrap

    Example output

    volumesnapshotclass.snapshot.storage.k8s.io "csi-ocs-rbd-snapclass" deleted
    Copy to Clipboard Toggle word wrap

    The finalizers are removed and the volume snapshot is deleted.

7.4. About LVM volume cloning

You can use the logical volume manager storage (LVMS) for persistent volume claim (PVC) cloning of the logical volume manager (LVM) thin volumes. A clone is a duplicate of an existing volume that can be used like any other volume.

When you provision the clone, an exact duplicate of the original volume is created if the data source references a source PVC in the same namespace. After a cloned PVC is created, the cloned VPC is considered a new object and completely separate from the source PVC. The clone represents a snapshot of the data from the source at the moment in time.

Note

Cloning is only possible when the source and destination PVCs are in the same namespace. To create PVC clones, you must configure thin volumes on the RHEL for Edge host.

To update existing objects to their latest version without recreating them, use storage version migration in MicroShift. By creating a StorageVersionMigration custom resource (CR), you request the Kube Storage Version Migrator embedded controller to handle the transition automatically.

Either you or a controller can create a StorageVersionMigration custom resource (CR) that requests a migration through the Migrator Controller.

To update stored data to the latest Kubernetes storage version, perform a storage migration.

The procedure shows an example of converting existing objects on the v1beta1 version to the current version, such as v1beta2, to ensure compatibility with the cluster APIs.

Procedure

  • Either you or any controller that has support for the StorageVersionMigration API must trigger a migration request. Use the following example request for reference:

    Example request

    apiVersion: migration.k8s.io/v1alpha1
    kind: StorageVersionMigration
    metadata:
      name: snapshot-v1
    spec:
      resource:
        group: snapshot.storage.k8s.io
        resource: volumesnapshotclasses
        version: v1
    Copy to Clipboard Toggle word wrap

    where:

    resource.resource
    Specifies the plural name of the resource.
    resource.version
    Specifies the version to update to.

Verification

  • To monitor the progress of the update, review the status of the StorageVersionMigration custom resource (CR).
Note

A migration fails when you misname a group or resource. Incompatible versions between the previous and latest versions can also cause a migration to fail.

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top