Chapter 2. Integrating Red Hat Ceph Storage


You can configure Red Hat OpenStack Services on OpenShift (RHOSO) to integrate with an external Red Hat Ceph Storage cluster. This configuration connects the following services to a Red Hat Ceph Storage cluster:

  • Block Storage service (cinder)
  • Image service (glance)
  • Object Storage service (swift)
  • Compute service (nova)
  • Shared File Systems service (manila)

To configure Red Hat Ceph Storage as the back end for RHOSO storage, complete the following tasks:

  1. Verify that Red Hat Ceph Storage is deployed and all the required services are running.
  2. Create the Red Hat Ceph Storage pools on the Red Hat Ceph Storage cluster.
  3. Create a Red Hat Ceph Storage secret on the Red Hat Ceph Storage cluster to provide RHOSO services access to the Red Hat Ceph Storage cluster.
  4. Obtain the Ceph File System Identifier.
  5. Configure the OpenStackControlPlane CR to use the Red Hat Ceph Storage cluster as the back end.
  6. Configure the OpenStackDataPlane CR to use the Red Hat Ceph Storage cluster as the back end.

Prerequisites

  • Access to a Red Hat Ceph Storage cluster.
  • The RHOSO control plane is installed on an operational RHOSO cluster.

2.1. Creating Red Hat Ceph Storage pools

Create pools on the Red Hat Ceph Storage cluster server for each RHOSO service that uses the cluster.

Procedure

  1. Create pools for the Compute service (vms), the Block Storage service (volumes), and the Image service (images):

    $ for P in vms volumes images; do
      cephadm shell -- ceph osd pool create $P;
      cephadm shell -- ceph osd pool application enable $P rbd;
    done
    Note

    When you create the pools, set the appropriate placement group (PG) number, as described in Placement Groups in the Red Hat Ceph Storage Storage Strategies Guide.

  2. Optional: Create the cephfs volume if the Shared File Systems service (manila) is enabled in the control plane. This automatically enables the CephFS Metadata service (MDS) and creates the necessary data and metadata pools on the Ceph cluster:

    $ cephadm shell -- ceph fs volume create cephfs
  3. Optional: Deploy an NFS service on the Red Hat Ceph Storage cluster to use CephFS with NFS:

    $ cephadm shell -- ceph nfs cluster create cephfs \
    --ingress --virtual-ip=<vip> \
    --ingress-mode=haproxy-protocol
    • Replace <vip> with the IP address assigned to the NFS service. The NFS service should be isolated on a network that can be shared with all Red Hat OpenStack users. See NFS cluster and export management, for more information about customizing the NFS service.

      Important

      When you deploy an NFS service for the Shared File Systems service, do not select a custom port to expose NFS. Only the default NFS port of 2049 is supported. You must enable the Red Hat Ceph Storage ingress service and set the ingress-mode to haproxy-protocol. Otherwise, you cannot use IP-based access rules with the Shared File Systems service. For security in production environments, Red Hat does not recommend providing access to 0.0.0.0/0 on shares to mount them on client machines.

  4. Create a cephx key for RHOSO to use to access pools:

    $ cephadm shell -- \
       ceph auth add client.openstack \
         mgr 'allow *' \
            mon 'allow r' \
            osd 'allow class-read object_prefix rbd_children, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=images'
    Important

    If the Shared File Systems service is enabled in the control plane, replace osd caps with the following:

    osd 'allow class-read object_prefix rbd_children, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=images, allow rwx pool=cephfs.cephfs.data'

  5. Export the cephx key:

    $ cephadm shell -- ceph auth get client.openstack > /etc/ceph/ceph.client.openstack.keyring
  6. Export the configuration file:

    $ cephadm shell -- ceph config generate-minimal-conf > /etc/ceph/ceph.conf

2.2. Creating a Red Hat Ceph Storage secret

Create a secret so that services can access the Red Hat Ceph Storage cluster.

Procedure

  1. Transfer the cephx key and configuration file created in the Creating Red Hat Ceph Storage pools procedure to a host that can create resources in the openstack namespace.
  2. Base64 encode these files and store them in KEY and CONF environment variables:

    $ KEY=$(cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0)
    $ CONF=$(cat /etc/ceph/ceph.conf | base64 -w 0)
  3. Create a YAML file to create the Secret resource.
  4. Using the environment variables, add the Secret configuration to the YAML file:

    apiVersion: v1
    data:
      ceph.client.openstack.keyring: $KEY
      ceph.conf: $CONF
    kind: Secret
    metadata:
      name: ceph-conf-files
      namespace: openstack
    type: Opaque
  5. Save the YAML file.
  6. Create the Secret resource:

    $ oc create -f <secret_configuration_file>
    • Replace <secret_configuration_file> with the name of the YAML file you created.
Note

The examples in this section use openstack as the name of the Red Hat Ceph Storage user. The file name in the Secret resource must match this user name.

For example, if the file name used for the username openstack2 is /etc/ceph/ceph.client.openstack2.keyring, then the secret data line should be ceph.client.openstack2.keyring: $KEY.

2.3. Obtaining the Red Hat Ceph Storage File System Identifier

The Red Hat Ceph Storage File System Identifier (FSID) is a unique identifier for the cluster. The FSID is used in configuration and verification of cluster interoperability with RHOSO.

Procedure

  • Extract the FSID from the Red Hat Ceph Storage secret:

    $ FSID=$(oc get secret ceph-conf-files -o json | jq -r '.data."ceph.conf"' | base64 -d | grep fsid | sed -e 's/fsid = //')

2.4. Configuring the control plane to use the Red Hat Ceph Storage cluster

You must configure the OpenStackControlPlane CR to use the Red Hat Ceph Storage cluster. Configuration includes the following tasks:

  1. Confirming the Red Hat Ceph Storage cluster and the associated services have the correct network configuration.
  2. Configuring the control plane to use the Red Hat Ceph Storage secret.
  3. Configuring the Image service (glance) to use the Red Hat Ceph Storage cluster.
  4. Configuring the Block Storage service (cinder) to use the Red Hat Ceph Storage cluster.
  5. Optional: Configuring the Shared File Systems service (manila) to use native CephFS or CephFS-NFS with the Red Hat Ceph Storage cluster.
Note

This example does not include configuring Block Storage backup service (cinder-backup) with Red Hat Ceph Storage.

Procedure

  1. Check the storage interface defined in your NodeNetworkConfigurationPolicy (nncp) custom resource to confirm that it has the same network configuration as the public_network of the Red Hat Ceph Storage cluster. This is required to enable access to the Red Hat Ceph Storage cluster through the Storage network. The Storage network should have the same network configuration as the public_network of the Red Hat Ceph Storage cluster.

    It is not necessary for RHOSO to access the cluster_network of the Red Hat Ceph Storage cluster.

    Note

    If it does not impact workload performance, the Storage network can be different from the external Red Hat Ceph Storage cluster public_network using routed (L3) connectivity as long as the appropriate routes are added to the Storage network to reach the external Red Hat Ceph Storage cluster public_network.

  2. Check the networkAttachments for the default Image service instance in the OpenStackControlPlane CR to confirm that the default Image service is configured to access the Storage network:

    glance:
        enabled: true
        template:
          databaseInstance: openstack
          storage:
            storageRequest: 10G
          glanceAPIs:
            default
              replicas: 3
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                    spec:
                      type: LoadBalancer
              networkAttachments:
              - storage
  3. Confirm the Block Storage service is configured to access the Storage network through MetalLB.
  4. Optional: Confirm the Shared File Systems service is configured to access the Storage network through ManilaShare.
  5. Confirm the Compute service (nova) is configured to access the Storage network.
  6. Confirm the Red Hat Ceph Storage configuration file, /etc/ceph/ceph.conf, contains the IP addresses of the Red Hat Ceph Storage cluster monitors. These IP addresses must be within the Storage network IP address range.
  7. Open your openstack_control_plane.yaml file to edit the OpenStackControlPlane CR.
  8. Add the extraMounts parameter to define the services that require access to the Red Hat Ceph Storage secret.

    The following is an example of using the extraMounts parameter for this purpose. Only include ManilaShare in the propagation list if you are using the Shared File Systems service (manila):

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    spec:
      extraMounts:
        - name: v1
          region: r1
          extraVol:
            - propagation:
              - CinderVolume
              - GlanceAPI
              - ManilaShare
              extraVolType: Ceph
              volumes:
              - name: ceph
                projected:
                  sources:
                  - secret:
                      name: <ceph-conf-files>
              mounts:
              - name: ceph
                mountPath: "/etc/ceph"
                readOnly: true
  9. Add the customServiceConfig parameter to the glance template to configure the Image service to use the Red Hat Ceph Storage cluster:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    metadata:
      name: openstack
    spec:
      glance:
        template:
          customServiceConfig: |
            [DEFAULT]
            enabled_backends = default_backend:rbd
            [glance_store]
            default_backend = default_backend
            [default_backend]
            rbd_store_ceph_conf = /etc/ceph/ceph.conf
            store_description = "RBD backend"
            rbd_store_pool = images
            rbd_store_user = openstack
          databaseInstance: openstack
          databaseAccount: glance
          secret: osp-secret
          storage:
            storageRequest: 10G
      extraMounts:
        - name: v1
          region: r1
          extraVol:
            - propagation:
              - GlanceAPI
              extraVolType: Ceph
              volumes:
              - name: ceph
                secret:
                  secretName: ceph-conf-files
              mounts:
              - name: ceph
                mountPath: "/etc/ceph"
                readOnly: true

    When you use Red Hat Ceph Storage as a back end for the Image service, image-conversion is enabled by default. For more information, see Planning storage and shared file systems in Planning your deployment.

  10. Add the customServiceConfig parameter to the cinder template to configure the Block Storage service to use the Red Hat Ceph Storage cluster. For information about using Block Storage backups, see Configuring the Block Storage backup service.

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    spec:
      extraMounts:
        ...
      cinder:
        template:
          cinderVolumes:
            ceph:
              customServiceConfig: |
                [DEFAULT]
                enabled_backends=ceph
                [ceph]
                volume_backend_name=ceph
                volume_driver=cinder.volume.drivers.rbd.RBDDriver
                rbd_ceph_conf=/etc/ceph/ceph.conf
                rbd_user=openstack
                rbd_pool=volumes
                rbd_flatten_volume_from_snapshot=False
                rbd_secret_uuid=$FSID 1
    1
    Replace with the actual FSID. The FSID itself does not need to be considered secret. For more information, see Obtaining the Red Hat Ceph Storage FSID.
  11. Optional: Add the customServiceConfig parameter to the manila template to configure the Shared File Systems service to use native CephFS or CephFS-NFS with the Red Hat Ceph Storage cluster. For more information, see Configuring the Shared File Systems service (manila).

    The following example exposes native CephFS:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    spec:
      extraMounts:
        ...
        manila:
            template:
     	manilaAPI:
                 customServiceConfig: |
                   [DEFAULT]
                   enabled_share_protocols=cephfs
               manilaShares:
                 share1:
                    customServiceConfig: |
                        [DEFAULT]
                        enabled_share_backends=cephfs
                        [cephfs]
                        driver_handles_share_servers=False
                        share_backend_name=cephfs
                        share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
                        cephfs_conf_path=/etc/ceph/ceph.conf
                        cephfs_auth_id=openstack
                        cephfs_cluster_name=ceph
                        cephfs_volume_mode=0755
                        cephfs_protocol_helper_type=CEPHFS

    The following example exposes CephFS with NFS:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    spec:
      extraMounts:
        ...
        manila:
            template:
               manilaAPI:
                 customServiceConfig: |
                   [DEFAULT]
                   enabled_share_protocols=nfs
               manilaShares:
                 share1:
                    customServiceConfig: |
                        [DEFAULT]
                        enabled_share_backends=cephfsnfs
                        [cephfsnfs]
                        driver_handles_share_servers=False
                        share_backend_name=cephfsnfs
                        share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
                        cephfs_conf_path=/etc/ceph/ceph.conf
                        cephfs_auth_id=openstack
                        cephfs_cluster_name=ceph
                        cephfs_volume_mode=0755
                        cephfs_protocol_helper_type=NFS
                        cephfs_nfs_cluster_id=cephfs
  12. Apply the updates to the OpenStackControlPlane CR:

    $ oc apply -f openstack_control_plane.yaml

2.5. Configuring the data plane to use the Red Hat Ceph Storage cluster

Configure the data plane to use the Red Hat Ceph Storage cluster.

Procedure

  1. Create a ConfigMap with additional content for the Compute service (nova) configuration file /etc/nova/nova.conf.d/ inside the nova_compute container. This additional content directs the Compute service to use Red Hat Ceph Storage RBD.

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ceph-nova
    data:
     03-ceph-nova.conf: | 1
      [libvirt]
      images_type=rbd
      images_rbd_pool=vms
      images_rbd_ceph_conf=/etc/ceph/ceph.conf
      images_rbd_glance_store_name=default_backend
      images_rbd_glance_copy_poll_interval=15
      images_rbd_glance_copy_timeout=600
      rbd_user=openstack
      rbd_secret_uuid=$FSID 2
    1
    This file name must follow the naming convention of ##-<name>-nova.conf. Files are evaluated by the Compute service alphabetically. A filename that starts with 01 will be evaluated by the Compute service before a filename that starts with 02. When the same configuration option occurs in multiple files, the last one read wins.
    2
    The $FSID value should contain the actual FSID as described in the Obtaining the Ceph FSID section. The FSID itself does not need to be considered secret.
  2. Create a custom version of the default nova service to use the new ConfigMap, which in this case is called ceph-nova.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: nova-custom-ceph 1
    spec:
      label: dataplane-deployment-nova-custom-ceph
      caCerts: combined-ca-bundle
      edpmServiceType: nova
      dataSources:
       - configMapRef:
           name: ceph-nova
       - secretRef:
           name: nova-cell1-compute-config
       - secretRef:
           name: nova-migration-ssh-key
      playbook: osp.edpm.nova
    1
    The custom service is named nova-custom-ceph. It cannot be named nova because nova is an unchangeable default service. Any custom service that has the same name as a default service name will be overwritten during reconciliation.
  3. Apply the ConfigMap and custom service changes:

    $ oc create -f ceph-nova.yaml
  4. Update the OpenStackDataPlaneNodeSet services list to replace the nova service with the new custom service (in this case called nova-custom-ceph), add the ceph-client service, and use the extraMounts parameter to define access to the Ceph Storage secret.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    spec:
      ...
      roles:
        edpm-compute:
          ...
          services:
            - configure-network
            - validate-network
            - install-os
            - configure-os
            - run-os
            - ceph-client
            - ovn
            - libvirt
            - nova-custom-ceph
            - telemetry
    
      nodeTemplate:
        extraMounts:
        - extraVolType: Ceph
          volumes:
          - name: ceph
            secret:
              secretName: ceph-conf-files
          mounts:
          - name: ceph
            mountPath: "/etc/ceph"
            readOnly: true
    Note

    The ceph-client service must be added before the libvirt and nova-custom-ceph services. The ceph-client service configures EDPM nodes as clients of a Red Hat Ceph Storage server by distributing the Red Hat Ceph Storage client files.

  5. Save the changes to the services list.
  6. Create an OpenStackDataPlaneDeployment CR:

    $ oc create -f <dataplanedeployment_cr_file>
    • Replace <dataplanedeployment_cr_file> with the name of your file.

Result

When the nova-custom-ceph service Ansible job runs, the job copies overrides from the ConfigMaps to the Compute service hosts. It will also use virsh secret-* commands so the libvirt service retrieves the cephx secret by FSID.

  • Run the following command on an EDPM node after the job completes to confirm the job results:

    $ podman exec libvirt_virtsecretd virsh secret-get-value $FSID

2.6. Configuring an external Ceph Object Gateway back end

You can configure an external Ceph Object Gateway (RGW) to act as an Object Storage service (swift) back end, by completing the following high-level tasks:

  1. Configure the RGW to verify users and their roles in the Identity service (keystone) to authenticate with the external RGW service.
  2. Deploy and configure a RGW service to handle object storage requests.

You use the openstack client tool to configure the Object Storage service.

2.6.1. Configuring RGW authentication

You must configure RGW to verify users and their roles in the Identity service (keystone) to authenticate with the external RGW service.

Prerequisites

  • You have deployed an operational OpenStack control plane.

Procedure

  1. Create the Object Storage service on the control plane:

    $ openstack service create --name swift --description "OpenStack Object Storage" object-store
  2. Create a user called swift:

    $ openstack user create --project service --password <swift_password> swift
    • Replace <swift_password> with the password to assign to the swift user.
  3. Create roles for the swift user:

    $ openstack role create swiftoperator
    $ openstack role create ResellerAdmin
  4. Add the swift user to system roles:

    $ openstack role add --user swift --project service member
    $ openstack role add --user swift --project service admin
  5. Export the RGW endpoint IP addresses to variables and create control plane endpoints:

    $ export RGW_ENDPOINT_STORAGE=<rgw_endpoint_ip_address_storage>
    $ export RGW_ENDPOINT_EXTERNAL=<rgw_endpoint_ip_address_external>
    $ openstack endpoint create --region regionOne object-store public http://$RGW_ENDPOINT_EXTERNAL:8080/swift/v1/AUTH_%\(tenant_id\)s;
    $ openstack endpoint create --region regionOne object-store internal http://$RGW_ENDPOINT_STORAGE:8080/swift/v1/AUTH_%\(tenant_id\)s;
    • Replace <rgw_endpoint_ip_address_storage> with the IP address of the RGW endpoint on the storage network. This is how internal services will access RGW.
    • Replace <rgw_endpoint_ip_address_external> with the IP address of the RGW endpoint on the external network. This is how cloud users will write objects to RGW.

      Note

      Both endpoint IP addresses are the endpoints that represent the Virtual IP addresses, owned by haproxy and keepalived, used to reach the RGW backends that will be deployed in the Red Hat Ceph Storage cluster in the procedure Configuring and Deploying the RGW service.

  6. Add the swiftoperator role to the control plane admin group:

    $ openstack role add --project admin --user admin swiftoperator

2.6.2. Configuring and deploying the RGW service

Configure and deploy a RGW service to handle object storage requests.

Procedure

  1. Log in to a Red Hat Ceph Storage Controller node.
  2. Create a file called /tmp/rgw_spec.yaml and add the RGW deployment parameters:

    service_type: rgw
    service_id: rgw
    service_name: rgw.rgw
    placement:
      hosts:
        - <host_1>
        - <host_2>
        ...
        - <host_n>
    networks:
    - <storage_network>
    spec:
      rgw_frontend_port: 8082
      rgw_realm: default
      rgw_zone: default
    ---
    service_type: ingress
    service_id: rgw.default
    service_name: ingress.rgw.default
    placement:
      count: 1
    spec:
      backend_service: rgw.rgw
      frontend_port: 8080
      monitor_port: 8999
      virtual_ips_list:
      - <storage_network_vip>
      - <external_network_vip>
      virtual_interface_networks:
      - <storage_network>
    • Replace <host_1>, <host_2>, …, <host_n> with the name of the Ceph nodes where the RGW instances are deployed.
    • Replace <storage_network> with the network range used to resolve the interfaces where radosgw processes are bound.
    • Replace <storage_network_vip> with the virtual IP (VIP) used as the haproxy front end. This is the same address configured as the Object Storage service endpoint ($RGW_ENDPOINT) in the Configuring RGW authentication procedure.
    • Optional: Replace <external_network_vip> with an additional VIP on an external network to use as the haproxy front end. This address is used to connect to RGW from an external network.
  3. Save the file.
  4. Enter the cephadm shell and mount the rgw_spec.yaml file.

    $ cephadm shell -m /tmp/rgw_spec.yaml
  5. Add RGW related configuration to the cluster:

    $ ceph config set global rgw_keystone_url "https://<keystone_endpoint>"
    $ ceph config set global rgw_keystone_verify_ssl false
    $ ceph config set global rgw_keystone_api_version 3
    $ ceph config set global rgw_keystone_accepted_roles "member, Member, admin"
    $ ceph config set global rgw_keystone_accepted_admin_roles "ResellerAdmin, swiftoperator"
    $ ceph config set global rgw_keystone_admin_domain default
    $ ceph config set global rgw_keystone_admin_project service
    $ ceph config set global rgw_keystone_admin_user swift
    $ ceph config set global rgw_keystone_admin_password "$SWIFT_PASSWORD"
    $ ceph config set global rgw_keystone_implicit_tenants true
    $ ceph config set global rgw_s3_auth_use_keystone true
    $ ceph config set global rgw_swift_versioning_enabled true
    $ ceph config set global rgw_swift_enforce_content_length true
    $ ceph config set global rgw_swift_account_in_url true
    $ ceph config set global rgw_trust_forwarded_https true
    $ ceph config set global rgw_max_attr_name_len 128
    $ ceph config set global rgw_max_attrs_num_in_req 90
    $ ceph config set global rgw_max_attr_size 1024
    • Replace <keystone_endpoint> with the Identity service internal endpoint. The EDPM nodes are able to resolve the internal endpoint but not the public one. Do not omit the URIScheme from the URL, it must be either http:// or https://.
    • Replace <swift_password> with the password assigned to the swift user in the previous step.
  6. Deploy the RGW configuration using the Orchestrator:

    $ ceph orch apply -i /mnt/rgw_spec.yaml

2.7. Configuring RGW with TLS for an external Red Hat Ceph Storage cluster

Configure RGW with TLS so the control plane services can resolve the external Red Hat Ceph Storage cluster host names.

This procedure configures Ceph RGW to emulate the Object Storage service (swift). It creates a DNS zone and certificate so that a URL such as https://rgw-external.ceph.local:8080 is registered as an Identity service (keystone) endpoint. This enables Red Hat OpenStack Services on OpenShift (RHOSO) clients to resolve the host and trust the certificate.

Because a RHOSO pod needs to securely access an HTTPS endpoint hosted outside of Red Hat OpenShift Container Platform (RHOCP), this process is used to create a DNS domain and certificate for that endpoint.

During this procedure, a DNSData domain is created, ceph.local in the examples, so that pods can map host names to IP addresses for services that are not hosted on RHOCP. DNS forwarding is then configured for the domain with the CoreDNS service. Lastly, a certificate is created using the RHOSO public root certificate authority.

You must copy the certificate and key file created in RHOCP to the nodes hosting RGW so they can become part of the Ceph Orchestrator RGW specification.

Procedure

  1. Create a DNSData custom resource (CR) for the external Ceph cluster.

    Note

    Creating a DNSData CR creates a new dnsmasq pod that is able to read and resolve the DNS information in the associated DNSData CR.

    The following is an example of a DNSData CR:

    apiVersion: network.openstack.org/v1beta1
    kind: DNSData
    metadata:
      labels:
        component: ceph-storage
        service: ceph
      name: ceph-storage
      namespace: openstack
    spec:
      dnsDataLabelSelectorValue: dnsdata
      hosts:
        - hostnames:
          - ceph-rgw-internal-vip.ceph.local
          ip: 172.18.0.2
        - hostnames:
          - ceph-rgw-external-vip.ceph.local
          ip: 10.10.10.2
    Note

    In this example, it is assumed that the host at the IP address 172.18.0.2 hosts the Ceph RGW endpoint for access on the private storage network. This host passes the CR so that a DNS A and PTR record is created. This enables the host to be accessed by using the host name ceph-rgw-internal-vip.ceph.local.

    It is also assumed that the host at the IP address 10.10.10.2 hosts the Ceph RGW endpoint for access on the external network. This host passes the CR so that a DNS A and PTR record is created. This enables the host to be accessed by using the host name ceph-rgw-external-vip.ceph.local.

    The list of hosts in this example is not a definitive list of required hosts. It is provided for demonstration purposes. Substitute the appropriate hosts for your environment.

  2. Apply the CR to your environment:

    $ oc apply -f <ceph_dns_yaml>
    • Replace <ceph_dns_yaml> with the name of the DNSData CR file.
  3. Update the CoreDNS CR with a forwarder to the dnsmasq for requests to the ceph.local domain. For more information about DNS forwarding, see Using DNS forwarding in the RHOCP Networking guide.
  4. List the openstack domain DNS cluster IP:

    $ oc get svc dnsmasq-dns

    The following is an example output for this command:

    $ oc get svc dnsmasq-dns
    dnsmasq-dns     LoadBalancer   10.217.5.130   192.168.122.80    53:30185/UDP     160m
  5. Record the forwarding information from the command output.
  6. List the CoreDNS CR:

    $ oc -n openshift-dns describe dns.operator/default
  7. Edit the CoreDNS CR and update it with the forwarding information.

    The following is an example of a CoreDNS CR updated with forwarding information:

    apiVersion: operator.openshift.io/v1
    kind: DNS
    metadata:
      creationTimestamp: "2024-03-25T02:49:24Z"
      finalizers:
      - dns.operator.openshift.io/dns-controller
      generation: 3
      name: default
      resourceVersion: "164142"
      uid: 860b0e61-a48a-470e-8684-3b23118e6083
    spec:
      cache:
        negativeTTL: 0s
        positiveTTL: 0s
      logLevel: Normal
      nodePlacement: {}
      operatorLogLevel: Normal
      servers:
      - forwardPlugin:
          policy: Random
          upstreams:
          - 10.217.5.130:53
        name: ceph
        zones:
        - ceph.local
      upstreamResolvers:
        policy: Sequential
        upstreams:
        - port: 53
          type: SystemResolvConf

    The following is what has been added to the CR:

    ....
       servers:
      - forwardPlugin:
         policy: Random
         upstreams:
         - 10.217.5.130:53 1
        name: ceph
        zones:
        - ceph.local
    ....
    1
    The forwarding information recorded from the oc get svc dnsmasq-dns command.
  8. Create a Certificate CR with the host names from the DNSData CR.

    The following is an example of a Certificate CR:

    apiVersion: cert-manager.io/v1
    kind: Certificate
    metadata:
      name: cert-ceph-rgw
      namespace: openstack
    spec:
      duration: 43800h0m0s
      issuerRef: {'group': 'cert-manager.io', 'kind': 'Issuer', 'name': 'rootca-public'}
      secretName: cert-ceph-rgw
      dnsNames:
        - ceph-rgw-internal-vip.ceph.local
        - ceph-rgw-external-vip.ceph.local
    Note

    The certificate issuerRef is set to the root certificate authority (CA) of RHOSO. This CA is automatically created when the control plane is deployed. The default name of the CA is rootca-public. The RHOSO pods trust this new certificate because the root CA is used.

  9. Apply the CR to your environment:

    $ oc apply -f <ceph_cert_yaml>
    • Replace <ceph_cert_yaml> with the name of the Certificate CR file.
  10. Extract the certificate and key data from the secret created when the Certificate CR was applied:

    $ oc get secret <ceph_cert_secret_name> -o yaml
    • Replace <ceph_cert_secret_name> with the name used in the secretName field of your Certificate CR.

      Note

      This command outputs YAML with a data section that looks like the following:

      [stack@osp-storage-04 ~]$ oc get secret cert-ceph-rgw -o yaml
      apiVersion: v1
      data:
        ca.crt: <CA>
        tls.crt: <b64cert>
        tls.key: <b64key>
      kind: Secret

      The <b64cert> and <b64key> values are the base64-encoded certificate and key strings that you must use in the next step.

  11. Extract and base64 decode the certificate and key information obtained in the previous step and save a concatenation of them in the Ceph Object Gateway service specification.

    The rgw section of the the specification file looks like the following:

      service_type: rgw
      service_id: rgw
      service_name: rgw.rgw
      placement:
        hosts:
        - host1
        - host2
      networks:
        - 172.18.0.0/24
      spec:
        rgw_frontend_port: 8082
        rgw_realm: default
        rgw_zone: default
        ssl: true
        rgw_frontend_ssl_certificate: |
        -----BEGIN CERTIFICATE-----
        MIIDkzCCAfugAwIBAgIRAKNgGd++xV9cBOrwDAeEdQUwDQYJKoZIhvcNAQELBQAw
        <redacted>
        -----BEGIN RSA PRIVATE KEY-----
        MIIEpQIBAAKCAQEAyTL1XRJDcSuaBLpqasAuLsGU2LQdMxuEdw3tE5voKUNnWgjB
        <redacted>
        -----END RSA PRIVATE KEY-----

    The ingress section of the specification file looks like the following:

      service_type: ingress
      service_id: rgw.default
      service_name: ingress.rgw.default
      placement:
        count: 1
      spec:
        backend_service: rgw.rgw
        frontend_port: 8080
        monitor_port: 8999
        virtual_interface_networks:
        - 172.18.0.0/24
        virtual_ip: 172.18.0.2/24
        ssl_cert: |
        -----BEGIN CERTIFICATE-----
        MIIDkzCCAfugAwIBAgIRAKNgGd++xV9cBOrwDAeEdQUwDQYJKoZIhvcNAQELBQAw
        <redacted>
        -----BEGIN RSA PRIVATE KEY-----
        MIIEpQIBAAKCAQEAyTL1XRJDcSuaBLpqasAuLsGU2LQdMxuEdw3tE5voKUNnWgjB
        <redacted>
        -----END RSA PRIVATE KEY-----

    In the above examples, the rgw_frontend_ssl_certificate and ssl_cert contain the base64 decoded values from both the <b64cert> and <b64key> in the previous step with no spaces in between.

  12. Use the procedure Deploying the Ceph Object Gateway using the service specification to deploy Ceph RGW with SSL.
  13. Connect to the openstackclient pod.
  14. Verify that the forwarding information has been successfully updated.

    $ curl --trace - <host_name>
    • Replace <host_name> with the name of the external host previously added to the DNSData CR.

      Note

      The following is an example output from this command where the openstackclient pod successfully resolved the host name, and no SSL verification errors were encountered.

      sh-5.1$ curl https://rgw-external-vip.ceph.local:8080
      <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
      .1$
      sh-5.1$

2.8. Enabling deferred deletion for volumes or images with dependencies

When you use Ceph RBD as a back end for the Block Storage service (cinder) or the Image service (glance), you can enable deferred deletion in the Ceph RBD Clone v2 API.

With deferred deletion, you can delete a volume from the Block Storage service or an image from the Image service, even if Ceph RBD volumes or snapshots depend on them, for example, COW clones created in different storage pools by the Block Storage service or the Compute service (nova). The volume is deleted from the Block Storage service or the image is deleted from the Image service, but it is still stored in a trash area in Ceph RBD for dependencies. The volume or image is only deleted from Ceph RBD when there are no dependencies.

Limitations

  • When you enable Clone v2 deferred deletion in existing environments, the feature only applies to new volumes or images.

Procedure

  1. Verify which Ceph version the clients in your Ceph Storage cluster are running:

    $ cephadm shell -- ceph osd get-require-min-compat-client

    Example output:

    luminous
  2. To set the cluster to use the Clone v2 API and the deferred deletion feature by default, set min-compat-client to mimic. Only clients in the cluster that are running Ceph version 13.2.x (Mimic) can access images with dependencies:

    $ cephadm shell -- ceph osd set-require-min-compat-client mimic
  3. Schedule an interval for trash purge in minutes by using the m suffix:

    $ rbd trash purge schedule add --pool <pool> <30m>
    • Replace <pool> with the name of the associated storage pool, for example, volumes in the Block Storage service.
    • Replace <30m> with the interval in minutes that you want to specify for trash purge.
  4. Verify a trash purge schedule has been set for the pool:

    $ rbd trash purge schedule list --pool <pool>

2.9. Troubleshooting Red Hat Ceph Storage RBD integration

The Compute (nova), Block Storage (cinder), and Image (glance) services can integrate with Red Hat Ceph Storage RBD to use it as a storage back end. If this integration does not work as expected, you can perform an incremental troubleshooting procedure to progressively eliminate possible causes.

The following example shows how to troubleshoot an Image service integration. You can adapt the same steps to troubleshoot Compute and Block Storage service integrations.

Note

If you discover the cause of your issue before completing this procedure, it is not necessary to do any subsequent steps. You can exit this procedure and resolve the issue.

Procedure

  1. Determine if any parts of the control plane are not properly deployed by assessing whether the Ready condition is not True:

    $ oc get -n openstack OpenStackControlPlane \
      -o jsonpath="{range .items[0].status.conditions[?(@.status!='True')]}{.type} is {.status} due to {.message}{'\n'}{end}"
    1. If you identify a service that is not properly deployed, check the status of the service.

      The following example checks the status of the Compute service:

      $ oc get -n openstack Nova/nova \
        -o jsonpath="{range .status.conditions[?(@.status!='True')]}{.type} is {.status} due to {.message}{'\n'}{end}"
      Note

      You can check the status of all deployed services with the command oc get pods -n openstack and the logs of a specific service with the command oc logs -n openstack <service_pod_name>. Replace <service_pod_name> with the name of the service pod you want to check.

    2. If you identify an operator that is not properly deployed, check the status of the operator:

      $ oc get pods -n openstack-operators -lopenstack.org/operator-name
      Note

      Check the operator logs with the command oc logs -n openstack-operators -lopenstack.org/operator-name=<operator_name>.

  2. Check the Status of the data plane deployment:

    $ oc get -n openstack OpenStackDataPlaneDeployment
    1. If the Status of the data plane deployment is False, check the logs of the associated Ansible job:

      $ oc logs -n openstack job/<ansible_job_name>

      Replace <ansible_job_name> with the name of the associated job. The job name is listed in the Message field of oc get -n openstack OpenStackDataPlaneDeployment command.

  3. Check the Status of the data plane node set deployment:

    $ oc get -n openstack OpenStackDataPlaneNodeSet
    1. If the Status of the data plane node set deployment is False, check the logs of the associated Ansible job:

      $ oc logs -n openstack job/<ansible_job_name>
      • Replace <ansible_job_name> with the name of the associated job. It is listed in the Message field of oc get -n openstack OpenStackDataPlaneNodeSet command.
  4. If any pods are in the CrashLookBackOff state, you can duplicate them for troublehooting purposes with the oc debug command:

    oc debug <pod_name>

    Replace <pod_name> with the name of the pod to duplicate.

    Tip

    You can also use the oc debug command in the following object debugging activities:

    • To run /bin/sh on a container other than the first one, the commands default behavior, using the command form oc debug -container <pod_name> <container_name>. This is useful for pods like the API where the first container is tailing a file and the second container is the one you want to debug. If you use this command form, you must first use the command oc get pods | grep <search_string> to find the container name.
    • To route traffic to the pod during the debug process, use the command form oc debug <pod_name> --keep-labels=true.
    • To create any resource that creates pods such as Deployments, StatefulSets, and Nodes, use the command form oc debug <resource_type>/<resource_name>. An example of creating a StatefulSet would be oc debug StatefulSet/cinder-scheduler.
  5. Connect to the pod and confirm that the ceph.client.openstack.keyring and ceph.conf files are present in the /etc/ceph directory.

    Note

    If the pod is in a CrashLookBackOff state, use the oc debug command as described in the previous step to duplicate the pod and route traffic to it.

    $ oc rsh <pod_name>
    • Replace <pod_name> with the name of the applicable pod.

      Tip

      If the Ceph configuration files are missing, check the extraMounts parameter in your OpenStackControlPlane CR.

  6. Confirm the pod has a network connection to the Red Hat Ceph Storage cluster by connecting to the IP and port of a Ceph Monitor from the pod. The IP and port information is located in /etc/ceph.conf.

    The following is an example of this process:

    $ oc get pods | grep glance | grep external-api-0
    glance-06f7a-default-external-api-0                               3/3     Running     0              2d3h
    $ oc debug --container glance-api glance-06f7a-default-external-api-0
    Starting pod/glance-06f7a-default-external-api-0-debug-p24v9, command was: /usr/bin/dumb-init --single-child -- /bin/bash -c /usr/local/bin/kolla_set_configs && /usr/local/bin/kolla_start
    Pod IP: 192.168.25.50
    If you don't see a command prompt, try pressing enter.
    sh-5.1# cat /etc/ceph/ceph.conf
    # Ansible managed
    
    [global]
    
    fsid = 63bdd226-fbe6-5f31-956e-7028e99f1ee1
    mon host = [v2:192.168.122.100:3300/0,v1:192.168.122.100:6789/0],[v2:192.168.122.102:3300/0,v1:192.168.122.102:6789/0],[v2:192.168.122.101:3300/0,v1:192.168.122.101:6789/0]
    
    
    [client.libvirt]
    admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
    log file = /var/log/ceph/qemu-guest-$pid.log
    
    sh-5.1# python3
    Python 3.9.19 (main, Jul 18 2024, 00:00:00)
    [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import socket
    >>> s = socket.socket()
    >>> ip="192.168.122.100"
    >>> port=3300
    >>> s.connect((ip,port))
    >>>
    Tip

    Troubleshoot the network connection between the cluster and pod if you cannot connect to a Ceph Monitor. The previous example uses a Python socket to connect to the IP and port of the Red Hat Ceph Storage cluster from the ceph.conf file.

    There are two potential outcomes from the execution of the s.connect((ip,port)) function:

    • If the command executes successfully and there is no error similar to the following example, the network connection between the pod and cluster is functioning correctly. If the connection is functioning correctly, the command execution will provide no return value at all.
    • If the command takes a long time to execute and returns an error similar to the following example, the network connection between the pod and cluster is not functioning correctly. It should be investigated further to troubleshoot the connection.
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TimeoutError: [Errno 110] Connection timed out
  7. Examine the cephx key as shown in the following example:

    bash-5.1$ cat /etc/ceph/ceph.client.openstack.keyring
    [client.openstack]
       key = "<redacted>"
       caps mgr = allow *
       caps mon = profile rbd
       caps osd = profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=backups, profile rbd pool=images
    bash-5.1$
  8. List the contents of a pool from the caps osd parameter as shown in the following example:

    $ /usr/bin/rbd --conf /etc/ceph/ceph.conf \
    --keyring /etc/ceph/ceph.client.openstack.keyring \
    --cluster ceph --id openstack \
    ls -l -p <pool_name> | wc -l
    • Replace <pool_name> with the name of the required Red Hat Ceph Storage pool.

      Tip

      If this command returns the number 0 or greater, the cephx key provides adequate permissions to connect to, and read information from, the Red Hat Ceph Storage cluster.

      If this command does not complete but network connectivity to the cluster was confirmed, work with the Ceph administrator to obtain the correct cephx keyring.

      Additionally, it is possible there is an MTU mismatch on the Storage network. If the network is using jumbo frames (an MTU value of 9000), all switch ports between servers using the interface must be updated to support jumbo frames. If this change is not made on the switch, problems can occur at the Ceph application layer. Verify all hosts using the network can communicate at the desired MTU with a command such as ping -M do -s 8972 <ip_address>.

  9. Send test data to the images pool on the Ceph cluster.

    The following is an example of performing this task:

    # DATA=$(date | md5sum | cut -c-12)
    # POOL=images
    # RBD="/usr/bin/rbd --conf /etc/ceph/ceph.conf --keyring /etc/ceph/ceph.client.openstack.keyring --cluster ceph --id openstack"
    # $RBD create --size 1024 $POOL/$DATA
    Tip

    It is possible to be able to read data from the cluster but not have permissions to write data to it, even if write permission was granted in the cephx keyring. If write permissions have been granted but you cannot write data to the cluster, this may indicate the cluster is overloaded and not able to write new data.

    In the example, the rbd command did not complete successfully and was canceled. It was subsequently confirmed the cluster itself did not have the resources to write new data. The issue was resolved on the cluster itself. There was nothing incorrect with the client configuation.

2.10. Customizing and managing Red Hat Ceph Storage

Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 supports Red Hat Ceph Storage 7. For information on the customization and management of Red Hat Ceph Storage 7, refer to the Red Hat Ceph Storage documentation. The following guides contain key information and procedures for these tasks:

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.