이 콘텐츠는 선택한 언어로 제공되지 않습니다.
Deploying an overcloud with containerized Red Hat Ceph
Configuring the director to deploy and use a containerized Red Hat Ceph cluster
Abstract
Chapter 1. Introduction 링크 복사링크가 클립보드에 복사되었습니다!
Red Hat OpenStack Platform director creates a cloud environment called the overcloud. The director provides the ability to configure extra features for an overcloud, including integration with Red Hat Ceph Storage (both Ceph Storage clusters created with the director or existing Ceph Storage clusters).
This guide contains instructions for deploying a containerized Red Hat Ceph Storage cluster with your overcloud. Director uses Ansible playbooks provided through the ceph-ansible package to deploy a containerized Ceph cluster. The director also manages the configuration and scaling operations of the cluster.
For more information about containerized services in OpenStack, see Configuring a basic overcloud with the CLI tools in the Director Installation and Usage guide.
1.1. Introduction to Ceph Storage 링크 복사링크가 클립보드에 복사되었습니다!
Red Hat Ceph Storage is a distributed data object store designed to provide excellent performance, reliability, and scalability. Distributed object stores are the future of storage, because they accommodate unstructured data, and because clients can use modern object interfaces and legacy interfaces simultaneously. At the core of every Ceph deployment is the Ceph Storage cluster, which consists of several types of daemons, but primarily, these two:
- Ceph OSD (Object Storage Daemon)
- Ceph OSDs store data on behalf of Ceph clients. Additionally, Ceph OSDs utilize the CPU and memory of Ceph nodes to perform data replication, rebalancing, recovery, monitoring and reporting functions.
- Ceph Monitor
- A Ceph monitor maintains a master copy of the Ceph storage cluster map with the current state of the storage cluster.
For more information about Red Hat Ceph Storage, see the Red Hat Ceph Storage Architecture Guide.
1.2. Requirements 링크 복사링크가 클립보드에 복사되었습니다!
This guide contains information supplementary to the Director Installation and Usage guide.
Before you deploy a containerized Ceph Storage cluster with your overcloud, your environment must contain the following configuration:
- An undercloud host with the Red Hat OpenStack Platform director installed. See Installing director.
- Any additional hardware recommended for Red Hat Ceph Storage. For more information about recommended hardware, see the Red Hat Ceph Storage Hardware Guide.
The Ceph Monitor service installs on the overcloud Controller nodes, so you must provide adequate resources to avoid performance issues. Ensure that the Controller nodes in your environment use at least 16 GB of RAM for memory and solid-state drive (SSD) storage for the Ceph monitor data. For a medium to large Ceph installation, provide at least 500 GB of Ceph monitor data. This space is necessary to avoid levelDB growth if the cluster becomes unstable.
If you use the Red Hat OpenStack Platform director to create Ceph Storage nodes, note the following requirements.
1.2.1. Ceph Storage node requirements 링크 복사링크가 클립보드에 복사되었습니다!
Ceph Storage nodes are responsible for providing object storage in a Red Hat OpenStack Platform environment.
- Placement Groups (PGs)
- Ceph uses placement groups to facilitate dynamic and efficient object tracking at scale. In the case of OSD failure or cluster rebalancing, Ceph can move or replicate a placement group and its contents, which means a Ceph cluster can re-balance and recover efficiently. The default placement group count that director creates is not always optimal so it is important to calculate the correct placement group count according to your requirements. You can use the placement group calculator to calculate the correct count: Placement Groups (PGs) per Pool Calculator
- Processor
- 64-bit x86 processor with support for the Intel 64 or AMD64 CPU extensions.
- Memory
- Red Hat typically recommends a baseline of 16 GB of RAM per OSD host, with an additional 2 GB of RAM per OSD daemon.
- Disk layout
Sizing is dependent on your storage requirements. Red Hat recommends that your Ceph Storage node configuration includes three or more disks in a layout similar to the following example:
-
/dev/sda- The root disk. The director copies the main overcloud image to the disk. Ensure that the disk has a minimum of 40 GB of available disk space. -
/dev/sdb- The journal disk. This disk divides into partitions for Ceph OSD journals. For example,/dev/sdb1,/dev/sdb2, and/dev/sdb3. The journal disk is usually a solid state drive (SSD) to aid with system performance. /dev/sdcand onward - The OSD disks. Use as many disks as necessary for your storage requirements.NoteRed Hat OpenStack Platform director uses
ceph-ansible, which does not support installing the OSD on the root disk of Ceph Storage nodes. This means that you need at least two disks for a supported Ceph Storage node.
-
- Network Interface Cards
- A minimum of one 1 Gbps Network Interface Cards, although Red Hat recommends that you use at least two NICs in a production environment. Use additional network interface cards for bonded interfaces or to delegate tagged VLAN traffic. Red Hat recommends that you use a 10 Gbps interface for storage nodes, especially if you want to create an OpenStack Platform environment that serves a high volume of traffic.
- Power management
- Each Controller node requires a supported power management interface, such as Intelligent Platform Management Interface (IPMI) functionality on the motherboard of the server.
1.3. Additional resources 링크 복사링크가 클립보드에 복사되었습니다!
The /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml environment file instructs the director to use playbooks derived from the ceph-ansible project. These playbooks are installed in /usr/share/ceph-ansible/ of the undercloud. In particular, the following file contains all the default settings that the playbooks apply:
-
/usr/share/ceph-ansible/group_vars/all.yml.sample
While ceph-ansible uses playbooks to deploy containerized Ceph Storage, do not edit these files to customize your deployment. Instead, use heat environment files to override the defaults set by these playbooks. If you edit the ceph-ansible playbooks directly, your deployment will fail.
For more information about the playbook collection, see the documentation for this project (http://docs.ceph.com/ceph-ansible/master/) to learn more about the playbook collection.
Alternatively, for information about the default settings applied by director for containerized Ceph Storage, see the heat templates in /usr/share/openstack-tripleo-heat-templates/deployment/ceph-ansible.
Reading these templates requires a deeper understanding of how environment files and heat templates work in director. See Understanding Heat Templates and Environment Files for reference.
Lastly, for more information about containerized services in OpenStack, see Configuring a basic overcloud with the CLI tools in the Director Installation and Usage guide.
Chapter 2. Preparing overcloud nodes 링크 복사링크가 클립보드에 복사되었습니다!
All nodes in this scenario are bare metal systems using IPMI for power management. These nodes do not require an operating system because the director copies a Red Hat Enterprise Linux 8 image to each node. Additionally, the Ceph Storage services on these nodes are containerized. The director communicates to each node through the Provisioning network during the introspection and provisioning processes. All nodes connect to this network through the native VLAN.
2.1. Cleaning Ceph Storage node disks 링크 복사링크가 클립보드에 복사되었습니다!
The Ceph Storage OSDs and journal partitions require GPT disk labels. This means the additional disks on Ceph Storage require conversion to GPT before installing the Ceph OSD services. You must delete all metadata from the disks to allow the director to set GPT labels on them.
You can configure the director to delete all disk metadata by default by adding the following setting to your /home/stack/undercloud.conf file:
clean_nodes=true
clean_nodes=true
With this option, the Bare Metal Provisioning service runs an additional step to boot the nodes and clean the disks each time the node is set to available. This process adds an additional power cycle after the first introspection and before each deployment. The Bare Metal Provisioning service uses the wipefs --force --all command to perform the clean.
After setting this option, run the openstack undercloud install command to execute this configuration change.
The wipefs --force --all command deletes all data and metadata on the disk, but does not perform a secure erase. A secure erase takes much longer.
2.2. Registering nodes 링크 복사링크가 클립보드에 복사되었습니다!
Import a node inventory file (instackenv.json) in JSON format to the director so that the director can communicate with the nodes. This inventory file contains hardware and power management details that the director can use to register nodes:
Procedure
-
After you create the inventory file, save the file to the home directory of the stack user (
/home/stack/instackenv.json). Initialize the stack user, then import the
instackenv.jsoninventory file into the director:source ~/stackrc openstack overcloud node import ~/instackenv.json
$ source ~/stackrc $ openstack overcloud node import ~/instackenv.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
openstack overcloud node importcommand imports the inventory file and registers each node with the director.- Assign the kernel and ramdisk images to each node:
openstack overcloud node configure <node>
$ openstack overcloud node configure <node>
The nodes are now registered and configured in the director.
2.3. Manually tagging nodes into profiles 링크 복사링크가 클립보드에 복사되었습니다!
After you register each node, you must inspect the hardware and tag the node into a specific profile. Use profile tags to match your nodes to flavors, and then assign flavors to deployment roles.
To inspect and tag new nodes, complete the following steps:
Trigger hardware introspection to retrieve the hardware attributes of each node:
openstack overcloud node introspect --all-manageable --provide
$ openstack overcloud node introspect --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
The
--all-manageableoption introspects only the nodes that are in a managed state. In this example, all nodes are in a managed state. The
--provideoption resets all nodes to anactivestate after introspection.ImportantEnsure that this process completes successfully. This process usually takes 15 minutes for bare metal nodes.
-
The
Retrieve a list of your nodes to identify their UUIDs:
openstack baremetal node list
$ openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add a profile option to the
properties/capabilitiesparameter for each node to manually tag a node to a specific profile. The addition of theprofileoption tags the nodes into each respective profile.NoteAs an alternative to manual tagging, use the Automated Health Check (AHC) Tools to automatically tag larger numbers of nodes based on benchmarking data.
For example, a typical deployment contains three profiles:
control,compute, andceph-storage. Enter the following commands to tag three nodes for each profile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can also configure a new custom profile to tag a node for the Ceph MON and Ceph MDS services, see Chapter 3, Deploying Ceph services on dedicated nodes.
2.4. Defining the root disk for multi-disk clusters 링크 복사링크가 클립보드에 복사되었습니다!
Director must identify the root disk during provisioning in the case of nodes with multiple disks. For example, most Ceph Storage nodes use multiple disks. By default, the director writes the overcloud image to the root disk during the provisioning process
There are several properties that you can define to help the director identify the root disk:
-
model(String): Device identifier. -
vendor(String): Device vendor. -
serial(String): Disk serial number. -
hctl(String): Host:Channel:Target:Lun for SCSI. -
size(Integer): Size of the device in GB. -
wwn(String): Unique storage identifier. -
wwn_with_extension(String): Unique storage identifier with the vendor extension appended. -
wwn_vendor_extension(String): Unique vendor storage identifier. -
rotational(Boolean): True for a rotational device (HDD), otherwise false (SSD). -
name(String): The name of the device, for example: /dev/sdb1.
Use the name property only for devices with persistent names. Do not use name to set the root disk for any other devices because this value can change when the node boots.
Complete the following steps to specify the root device using its serial number.
Procedure
Check the disk information from the hardware introspection of each node. Run the following command to display the disk information of a node:
(undercloud) $ openstack baremetal introspection data save 1a4e30da-b6dc-499d-ba87-0bd8a3819bc0 | jq ".inventory.disks"
(undercloud) $ openstack baremetal introspection data save 1a4e30da-b6dc-499d-ba87-0bd8a3819bc0 | jq ".inventory.disks"Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, the data for one node might show three disks:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
openstack baremetal node set --property root_device=command to set the root disk for a node. Include the most appropriate hardware attribute value to define the root disk.(undercloud) $ openstack baremetal node set --property root_device=’{“serial”:”<serial_number>”}' <node-uuid>(undercloud) $ openstack baremetal node set --property root_device=’{“serial”:”<serial_number>”}' <node-uuid>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to set the root device to disk 2, which has the serial number
61866da04f380d001ea4e13c12e36ad6run the following command:
(undercloud) $ openstack baremetal node set --property root_device='{"serial": "61866da04f380d001ea4e13c12e36ad6"}' 1a4e30da-b6dc-499d-ba87-0bd8a3819bc0
(undercloud) $ openstack baremetal node set --property root_device='{"serial": "61866da04f380d001ea4e13c12e36ad6"}' 1a4e30da-b6dc-499d-ba87-0bd8a3819bc0
+
Ensure that you configure the BIOS of each node to include booting from the root disk that you choose. Configure the boot order to boot from the network first, then to boot from the root disk.
Director identifies the specific disk to use as the root disk. When you run the openstack overcloud deploy command, director provisions and writes the overcloud image to the root disk.
2.5. Using the overcloud-minimal image to avoid using a Red Hat subscription entitlement 링크 복사링크가 클립보드에 복사되었습니다!
By default, director writes the QCOW2 overcloud-full image to the root disk during the provisioning process. The overcloud-full image uses a valid Red Hat subscription. However, you can also use the overcloud-minimal image, for example, to provision a bare OS where you do not want to run any other OpenStack services and consume your subscription entitlements.
A common use case for this occurs when you want to provision nodes with only Ceph daemons. For this and similar use cases, you can use the overcloud-minimal image option to avoid reaching the limit of your paid Red Hat subscriptions. For information about how to obtain the overcloud-minimal image, see Obtaining images for overcloud nodes.
A Red Hat OpenStack Platform subscription contains Open vSwitch (OVS), but core services, such as OVS, are not available when you use the overcloud-minimal image. OVS is not required to deploy Ceph Storage nodes. Instead of using 'ovs_bond' to define bonds, use 'linux_bond'. For more information about linux_bond, see Linux bonding options.
Procedure
To configure director to use the
overcloud-minimalimage, create an environment file that contains the following image definition:parameter_defaults: <roleName>Image: overcloud-minimal
parameter_defaults: <roleName>Image: overcloud-minimalCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<roleName>with the name of the role and appendImageto the name of the role. The following example shows anovercloud-minimalimage for Ceph storage nodes:parameter_defaults: CephStorageImage: overcloud-minimal
parameter_defaults: CephStorageImage: overcloud-minimalCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Pass the environment file to the
openstack overcloud deploycommand.
The overcloud-minimal image supports only standard Linux bridges and not OVS because OVS is an OpenStack service that requires a Red Hat OpenStack Platform subscription entitlement.
Chapter 3. Deploying Ceph services on dedicated nodes 링크 복사링크가 클립보드에 복사되었습니다!
By default, the director deploys the Ceph MON and Ceph MDS services on the Controller nodes. This is suitable for small deployments. However, with larger deployments Red Hat recommends that you deploy the Ceph MON and Ceph MDS services on dedicated nodes to improve the performance of your Ceph cluster. Create a custom role for services that you want to isolate on dedicated nodes.
For more information about custom roles, see Creating a New Role in the Advanced Overcloud Customization guide.
The director uses the following file as a default reference for all overcloud roles:
-
/usr/share/openstack-tripleo-heat-templates/roles_data.yaml
3.1. Creating a custom roles file 링크 복사링크가 클립보드에 복사되었습니다!
To create a custom role file, complete the following steps:
Procedure
Make a copy of the
roles_data.yamlfile in/home/stack/templates/so that you can add custom roles:cp /usr/share/openstack-tripleo-heat-templates/roles_data.yaml /home/stack/templates/roles_data_custom.yaml
$ cp /usr/share/openstack-tripleo-heat-templates/roles_data.yaml /home/stack/templates/roles_data_custom.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Include the new custom role file in the
openstack overcloud deploycommand.
3.2. Creating a custom role and flavor for the Ceph MON service 링크 복사링크가 클립보드에 복사되었습니다!
Complete the following steps to create a custom role CephMon and flavor ceph-mon for the Ceph MON role. You must already have a copy of the default roles data file as described in Chapter 3, Deploying Ceph services on dedicated nodes.
Procedure
-
Open the
/home/stack/templates/roles_data_custom.yamlfile. - Remove the service entry for the Ceph MON service (namely, OS::TripleO::Services::CephMon) from the Controller role.
Add the OS::TripleO::Services::CephClient service to the Controller role:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow At the end of the
roles_data_custom.yamlfile, add a customCephMonrole that contains the Ceph MON service and all the other required node services:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
openstack flavor createcommand to define a new flavor namedceph-monfor theCephMonrole:openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 ceph-mon
$ openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 ceph-monCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor details about this command, run
openstack flavor create --help.Map this flavor to a new profile, also named
ceph-mon:openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="ceph-mon" ceph-mon
$ openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="ceph-mon" ceph-monCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor details about this command, run
openstack flavor set --help.Tag nodes into the new
ceph-monprofile:openstack baremetal node set UUID --property capabilities="profile:ceph-mon,boot_option:local"
$ openstack baremetal node set UUID --property capabilities="profile:ceph-mon,boot_option:local"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the following configuration to the
node-info.yamlfile to associate theceph-monflavor with the CephMon role:parameter_defaults: OvercloudCephMonFlavor: CephMon CephMonCount: 3
parameter_defaults: OvercloudCephMonFlavor: CephMon CephMonCount: 3Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For more information about tagging nodes, see Section 2.3, “Manually tagging nodes into profiles”. For more information about custom role profiles, see Tagging Nodes Into Profiles.
3.3. Creating a custom role and flavor for the Ceph MDS service 링크 복사링크가 클립보드에 복사되었습니다!
Complete the following steps to create a custom role CephMDS and flavor ceph-mds for the Ceph MDS role. You must already have a copy of the default roles data file as described in Chapter 3, Deploying Ceph services on dedicated nodes.
Procedure
-
Open the
/home/stack/templates/roles_data_custom.yamlfile. Remove the service entry for the Ceph MDS service (namely, OS::TripleO::Services::CephMds) from the Controller role:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Comment out this line. In the next step, you add this service to the new custom role.
At the end of the
roles_data_custom.yamlfile, add a customCephMDSrole containing the Ceph MDS service and all the other required node services:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The Ceph MDS service requires the admin keyring, which you can set with either the Ceph MON or Ceph Client service. If you deploy Ceph MDS on a dedicated node without the Ceph MON service, you must also include the Ceph Client service in the new
CephMDSrole.
Run the
openstack flavor createcommand to define a new flavor namedceph-mdsfor this role:openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 ceph-mds
$ openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 ceph-mdsCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor details about this command, run
openstack flavor create --help.Map the new
ceph-mdsflavor to a new profile, also namedceph-mds:openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="ceph-mds" ceph-mds
$ openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="ceph-mds" ceph-mdsCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor details about this command, run
openstack flavor set --help.-
Tag nodes into the new
ceph-mdsprofile:
openstack baremetal node set UUID --property capabilities="profile:ceph-mds,boot_option:local"
$ openstack baremetal node set UUID --property capabilities="profile:ceph-mds,boot_option:local"
For more information about tagging nodes, see Section 2.3, “Manually tagging nodes into profiles”. For more information about custom role profiles, see Tagging Nodes Into Profiles.
Chapter 4. Customizing the Storage service 링크 복사링크가 클립보드에 복사되었습니다!
The heat template collection provided by the director already contains the necessary templates and environment files to enable a basic Ceph Storage configuration.
The director uses the /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml environment file to create a Ceph cluster and integrate it with your overcloud during deployment. This cluster features containerized Ceph Storage nodes. For more information about containerized services in OpenStack, see Configuring a basic overcloud with the CLI tools in the Director Installation and Usage guide.
The Red Hat OpenStack director also applies basic, default settings to the deployed Ceph cluster. You must also define any additional configuration in a custom environment file:
Procedure
-
Create the file
storage-config.yamlin/home/stack/templates/. In this example, the~/templates/storage-config.yamlfile contains most of the overcloud-related custom settings for your environment. Parameters that you include in the custom environment file override the corresponding default settings from the/usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yamlfile. Add a
parameter_defaultssection to~/templates/storage-config.yaml. This section contains custom settings for your overcloud. For example, to setvxlanas the network type of the networking service (neutron), add the following snippet to your custom environment file:parameter_defaults: NeutronNetworkType: vxlan
parameter_defaults: NeutronNetworkType: vxlanCopy to Clipboard Copied! Toggle word wrap Toggle overflow If necessary, set the following options under
parameter_defaultsaccording to your requirements:Expand Option Description Default value CinderEnableIscsiBackend
Enables the iSCSI backend
false
CinderEnableRbdBackend
Enables the Ceph Storage back end
true
CinderBackupBackend
Sets ceph or swift as the back end for volume backups. For more information, see Section 4.2.1, “Configuring the Backup Service to use Ceph”.
ceph
NovaEnableRbdBackend
Enables Ceph Storage for Nova ephemeral storage
true
GlanceBackend
Defines which back end the Image service should use:
rbd(Ceph),swift, orfilerbd
GnocchiBackend
Defines which back end the Telemetry service should use:
rbd(Ceph),swift, orfilerbd
NoteYou can omit an option from
~/templates/storage-config.yamlif you intend to use the default setting.
The contents of your custom environment file change depending on the settings that you apply in the following sections. See Appendix A, Sample environment file: creating a Ceph Storage cluster for a completed example.
The following subsections contain information about overriding the common default storage service settings that the director applies.
4.1. Enabling the Ceph Metadata Server 링크 복사링크가 클립보드에 복사되었습니다!
The Ceph Metadata Server (MDS) runs the ceph-mds daemon, which manages metadata related to files stored on CephFS. CephFS can be consumed through NFS. For more information about using CephFS through NFS, see File System Guide and Deploying the Shared File Systems service with CephFS through NFS.
Red Hat supports deploying Ceph MDS only with the CephFS through NFS back end for the Shared File Systems service.
Procedure
To enable the Ceph Metadata Server, invoke the following environment file when you create your overcloud:
-
/usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-mds.yaml
For more information, see Section 7.2, “Initiating overcloud deployment”. For more information about the Ceph Metadata Server, see Configuring Metadata Server Daemons.
By default, the Ceph Metadata Server is deployed on the Controller node. You can deploy the Ceph Metadata Server on its own dedicated node. For more information, see Section 3.3, “Creating a custom role and flavor for the Ceph MDS service”.
4.2. Enabling the Ceph Object Gateway 링크 복사링크가 클립보드에 복사되었습니다!
The Ceph Object Gateway (RGW) provides applications with an interface to object storage capabilities within a Ceph Storage cluster. When you deploy RGW, you can replace the default Object Storage service (swift) with Ceph. For more information, see Object Gateway Configuration and Administration Guide.
Procedure
To enable RGW in your deployment, invoke the following environment file when you create the overcloud:
-
/usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml
For more information, see Section 7.2, “Initiating overcloud deployment”.
By default, Ceph Storage allows 250 placement groups per OSD. When you enable RGW, Ceph Storage creates six additional pools that are required by RGW. The new pools are:
- .rgw.root
- default.rgw.control
- default.rgw.meta
- default.rgw.log
- default.rgw.buckets.index
- default.rgw.buckets.data
In your deployment, default is replaced with the name of the zone to which the pools belongs.
Therefore, when you enable RGW, be sure to set the default pg_num using the CephPoolDefaultPgNum parameter to account for the new pools. For more information about how to calculate the number of placement groups for Ceph pools, see Section 5.4, “Assigning custom attributes to different Ceph pools”.
The Ceph Object Gateway is a direct replacement for the default Object Storage service. As such, all other services that normally use swift can seamlessly start using the Ceph Object Gateway instead without further configuration.
4.2.1. Configuring the Backup Service to use Ceph 링크 복사링크가 클립보드에 복사되었습니다!
The Block Storage Backup service (cinder-backup) is disabled by default. To enable the Block Storage Backup service, complete the following steps:
Procedure
Invoke the following environment file when you create your overcloud:
-
/usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml
4.3. Configuring multiple bonded interfaces for Ceph nodes 링크 복사링크가 클립보드에 복사되었습니다!
Use a bonded interface to combine multiple NICs and add redundancy to a network connection. If you have enough NICs on your Ceph nodes, you can create multiple bonded interfaces on each node to expand redundancy capability.
You can then use a bonded interface for each network connection that the node requires. This provides both redundancy and a dedicated connection for each network.
The simplest implementation of bonded interfaces involves the use of two bonds, one for each storage network used by the Ceph nodes. These networks are the following:
- Front-end storage network (
StorageNet) - The Ceph client uses this network to interact with the corresponding Ceph cluster.
- Back-end storage network (
StorageMgmtNet) - The Ceph cluster uses this network to balance data in accordance with the placement group policy of the cluster. For more information, see Placement Groups (PG) in the in the Red Hat Ceph Architecture Guide.
To configure multiple bonded interfaces, you must create a new network interface template, as the director does not provide any sample templates that you can use to deploy multiple bonded NICs. However, the director does provide a template that deploys a single bonded interface. This template is /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml. You can define an additional bonded interface for your additional NICs in this template.
For more information about creating custom interface templates, Creating Custom Interface Templates in the Advanced Overcloud Customization guide.
The following snippet contains the default definition for the single bonded interface defined in the /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml file:
- 1
- A single bridge named
br-bondholds the bond defined in this template. This line defines the bridge type, namely OVS. - 2
- The first member of the
br-bondbridge is the bonded interface itself, namedbond1. This line defines the bond type ofbond1, which is also OVS. - 3
- The default bond is named
bond1. - 4
- The
ovs_optionsentry instructs director to use a specific set of bonding module directives. Those directives are passed through theBondInterfaceOvsOptions, which you can also configure in this file. For more information about configuring bonding module directives, see Section 4.3.1, “Configuring bonding module directives”. - 5
- The
memberssection of the bond defines which network interfaces are bonded bybond1. In this example, the bonded interface usesnic2(set as the primary interface) andnic3. - 6
- The
br-bondbridge has two other members: a VLAN for both front-end (StorageNetwork) and back-end (StorageMgmtNetwork) storage networks. - 7
- The
deviceparameter defines which device a VLAN should use. In this example, both VLANs use the bonded interface,bond1.
With at least two more NICs, you can define an additional bridge and bonded interface. Then, you can move one of the VLANs to the new bonded interface, which increases throughput and reliability for both storage network connections.
When you customize the /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml file for this purpose, Red Hat recommends that you use Linux bonds (type: linux_bond ) instead of the default OVS (type: ovs_bond). This bond type is more suitable for enterprise production deployments.
The following edited snippet defines an additional OVS bridge (br-bond2) which houses a new Linux bond named bond2. The bond2 interface uses two additional NICs, nic4 and nic5, and is used solely for back-end storage network traffic:
- 1
- As
bond1andbond2are both Linux bonds (instead of OVS), they usebonding_optionsinstead ofovs_optionsto set bonding directives. For more information, see Section 4.3.1, “Configuring bonding module directives”.
For the full contents of this customized template, see Appendix B, Sample custom interface template: multiple bonded interfaces.
4.3.1. Configuring bonding module directives 링크 복사링크가 클립보드에 복사되었습니다!
After you add and configure the bonded interfaces, use the BondInterfaceOvsOptions parameter to set the directives that you want each bonded interface to use. You can find this information in the parameters: section of the /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml file. The following snippet shows the default definition of this parameter (namely, empty):
Define the options you need in the default: line. For example, to use 802.3ad (mode 4) and a LACP rate of 1 (fast), use 'mode=4 lacp_rate=1':
For more information about other supported bonding options, see Open vSwitch Bonding Options in the Advanced Overcloud Optimization guide. For the full contents of the customized /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml template, see Appendix B, Sample custom interface template: multiple bonded interfaces.
Chapter 5. Customizing the Ceph Storage cluster 링크 복사링크가 클립보드에 복사되었습니다!
Director deploys containerized Red Hat Ceph Storage using a default configuration. You can customize Ceph Storage by overriding the default settings.
Prerequistes
To deploy containerized Ceph Storage you must include the /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml file during overcloud deployment. This environment file defines the following resources:
-
CephAnsibleDisksConfig- This resource maps the Ceph Storage node disk layout. For more information, see Section 5.3, “Mapping the Ceph Storage node disk layout”. -
CephConfigOverrides- This resource applies all other custom settings to your Ceph Storage cluster.
Use these resources to override any defaults that the director sets for containerized Ceph Storage.
Procedure
Enable the Red Hat Ceph Storage 4 Tools repository:
sudo subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms
$ sudo subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpmsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Install the
ceph-ansiblepackage on the undercloud:sudo dnf install ceph-ansible
$ sudo dnf install ceph-ansibleCopy to Clipboard Copied! Toggle word wrap Toggle overflow To customize your Ceph Storage cluster, define custom parameters in a new environment file, for example,
/home/stack/templates/ceph-config.yaml. You can apply Ceph Storage cluster settings with the following syntax in theparameter_defaultssection of your environment file:parameter_defaults: CephConfigOverrides: section: KEY:VALUEparameter_defaults: CephConfigOverrides: section: KEY:VALUECopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou can apply the
CephConfigOverridesparameter to the[global]section of theceph.conffile, as well as any other section, such as[osd],[mon], and[client]. If you specify a section, thekey:valuedata goes into the specified section. If you do not specify a section, the data goes into the[global]section by default. For information about Ceph Storage configuration, customization, and supported parameters, see Red Hat Ceph Storage Configuration Guide.Replace
KEYandVALUEwith the Ceph cluster settings that you want to apply. For example, in theglobalsection,max_open_filesis theKEYand131072is the correspondingVALUE:Copy to Clipboard Copied! Toggle word wrap Toggle overflow This configuration results in the following settings defined in the configuration file of your Ceph cluster:
[global] max_open_files = 131072 [osd] osd_scrub_during_recovery = false
[global] max_open_files = 131072 [osd] osd_scrub_during_recovery = falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.1. Setting ceph-ansible group variables 링크 복사링크가 클립보드에 복사되었습니다!
The ceph-ansible tool is a playbook used to install and manage Ceph Storage clusters.
The ceph-ansible tool has a group_vars directory that defines configuration options and the default settings for those options. Use the group_vars directory to set Ceph Storage parameters.
For information about the group_vars directory, see Installing a Red Hat Ceph Storage cluster in the Installation Guide.
To change the variable defaults in director, use the CephAnsibleExtraConfig parameter to pass the new values in heat environment files. For example, to set the ceph-ansible group variable journal_size to 40960, create an environment file with the following journal_size definition:
parameter_defaults:
CephAnsibleExtraConfig:
journal_size: 40960
parameter_defaults:
CephAnsibleExtraConfig:
journal_size: 40960
Change ceph-ansible group variables with the override parameters; do not edit group variables directly in the /usr/share/ceph-ansible directory on the undercloud.
5.2. Ceph containers for Red Hat OpenStack Platform with Ceph Storage 링크 복사링크가 클립보드에 복사되었습니다!
A Ceph container is required to configure Red Hat OpenStack Platform (RHOSP) to use Ceph, even with an external Ceph cluster. To be compatible with Red Hat Enterprise Linux 8, RHOSP 16.0 requires Red Hat Ceph Storage 4. The Ceph Storage 4 container is hosted at registry.redhat.io, a registry which requires authentication.
You can use the heat environment parameter ContainerImageRegistryCredentials to authenticate at registry.redhat.io, as described in Container image preparation parameters.
5.3. Mapping the Ceph Storage node disk layout 링크 복사링크가 클립보드에 복사되었습니다!
When you deploy containerized Ceph Storage, you must map the disk layout and specify dedicated block devices for the Ceph OSD service. You can perform this mapping in the environment file that you created earlier to define your custom Ceph parameters: /home/stack/templates/ceph-config.yaml.
Use the CephAnsibleDisksConfig resource in parameter_defaults to map your disk layout. This resource uses the following variables:
| Variable | Required? | Default value (if unset) | Description |
|---|---|---|---|
| osd_scenario | Yes | lvm
NOTE: The default value is |
The |
| devices | Yes | NONE. Variable must be set. | A list of block devices that you want to use for OSDs on the node. |
| dedicated_devices |
Yes (only if | devices |
A list of block devices that maps each entry in the |
| dmcrypt | No | false |
Sets whether data stored on OSDs is encrypted ( |
| osd_objectstore | No | bluestore
NOTE: The default value is | Sets the storage back end used by Ceph. |
5.3.1. Using BlueStore 링크 복사링크가 클립보드에 복사되었습니다!
To specify the block devices that you want to use as Ceph OSDs, use a variation of the following snippet:
Because /dev/nvme0n1 is in a higher performing device class, the example parameter defaults produce three OSDs that run on /dev/sdb, /dev/sdc, and /dev/sdd. The three OSDs use /dev/nvme0n1 as a BlueStore WAL device. The ceph-volume tool does this by using the batch subcommand. The same setup is duplicated for each Ceph storage node and assumes uniform hardware. If the BlueStore WAL data resides on the same disks as the OSDs, then change the parameter defaults:
5.3.2. Referring to devices with persistent names 링크 복사링크가 클립보드에 복사되었습니다!
In some nodes, disk paths, such as /dev/sdb and /dev/sdc, may not point to the same block device during reboots. If this is the case with your CephStorage nodes, specify each disk with the /dev/disk/by-path/ symlink to ensure that the block device mapping is consistent throughout deployments:
Because you must set the list of OSD devices prior to overcloud deployment, it may not be possible to identify and set the PCI path of disk devices. In this case, gather the /dev/disk/by-path/symlink data for block devices during introspection.
In the following example, run the first command to download the introspection data from the undercloud Object Storage service (swift) for the server b08-h03-r620-hci and saves the data in a file called b08-h03-r620-hci.json. Run the second command to grep for “by-path”. The output of this command contains the unique /dev/disk/by-path values that you can use to identify disks.
For more information about naming conventions for storage devices, see Overview of persistent naming attributes in the Managing storage devices guide.
For details about each journaling scenario and disk mapping for containerized Ceph Storage, see the OSD Scenarios section of the project documentation for ceph-ansible.
5.4. Assigning custom attributes to different Ceph pools 링크 복사링크가 클립보드에 복사되었습니다!
By default, Ceph pools created with director have the same number of placement groups (pg_num and pgp_num) and sizes. You can use either method in Chapter 5, Customizing the Ceph Storage cluster to override these settings globally; that is, doing so applies the same values for all pools.
You can also apply different attributes to each Ceph pool. To do so, use the CephPools parameter:
parameter_defaults:
CephPools:
- name: POOL
pg_num: 128
application: rbd
parameter_defaults:
CephPools:
- name: POOL
pg_num: 128
application: rbd
Replace POOL with the name of the pool that you want to configure and the pg_num setting to indicate the number of placement groups. This overrides the default pg_num for the specified pool.
If you use the CephPools parameter, you must also specify the application type. The application type for Compute, Block Storage, and Image Storage should be rbd, as shown in the examples, but depending on what the pool is used for, you might need to specify a different application type. For example, the application type for the gnocchi metrics pool is openstack_gnocchi. For more information, see Enable Application in the Storage Strategies Guide .
If you do not use the CephPools parameter, director sets the appropriate application type automatically, but only for the default pool list.
You can also create new custom pools through the CephPools parameter. For example, to add a pool called custompool:
parameter_defaults:
CephPools:
- name: custompool
pg_num: 128
application: rbd
parameter_defaults:
CephPools:
- name: custompool
pg_num: 128
application: rbd
This creates a new custom pool in addition to the default pools.
For typical pool configurations of common Ceph use cases, see the Ceph Placement Groups (PGs) per Pool Calculator. This calculator is normally used to generate the commands for manually configuring your Ceph pools. In this deployment, the director will configure the pools based on your specifications.
Red Hat Ceph Storage 3 (Luminous) introduced a hard limit on the maximum number of PGs an OSD can have, which is 200 by default. Do not override this parameter beyond 200. If there is a problem because the Ceph PG number exceeds the maximum, adjust the pg_num per pool to address the problem, not the mon_max_pg_per_osd.
5.5. Mapping the disk layout to non-homogeneous Ceph Storage nodes 링크 복사링크가 클립보드에 복사되었습니다!
By default, all nodes of a role that host Ceph OSDs (indicated by the OS::TripleO::Services::CephOSD service in roles_data.yaml), for example CephStorage or ComputeHCI nodes, use the global devices and dedicated_devices lists set in Section 5.3, “Mapping the Ceph Storage node disk layout”. This assumes that all of these servers have homogeneous hardware. If a subset of these servers do not have homogeneous hardware, then director needs to be aware that each of these servers has different devices and dedicated_devices lists. This is known as a node-specific disk configuration.
To pass a node-specific disk configuration to director, you must pass a heat environment file, such as node-spec-overrides.yaml, to the openstack overcloud deploy command and the file content must identify each server by a machine unique UUID and a list of local variables to override the global variables.
You can extract the machine unique UUID for each individual server or from the Ironic database.
To locate the UUID for an individual server, log in to the server and enter the following command:
dmidecode -s system-uuid
dmidecode -s system-uuid
To extract the UUID from the Ironic database, enter the following command on the undercloud:
openstack baremetal introspection data save NODE-ID | jq .extra.system.product.uuid
openstack baremetal introspection data save NODE-ID | jq .extra.system.product.uuid
If the undercloud.conf does not have inspection_extras = true before undercloud installation or upgrade and introspection, then the machine unique UUID is not in the Ironic database.
The machine unique UUID is not the Ironic UUID.
A valid node-spec-overrides.yaml file might look like the following:
parameter_defaults:
NodeDataLookup: {"32E87B4C-C4A7-418E-865B-191684A6883B": {"devices": ["/dev/sdc"]}}
parameter_defaults:
NodeDataLookup: {"32E87B4C-C4A7-418E-865B-191684A6883B": {"devices": ["/dev/sdc"]}}
All lines after the first two lines must be valid JSON. An easy way to verify that the JSON is valid is to use the jq command:
-
Remove the first two lines (
parameter_defaults:andNodeDataLookup:) from the file temporarily. -
Enter
cat node-spec-overrides.yaml | jq .
As the node-spec-overrides.yaml file grows, jq might also be used to ensure that the embedded JSON is valid. For example, because the devices and dedicated_devices list must be the same length, use the following command to verify that they are the same length before you start the deployment.
In the above example, the node-spec-c05-h17-h21-h25-6048r.yaml has three servers in rack c05 in which slots h17, h21, and h25 are missing disks. A more complicated example is included at the end of this section.
After the JSON has been validated add back the two lines which makes it a valid environment YAML file (parameter_defaults: and NodeDataLookup:) and include it with a -e in the deployment.
In the example below, the updated heat environment file uses NodeDataLookup for Ceph deployment. All of the servers had a devices list with 35 disks except one of them had a disk missing. This environment file overrides the default devices list for only that single node and gives it the list of 34 disks it must use instead of the global list.
5.6. Increasing the restart delay for large Ceph clusters 링크 복사링크가 클립보드에 복사되었습니다!
During deployment, Ceph services such as OSDs and Monitors, are restarted and the deployment does not continue until the service is running again. Ansible waits 15 seconds (the delay) and checks 5 times for the service to start (the retries). If the service does not restart, the deployment stops so the operator can intervene.
Depending on the size of the Ceph cluster, you may need to increase the retry or delay values. The exact names of these parameters and their defaults are as follows:
health_mon_check_retries: 5 health_mon_check_delay: 15 health_osd_check_retries: 5 health_osd_check_delay: 15
health_mon_check_retries: 5
health_mon_check_delay: 15
health_osd_check_retries: 5
health_osd_check_delay: 15
Procedure
Update the
CephAnsibleExtraConfigparameter to change the default delay and retry values:Copy to Clipboard Copied! Toggle word wrap Toggle overflow This example makes the cluster check 30 times and wait 40 seconds between each check for the Ceph OSDs, and check 20 times and wait 10 seconds between each check for the Ceph MONs.
-
To incorporate the changes, pass the updated
yamlfile with-eusingopenstack overcloud deploy.
5.7. Overriding Ansible environment variables 링크 복사링크가 클립보드에 복사되었습니다!
The Red Hat OpenStack Platform Workflow service (mistral) uses Ansible to configure Ceph Storage, but you can customize the Ansible environment by using Ansible environment variables.
Procedure
To override an ANSIBLE_* environment variable, use the CephAnsibleEnvironmentVariables heat template parameter.
This example configuration increases the number of forks and SSH retries:
parameter_defaults:
CephAnsibleEnvironmentVariables:
ANSIBLE_SSH_RETRIES: '6'
DEFAULT_FORKS: '35'
parameter_defaults:
CephAnsibleEnvironmentVariables:
ANSIBLE_SSH_RETRIES: '6'
DEFAULT_FORKS: '35'
For more information about Ansible environment variables, see Ansible Configuration Settings.
For more information about how to customize your Ceph Storage cluster, see Customizing the Ceph Storage cluster.
Chapter 6. Deploying second-tier Ceph storage on OpenStack 링크 복사링크가 클립보드에 복사되었습니다!
Using OpenStack director, you can deploy different Red Hat Ceph Storage performance tiers by adding new Ceph nodes dedicated to a specific tier in a Ceph cluster.
For example, you can add new object storage daemon (OSD) nodes with SSD drives to an existing Ceph cluster to create a Block Storage (cinder) backend exclusively for storing data on these nodes. A user creating a new Block Storage volume can then choose the desired performance tier: either HDDs or the new SSDs.
This type of deployment requires Red Hat OpenStack Platform director to pass a customized CRUSH map to ceph-ansible. The CRUSH map allows you to split OSD nodes based on disk performance, but you can also use this feature for mapping physical infrastructure layout.
The following sections demonstrate how to deploy four nodes where two of the nodes use SSDs and the other two use HDDs. The example is kept simple to communicate a repeatable pattern. However, a production deployment should use more nodes and more OSDs to be supported as per the Red Hat Ceph Storage hardware selection guide.
6.1. Create a CRUSH map 링크 복사링크가 클립보드에 복사되었습니다!
Use the CRUSH map to put OSD nodes into a CRUSH root. A default root is created by default and all OSD nodes are included in it.
Inside a given root, you define the physical topology, rack, rooms, and other specifications, and then add the OSD nodes to the desired location in the hierarchy (or bucket). By default, no physical topology is defined; a flat design is assumed as if all nodes are in the same rack.
For more information about creating a custom CRUSH map, see Crush Administration in the Storage Strategies Guide.
6.2. Mapping the OSDs 링크 복사링크가 클립보드에 복사되었습니다!
Complete the following step to map the OSDs.
Procedure
Declare the OSDs/journal mapping:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3. Setting the replication factor 링크 복사링크가 클립보드에 복사되었습니다!
Complete the following step to set the replication factor.
This is normally supported only for full SSD deployment. See Red Hat Ceph Storage: Supported configurations.
Procedure
Set the default replication factor to two. This example splits four nodes into two different roots.
parameter_defaults: CephPoolDefaultSize: 2
parameter_defaults: CephPoolDefaultSize: 2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If you upgrade a deployment that uses gnocchi as the backend, you might encounter deployment timeout. To prevent this timeout, use the following CephPool definition to customize the gnocchi pool:
parameter_defaults
CephPools: {"name": metrics, "pg_num": 128, "pgp_num": 128, "size": 1}
parameter_defaults
CephPools: {"name": metrics, "pg_num": 128, "pgp_num": 128, "size": 1}
6.4. Defining the CRUSH hierarchy 링크 복사링크가 클립보드에 복사되었습니다!
Director provides the data for the CRUSH hierarchy, but ceph-ansible actually passes that data by getting the CRUSH mapping through the Ansible inventory file. Unless you keep the default root, you must specify the location of the root for each node.
For example if node lab-ceph01 (provisioning IP 172.16.0.26) is placed in rack1 inside the fast_root, the Ansible inventory should resemble the following:
172.16.0.26:
osd_crush_location: {host: lab-ceph01, rack: rack1, root: fast_root}
172.16.0.26:
osd_crush_location: {host: lab-ceph01, rack: rack1, root: fast_root}
When you use director to deploy Ceph, you don’t actually write the Ansible inventory; it is generated for you. Therefore, you must use NodeDataLookup to append the data.
NodeDataLookup works by specifying the system product UUID stored on the motherboard of the systems. The Bare Metal service (ironic) also stores this information after the introspection phase.
To create a CRUSH map that supports second-tier storage, complete the following steps:
Procedure
Run the following commands to retrieve the UUIDs of the four nodes:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIn the example, overcloud-ceph0[1-4] are the Ironic nodes names; they are deployed as
lab-ceph0[1–4](through HostnameMap.yaml).Specify the node placement as follows:
Expand Root Rack Node standard_root
rack1_std
overcloud-ceph01 (lab-ceph01)
rack2_std
overcloud-ceph02 (lab-ceph02)
fast_root
rack1_fast
overcloud-ceph03 (lab-ceph03)
rack2_fast
overcloud-ceph04 (lab-ceph04)
NoteYou cannot have two buckets with the same name. Even if
lab-ceph01andlab-ceph03are in the same physical rack, you cannot have two buckets calledrack1. Therefore, we named themrack1_stdandrack1_fast.NoteThis example demonstrates how to create a specific route called “standard_root” to illustrate multiple custom roots. However, you could have kept the HDDs OSD nodes in the default root.
Use the following
NodeDataLookupsyntax:NodeDataLookup: {"SYSTEM_UUID": {"osd_crush_location": {"root": "$MY_ROOT", "rack": "$MY_RACK", "host": "$OVERCLOUD_NODE_HOSTNAME"}}}NodeDataLookup: {"SYSTEM_UUID": {"osd_crush_location": {"root": "$MY_ROOT", "rack": "$MY_RACK", "host": "$OVERCLOUD_NODE_HOSTNAME"}}}Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou must specify the system UUID and then the CRUSH hierarchy from top to bottom. Also, the
hostparameter must point to the node’s overcloud host name, not the Bare Metal service (ironic) node name. To match the example configuration, enter the following:parameter_defaults: NodeDataLookup: {"32C2BC31-F6BB-49AA-971A-377EFDFDB111": {"osd_crush_location": {"root": "standard_root", "rack": "rack1_std", "host": "lab-ceph01"}}, "76B4C69C-6915-4D30-AFFD-D16DB74F64ED": {"osd_crush_location": {"root": "standard_root", "rack": "rack2_std", "host": "lab-ceph02"}}, "FECF7B20-5984-469F-872C-732E3FEF99BF": {"osd_crush_location": {"root": "fast_root", "rack": "rack1_fast", "host": "lab-ceph03"}}, "5FFEFA5F-69E4-4A88-B9EA-62811C61C8B3": {"osd_crush_location": {"root": "fast_root", "rack": "rack2_fast", "host": "lab-ceph04"}}}parameter_defaults: NodeDataLookup: {"32C2BC31-F6BB-49AA-971A-377EFDFDB111": {"osd_crush_location": {"root": "standard_root", "rack": "rack1_std", "host": "lab-ceph01"}}, "76B4C69C-6915-4D30-AFFD-D16DB74F64ED": {"osd_crush_location": {"root": "standard_root", "rack": "rack2_std", "host": "lab-ceph02"}}, "FECF7B20-5984-469F-872C-732E3FEF99BF": {"osd_crush_location": {"root": "fast_root", "rack": "rack1_fast", "host": "lab-ceph03"}}, "5FFEFA5F-69E4-4A88-B9EA-62811C61C8B3": {"osd_crush_location": {"root": "fast_root", "rack": "rack2_fast", "host": "lab-ceph04"}}}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable CRUSH map management at the ceph-ansible level:
parameter_defaults: CephAnsibleExtraConfig: create_crush_tree: trueparameter_defaults: CephAnsibleExtraConfig: create_crush_tree: trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use scheduler hints to ensure the Bare Metal service node UUIDs correctly map to the hostnames:
parameter_defaults: CephStorageCount: 4 OvercloudCephStorageFlavor: ceph-storage CephStorageSchedulerHints: 'capabilities:node': 'ceph-%index%'parameter_defaults: CephStorageCount: 4 OvercloudCephStorageFlavor: ceph-storage CephStorageSchedulerHints: 'capabilities:node': 'ceph-%index%'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Tag the Bare Metal service nodes with the corresponding hint:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor more information about predictive placement, see Assigning Specific Node IDs in the Advanced Overcloud Customization guide.
6.5. Defining CRUSH map rules 링크 복사링크가 클립보드에 복사되었습니다!
Rules define how the data is written on a cluster. After the CRUSH map node placement is complete, define the CRUSH rules.
Procedure
Use the following syntax to define the CRUSH rules:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteSetting the default parameter to
truemeans that this rule will be used when you create a new pool without specifying any rule. There may only be one default rule.In the following example, rule
standardpoints to the OSD nodes hosted on thestandard_rootwith one replicate per rack. Rulefastpoints to the OSD nodes hosted on thestandard_rootwith one replicate per rack:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou must set
crush_rule_configtotrue.
6.6. Configuring OSP pools 링크 복사링크가 클립보드에 복사되었습니다!
Ceph pools are configured with a CRUSH rules that define how to store data. This example features all built-in OSP pools using the standard_root (the standard rule) and a new pool using fast_root (the fast rule).
Procedure
Use the following syntax to define or change a pool property:
- name: $POOL_NAME pg_num: $PG_COUNT rule_name: $RULE_NAME application: rbd- name: $POOL_NAME pg_num: $PG_COUNT rule_name: $RULE_NAME application: rbdCopy to Clipboard Copied! Toggle word wrap Toggle overflow List all OSP pools and set the appropriate rule (standard, in this case), and create a new pool called
tier2that uses the fast rule. This pool will be used by Block Storage (cinder).Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.7. Configuring Block Storage to use the new pool 링크 복사링크가 클립보드에 복사되었습니다!
Add the Ceph pool to the cinder.conf file to enable Block Storage (cinder) to consume it:
Procedure
Update
cinder.confas follows:parameter_defaults: CinderRbdExtraPools: - tier2parameter_defaults: CinderRbdExtraPools: - tier2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.8. Verifying customized CRUSH map 링크 복사링크가 클립보드에 복사되었습니다!
After the openstack overcloud deploy command creates or updates the overcloud, complete the following step to verify that the customized CRUSH map was correctly applied.
Be careful if you move a host from one route to another.
Procedure
Connect to a Ceph monitor node and run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 7. Creating the overcloud 링크 복사링크가 클립보드에 복사되었습니다!
When your custom environment files are ready, you can specify the flavors and nodes that each role uses and then execute the deployment. The following subsections explain both steps in greater detail.
7.1. Assigning nodes and flavors to roles 링크 복사링크가 클립보드에 복사되었습니다!
Planning an overcloud deployment involves specifying how many nodes and which flavors to assign to each role. Like all Heat template parameters, these role specifications are declared in the parameter_defaults section of your environment file (in this case, ~/templates/storage-config.yaml).
For this purpose, use the following parameters:
| Heat Template Parameter | Description |
|---|---|
| ControllerCount | The number of Controller nodes to scale out |
| OvercloudControlFlavor |
The flavor to use for Controller nodes ( |
| ComputeCount | The number of Compute nodes to scale out |
| OvercloudComputeFlavor |
The flavor to use for Compute nodes ( |
| CephStorageCount | The number of Ceph storage (OSD) nodes to scale out |
| OvercloudCephStorageFlavor |
The flavor to use for Ceph Storage (OSD) nodes ( |
| CephMonCount | The number of dedicated Ceph MON nodes to scale out |
| OvercloudCephMonFlavor |
The flavor to use for dedicated Ceph MON nodes ( |
| CephMdsCount | The number of dedicated Ceph MDS nodes to scale out |
| OvercloudCephMdsFlavor |
The flavor to use for dedicated Ceph MDS nodes ( |
The CephMonCount, CephMdsCount, OvercloudCephMonFlavor, and OvercloudCephMdsFlavor parameters (along with the ceph-mon and ceph-mds flavors) will only be valid if you created a custom CephMON and CephMds role, as described in Chapter 3, Deploying Ceph services on dedicated nodes.
For example, to configure the overcloud to deploy three nodes for each role (Controller, Compute, Ceph-Storage, and CephMon), add the following to your parameter_defaults:
See Creating the Overcloud with the CLI Tools from the Director Installation and Usage guide for a more complete list of Heat template parameters.
7.2. Initiating overcloud deployment 링크 복사링크가 클립보드에 복사되었습니다!
During undercloud installation, set generate_service_certificate=false in the undercloud.conf file. Otherwise, you must inject a trust anchor when you deploy the overcloud, as described in Enabling SSL/TLS on Overcloud Public Endpoints in the Advanced Overcloud Customization guide.
- Note
- If you want to add Ceph Dashboard during your overcloud deployment, see Chapter 8, Adding the Red Hat Ceph Storage Dashboard to an overcloud deployment.
The creation of the overcloud requires additional arguments for the openstack overcloud deploy command. For example:
The above command uses the following options:
-
--templates- Creates the overcloud from the default heat template collection,/usr/share/openstack-tripleo-heat-templates/. -
-r /home/stack/templates/roles_data_custom.yaml- Specifies the customized roles definition file from Chapter 3, Deploying Ceph services on dedicated nodes, which adds custom roles for either Ceph MON or Ceph MDS services. These roles allow either service to be installed on dedicated nodes. -
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml- Sets the director to create a Ceph cluster. In particular, this environment file deploys a Ceph cluster with containerized Ceph Storage nodes. -
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml- Enables the Ceph Object Gateway, as described in Section 4.2, “Enabling the Ceph Object Gateway”. -
-e /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml- Enables the Block Storage Backup service (cinder-backup), as described in Section 4.2.1, “Configuring the Backup Service to use Ceph”. -
-e /home/stack/templates/storage-config.yaml- Adds the environment file containing your custom Ceph Storage configuration. -
-e /home/stack/templates/ceph-config.yaml- Adds the environment file containing your custom Ceph cluster settings, as described in Chapter 5, Customizing the Ceph Storage cluster. -
--ntp-server pool.ntp.org- Sets the NTP server.
You can also use an answers file to invoke all your templates and environment files. For example, you can use the following command to deploy an identical overcloud:
openstack overcloud deploy -r /home/stack/templates/roles_data_custom.yaml \ --answers-file /home/stack/templates/answers.yaml --ntp-server pool.ntp.org
$ openstack overcloud deploy -r /home/stack/templates/roles_data_custom.yaml \
--answers-file /home/stack/templates/answers.yaml --ntp-server pool.ntp.org
In this case, the answers file /home/stack/templates/answers.yaml contains:
See Including environment files in an overcloud deployment for more details.
For a full list of options, run:
openstack help overcloud deploy
$ openstack help overcloud deploy
For more information, see Configuring a basic overcloud with the CLI tools in the Director Installation and Usage guide.
The overcloud creation process begins and the director provisions your nodes. This process takes some time to complete. To view the status of the Overcloud creation, open a separate terminal as the stack user and run:
source ~/stackrc openstack stack list --nested
$ source ~/stackrc
$ openstack stack list --nested
Chapter 8. Adding the Red Hat Ceph Storage Dashboard to an overcloud deployment 링크 복사링크가 클립보드에 복사되었습니다!
Red Hat Ceph Storage Dashboard is disabled by default but you can now enable it in your overcloud with the Red Hat OpenStack Platform director. The Ceph Dashboard is a built-in, web-based Ceph management and monitoring application to administer various aspects and objects in your cluster. Red Hat Ceph Storage Dashboard comprises the Ceph Dashboard manager module, which provides the user interface and embeds Grafana, the front end of the platform, Prometheus as a monitoring plugin, Alertmanager and Node Exporters that are deployed throughout the cluster and send alerts and export cluster data to the Dashboard.
- Note
- This feature is supported with Ceph Storage 4.1 or later. For more information about how to determine the version of Ceph Storage installed on your system, see Red Hat Ceph Storage releases and corresponding Ceph package versions.
- Note
- The Red Hat Ceph Storage Dashboard is always colocated on the same nodes as the other Ceph manager components.
- Note
- If you want to add Ceph Dashboard during your initial overcloud deployment, complete the procedures in this chapter before you deploy your initial overcloud in Section 7.2, “Initiating overcloud deployment”.
The following diagram shows the architecture of Ceph Dashboard on Red Hat OpenStack Platform:
For more information about the Dashboard and its features and limitations, see Dashboard features in the Red Hat Ceph Storage Dashboard Guide.
TLS everywhere with Ceph Dashboard
The dashboard front end is fully integrated with the TLS everywhere framework. You can enable TLS everywhere provided that you have the required environment files and they are included in the overcloud deploy command. This triggers the certificate request for both Grafana and the Ceph Dashboard and the generated certificate and key files are passed to ceph-ansible during the overcloud deployment. For instructions and more information about how to enable TLS for the Dashboard as well as for other openstack services, see the following locations in the Advanced Overcloud Customization guide:
- Enabling SSL/TLS on Overcloud Public Endpoints.
Enabling SSL/TLS on Internal and Public Endpoints with Identity Management.
- Note
- The port to reach the Ceph Dashboard remains the same even in the TLS-everywhere context.
8.1. Including the necessary containers for the Ceph Dashboard 링크 복사링크가 클립보드에 복사되었습니다!
Before you can add the Ceph Dashboard templates to your overcloud, you must include the necessary containers by using the containers-prepare-parameter.yaml file. To generate the containers-prepare-parameter.yaml file to prepare your container images, complete the following steps:
Procedure
-
Log in to your undercloud host as the
stackuser. Generate the default container image preparation file:
openstack tripleo container image prepare default \ --local-push-destination \ --output-env-file containers-prepare-parameter.yaml
$ openstack tripleo container image prepare default \ --local-push-destination \ --output-env-file containers-prepare-parameter.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the
containers-prepare-parameter.yamlfile and make the modifications to suit your requirements. The following examplecontainers-prepare-parameter.yamlfile contains the image locations and tags related to the Dashboard services including Grafana, Prometheus, Alertmanager, and Node Exporter. Edit the values depending on your specific scenario:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For more information about registry and image configuration with the containers-prepare-parameter.yaml file, see Container image preparation parameters in the Transitioning to Containerized Services guide.
8.2. Deploying Ceph Dashboard 링크 복사링크가 클립보드에 복사되었습니다!
- Note
- The Ceph Dashboard admin user role is set to read-only mode by default. To change the Ceph Dashboard admin default mode, see Section 8.3, “Changing the default permissions”.
Procedure
-
Log in to the undercloud node as the
stackuser. Include the following environment files, with all environment files that are part of your existing deployment, in the
openstack overcloud deploycommand:openstack overcloud deploy \ --templates \ -e <existing_overcloud_environment_files> \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yaml
$ openstack overcloud deploy \ --templates \ -e <existing_overcloud_environment_files> \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<existing_overcloud_environment_files>with the list of environment files that are part of your existing deployment.- Result
- The resulting deployment comprises an external stack with the grafana, prometheus, alertmanager, and node-exporter containers. The Ceph Dashboard manager module is the back end for this stack and embeds the grafana layouts to provide ceph cluster specific metrics to the end users.
8.3. Changing the default permissions 링크 복사링크가 클립보드에 복사되었습니다!
The Ceph Dashboard admin user role is set to read-only mode by default for safe monitoring of the Ceph cluster. To permit an admin user to have elevated privileges so that they can alter elements of the Ceph cluster with the Dashboard, you can use the CephDashboardAdminRO parameter to change the default admin permissions.
- Warning
- A user with full permissions might alter elements of your cluster that director configures. This can cause a conflict with director-configured options when you run a stack update. To avoid this problem, do not alter director-configured options with Ceph Dashboard, for example, Ceph OSP pools attributes.
Procedure
-
Log in to the undercloud as the
stackuser. Create the following
ceph_dashboard_admin.yamlenvironment file:parameter_defaults: CephDashboardAdminRO: falseparameter_defaults: CephDashboardAdminRO: falseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the overcloud deploy command to update the existing stack and include the environment file you created with all other environment files that are part of your existing deployment:
openstack overcloud deploy \ --templates \ -e <existing_overcloud_environment_files> \ -e ceph_dashboard_admin.yml
$ openstack overcloud deploy \ --templates \ -e <existing_overcloud_environment_files> \ -e ceph_dashboard_admin.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<existing_overcloud_environment_files>with the list of environment files that are part of your existing deployment.
8.4. Accessing Ceph Dashboard 링크 복사링크가 클립보드에 복사되었습니다!
To test that Ceph Dashboard is running correctly, complete the following verification steps to access it and check that the data it displays from the Ceph cluster is correct.
Procedure
-
Log in to the undercloud node as the
stackuser. Retrieve the dashboard admin login credentials:
grep dashboard_admin_password /var/lib/mistral/overcloud/ceph-ansible/group_vars/all.yml
[stack@undercloud ~]$ grep dashboard_admin_password /var/lib/mistral/overcloud/ceph-ansible/group_vars/all.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Retrieve the VIP address to access the Ceph Dashboard:
grep dashboard_frontend /var/lib/mistral/overcloud/ceph-ansible/group_vars/mgrs.yml
[stack@undercloud-0 ~]$ grep dashboard_frontend /var/lib/mistral/overcloud/ceph-ansible/group_vars/mgrs.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use a web browser to point to the front end VIP and access the Dashboard. Director configures and exposes the Dashboard on the provisioning network, so you can use the VIP that you retrieved in step 2 to access the dashboard directly on TCP port 8444. Ensure that the following conditions are met:
- The Web client host is layer 2 connected to the provisioning network.
The provisioning network is properly routed or proxied, and it can be reached from the web client host. If these conditions are not met, you can still open a SSH tunnel to reach the Dashboard VIP on the overcloud:
client_host$ ssh -L 8444:<dashboard vip>:8444 stack@<your undercloud>
client_host$ ssh -L 8444:<dashboard vip>:8444 stack@<your undercloud>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <dashboard vip> with the IP address of the control plane VIP that you retrieved in step 3.
Access the Dashboard by pointing your web browser to http://localhost:8444. The default user that
ceph-ansiblecreates is admin. You can retrieve the password in/var/lib/mistral/overcloud/ceph-ansible/group_vars/all.yml.- Results
- You can access the Ceph Dashboard.
-
The numbers and graphs that the Dashboard displays reflect the same cluster status that the CLI command,
ceph -s, returns.
For more information about the Red Hat Ceph Storage Dashboard, see the Red Hat Ceph Storage Administration Guide
Chapter 9. Post-deployment 링크 복사링크가 클립보드에 복사되었습니다!
The following subsections describe several post-deployment operations for managing the Ceph cluster.
9.1. Accessing the overcloud 링크 복사링크가 클립보드에 복사되었습니다!
The director generates a script to configure and help authenticate interactions with your overcloud from the undercloud. The director saves this file (overcloudrc) in your stack user’s home directory. Run the following command to use this file:
source ~/overcloudrc
$ source ~/overcloudrc
This loads the necessary environment variables to interact with your overcloud from the undercloud CLI. To return to interacting with the undercloud, run the following command:
source ~/stackrc
$ source ~/stackrc
9.2. Monitoring Ceph Storage nodes 링크 복사링크가 클립보드에 복사되었습니다!
After you create the overcloud, check the status of the Ceph Storage Cluster to ensure that it works correctly.
Procedure
Log in to a Controller node as the
heat-adminuser:nova list ssh heat-admin@192.168.0.25
$ nova list $ ssh heat-admin@192.168.0.25Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the health of the cluster:
sudo podman exec ceph-mon-<HOSTNAME> ceph health
$ sudo podman exec ceph-mon-<HOSTNAME> ceph healthCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the cluster has no issues, the command reports back
HEALTH_OK. This means the cluster is safe to use.Log in to an overcloud node that runs the Ceph monitor service and check the status of all OSDs in the cluster:
sudo podman exec ceph-mon-<HOSTNAME> ceph osd tree
$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd treeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status of the Ceph Monitor quorum:
sudo podman exec ceph-mon-<HOSTNAME> ceph quorum_status
$ sudo podman exec ceph-mon-<HOSTNAME> ceph quorum_statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow This shows the monitors participating in the quorum and which one is the leader.
Verify that all Ceph OSDs are running:
sudo podman exec ceph-mon-<HOSTNAME> ceph osd stat
$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd statCopy to Clipboard Copied! Toggle word wrap Toggle overflow
For more information on monitoring Ceph Storage clusters, see Monitoring in the Red Hat Ceph Storage Administration Guide.
Chapter 10. Rebooting the environment 링크 복사링크가 클립보드에 복사되었습니다!
A situation might occur where you need to reboot the environment. For example, when you might need to modify the physical servers, or you might need to recover from a power outage. In this situation, it is important to make sure your Ceph Storage nodes boot correctly.
Make sure to boot the nodes in the following order:
- Boot all Ceph Monitor nodes first - This ensures the Ceph Monitor service is active in your high availability cluster. By default, the Ceph Monitor service is installed on the Controller node. If the Ceph Monitor is separate from the Controller in a custom role, make sure this custom Ceph Monitor role is active.
- Boot all Ceph Storage nodes - This ensures the Ceph OSD cluster can connect to the active Ceph Monitor cluster on the Controller nodes.
10.1. Rebooting a Ceph Storage (OSD) cluster 링크 복사링크가 클립보드에 복사되었습니다!
Complete the following steps to reboot a cluster of Ceph Storage (OSD) nodes.
Procedure
Log in to a Ceph MON or Controller node and disable Ceph Storage cluster rebalancing temporarily:
sudo podman exec -it ceph-mon-controller-0 ceph osd set noout sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalance
$ sudo podman exec -it ceph-mon-controller-0 ceph osd set noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Select the first Ceph Storage node that you want to reboot and log in to the node.
Reboot the node:
sudo reboot
$ sudo rebootCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Wait until the node boots.
Log in to the node and check the cluster status:
sudo podman exec -it ceph-mon-controller-0 ceph status
$ sudo podman exec -it ceph-mon-controller-0 ceph statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the
pgmapreports allpgsas normal (active+clean).- Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.
When complete, log into a Ceph MON or Controller node and re-enable cluster rebalancing:
sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalance
$ sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout $ sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalanceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Perform a final status check to verify that the cluster reports
HEALTH_OK:sudo podman exec -it ceph-mon-controller-0 ceph status
$ sudo podman exec -it ceph-mon-controller-0 ceph statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If a situation occurs where all overcloud nodes boot at the same time, the Ceph OSD services might not start correctly on the Ceph Storage nodes. In this situation, reboot the Ceph Storage OSDs so they can connect to the Ceph Monitor service.
Verify a HEALTH_OK status of the Ceph Storage node cluster with the following command:
sudo ceph status
$ sudo ceph status
Chapter 11. Scaling the Ceph Storage cluster 링크 복사링크가 클립보드에 복사되었습니다!
11.1. Scaling up the Ceph Storage cluster 링크 복사링크가 클립보드에 복사되었습니다!
You can scale up the number of Ceph Storage nodes in your overcloud by re-running the deployment with the number of Ceph Storage nodes you need.
Before doing so, ensure that you have enough nodes for the updated deployment. These nodes must be registered with the director and tagged accordingly.
Registering New Ceph Storage Nodes
To register new Ceph storage nodes with the director, follow these steps:
Log in to the undercloud as the
stackuser and initialize your director configuration:source ~/stackrc
$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Define the hardware and power management details for the new nodes in a new node definition template; for example,
instackenv-scale.json. Import this file to the OpenStack director:
openstack overcloud node import ~/instackenv-scale.json
$ openstack overcloud node import ~/instackenv-scale.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Importing the node definition template registers each node defined there to the director.
Assign the kernel and ramdisk images to all nodes:
openstack overcloud node configure
$ openstack overcloud node configureCopy to Clipboard Copied! Toggle word wrap Toggle overflow
For more information about registering new nodes, see Section 2.2, “Registering nodes”.
Manually Tagging New Nodes
After you register each node, you must inspect the hardware and tag the node into a specific profile. Use profile tags to match your nodes to flavors, and then assign flavors to deployment roles.
To inspect and tag new nodes, complete the following steps:
Trigger hardware introspection to retrieve the hardware attributes of each node:
openstack overcloud node introspect --all-manageable --provide
$ openstack overcloud node introspect --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
The
--all-manageableoption introspects only the nodes that are in a managed state. In this example, all nodes are in a managed state. The
--provideoption resets all nodes to anactivestate after introspection.ImportantEnsure that this process completes successfully. This process usually takes 15 minutes for bare metal nodes.
-
The
Retrieve a list of your nodes to identify their UUIDs:
openstack baremetal node list
$ openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add a profile option to the
properties/capabilitiesparameter for each node to manually tag a node to a specific profile. The addition of theprofileoption tags the nodes into each respective profile.NoteAs an alternative to manual tagging, use the Automated Health Check (AHC) Tools to automatically tag larger numbers of nodes based on benchmarking data.
For example, the following commands tag three additional nodes with the
ceph-storageprofile:openstack baremetal node set 551d81f5-4df2-4e0f-93da-6c5de0b868f7 --property capabilities="profile:ceph-storage,boot_option:local" openstack baremetal node set 5e735154-bd6b-42dd-9cc2-b6195c4196d7 --property capabilities="profile:ceph-storage,boot_option:local" openstack baremetal node set 1a2b090c-299d-4c20-a25d-57dd21a7085b --property capabilities="profile:ceph-storage,boot_option:local"
$ openstack baremetal node set 551d81f5-4df2-4e0f-93da-6c5de0b868f7 --property capabilities="profile:ceph-storage,boot_option:local" $ openstack baremetal node set 5e735154-bd6b-42dd-9cc2-b6195c4196d7 --property capabilities="profile:ceph-storage,boot_option:local" $ openstack baremetal node set 1a2b090c-299d-4c20-a25d-57dd21a7085b --property capabilities="profile:ceph-storage,boot_option:local"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If the nodes you just tagged and registered use multiple disks, you can set the director to use a specific root disk on each node. See Section 2.4, “Defining the root disk for multi-disk clusters” for instructions on how to do so.
Re-deploying the Overcloud with Additional Ceph Storage Nodes
After registering and tagging the new nodes, you can now scale up the number of Ceph Storage nodes by re-deploying the overcloud. When you do, set the CephStorageCount parameter in the parameter_defaults of your environment file (in this case, ~/templates/storage-config.yaml). In Section 7.1, “Assigning nodes and flavors to roles”, the overcloud is configured to deploy with 3 Ceph Storage nodes. To scale it up to 6 nodes instead, use:
Upon re-deployment with this setting, the overcloud should now have 6 Ceph Storage nodes instead of 3.
11.2. Scaling down and replacing Ceph Storage nodes 링크 복사링크가 클립보드에 복사되었습니다!
In some cases, you might need to scale down your Ceph cluster, or even replace a Ceph Storage node, for example, if a Ceph Storage node is faulty. In either situation, you must disable and rebalance any Ceph Storage node that you want to remove from the overcloud to avoid data loss.
This procedure uses steps from the Red Hat Ceph Storage Administration Guide to manually remove Ceph Storage nodes. For more in-depth information about manual removal of Ceph Storage nodes, see Starting, stopping, and restarting Ceph daemons that run in containers and Removing a Ceph OSD using the command-line interface.
Procedure
-
Log in to a Controller node as the
heat-adminuser. The director’sstackuser has an SSH key to access theheat-adminuser. List the OSD tree and find the OSDs for your node. For example, the node you want to remove might contain the following OSDs:
-2 0.09998 host overcloud-cephstorage-0 0 0.04999 osd.0 up 1.00000 1.00000 1 0.04999 osd.1 up 1.00000 1.00000
-2 0.09998 host overcloud-cephstorage-0 0 0.04999 osd.0 up 1.00000 1.00000 1 0.04999 osd.1 up 1.00000 1.00000Copy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the OSDs on the Ceph Storage node. In this case, the OSD IDs are 0 and 1.
sudo podman exec ceph-mon-<HOSTNAME> ceph osd out 0 sudo podman exec ceph-mon-<HOSTNAME> ceph osd out 1
[heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd out 0 [heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd out 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow The Ceph Storage cluster begins rebalancing. Wait for this process to complete. Follow the status by using the following command:
sudo podman exec ceph-mon-<HOSTNAME> ceph -w
[heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph -wCopy to Clipboard Copied! Toggle word wrap Toggle overflow After the Ceph cluster completes rebalancing, log in to the Ceph Storage node you are removing, in this case
overcloud-cephstorage-0, as theheat-adminuser and stop the node.sudo systemctl disable ceph-osd@0 sudo systemctl disable ceph-osd@1
[heat-admin@overcloud-cephstorage-0 ~]$ sudo systemctl disable ceph-osd@0 [heat-admin@overcloud-cephstorage-0 ~]$ sudo systemctl disable ceph-osd@1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Stop the OSDs.
sudo systemctl stop ceph-osd@0 sudo systemctl stop ceph-osd@1
[heat-admin@overcloud-cephstorage-0 ~]$ sudo systemctl stop ceph-osd@0 [heat-admin@overcloud-cephstorage-0 ~]$ sudo systemctl stop ceph-osd@1Copy to Clipboard Copied! Toggle word wrap Toggle overflow While logged in to the Controller node, remove the OSDs from the CRUSH map so that they no longer receive data.
sudo podman exec ceph-mon-<HOSTNAME> ceph osd crush remove osd.0 sudo podman exec ceph-mon-<HOSTNAME> ceph osd crush remove osd.1
[heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd crush remove osd.0 [heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd crush remove osd.1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD authentication key.
sudo podman exec ceph-mon-<HOSTNAME> ceph auth del osd.0 sudo podman exec ceph-mon-<HOSTNAME> ceph auth del osd.1
[heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph auth del osd.0 [heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph auth del osd.1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD from the cluster.
sudo podman exec ceph-mon-<HOSTNAME> ceph osd rm 0 sudo podman exec ceph-mon-<HOSTNAME> ceph osd rm 1
[heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd rm 0 [heat-admin@overcloud-controller-0 ~]$ sudo podman exec ceph-mon-<HOSTNAME> ceph osd rm 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Leave the node and return to the undercloud as the
stackuser.exit
[heat-admin@overcloud-controller-0 ~]$ exit [stack@director ~]$Copy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the Ceph Storage node so that director does not reprovision it.
openstack baremetal node list openstack baremetal node maintenance set UUID
[stack@director ~]$ openstack baremetal node list [stack@director ~]$ openstack baremetal node maintenance set UUIDCopy to Clipboard Copied! Toggle word wrap Toggle overflow Removing a Ceph Storage node requires an update to the
overcloudstack in director with the local template files. First identify the UUID of the overcloud stack:openstack stack list
$ openstack stack listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the UUIDs of the Ceph Storage node you want to delete:
openstack server list
$ openstack server listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the node from the stack and update the plan accordingly:
openstack overcloud node delete --stack overcloud <NODE_UUID>
$ openstack overcloud node delete --stack overcloud <NODE_UUID>Copy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantIf you passed any extra environment files when you created the overcloud, pass them again here using the
-eoption to avoid making undesired changes to the overcloud. For more information, see Modifying the overcloud environment in the Director Installation and Usage guide.-
Wait until the stack completes its update. Use the
heat stack-list --show-nestedcommand to monitor the stack update. Add new nodes to the director node pool and deploy them as Ceph Storage nodes. Use the
CephStorageCountparameter in theparameter_defaultsof your environment file, in this case,~/templates/storage-config.yaml, to define the total number of Ceph Storage nodes in the overcloud.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor more information about how to define the number of nodes per role, see Section 7.1, “Assigning nodes and flavors to roles”.
After you update your environment file, redeploy the overcloud:
openstack overcloud deploy --templates -e <ENVIRONMENT_FILE>
$ openstack overcloud deploy --templates -e <ENVIRONMENT_FILE>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Director provisions the new node and updates the entire stack with the details of the new node.
Log in to a Controller node as the
heat-adminuser and check the status of the Ceph Storage node:sudo ceph status
[heat-admin@overcloud-controller-0 ~]$ sudo ceph statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Confirm that the value in the
osdmapsection matches the number of nodes in your cluster that you want. The Ceph Storage node that you removed is replaced with a new node.
11.3. Adding an OSD to a Ceph Storage node 링크 복사링크가 클립보드에 복사되었습니다!
This procedure demonstrates how to add an OSD to a node. For more information about Ceph OSDs, see Ceph OSDs in the Red Hat Ceph Storage Operations Guide.
Procedure
Notice the following heat template deploys Ceph Storage with three OSD devices:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To add an OSD, update the node disk layout as described in Section 5.3, “Mapping the Ceph Storage node disk layout”. In this example, add
/dev/sdeto the template:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Run
openstack overcloud deployto update the overcloud.
This example assumes that all hosts with OSDs have a new device called /dev/sde. If you do not want all nodes to have the new device, update the heat template. For for information about how to define hosts with a differing devices list, see Section 5.5, “Mapping the disk layout to non-homogeneous Ceph Storage nodes”.
11.4. Removing an OSD from a Ceph Storage node 링크 복사링크가 클립보드에 복사되었습니다!
This procedure demonstrates how to remove an OSD from a node. It assumes the following about the environment:
-
A server (
ceph-storage0) has an OSD (ceph-osd@4) running on/dev/sde. -
The Ceph monitor service (
ceph-mon) is running oncontroller0. - There are enough available OSDs to ensure the storage cluster is not at its near-full ratio.
For more information about Ceph OSDs, see Ceph OSDs in the Red Hat Ceph Storage Operations Guide.
Procedure
-
SSH into
ceph-storage0and log in asroot. Disable and stop the OSD service:
systemctl disable ceph-osd@4 systemctl stop ceph-osd@4
[root@ceph-storage0 ~]# systemctl disable ceph-osd@4 [root@ceph-stoarge0 ~]# systemctl stop ceph-osd@4Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Disconnect from
ceph-storage0. -
SSH into
controller0and log in asroot. Identify the name of the Ceph monitor container:
podman ps | grep ceph-mon ceph-mon-controller0
[root@controller0 ~]# podman ps | grep ceph-mon ceph-mon-controller0 [root@controller0 ~]#Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the Ceph monitor container to mark the undesired OSD as
out:podman exec ceph-mon-controller0 ceph osd out 4
[root@controller0 ~]# podman exec ceph-mon-controller0 ceph osd out 4Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThis command causes Ceph to rebalance the storage cluster and copy data to other OSDs in the cluster. The cluster temporarily leaves the
active+cleanstate until rebalancing is complete.Run the following command and wait for the storage cluster state to become
active+clean:podman exec ceph-mon-controller0 ceph -w
[root@controller0 ~]# podman exec ceph-mon-controller0 ceph -wCopy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD from the CRUSH map so that it no longer receives data:
podman exec ceph-mon-controller0 ceph osd crush remove osd.4
[root@controller0 ~]# podman exec ceph-mon-controller0 ceph osd crush remove osd.4Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD authentication key:
podman exec ceph-mon-controller0 ceph auth del osd.4
[root@controller0 ~]# podman exec ceph-mon-controller0 ceph auth del osd.4Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD:
podman exec ceph-mon-controller0 ceph osd rm 4
[root@controller0 ~]# podman exec ceph-mon-controller0 ceph osd rm 4Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Disconnect from
controller0. -
SSH into the undercloud as the
stackuser and locate the heat environment file in which you defined theCephAnsibleDisksConfigparameter. Notice the heat template contains four OSDs:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the template to remove
/dev/sde.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run
openstack overcloud deployto update the overcloud.NoteThis example assumes that you removed the
/dev/sdedevice from all hosts with OSDs. If you do not remove the same device from all nodes, update the heat template. For for information about how to define hosts with a differingdeviceslist, see Section 5.5, “Mapping the disk layout to non-homogeneous Ceph Storage nodes”.
Chapter 12. Replacing a failed disk 링크 복사링크가 클립보드에 복사되었습니다!
If one of the disks fails in your Ceph cluster, complete the following procedures to replace it:
- Determining if there is a device name change, see Section 12.1, “Determining if there is a device name change”.
- Ensuring that the OSD is down and destroyed, see Section 12.2, “Ensuring that the OSD is down and destroyed”.
- Removing the old disk from the system and installing the replacement disk, see Section 12.3, “Removing the old disk from the system and installing the replacement disk”.
- Verifying that the disk replacement is successful, see Section 12.4, “Verifying that the disk replacement is successful”.
12.1. Determining if there is a device name change 링크 복사링크가 클립보드에 복사되었습니다!
Before you replace the disk, determine if the replacement disk for the replacement OSD has a different name in the operating system than the device that you want to replace. If the replacement disk has a different name, you must update Ansible parameters for the devices list so that subsequent runs of ceph-ansible, including when director runs ceph-ansible, do not fail as a result of the change. For an example of the devices list that you must change when you use director, see Section 5.3, “Mapping the Ceph Storage node disk layout”.
If the device name changes and you use the following procedures to update your system outside of ceph-ansible or director, there is a risk that the configuration management tools are out of sync with the system that they manage until you update the system definition files and the configuration is reasserted without error.
Persistent naming of storage devices
Storage devices that the sd driver manages might not always have the same name across reboots. For example, a disk that is normally identified by /dev/sdc might be named /dev/sdb. It is also possible for the replacement disk, /dev/sdc, to appear in the operating system as /dev/sdd even if you want to use it as a replacement for /dev/sdc. To address this issue, use names that are persistent and match the following pattern: /dev/disk/by-*. For more information, see Persistent Naming in the Red Hat Enterprise Linux (RHEL) 7 Storage Administration Guide.
Depending on the naming method that you use to deploy Ceph, you might need to update the devices list after you replace the OSD. Use the following list of naming methods to determine if you must change the devices list:
- The major and minor number range method
If you used
sdand want to continue to use it, after you install the new disk, check if the name has changed. If the name did not change, for example, if the same name appears correctly as/dev/sdd, it is not necessary to change the name after you complete the disk replacement procedures.ImportantThis naming method is not recommended because there is still a risk that the name becomes inconsistent over time. For more information, see Persistent Naming in the RHEL 7 Storage Administration Guide.
- The
by-pathmethod If you use this method, and you add a replacement disk in the same slot, then the path is consistent and no change is necessary.
ImportantAlthough this naming method is preferable to the major and minor number range method, use caution to ensure that the target numbers do not change. For example, use persistent binding and update the names if a host adapter is moved to a different PCI slot. In addition, there is the possibility that the SCSI host numbers can change if a HBA fails to probe, if drivers are loaded in a different order, or if a new HBA is installed on the system. The
by-pathnaming method also differs between RHEL7 and RHEL8. For more information, see:- Article [What is the difference between "by-path" links created in RHEL8 and RHEL7?] https://access.redhat.com/solutions/5171991
- Overview of persistent naming attributes in the RHEL 8 Managing file systems guide.
- The
by-uuidmethod -
If you use this method, you can use the
blkidutility to set the new disk to have the same UUID as the old disk. For more information, see Persistent Naming in the RHEL 7 Storage Administration Guide. - The
by-idmethod - If you use this method, you must change the devices list because this identifier is a property of the device and the device has been replaced.
When you add the new disk to the system, if it is possible to modify the persistent naming attributes according to the RHEL7 Storage Administrator Guide, see Persistent Naming, so that the device name is unchanged, then it is not necessary to update the devices list and re-run ceph-ansible, or trigger director to re-run ceph-ansible and you can proceed with the disk replacement procedures. However, you can re-run ceph-ansible to ensure that the change did not result in any inconsistencies.
12.2. Ensuring that the OSD is down and destroyed 링크 복사링크가 클립보드에 복사되었습니다!
On the server that hosts the Ceph Monitor, use the ceph command in the running monitor container to ensure that the OSD that you want to replace is down, and then destroy it.
Procedure
Identify the name of the running Ceph monitor container and store it in an environment variable called
MON:MON=$(podman ps | grep ceph-mon | awk {'print $1'})MON=$(podman ps | grep ceph-mon | awk {'print $1'})Copy to Clipboard Copied! Toggle word wrap Toggle overflow Alias the
cephcommand so that it executes within the running Ceph monitor container:alias ceph="podman exec $MON ceph"
alias ceph="podman exec $MON ceph"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use the new alias to verify that the OSD that you want to replace is down:
ceph osd tree | grep 27 27 hdd 0.04790 osd.27 down 1.00000 1.00000
[root@overcloud-controller-0 ~]# ceph osd tree | grep 27 27 hdd 0.04790 osd.27 down 1.00000 1.00000Copy to Clipboard Copied! Toggle word wrap Toggle overflow Destroy the OSD. The following example command destroys
OSD 27:ceph osd destroy 27 --yes-i-really-mean-it destroyed osd.27
[root@overcloud-controller-0 ~]# ceph osd destroy 27 --yes-i-really-mean-it destroyed osd.27Copy to Clipboard Copied! Toggle word wrap Toggle overflow
12.3. Removing the old disk from the system and installing the replacement disk 링크 복사링크가 클립보드에 복사되었습니다!
On the container host with the OSD that you want to replace, remove the old disk from the system and install the replacement disk.
Prerequisites:
- Verify that the device ID has changed. For more information, see Section 12.1, “Determining if there is a device name change”.
The ceph-volume command is present in the Ceph container but is not installed on the overcloud node. Create an alias so that the ceph-volume command runs the ceph-volume binary inside the Ceph container. Then use the ceph-volume command to clean the new disk and add it as an OSD.
Procedure
Ensure that the failed OSD is not running:
systemctl stop ceph-osd@27
systemctl stop ceph-osd@27Copy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the image ID of the ceph container image and store it in an environment variable called
IMG:IMG=$(podman images | grep ceph | awk {'print $3'})IMG=$(podman images | grep ceph | awk {'print $3'})Copy to Clipboard Copied! Toggle word wrap Toggle overflow Alias the
ceph-volumecommand so that it runs inside the$IMGCeph container, with theceph-volumeentry point and relevant directories:alias ceph-volume="podman run --rm --privileged --net=host --ipc=host -v /run/lock/lvm:/run/lock/lvm:z -v /var/run/udev/:/var/run/udev/:z -v /dev:/dev -v /etc/ceph:/etc/ceph:z -v /var/lib/ceph/:/var/lib/ceph/:z -v /var/log/ceph/:/var/log/ceph/:z --entrypoint=ceph-volume $IMG --cluster ceph"
alias ceph-volume="podman run --rm --privileged --net=host --ipc=host -v /run/lock/lvm:/run/lock/lvm:z -v /var/run/udev/:/var/run/udev/:z -v /dev:/dev -v /etc/ceph:/etc/ceph:z -v /var/lib/ceph/:/var/lib/ceph/:z -v /var/log/ceph/:/var/log/ceph/:z --entrypoint=ceph-volume $IMG --cluster ceph"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the aliased command runs successfully:
ceph-volume lvm list
ceph-volume lvm listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that your new OSD device is not already part of LVM. Use the
pvdisplaycommand to inspect the device, and ensure that theVG Namefield is empty. Replace<NEW_DEVICE>with the/dev/*path of your new OSD device:Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the
VG Namefield is not empty, then the device belongs to a volume group that you must remove.If the device belongs to a volume group, use the
lvdisplaycommand to check if there is a logical volume in the volume group. Replace<VOLUME_GROUP>with the value of theVG Namefield that you retrieved from thepvdisplaycommand:lvdisplay | grep <VOLUME_GROUP> LV Path /dev/ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2/osd-data-a0810722-7673-43c7-8511-2fd9db1dbbc6 VG Name ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2
[root@overcloud-computehci-2 ~]# lvdisplay | grep <VOLUME_GROUP> LV Path /dev/ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2/osd-data-a0810722-7673-43c7-8511-2fd9db1dbbc6 VG Name ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2 [root@overcloud-computehci-2 ~]#Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the
LV Pathfield is not empty, then the device contains a logical volume that you must remove.If the new device is part of a logical volume or volume group, remove the logical volume, volume group, and the device association as a physical volume within the LVM system.
-
Replace
<LV_PATH>with the value of theLV Pathfield. -
Replace
<VOLUME_GROUP>with the value of theVG Namefield. Replace
<NEW_DEVICE>with the/dev/*path of your new OSD device.lvremove --force <LV_PATH> Logical volume "osd-data-a0810722-7673-43c7-8511-2fd9db1dbbc6" successfully removed
[root@overcloud-computehci-2 ~]# lvremove --force <LV_PATH> Logical volume "osd-data-a0810722-7673-43c7-8511-2fd9db1dbbc6" successfully removedCopy to Clipboard Copied! Toggle word wrap Toggle overflow vgremove --force <VOLUME_GROUP> Volume group "ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2" successfully removed
[root@overcloud-computehci-2 ~]# vgremove --force <VOLUME_GROUP> Volume group "ceph-0fb0de13-fc8e-44c8-99ea-911e343191d2" successfully removedCopy to Clipboard Copied! Toggle word wrap Toggle overflow pvremove <NEW_DEVICE> Labels on physical volume "/dev/sdj" successfully wiped.
[root@overcloud-computehci-2 ~]# pvremove <NEW_DEVICE> Labels on physical volume "/dev/sdj" successfully wiped.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Replace
Ensure that the new OSD device is clean. In the following example, the device is
/dev/sdj:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the new OSD with the existing OSD ID by using the new device but pass
--no-systemdso thatceph-volumedoes not attempt to start the OSD. This is not possible from within the container:ceph-volume lvm create --osd-id 27 --data /dev/sdj --no-systemd
ceph-volume lvm create --osd-id 27 --data /dev/sdj --no-systemdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Start the OSD outside of the container:
systemctl start ceph-osd@27
systemctl start ceph-osd@27Copy to Clipboard Copied! Toggle word wrap Toggle overflow
12.4. Verifying that the disk replacement is successful 링크 복사링크가 클립보드에 복사되었습니다!
To check that your disk replacement is successful, on the undercloud, complete the following steps.
Procedure
- Check if the device name changed, update the devices list according to the naming method you used to deploy Ceph. For more information, see Section 12.1, “Determining if there is a device name change”.
- To ensure that the change did not introduce any inconsistencies, re-run the overcloud deploy command to perform a stack update.
In cases where you have hosts that have different device lists, you might have to define an exception. For example, you might use the following example heat environment file to deploy a node with three OSD devices.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
CephAnsibleDisksConfigparameter applies to all nodes that host OSDs, so you cannot update thedevicesparameter with the new device list. Instead, you must define an exception for the new host that has a different device list. For more information about defining an exception, see Section 5.5, “Mapping the disk layout to non-homogeneous Ceph Storage nodes”.
Appendix A. Sample environment file: creating a Ceph Storage cluster 링크 복사링크가 클립보드에 복사되었습니다!
The following custom environment file uses many of the options described throughout Chapter 2, Preparing overcloud nodes. This sample does not include any commented-out options. For an overview on environment files, see Environment Files (from the Advanced Overcloud Customization guide).
/home/stack/templates/storage-config.yaml
- 1
- The
parameter_defaultssection modifies the default values for parameters in all templates. Most of the entries listed here are described in Chapter 4, Customizing the Storage service. - 2
- If you are deploying the Ceph Object Gateway, you can use Ceph Object Storage (
ceph-rgw) as a backup target. To configure this, setCinderBackupBackendtoswift. See Section 4.2, “Enabling the Ceph Object Gateway” for details. - 3
- The
CephAnsibleDisksConfigsection defines a custom disk layout for deployments using BlueStore. - 4
- For each role, the
*Countparameters assign a number of nodes while theOvercloud*Flavorparameters assign a flavor. For example,ControllerCount: 3assigns 3 nodes to the Controller role, andOvercloudControlFlavor: controlsets each of those roles to use thecontrolflavor. See Section 7.1, “Assigning nodes and flavors to roles” for details.NoteThe
CephMonCount,CephMdsCount,OvercloudCephMonFlavor, andOvercloudCephMdsFlavorparameters (along with theceph-monandceph-mdsflavors) will only be valid if you created a customCephMONandCephMdsrole, as described in Chapter 3, Deploying Ceph services on dedicated nodes. - 5
NeutronNetworkType:sets the network type that theneutronservice should use (in this case,vxlan).
Appendix B. Sample custom interface template: multiple bonded interfaces 링크 복사링크가 클립보드에 복사되었습니다!
The following template is a customized version of /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml. It features multiple bonded interfaces to isolate back-end and front-end storage network traffic, along with redundancy for both connections. For more information, see Section 4.3, “Configuring multiple bonded interfaces for Ceph nodes”. It also uses custom bonding options, 'mode=4 lacp_rate=1', see Section 4.3.1, “Configuring bonding module directives”.
/usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/ceph-storage.yaml (custom)