Chapter 6. Migrating the Object Storage service to Red Hat OpenStack Services on OpenShift nodes
If you are using the Red Hat OpenStack Platform Object Storage service (swift) as an Object Storage service, you must migrate your Object Storage service to Red Hat OpenStack Services on OpenShift nodes.
If you are using the Object Storage API of the Ceph Object Gateway (RGW), you can skip this chapter and migrate your Red Hat Ceph Storage cluster. For more information, see Migrate the Red Hat Ceph Storage cluster. If you are not planning to migrate Ceph daemons from Controller nodes, you must perform the steps that are described in Deploying a Ceph ingress daemon and Create or update the Object Storage service endpoints.
The data migration happens replica by replica. For example, if you have 3 replicas, move them one at a time to ensure that the other 2 replicas are still operational, which enables you to continue to use the Object Storage service during the migration.
Data migration to the new deployment is a long-running process that executes mostly in the background. The Object Storage service replicators move data from old to new nodes, which might take a long time depending on the amount of storage used. To reduce downtime, you can use the old nodes if they are running and continue with adopting other services while waiting for the migration to complete. Performance might be degraded due to the amount of replication traffic in the network.
6.1. Migrating the Object Storage service data from RHOSP to RHOSO nodes Copy linkLink copied to clipboard!
The Object Storage service (swift) migration involves the following steps:
- Add new nodes to the Object Storage service rings.
- Set weights of existing nodes to 0.
- Rebalance rings by moving one replica.
- Copy rings to old nodes and restart services.
- Check replication status and repeat the previous two steps until the old nodes are drained.
- Remove the old nodes from the rings.
Prerequisites
- Adopt the Object Storage service. For more information, see Adopting the Object Storage service.
For DNS servers, ensure that all existing nodes are able to resolve the hostnames of the Red Hat OpenShift Container Platform (RHOCP) pods, for example, by using the external IP of the DNSMasq service as the nameserver in
/etc/resolv.conf
:oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | $CONTROLLER1_SSH sudo tee /etc/resolv.conf
$ oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | $CONTROLLER1_SSH sudo tee /etc/resolv.conf
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Track the current status of the replication by using the
swift-dispersion
tool:oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The command might need a few minutes to complete. It creates 0-byte objects that are distributed across the Object Storage service deployment, and you can use the
swift-dispersion-report
afterward to show the current replication status:oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output of the
swift-dispersion-report
command looks similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
Add new nodes by scaling up the SwiftStorage resource from 0 to 3:
oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'
$ oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command creates three storage instances on the Red Hat OpenShift Container Platform (RHOCP) cluster that use Persistent Volume Claims.
Wait until all three pods are running and the rings include the new devices:
oc wait pods --for condition=Ready -l component=swift-storage oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search --device pv'
$ oc wait pods --for condition=Ready -l component=swift-storage $ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search --device pv'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow From the current rings, get the storage management IP addresses of the nodes to drain:
oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search _' | tail -n +2 | awk '{print $4}' | sort -u
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search _' | tail -n +2 | awk '{print $4}' | sort -u
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output looks similar to the following:
172.20.0.100 swift-storage-0.swift-storage.openstack.svc swift-storage-1.swift-storage.openstack.svc swift-storage-2.swift-storage.openstack.svc
172.20.0.100 swift-storage-0.swift-storage.openstack.svc swift-storage-1.swift-storage.openstack.svc swift-storage-2.swift-storage.openstack.svc
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Drain the old nodes. In the following example, the old node
172.20.0.100
is drained:oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c ' swift-ring-tool get swift-ring-tool drain 172.20.0.100 swift-ring-tool rebalance swift-ring-tool push'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Depending on your deployment, you might have more nodes to include in the command.
Copy and apply the updated rings to the original nodes. Run the ssh commands for your existing nodes that store the Object Storage service data:
oc extract --confirm cm/swift-ring-files
$ oc extract --confirm cm/swift-ring-files $CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz $CONTROLLER1_SSH "systemctl restart tripleo_swift_*"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Track the replication progress by using the
swift-dispersion-report
tool:oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows less than 100% of copies found. Repeat the command until all container and object copies are found:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Move the next replica to the new nodes by rebalancing and distributing the rings:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the
swift-dispersion-report
output again, wait until all copies are found, and then repeat this step until all your replicas are moved to the new nodes.Remove the nodes from the rings:
oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c ' swift-ring-tool get swift-ring-tool remove 172.20.0.100 swift-ring-tool rebalance swift-ring-tool push'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteEven if all replicas are on the new nodes and the
swift-dispersion-report
command reports 100% of the copies found, there might still be data on the old nodes. The replicators remove this data, but it might take more time.
Verification
Check the disk usage of all disks in the cluster:
oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that there are no more
\*.db
or*.data
files in the/srv/node
directory on the nodes:$CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
$CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.2. Troubleshooting the Object Storage service migration Copy linkLink copied to clipboard!
You can troubleshoot issues with the Object Storage service (swift) migration.
If the replication is not working and the
swift-dispersion-report
is not back to 100% availability, check the replicator progress to help you debug:CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-server
$ CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-server
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following shows an example of the output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can also check the ring consistency and replicator status:
oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output might show a md5 mismatch until approximately 2 minutes after pushing the new rings. After the 2 minutes, the output looks similar to the following example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow