Chapter 10. Multicloud Object Gateway bucket replication
Data replication from one Multicloud Object Gateway (MCG) bucket to another MCG bucket provides higher resiliency and better collaboration options. These buckets can be either data buckets or namespace buckets backed by any supported storage solution (AWS S3, Azure, and so on).
A replication policy is composed of a list of replication rules. Each rule defines the destination bucket, and can specify a filter based on an object key prefix. Configuring a complementing replication policy on the second bucket results in bidirectional replication.
Prerequisites
- A running OpenShift Data Foundation Platform.
Download the Multicloud Object Gateway (MCG) command-line interface binary from the customer portal and make it executable.
NoteChoose the correct product variant according to your architecture. Available platforms are Linux(x86_64), Windows, and Mac OS.
To replicate a bucket, see Replicating a bucket to another bucket.
To set a bucket class replication policy, see Setting a bucket class replication policy.
10.1. Replicating a bucket to another bucket Copy linkLink copied to clipboard!
You can set the bucket replication policy in two ways:
10.1.1. Replicating a bucket to another bucket using the MCG command-line interface Copy linkLink copied to clipboard!
You can set a replication policy for Multicloud Object Gateway (MCG) data bucket at the time of creation of object bucket claim (OBC). You must define the replication policy parameter in a JSON file.
Procedure
From the MCG command-line interface, run the following command to create an OBC with a specific replication policy:
odf noobaa obc create <bucket-claim-name> -n openshift-storage --replication-policy /path/to/json-file.json
<bucket-claim-name>- Specify the name of the bucket claim.
/path/to/json-file.json- Is the path to a JSON file which defines the replication policy.
Example JSON file:
[{ "rule_id": "rule-1", "destination_bucket": "first.bucket", "filter": {"prefix": "repl"}}]
"prefix"-
Is optional. It is the prefix of the object keys that should be replicated, and you can even leave it empty, for example,
{"prefix": ""}.
For example:
odf noobaa obc create my-bucket-claim -n openshift-storage --replication-policy /path/to/json-file.json
10.1.2. Replicating a bucket to another bucket using a YAML Copy linkLink copied to clipboard!
You can set a replication policy for Multicloud Object Gateway (MCG) data bucket at the time of creation of object bucket claim (OBC) or you can edit the YAML later. You must provide the policy as a JSON-compliant string that adheres to the format shown in the following procedure.
Procedure
Apply the following YAML:
apiVersion: objectbucket.io/v1alpha1 kind: ObjectBucketClaim metadata: name: <desired-bucket-claim> namespace: <desired-namespace> spec: generateBucketName: <desired-bucket-name> storageClassName: openshift-storage.noobaa.io additionalConfig: replicationPolicy: {"rules": [{ "rule_id": "", "destination_bucket": "", "filter": {"prefix": ""}}]}<desired-bucket-claim>- Specify the name of the bucket claim.
<desired-namespace>- Specify the namespace.
<desired-bucket-name>- Specify the prefix of the bucket name.
"rule_id"-
Specify the ID number of the rule, for example,
{"rule_id": "rule-1"}. "destination_bucket"-
Specify the name of the destination bucket, for example,
{"destination_bucket": "first.bucket"}. "prefix"-
Is optional. It is the prefix of the object keys that should be replicated, and you can even leave it empty, for example,
{"prefix": ""}.
Additional information
- For more information about OBCs, see Object Bucket Claim.
10.2. Setting a bucket class replication policy Copy linkLink copied to clipboard!
It is possible to set up a replication policy that automatically applies to all the buckets created under a certain bucket class. You can do this in two ways:
10.2.1. Setting a bucket class replication policy using the MCG command-line interface Copy linkLink copied to clipboard!
You can set a replication policy for Multicloud Object Gateway (MCG) data bucket at the time of creation of bucket class. You must define the replication-policy parameter in a JSON file. You can set a bucket class replication policy for the Placement and Namespace bucket classes.
You can set a bucket class replication policy for the Placement and Namespace bucket classes.
Procedure
From the MCG command-line interface, run the following command:
odf noobaa -n openshift-storage bucketclass create placement-bucketclass <bucketclass-name> --backingstores <backingstores> --replication-policy=/path/to/json-file.json<bucketclass-name>- Specify the name of the bucket class.
<backingstores>- Specify the name of a backingstore. You can pass many backingstores separated by commas.
/path/to/json-file.jsonIs the path to a JSON file which defines the replication policy.
Example JSON file:
[{ "rule_id": "rule-1", "destination_bucket": "first.bucket", "filter": {"prefix": "repl"}}]"prefix"Is optional. The prefix of the object keys gets replicated. You can leave it empty, for example,
{"prefix": ""}.For example:
odf noobaa -n openshift-storage bucketclass create placement-bucketclass bc --backingstores azure-blob-ns --replication-policy=/path/to/json-file.jsonThis example creates a placement bucket class with a specific replication policy defined in the JSON file.
10.2.2. Setting a bucket class replication policy using a YAML Copy linkLink copied to clipboard!
You can set a replication policy for Multicloud Object Gateway (MCG) data bucket at the time of creation of bucket class or you can edit their YAML later. You must provide the policy as a JSON-compliant string that adheres to the format shown in the following procedure.
Procedure
Apply the following YAML:
apiVersion: noobaa.io/v1alpha1 kind: BucketClass metadata: labels: app: <desired-app-label> name: <desired-bucketclass-name> namespace: <desired-namespace> spec: placementPolicy: tiers: - backingstores: - <backingstore> placement: Spread replicationPolicy: [{ "rule_id": "<rule id>", "destination_bucket": "first.bucket", "filter": {"prefix": "<object name prefix>"}}]This YAML is an example that creates a placement bucket class. Each Object bucket claim (OBC) object that is uploaded to the bucket is filtered based on the prefix and is replicated to
first.bucket.<desired-app-label>- Specify a label for the app.
<desired-bucketclass-name>- Specify the bucket class name.
<desired-namespace>- Specify the namespace in which the bucket class gets created.
<backingstore>- Specify the name of a backingstore. You can pass many backingstores.
"rule_id"-
Specify the ID number of the rule, for example,
{"rule_id": "rule-1"}. "destination_bucket"-
Specify the name of the destination bucket, for example,
{"destination_bucket": "first.bucket"}. "prefix"-
Is optional. The prefix of the object keys gets replicated. You can leave it empty, for example,
{"prefix": ""}.
10.3. Enabling log based bucket replication Copy linkLink copied to clipboard!
When creating a bucket replication policy, you can use logs so that recent data is replicated more quickly, while the default scan-based replication works on replicating the rest of the data.
This feature requires setting up bucket logs on AWS or Azure.For more information about setting up AWS logs, see Enabling Amazon S3 server access logging. The AWS logs bucket needs to be created in the same region as the source NamespaceStore AWS bucket.
This feature is only supported in buckets that are backed by a NamespaceStore. Buckets backed by BackingStores cannot utilized log-based replication.
10.3.1. Enabling log based bucket replication for new namespace buckets using OpenShift Web Console in Amazon Web Service environment Copy linkLink copied to clipboard!
You can optimize replication by using the event logs of the Amazon Web Service(AWS) cloud environment. You enable log based bucket replication for new namespace buckets using the web console during the creation of namespace buckets.
Prerequisites
- Ensure that object logging is enabled in AWS. For more information, see the “Using the S3 console” section in Enabling Amazon S3 server access logging.
- Administrator access to OpenShift Web Console.
Procedure
-
In the OpenShift Web Console, navigate to Storage
Object Storage Object Bucket Claims. - Click Create ObjectBucketClaim.
- Enter the name of ObjectBucketName and select StorageClass and BucketClass.
- Select the Enable replication check box to enable replication.
- In the Replication policy section, select the Optimize replication using event logs checkbox.
Enter the name of the bucket that will contain the logs under Event log Bucket.
If the logs are not stored in the root of the bucket, provide the full path without
s3://- Enter a prefix to replicate only the objects whose name begins with the given prefix.
10.3.2. Enabling log based bucket replication for existing namespace buckets using YAML Copy linkLink copied to clipboard!
You can enable log based bucket replication for the existing buckets that are created using the command line interface or by applying an YAML, and not the buckets that are created using AWS S3 commands.
Procedure
Edit the YAML of the bucket’s OBC to enable log based bucket replication. Add the following under
spec:replicationPolicy: '{"rules":[{"rule_id":"<RULE ID>", "destination_bucket":"<DEST>", "filter": {"prefix": "<PREFIX>"}}], "log_replication_info": {"logs_location": {"logs_bucket": "<LOGS_BUCKET>"}}}'
It is also possible to add this to the YAML of an OBC before it is created.
rule_id- Specify an ID of your choice for identifying the rule
destination_bucket- Specify the name of the target MCG bucket that the objects are copied to
- (optional)
{"filter": {"prefix": <>}} - Specify a prefix string that you can set to filter the objects that are replicated
log_replication_info-
Specify an object that contains data related to log-based replication optimization.
{"logs_location": {"logs_bucket": <>}}is set to the location of the AWS S3 server access logs.
10.3.3. Enabling log based bucket replication in Microsoft Azure Copy linkLink copied to clipboard!
Prerequisites
Refer to Microsoft Azure documentation and ensure that you have completed the following tasks in the Microsoft Azure portal:
Ensure that have created a new application and noted down the name, application (client) ID, and directory (tenant) ID.
For information, see Register an application.
- Ensure that a new a new client secret is created and the application secret is noted down.
Ensure that a new Log Analytics workspace is created and its name and workspace ID is noted down.
For information, see Create a Log Analytics workspace.
Ensure that the
Readerrole is assigned under Access control and members are selected and the name of the application that you registered in the previous step is provided.For more information, see Assign Azure roles using the Azure portal.
- Ensure that a new storage account is created and the Access keys are noted down.
In the Monitoring section of the storage account created, select a blob and in the Diagnostic settings screen, select only
StorageWriteandStorageDelete, and in the destination details add the Log Analytics workspace that you created earlier. Ensure that a blob is selected in the Diagnostic settings screen of the Monitoring section of the storage account created. Also, ensure that onlyStorageWriteandStorageDeleteis selected and in the destination details, the Log Analytics workspace that you created earlier is added.For more information, see Diagnostic settings in Azure Monitor.
- Ensure that two new containers for object source and object destination are created.
- Administrator access to OpenShift Web Console.
Procedure
Create a secret with credentials to be used by the
namespacestores.apiVersion: v1 kind: Secret metadata: name: <namespacestore-secret-name> type: Opaque data: TenantID: <AZURE TENANT ID ENCODED IN BASE64> ApplicationID: <AZURE APPLICATIOM ID ENCODED IN BASE64> ApplicationSecret: <AZURE APPLICATION SECRET ENCODED IN BASE64> LogsAnalyticsWorkspaceID: <AZURE LOG ANALYTICS WORKSPACE ID ENCODED IN BASE64> AccountName: <AZURE ACCOUNT NAME ENCODED IN BASE64> AccountKey: <AZURE ACCOUNT KEY ENCODED IN BASE64>Create a
NamespaceStorebacked by a container created in Azure.For more information, see Adding a namespace bucket using the OpenShift Container Platform user interface.
- Create a new Namespace-Bucketclass and OBC that utilizes it.
-
Check the object bucket name by looking in the YAML of target OBC, or by listing all S3 buckets, for example,
- s3 ls. Use the following template to apply an Azure replication policy on your source OBC by adding the following in its YAML, under
.spec:replicationPolicy:'{"rules":[ {"rule_id":"ID goes here", "sync_deletions": "<true or false>"", "destination_bucket":object bucket name"} ], "log_replication_info":{"endpoint_type":"AZURE"}}'sync_deletion-
Specify a boolean value,
trueorfalse. destination_bucket-
Make sure to use the name of the object bucket, and not the claim. The name can be retrieved using the
s3 lscommand, or by looking for the value in an OBC’s YAML.
Verification steps
- Write objects to the source bucket.
- Wait until MCG replicates them.
- Delete the objects from the source bucket.
- Verify the objects were removed from the target bucket.
10.3.4. Enabling log-based bucket replication deletion Copy linkLink copied to clipboard!
Prerequisites
- Administrator access to OpenShift Web Console.
- AWS Server Access Logging configured for the desired bucket.
Procedure
-
In the OpenShift Web Console, navigate to Storage
Object Storage Object Bucket Claims. - Click Create new Object bucket claim.
- (Optional) In the Replication rules section, select the Sync deletion checkbox for each rule separately.
Enter the name of the bucket that will contain the logs under Event log Bucket.
If the logs are not stored in the root of the bucket, provide the full path without
s3://- Enter a prefix to replicate only the objects whose name begins with the given prefix.
10.4. Bucket logging for Multicloud Object Gateway Copy linkLink copied to clipboard!
Bucket logging helps you to record the S3 operations that are performed against the Multicloud Object Gateway (MCG) bucket for compliance, auditing, and optimization purposes.
Bucket logging supports the following two options:
- Best-effort - Bucket logging is recorded using UDP on the best effort basis
Guaranteed - Bucket logging with this option creates a PVC attached to the MCG pods and saves the logs to this PVC on a Guaranteed basis, and then from the PVC to the log buckets. Using this option logging takes place twice for every S3 operation as follows:
- At the start of processing the request
- At the end with the result of the S3 operation
10.4.1. Enabling bucket logging for Multicloud Object Gateway using the Best-effort option Copy linkLink copied to clipboard!
Prerequisites
- Openshift Container Platform with OpenShift Data Foundation operator installed.
Access to MCG.
For information, see Accessing the Multicloud Object Gateway with your applications.
Procedure
Create a data bucket where you can upload the objects.
odf nb bucket create data.bucketCreate a log bucket where you want to store the logs for bucket operations by using the following command:
odf nb bucket create log.bucketConfigure bucket logging on data bucket with log bucket in one of the following ways:
Using the NooBaa API
odf nb api bucket_api put_bucket_logging '{ "name": "data.bucket", "log_bucket": "log.bucket", "log_prefix": "data-bucket-logs" }'Using the S3 API
alias s3api_alias='AWS_ACCESS_KEY_ID=$NOOBAA_ACCESS_KEY AWS_SECRET_ACCESS_KEY=$NOOBAA_SECRET_KEY aws --endpoint https://localhost:10443 --no-verify-ssl s3api'Create a file called
setlogging.jsonin the following format:{ "LoggingEnabled": { "TargetBucket": "<log-bucket-name>", "TargetPrefix": "<prefix/empty-string>" } }Run the following command:
s3api_alias put-bucket-logging --endpoint <ep> --bucket <source-bucket> --bucket-logging-status file://setlogging.json --no-verify-ssl
Verify if the bucket logging is set for the data bucket in one of the following ways:
Using the NooBaa API
odf nb api bucket_api get_bucket_logging '{ "name": "data.bucket" }'Using the S3 API
s3api_alias get-bucket-logging --no-verify-ssl --endpoint <ep> --bucket <source-bucket>The S3 operations can take up to 24 hours to get recorded in the logs bucket. The following example shows the recorded logs and how to download them:
Example
s3_alias cp s3://logs.bucket/data-bucket-logs/logs.bucket.bucket_data-bucket-logs_1719230150.log - | tail -n 2 Jun 24 14:00:02 10-XXX-X-XXX.sts.openshift-storage.svc.cluster.local {"noobaa_bucket_logging":"true","op":"GET","bucket_owner":"operator@noobaa.io","source_bucket":"data.bucket","object_key":"/data.bucket?list-type=2&prefix=data-bucket-logs&delimiter=%2F&encoding-type=url","log_bucket":"logs.bucket","remote_ip":"100.XX.X.X","request_uri":"/data.bucket?list-type=2&prefix=data-bucket-logs&delimiter=%2F&encoding-type=url","request_id":"luv2XXXX-ctyg2k-12gs"} Jun 24 14:00:06 10-XXX-X-XXX.s3.openshift-storage.svc.cluster.local {"noobaa_bucket_logging":"true","op":"PUT","bucket_owner":"operator@noobaa.io","source_bucket":"data.bucket","object_key":"/data.bucket/B69EC83F-0177-44D8-A8D1-4A10C5A5AB0F.file","log_bucket":"logs.bucket","remote_ip":"100.XX.X.X","request_uri":"/data.bucket/B69EC83F-0177-44D8-A8D1-4A10C5A5AB0F.file","request_id":"luv2XXXX-9syea5-x5z"}
(Optional) To disable bucket logging, use the following command:
odf nb api bucket_api delete_bucket_logging '{ "name": "data.bucket" }'
10.4.2. Enabling bucket logging using the Guaranteed option Copy linkLink copied to clipboard!
Procedure
Enable Guaranteed bucket logging using the NooBaa CR in one of the following ways:
Using the default CephFS storage class update the NooBaa CR spec:
bucketLogging: { loggingType: guaranteed }Using the RWX PVC that you created:
NoteMake sure that the PVC supports RWX
bucketLogging: { loggingType: guaranteed bucketLoggingPVC: <pvc-name> }
10.5. Synchronizing versions in Multicloud Object Gateway bucket replication Copy linkLink copied to clipboard!
Prerequisites
A Multicloud Object Gateway (MCG) source bucket, which is created from an object bucket claim (OBC) and any MCG target bucket. For example, you can create the two buckets using OBCs using the MCG command line interface (CLI):
Create a source bucket using an OBC:
$ odf noobaa obc create source-bucket --exact $ odf noobaa obc create target-bucket --exactwhere
--exactis optional.
Ensure that S3 client aliases with MCG credentials and endpoint are set up.
$ NOOBAA_ACCESS_KEY=$(oc extract secret/noobaa-admin -n openshift-storage --keys=AWS_ACCESS_KEY_ID --to=- 2>/dev/null); \ NOOBAA_SECRET_KEY=$(oc extract secret/noobaa-admin -n openshift-storage --keys=AWS_SECRET_ACCESS_KEY --to=- 2>/dev/null); \ S3_ENDPOINT=https://$(oc get route s3 -n openshift-storage -o json | jq -r ".spec.host")$ alias common_s3='AWS_ACCESS_KEY_ID=$NOOBAA_ACCESS_KEY AWS_SECRET_ACCESS_KEY=$NOOBAA_SECRET_KEY aws --endpoint $S3_ENDPOINT --no-verify-ssl'; \ alias s3_alias='common_s3 s3'; \ alias s3api_alias='common_s3 s3api'Make sure to enable versioning on both the source and target bucket by using the
put-bucket-versioningcommand in the AWS S3 client:$ s3api_alias put-bucket-versioning --bucket source-bucket --versioning-configuration Status=Enabled$ s3api_alias put-bucket-versioning --bucket target-bucket --versioning-configuration Status=EnabledProcedure
Patch or edit a replication policy with
sync_versions: trueto the source bucket’s OBC:$ oc patch obc source-bucket -n openshift-storage --type=merge -p '{"spec": {"additionalConfig": {"replicationPolicy": "{\"rules\":[{\"rule_id\":\"replication-rule-a\",\"destination_bucket\":\"target-bucket\", \"sync_versions\": true}]}"}}}'Normal bucket replication replicates only the latest version, and the use of
sync_versionsadd the functionality of replicating even the older versions, and in their original order.
- Delete markers are replicated if configured and using log base replication.
- Delete version IDs are not replicated to avoid human errors.
Verification steps
Verify the replication to the target bucket by adding a few versions under an object key and waiting for them to get replicated to the target bucket:
$ echo 'version_a' | s3_alias cp - s3://source-bucket/versioned_obj.txt$ echo 'version_b' | s3_alias cp - s3://source-bucket/versioned_obj.txt$ echo 'version_c' | s3_alias cp - s3://source-bucket/versioned_obj.txtAfter a few minutes, compare the versions of the object on both the buckets:
$ s3api_alias list-object-versions --bucket source-bucket --prefix versioned_obj.txt | jq -r ".Versions[].ETag" "aaabf266d38a8e995cef03c13ee9a7f1" "181d8a23f59939c0cddec1692e05cdf3" "ebc538cc6ffa04a39263f4b1be2f832f"$ s3api_alias list-object-versions --bucket target-bucket --prefix versioned_obj.txt | jq -r ".Versions[].ETag" "aaabf266d38a8e995cef03c13ee9a7f1" "181d8a23f59939c0cddec1692e05cdf3" "ebc538cc6ffa04a39263f4b1be2f832f"
10.6. Obtaining metrics to reflect bucket replication state Copy linkLink copied to clipboard!
To determine if the data is safe or available on the secondary site, the replication progress per bucket is provided using the following metrics :
-
Total number of objects scanned in the last replication cycle per bucket -
bucket_last_cycle_total_objects_num {bucket="bucket-name"} -
Number of objects successfully replicated in the last cycle per bucket -
bucket_last_cycle_replicated_objects_num{bucket="bucket-name"} -
Number of objects that failed replication in the last cycle per bucket -
bucket_last_cycle_error_objects_num {bucket="bucket-name"}
Procedure
Use the following commands to obtain the metrics:
JWT_TOKEN=$(oc get secret noobaa-metrics-auth-secret -n openshift-storage -o jsonpath=" {.data.metrics_token} " | base64 -d) oc -n openshift-storage exec noobaa-core-0 – curl -k -H "Authorization: Bearer ${JWT_TOKEN}" localhost:8080/metrics/bg_workersFor example:
noobaa-operator$ oc -n openshift-storage exec -it noobaa-core-0 -- curl -k -H "Authorization: Bearer ${JWT_TOKEN}" http://localhost:8080/metrics/bg_workers | grep -E "bucket_name"| grep -v "NooBaa_replication_status" Defaulted container "core" out of: core, noobaa-log-processor NooBaa_bucket_last_cycle_total_objects_num{bucket_name="test-buck1"} 15 NooBaa_bucket_last_cycle_total_objects_num{bucket_name="test-buck2"} 25 NooBaa_bucket_last_cycle_replicated_objects_num{bucket_name="test-buck1"} 15 NooBaa_bucket_last_cycle_replicated_objects_num{bucket_name="test-buck2"} 25 NooBaa_bucket_last_cycle_error_objects_num{bucket_name="test-buck1"} 0 NooBaa_bucket_last_cycle_error_objects_num{bucket_name="test-buck2"} 0Use the following syntax to view the metrics in the metrics section:
metric_name {bucket_name='bucket_name'}For example:
For
NooBaa_bucket_last_cycle_total_objects_numNooBaa_bucket_last_cycle_total_objects_num {bucket_name='bubck1'}For
NooBaa_bucket_last_cycle_replicated_objects_numNooBaa_bucket_last_cycle_replicated_objects_num {bucket_name='bubck1'}For NooBaa_bucket_last_cycle_error_objects_num `
NooBaa_bucket_last_cycle_error_objects_num {bucket_name='bubck1'}