Chapter 9. Administration
As a storage administrator, you can manage the Ceph Object Gateway using the radosgw-admin
command line interface (CLI) or using the Red Hat Ceph Storage Dashboard.
Not all of the Ceph Object Gateway features are available to the Red Hat Ceph Storage Dashboard.
Prerequisites
- A healthy running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway software.
9.1. Creating storage policies
The Ceph Object Gateway stores the client bucket and object data by identifying placement targets, and storing buckets and objects in the pools associated with a placement target. If you don’t configure placement targets and map them to pools in the instance’s zone configuration, the Ceph Object Gateway will use default targets and pools, for example, default_placement
.
Storage policies give Ceph Object Gateway clients a way of accessing a storage strategy, that is, the ability to target a particular type of storage, such as SSDs, SAS drives, and SATA drives, as a way of ensuring, for example, durability, replication, and erasure coding. For details, see the Storage Strategies guide for Red Hat Ceph Storage 7.
To create a storage policy, use the following procedure:
-
Create a new pool
.rgw.buckets.special
with the desired storage strategy. For example, a pool customized with erasure-coding, a particular CRUSH ruleset, the number of replicas, and thepg_num
andpgp_num
count. Get the zone group configuration and store it in a file:
Syntax
radosgw-admin zonegroup --rgw-zonegroup=ZONE_GROUP_NAME get > FILE_NAME.json
Example
[root@host01 ~]# radosgw-admin zonegroup --rgw-zonegroup=default get > zonegroup.json
Add a
special-placement
entry underplacement_target
in thezonegroup.json
file:Example
{ "name": "default", "api_name": "", "is_master": "true", "endpoints": [], "hostnames": [], "master_zone": "", "zones": [{ "name": "default", "endpoints": [], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 5 }], "placement_targets": [{ "name": "default-placement", "tags": [] }, { "name": "special-placement", "tags": [] }], "default_placement": "default-placement" }
Set the zone group with the modified
zonegroup.json
file:Example
[root@host01 ~]# radosgw-admin zonegroup set < zonegroup.json
Get the zone configuration and store it in a file, for example,
zone.json
:Example
[root@host01 ~]# radosgw-admin zone get > zone.json
Edit the zone file and add the new placement policy key under
placement_pool
:Example
{ "domain_root": ".rgw", "control_pool": ".rgw.control", "gc_pool": ".rgw.gc", "log_pool": ".log", "intent_log_pool": ".intent-log", "usage_log_pool": ".usage", "user_keys_pool": ".users", "user_email_pool": ".users.email", "user_swift_pool": ".users.swift", "user_uid_pool": ".users.uid", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [{ "key": "default-placement", "val": { "index_pool": ".rgw.buckets.index", "data_pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra" } }, { "key": "special-placement", "val": { "index_pool": ".rgw.buckets.index", "data_pool": ".rgw.buckets.special", "data_extra_pool": ".rgw.buckets.extra" } }] }
Set the new zone configuration:
Example
[root@host01 ~]# radosgw-admin zone set < zone.json
Update the zone group map:
Example
[root@host01 ~]# radosgw-admin period update --commit
The
special-placement
entry is listed as aplacement_target
.To specify the storage policy when making a request:
Example
$ curl -i http://10.0.0.1/swift/v1/TestContainer/file.txt -X PUT -H "X-Storage-Policy: special-placement" -H "X-Auth-Token: AUTH_rgwtxxxxxx"
9.2. Creating indexless buckets
You can configure a placement target where created buckets do not use the bucket index to store objects index; that is, indexless buckets. Placement targets that do not use data replication or listing might implement indexless buckets. Indexless buckets provide a mechanism in which the placement target does not track objects in specific buckets. This removes a resource contention that happens whenever an object write happens and reduces the number of round trips that Ceph Object Gateway needs to make to the Ceph storage cluster. This can have a positive effect on concurrent operations and small object write performance.
The bucket index does not reflect the correct state of the bucket, and listing these buckets does not correctly return their list of objects. This affects multiple features. Specifically, these buckets are not synced in a multi-zone environment because the bucket index is not used to store change information. Red Hat recommends not to use S3 object versioning on indexless buckets, because the bucket index is necessary for this feature.
Using indexless buckets removes the limit of the max number of objects in a single bucket.
Objects in indexless buckets cannot be listed from NFS.
Prerequisites
- A running and healthy Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway software.
- Root-level access to a Ceph Object Gateway node.
Procedure
Add a new placement target to the zonegroup:
Example
[ceph: root@host03 /]# radosgw-admin zonegroup placement add --rgw-zonegroup="default" \ --placement-id="indexless-placement"
Add a new placement target to the zone:
Example
[ceph: root@host03 /]# radosgw-admin zone placement add --rgw-zone="default" \ --placement-id="indexless-placement" \ --data-pool="default.rgw.buckets.data" \ --index-pool="default.rgw.buckets.index" \ --data_extra_pool="default.rgw.buckets.non-ec" \ --placement-index-type="indexless"
Set the zonegroup’s default placement to
indexless-placement
:Example
[ceph: root@host03 /]# radosgw-admin zonegroup placement default --placement-id "indexless-placement"
In this example, the buckets created in the
indexless-placement
target will be indexless buckets.Update and commit the period if the cluster is in a multi-site configuration:
Example
[ceph: root@host03 /]# radosgw-admin period update --commit
Restart the Ceph Object Gateways on all nodes in the storage cluster for the change to take effect:
Syntax
ceph orch restart SERVICE_TYPE
Example
[ceph: root@host03 /]# ceph orch restart rgw
9.3. Configure bucket index resharding
As a storage administrator, you can configure bucket index resharding in single-site and multi-site deployments to improve performance.
You can reshard a bucket index either manually offline or dynamically online.
9.3.1. Bucket index resharding
The Ceph Object Gateway stores bucket index data in the index pool, which defaults to .rgw.buckets.index
parameter. When the client puts many objects in a single bucket without setting quotas for the maximum number of objects per bucket, the index pool can result in significant performance degradation.
- Bucket index resharding prevents performance bottlenecks when you add a high number of objects per bucket.
- You can configure bucket index resharding for new buckets or change the bucket index on the existing ones.
- You need to have the shard count as the nearest prime number to the calculated shard count. The bucket index shards that are prime numbers tend to work better in an evenly distributed bucket index entries across shards.
Bucket index can be resharded manually or dynamically.
During the process of resharding bucket index dynamically, there is a periodic check of all the Ceph Object Gateway buckets and it detects buckets that require resharding. If a bucket has grown larger than the value specified in the
rgw_max_objs_per_shard
parameter, the Ceph Object Gateway reshards the bucket dynamically in the background. The default value forrgw_max_objs_per_shard
is 100k objects per shard. Resharding bucket index dynamically works as expected on the upgraded single-site configuration without any modification to the zone or the zone group. A single site-configuration can be any of the following:- A default zone configuration with no realm.
- A non-default configuration with at least one realm.
- A multi-realm single-site configuration.
Versioned buckets may exhibit imbalanced indexes, especially if a small subset of objects are written and re-written. This issue may lead to large omaps on the versioned bucket when a large number of object uploads happen (around a million objects).
9.3.2. Recovering bucket index
Resharding a bucket that was created with bucket_index_max_shards = 0
, removes the bucket’s metadata. However, you can restore the bucket indexes by recovering the affected buckets.
The /usr/bin/rgw-restore-bucket-index
tool creates temporary files in the /tmp
directory. These temporary files consume space based on the bucket objects count from the previous buckets. The previous buckets with more than 10M objects needs more than 4GB of free space in /tmp
directory. If the storage space in /tmp
is exhausted, the tool fails with the following message:
ln: failed to access '/tmp/rgwrbi-object-list.4053207': No such file or directory
The temporary objects are removed.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed at a minimum of two sites.
-
The
jq
package installed.
Procedure
Perform either of the below two steps to perform recovery of bucket indexes:
-
Run
radosgw-admin object reindex --bucket BUCKET_NAME --object OBJECT_NAME
command. Run the script -
/usr/bin/rgw-restore-bucket-index -b BUCKET_NAME -p DATA_POOL_NAME
.Example
[root@host01 ceph]# /usr/bin/rgw-restore-bucket-index -b bucket-large-1 -p local-zone.rgw.buckets.data marker is d8a347a4-99b6-4312-a5c1-75b83904b3d4.41610.2 bucket_id is d8a347a4-99b6-4312-a5c1-75b83904b3d4.41610.2 number of bucket index shards is 5 data pool is local-zone.rgw.buckets.data NOTICE: This tool is currently considered EXPERIMENTAL. The list of objects that we will attempt to restore can be found in "/tmp/rgwrbi-object-list.49946". Please review the object names in that file (either below or in another window/terminal) before proceeding. Type "proceed!" to proceed, "view" to view object list, or "q" to quit: view Viewing... Type "proceed!" to proceed, "view" to view object list, or "q" to quit: proceed! Proceeding... NOTICE: Bucket stats are currently incorrect. They can be restored with the following command after 2 minutes: radosgw-admin bucket list --bucket=bucket-large-1 --allow-unordered --max-entries=1073741824 Would you like to take the time to recalculate bucket stats now? [yes/no] yes Done real 2m16.530s user 0m1.082s sys 0m0.870s
-
Run
The tool does not work for versioned buckets.
[root@host01 ~]# time rgw-restore-bucket-index --proceed serp-bu-ver-1 default.rgw.buckets.data NOTICE: This tool is currently considered EXPERIMENTAL. marker is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5 bucket_id is e871fb65-b87f-4c16-a7c3-064b66feb1c4.25076.5 Error: this bucket appears to be versioned, and this tool cannot work with versioned buckets.
-
The tool’s scope is limited to a single site only and not multi-site, that is, if we run
rgw-restore-bucket-index
tool at site-1, it does not recover objects in site-2 and vice versa. On a multi-site, the recovery tool and the object re-index command should be executed at both sites for a bucket.
9.3.3. Limitations of bucket index resharding
Use the following limitations with caution. There are implications related to your hardware selections, so you should always discuss these requirements with your Red Hat account team.
- Maximum number of objects in one bucket before it needs resharding: Use a maximum of 102,400 objects per bucket index shard. To take full advantage of resharding and maximize parallelism, provide a sufficient number of OSDs in the Ceph Object Gateway bucket index pool. This parallelization scales with the number of Ceph Object Gateway instances, and replaces the in-order index shard enumeration with a number sequence. The default locking timeout is extended from 60 seconds to 90 seconds.
- Maximum number of objects when using sharding: Based on prior testing, the number of bucket index shards currently supported is 65521. Red Hat quality assurance has NOT performed full scalability testing on bucket sharding.
- Maximum number of objects when using sharding: Based on prior testing, the number of bucket index shards currently supported is 65,521.
You can reshard a bucket three times before the other zones catch-up: Resharding is not recommended until the older generations synchronize. Around four generations of the buckets from previous reshards are supported. Once the limit is reached, dynamic resharding does not reshard the bucket again until at least one of the old log generations are fully trimmed. Using the command
radosgw-admin bucket reshard
throws the following error:Bucket _BUCKET_NAME_ already has too many log generations (4) from previous reshards that peer zones haven't finished syncing. Resharding is not recommended until the old generations sync, but you can force a reshard with `--yes-i-really-mean-it`.
9.3.4. Configuring bucket index resharding in simple deployments
To enable and configure bucket index resharding on all new buckets, use the rgw_override_bucket_index_max_shards
parameter.
You can set the parameter to one of the following values:
-
0
to disable bucket index sharding, which is the default value. -
A value greater than
0
to enable bucket sharding and to set the maximum number of shards.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed at a minimum of two sites.
Procedure
Calculate the recommended number of shards:
number of objects expected in a bucket / 100,000
NoteThe maximum number of bucket index shards currently supported is 65,521.
Set the
rgw_override_bucket_index_max_shards
option accordingly:Syntax
ceph config set client.rgw rgw_override_bucket_index_max_shards VALUE
Replace VALUE with the recommended number of shards calculated:
Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_override_bucket_index_max_shards 12
-
To configure bucket index resharding for all instances of the Ceph Object Gateway, set the
rgw_override_bucket_index_max_shards
parameter with theglobal
option. -
To configure bucket index resharding only for a particular instance of the Ceph Object Gateway, add
rgw_override_bucket_index_max_shards
parameter under the instance.
-
To configure bucket index resharding for all instances of the Ceph Object Gateway, set the
Restart the Ceph Object Gateways on all nodes in the cluster to take effect:
Syntax
ceph orch restart SERVICE_TYPE
Example
[ceph: root#host01 /]# ceph orch restart rgw
Additional Resources
- See the Resharding bucket index dynamically
- See the Resharding bucket index manually
9.3.5. Configuring bucket index resharding in multi-site deployments
In multi-site deployments, each zone can have a different index_pool
setting to manage failover. To configure a consistent shard count for zones in one zone group, set the bucket_index_max_shards
parameter in the configuration for that zone group. The default value of bucket_index_max_shards
parameter is 11.
You can set the parameter to one of the following values:
-
0
to disable bucket index sharding. -
A value greater than
0
to enable bucket sharding and to set the maximum number of shards.
Mapping the index pool, for each zone, if applicable, to a CRUSH ruleset of SSD-based OSDs might also help with bucket index performance. See the Establishing performance domains section for more information.
To prevent sync issues in multi-site deployments, a bucket should not have more than three generation gaps.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed at a minimum of two sites.
Procedure
Calculate the recommended number of shards:
number of objects expected in a bucket / 100,000
NoteThe maximum number of bucket index shards currently supported is 65,521.
Extract the zone group configuration to the
zonegroup.json
file:Example
[ceph: root@host01 /]# radosgw-admin zonegroup get > zonegroup.json
In the
zonegroup.json
file, set thebucket_index_max_shards
parameter for each named zone:Syntax
bucket_index_max_shards = VALUE
Replace VALUE with the recommended number of shards calculated:
Example
bucket_index_max_shards = 12
Reset the zone group:
Example
[ceph: root@host01 /]# radosgw-admin zonegroup set < zonegroup.json
Update the period:
Example
[ceph: root@host01 /]# radosgw-admin period update --commit
Check if resharding is complete:
Syntax
radosgw-admin reshard status --bucket BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin reshard status --bucket data
Verification
Check the sync status of the storage cluster:
Example
[ceph: root@host01 /]# radosgw-admin sync status
9.3.6. Resharding bucket index dynamically
You can reshard the bucket index dynamically by adding the bucket to the resharding queue. It gets scheduled to be resharded. The reshard threads run in the background and executes the scheduled resharding, one at a time.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed at a minimum of two sites.
Procedure
Set the
rgw_dynamic_resharding
parameter is set totrue
.Example
[ceph: root@host01 /]# radosgw-admin period get
Optional: Customize Ceph configuration using the following command:
Syntax
ceph config set client.rgw OPTION VALUE
Replace OPTION with the following options:
-
rgw_reshard_num_logs
: The number of shards for the resharding log. The default value is16
. -
rgw_reshard_bucket_lock_duration
: The duration of the lock on a bucket during resharding. The default value is360
seconds. -
rgw_dynamic_resharding
: Enables or disables dynamic resharding. The default value istrue
. -
rgw_max_objs_per_shard
: The maximum number of objects per shard. The default value is100000
objects per shard. -
rgw_reshard_thread_interval
: The maximum time between rounds of reshard thread processing. The default value is600
seconds.
Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_reshard_num_logs 23
-
Add a bucket to the resharding queue:
Syntax
radosgw-admin reshard add --bucket BUCKET --num-shards NUMBER
Example
[ceph: root@host01 /]# radosgw-admin reshard add --bucket data --num-shards 10
List the resharding queue:
Example
[ceph: root@host01 /]# radosgw-admin reshard list
Check the bucket log generations and shards:
Example
[ceph: root@host01 /]# radosgw-admin bucket layout --bucket data { "layout": { "resharding": "None", "current_index": { "gen": 1, "layout": { "type": "Normal", "normal": { "num_shards": 23, "hash_type": "Mod" } } }, "logs": [ { "gen": 0, "layout": { "type": "InIndex", "in_index": { "gen": 0, "layout": { "num_shards": 11, "hash_type": "Mod" } } } }, { "gen": 1, "layout": { "type": "InIndex", "in_index": { "gen": 1, "layout": { "num_shards": 23, "hash_type": "Mod" } } } } ] } }
Check bucket resharding status:
Syntax
radosgw-admin reshard status --bucket BUCKET
Example
[ceph: root@host01 /]# radosgw-admin reshard status --bucket data
Process entries on the resharding queue immediately:
[ceph: root@host01 /]# radosgw-admin reshard process
Cancel pending bucket resharding:
WarningYou can only cancel pending resharding operations. Do not cancel ongoing resharding operations.
Syntax
radosgw-admin reshard cancel --bucket BUCKET
Example
[ceph: root@host01 /]# radosgw-admin reshard cancel --bucket data
Verification
Check bucket resharding status:
Syntax
radosgw-admin reshard status --bucket BUCKET
Example
[ceph: root@host01 /]# radosgw-admin reshard status --bucket data
Additional resources
- See the Cleaning stale instances of bucket entries after resharding section to remove the stale bucket entries.
- See the Resharding bucket index manually.
- See the Configuring bucket index resharding in simple deployments.
9.3.7. Resharding bucket index dynamically in multi-site configuration
Red Hat Ceph Storage supports dynamic bucket index resharding in multi-site configuration. The feature allows buckets to be resharded in a multi-site configuration without interrupting the replication of their objects. When rgw_dynamic_resharding
is enabled, it runs on each zone independently, and the zones might choose different shard counts for the same bucket.
These steps that need to be followed are for an existing Red Hat Ceph Storage cluster only. You need to enable the resharding
feature manually on the existing zones and the zone groups after upgrading the storage cluster.
Zones and zone groups are supported and enabled by default.
You can reshard a bucket three times before the other zones catch-up. See the Limitations of bucket index resharding for more details.
If a bucket is created and uploaded with more than the threshold number of objects for resharding dynamically, you need to continue to write I/Os to old buckets to begin the resharding process.
Prerequisites
- The Red Hat Ceph Storage clusters at both sites are upgraded to the latest version.
- All the Ceph Object Gateway daemons enabled at both the sites are upgraded to the latest version.
- Root-level access to all the nodes.
Procedure
Check if
resharding
is enabled on the zonegroup:Example
[ceph: root@host01 /]# radosgw-admin sync status
If
zonegroup features enabled
is not enabled for resharding on the zonegroup, then continue with the procedure.Enable the
resharding
feature on all the zonegroups in the multi-site configuration where Ceph Object Gateway is installed:Syntax
radosgw-admin zonegroup modify --rgw-zonegroup=ZONEGROUP_NAME --enable-feature=resharding
Example
[ceph: root@host01 /]# radosgw-admin zonegroup modify --rgw-zonegroup=us --enable-feature=resharding
Update the period and commit:
Example
[ceph: root@host01 /]# radosgw-admin period update --commit
Enable the
resharding
feature on all the zones in the multi-site configuration where Ceph Object Gateway is installed:Syntax
radosgw-admin zone modify --rgw-zone=ZONE_NAME --enable-feature=resharding
Example
[ceph: root@host01 /]# radosgw-admin zone modify --rgw-zone=us-east --enable-feature=resharding
Update the period and commit:
Example
[ceph: root@host01 /]# radosgw-admin period update --commit
Verify the
resharding
feature is enabled on the zones and zonegroups. You can see that each zone lists itssupported_features
and the zonegroups lists itsenabled_features
Example
[ceph: root@host01 /]# radosgw-admin period get "zones": [ { "id": "505b48db-6de0-45d5-8208-8c98f7b1278d", "name": "us_east", "endpoints": [ "http://10.0.208.11:8080" ], "log_meta": "false", "log_data": "true", "bucket_index_max_shards": 11, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [], "redirect_zone": "", "supported_features": [ "resharding" ] "default_placement": "default-placement", "realm_id": "26cf6f23-c3a0-4d57-aae4-9b0010ee55cc", "sync_policy": { "groups": [] }, "enabled_features": [ "resharding" ]
Check the sync status:
Example
[ceph: root@host01 /]# radosgw-admin sync status realm 26cf6f23-c3a0-4d57-aae4-9b0010ee55cc (usa) zonegroup 33a17718-6c77-493e-99fe-048d3110a06e (us) zone 505b48db-6de0-45d5-8208-8c98f7b1278d (us_east) zonegroup features enabled: resharding
In this example, the
resharding
feature is enabled for theus
zonegroup.Optional: You can disable the
resharding
feature for the zonegroups:ImportantTo disable resharding on any singular zone, set the
rgw_dynamic_resharding
configuration option tofalse
on that specific zone.Disable the feature on all the zonegroups in the multi-site where Ceph Object Gateway is installed:
Syntax
radosgw-admin zonegroup modify --rgw-zonegroup=ZONEGROUP_NAME --disable-feature=resharding
Example
[ceph: root@host01 /]# radosgw-admin zonegroup modify --rgw-zonegroup=us --disable-feature=resharding
Update the period and commit:
Example
[ceph: root@host01 /]# radosgw-admin period update --commit
Additional Resources
- For more configurable parameters for dynamic bucket index resharding, see the Resharding bucket index dynamically section in the Red Hat Ceph Storage Object Gateway Configuration and Administration Guide.
9.3.8. Resharding bucket index manually
If a bucket has grown larger than the initial configuration for which it was optimzed, reshard the bucket index pool by using the radosgw-admin bucket reshard
command. This command performs the following tasks:
- Creates a new set of bucket index objects for the specified bucket.
- Distributes object entries across these bucket index objects.
- Creates a new bucket instance.
- Links the new bucket instance with the bucket so that all new index operations go through the new bucket indexes.
- Prints the old and the new bucket ID to the command output.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed at a minimum of two sites.
Procedure
Back up the original bucket index:
Syntax
radosgw-admin bi list --bucket=BUCKET > BUCKET.list.backup
Example
[ceph: root@host01 /]# radosgw-admin bi list --bucket=data > data.list.backup
Reshard the bucket index:
Syntax
radosgw-admin bucket reshard --bucket=BUCKET --num-shards=NUMBER
Example
[ceph: root@host01 /]# radosgw-admin bucket reshard --bucket=data --num-shards=100
Verification
Check bucket resharding status:
Syntax
radosgw-admin reshard status --bucket bucket
Example
[ceph: root@host01 /]# radosgw-admin reshard status --bucket data
Additional Resources
- See the Configuring bucket index resharding in multi-site deployments in the Red Hat Ceph Storage Object Gateway Guide for more details.
- See the Resharding bucket index dynamically.
- See the Configuring bucket index resharding in simple deployments.
9.3.9. Cleaning stale instances of bucket entries after resharding
The resharding process might not clean stale instances of bucket entries automatically and these instances can impact performance of the storage cluster.
Clean them manually to prevent the stale instances from negatively impacting the performance of the storage cluster.
Contact Red Hat Support prior to cleaning the stale instances.
Use this procedure only in simple deployments, not in multi-site clusters.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Ceph Object Gateway installed.
Procedure
List stale instances:
[ceph: root@host01 /]# radosgw-admin reshard stale-instances list
Clean the stale instances of the bucket entries:
[ceph: root@host01 /]# radosgw-admin reshard stale-instances rm
Verification
Check bucket resharding status:
Syntax
radosgw-admin reshard status --bucket BUCKET
Example
[ceph: root@host01 /]# radosgw-admin reshard status --bucket data
9.3.10. Enabling compression
The Ceph Object Gateway supports server-side compression of uploaded objects using any of Ceph’s compression plugins. These include:
-
zlib
: Supported. -
snappy
: Supported. -
zstd
: Supported.
Configuration
To enable compression on a zone’s placement target, provide the --compression=TYPE
option to the radosgw-admin zone placement modify
command. The compression TYPE
refers to the name of the compression plugin to use when writing new object data.
Each compressed object stores the compression type. Changing the setting does not hinder the ability to decompress existing compressed objects, nor does it force the Ceph Object Gateway to recompress existing objects.
This compression setting applies to all new objects uploaded to buckets using this placement target.
To disable compression on a zone’s placement target, provide the --compression=TYPE
option to the radosgw-admin zone placement modify
command and specify an empty string or none
.
Example
[root@host01 ~] radosgw-admin zone placement modify --rgw-zone=default --placement-id=default-placement --compression=zlib { ... "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "default.rgw.buckets.index", "data_pool": "default.rgw.buckets.data", "data_extra_pool": "default.rgw.buckets.non-ec", "index_type": 0, "compression": "zlib" } } ], ... }
After enabling or disabling compression, restart the Ceph Object Gateway instance so the change will take effect.
Ceph Object Gateway creates a default
zone and a set of pools. For production deployments, see the Creating a Realm section first.
Statistics
While all existing commands and APIs continue to report object and bucket sizes based on their uncompressed data, the radosgw-admin bucket stats
command includes compression statistics for all buckets.
The usage types for the radosgw-admin bucket stats
command are:
-
rgw.main
refers to regular entries or objects. -
rgw.multimeta
refers to the metadata of incomplete multipart uploads. -
rgw.cloudtiered
refers to objects that a lifecycle policy has transitioned to a cloud tier. When configured withretain_head_object=true
, a head object is left behind that no longer contains data, but can still serve the object’s metadata via HeadObject requests. These stub head objects use thergw.cloudtiered
category. See the Transitioning data to Amazon S3 cloud service section in the Red Hat Ceph Storage Object Gateway Guide for more information.
Syntax
radosgw-admin bucket stats --bucket=BUCKET_NAME
{
...
"usage": {
"rgw.main": {
"size": 1075028,
"size_actual": 1331200,
"size_utilized": 592035,
"size_kb": 1050,
"size_kb_actual": 1300,
"size_kb_utilized": 579,
"num_objects": 104
}
},
...
}
The size
is the accumulated size of the objects in the bucket, uncompressed and unencrypted. The size_kb
is the accumulated size in kilobytes and is calculated as ceiling(size/1024)
. In this example, it is ceiling(1075028/1024) = 1050
.
The size_actual
is the accumulated size of all the objects after each object is distributed in a set of 4096-byte blocks. If a bucket has two objects, one of size 4100 bytes and the other of 8500 bytes, the first object is rounded up to 8192 bytes, and the second one rounded 12288 bytes, and their total for the bucket is 20480 bytes. The size_kb_actual
is the actual size in kilobytes and is calculated as size_actual/1024
. In this example, it is 1331200/1024 = 1300
.
The size_utilized
is the total size of the data in bytes after it has been compressed and/or encrypted. Encryption could increase the size of the object while compression could decrease it. The size_kb_utilized
is the total size in kilobytes and is calculated as ceiling(size_utilized/1024)
. In this example, it is ceiling(592035/1024)= 579
.
9.4. User management
Ceph Object Storage user management refers to users that are client applications of the Ceph Object Storage service; not the Ceph Object Gateway as a client application of the Ceph Storage Cluster. You must create a user, access key, and secret to enable client applications to interact with the Ceph Object Gateway service.
There are two user types:
- User: The term 'user' reflects a user of the S3 interface.
- Subuser: The term 'subuser' reflects a user of the Swift interface. A subuser is associated to a user .
You can create, modify, view, suspend, and remove users and subusers.
When managing users in a multi-site deployment, ALWAYS issue the radosgw-admin
command on a Ceph Object Gateway node within the master zone of the master zone group to ensure that users synchronize throughout the multi-site cluster. DO NOT create, modify, or delete users on a multi-site cluster from a secondary zone or a secondary zone group.
In addition to creating user and subuser IDs, you may add a display name and an email address for a user. You can specify a key and secret, or generate a key and secret automatically. When generating or specifying keys, note that user IDs correspond to an S3 key type and subuser IDs correspond to a swift key type. Swift keys also have access levels of read
, write
, readwrite
and full
.
User management command line syntax generally follows the pattern user COMMAND USER_ID
where USER_ID
is either the --uid=
option followed by the user’s ID (S3) or the --subuser=
option followed by the user name (Swift).
Syntax
radosgw-admin user <create|modify|info|rm|suspend|enable|check|stats> <--uid=USER_ID|--subuser=SUB_USER_NAME> [other-options]
Additional options may be required depending on the command you issue.
9.4.1. Multi-tenancy
The Ceph Object Gateway supports multi-tenancy for both the S3 and Swift APIs, where each user and bucket lies under a "tenant." Multi tenancy prevents namespace clashing when multiple tenants are using common bucket names, such as "test", "main", and so forth.
Each user and bucket lies under a tenant. For backward compatibility, a "legacy" tenant with an empty name is added. Whenever referring to a bucket without specifically specifying a tenant, the Swift API will assume the "legacy" tenant. Existing users are also stored under the legacy tenant, so they will access buckets and objects the same way as earlier releases.
Tenants as such do not have any operations on them. They appear and disappear as needed, when users are administered. In order to create, modify, and remove users with explicit tenants, either an additional option --tenant
is supplied, or a syntax "TENANT$USER"
is used in the parameters of the radosgw-admin
command.
To create a user testx$tester
for S3, run the following command:
Example
[root@host01 ~]# radosgw-admin --tenant testx --uid tester \ --display-name "Test User" --access_key TESTER \ --secret test123 user create
To create a user testx$tester
for Swift, run one of the following commands:
Example
[root@host01 ~]# radosgw-admin --tenant testx --uid tester \ --display-name "Test User" --subuser tester:swift \ --key-type swift --access full subuser create [root@host01 ~]# radosgw-admin key create --subuser 'testx$tester:swift' \ --key-type swift --secret test123
The subuser with explicit tenant had to be quoted in the shell.
9.4.2. Create a user
Use the user create
command to create an S3-interface user. You MUST specify a user ID and a display name. You may also specify an email address. If you DO NOT specify a key or secret, radosgw-admin
will generate them for you automatically. However, you may specify a key and/or a secret if you prefer not to use generated key/secret pairs.
Syntax
radosgw-admin user create --uid=USER_ID \ [--key-type=KEY_TYPE] [--gen-access-key|--access-key=ACCESS_KEY]\ [--gen-secret | --secret=SECRET_KEY] \ [--email=EMAIL] --display-name=DISPLAY_NAME
Example
[root@host01 ~]# radosgw-admin user create --uid=janedoe --access-key=11BS02LGFB6AL6H1ADMW --secret=vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY --email=jane@example.com --display-name=Jane Doe
{ "user_id": "janedoe", "display_name": "Jane Doe", "email": "jane@example.com", "suspended": 0, "max_buckets": 1000, "auid": 0, "subusers": [], "keys": [ { "user": "janedoe", "access_key": "11BS02LGFB6AL6H1ADMW", "secret_key": "vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY"}], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "placement_tags": [], "bucket_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1}, "user_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1}, "temp_url_keys": []}
Check the key output. Sometimes radosgw-admin
generates a JSON escape (\
) character, and some clients do not know how to handle JSON escape characters. Remedies include removing the JSON escape character (\
), encapsulating the string in quotes, regenerating the key to ensure that it does not have a JSON escape character, or specifying the key and secret manually.
9.4.3. Create a subuser
To create a subuser (Swift interface), you must specify the user ID (--uid=USERNAME
), a subuser ID and the access level for the subuser. If you DO NOT specify a key or secret, radosgw-admin
generates them for you automatically. However, you can specify a key, a secret, or both if you prefer not to use generated key and secret pairs.
full
is not readwrite
, as it also includes the access control policy.
Syntax
radosgw-admin subuser create --uid=USER_ID --subuser=SUB_USER_ID --access=[ read | write | readwrite | full ]
Example
[root@host01 ~]# radosgw-admin subuser create --uid=janedoe --subuser=janedoe:swift --access=full { "user_id": "janedoe", "display_name": "Jane Doe", "email": "jane@example.com", "suspended": 0, "max_buckets": 1000, "auid": 0, "subusers": [ { "id": "janedoe:swift", "permissions": "full-control"}], "keys": [ { "user": "janedoe", "access_key": "11BS02LGFB6AL6H1ADMW", "secret_key": "vzCEkuryfn060dfee4fgQPqFrncKEIkh3ZcdOANY"}], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "placement_tags": [], "bucket_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1}, "user_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1}, "temp_url_keys": []}
9.4.4. Get user information
To get information about a user, specify user info
and the user ID (--uid=USERNAME
).
Example
[root@host01 ~]# radosgw-admin user info --uid=janedoe
To get information about a tenanted user, specify both the user ID and the name of the tenant.
[root@host01 ~]# radosgw-admin user info --uid=janedoe --tenant=test
9.4.5. Modify user information
To modify information about a user, you must specify the user ID (--uid=USERNAME
) and the attributes you want to modify. Typical modifications are to keys and secrets, email addresses, display names, and access levels.
Example
[root@host01 ~]# radosgw-admin user modify --uid=janedoe --display-name="Jane E. Doe"
To modify subuser values, specify subuser modify
and the subuser ID.
Example
[root@host01 ~]# radosgw-admin subuser modify --subuser=janedoe:swift --access=full
9.4.6. Enable and suspend users
When you create a user, the user is enabled by default. However, you may suspend user privileges and re-enable them at a later time. To suspend a user, specify user suspend
and the user ID.
[root@host01 ~]# radosgw-admin user suspend --uid=johndoe
To re-enable a suspended user, specify user enable
and the user ID:
[root@host01 ~]# radosgw-admin user enable --uid=johndoe
Disabling the user disables the subuser.
9.4.7. Remove a user
When you remove a user, the user and subuser are removed from the system. However, you may remove only the subuser if you wish. To remove a user (and subuser), specify user rm
and the user ID.
Syntax
radosgw-admin user rm --uid=USER_ID[--purge-keys] [--purge-data]
Example
[ceph: root@host01 /]# radosgw-admin user rm --uid=johndoe --purge-data
To remove the subuser only, specify subuser rm
and the subuser name.
Example
[ceph: root@host01 /]# radosgw-admin subuser rm --subuser=johndoe:swift --purge-keys
Options include:
-
Purge Data: The
--purge-data
option purges all data associated with the UID. -
Purge Keys: The
--purge-keys
option purges all keys associated with the UID.
9.4.8. Remove a subuser
When you remove a subuser, you are removing access to the Swift interface. The user remains in the system. To remove the subuser, specify subuser rm
and the subuser ID.
Syntax
radosgw-admin subuser rm --subuser=SUB_USER_ID
Example
[root@host01 /]# radosgw-admin subuser rm --subuser=johndoe:swift
Options include:
-
Purge Keys: The
--purge-keys
option purges all keys associated with the UID.
9.4.9. Rename a user
To change the name of a user, use the radosgw-admin user rename
command. The time that this command takes depends on the number of buckets and objects that the user has. If the number is large, Red Hat recommends using the command in the Screen
utility provided by the screen
package.
Prerequisites
- A working Ceph cluster.
-
root
orsudo
access to the host running the Ceph Object Gateway. - Installed Ceph Object Gateway.
Procedure
Rename a user:
Syntax
radosgw-admin user rename --uid=CURRENT_USER_NAME --new-uid=NEW_USER_NAME
Example
[ceph: root@host01 /]# radosgw-admin user rename --uid=user1 --new-uid=user2 { "user_id": "user2", "display_name": "user 2", "email": "", "suspended": 0, "max_buckets": 1000, "auid": 0, "subusers": [], "keys": [ { "user": "user2", "access_key": "59EKHI6AI9F8WOW8JQZJ", "secret_key": "XH0uY3rKCUcuL73X0ftjXbZqUbk0cavD11rD8MsA" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw" }
If a user is inside a tenant, specify both the user name and the tenant:
Syntax
radosgw-admin user rename --uid USER_NAME --new-uid NEW_USER_NAME --tenant TENANT
Example
[ceph: root@host01 /]# radosgw-admin user rename --uid=test$user1 --new-uid=test$user2 --tenant test 1000 objects processed in tvtester1. Next marker 80_tVtester1_99 2000 objects processed in tvtester1. Next marker 64_tVtester1_44 3000 objects processed in tvtester1. Next marker 48_tVtester1_28 4000 objects processed in tvtester1. Next marker 2_tVtester1_74 5000 objects processed in tvtester1. Next marker 14_tVtester1_53 6000 objects processed in tvtester1. Next marker 87_tVtester1_61 7000 objects processed in tvtester1. Next marker 6_tVtester1_57 8000 objects processed in tvtester1. Next marker 52_tVtester1_91 9000 objects processed in tvtester1. Next marker 34_tVtester1_74 9900 objects processed in tvtester1. Next marker 9_tVtester1_95 1000 objects processed in tvtester2. Next marker 82_tVtester2_93 2000 objects processed in tvtester2. Next marker 64_tVtester2_9 3000 objects processed in tvtester2. Next marker 48_tVtester2_22 4000 objects processed in tvtester2. Next marker 32_tVtester2_42 5000 objects processed in tvtester2. Next marker 16_tVtester2_36 6000 objects processed in tvtester2. Next marker 89_tVtester2_46 7000 objects processed in tvtester2. Next marker 70_tVtester2_78 8000 objects processed in tvtester2. Next marker 51_tVtester2_41 9000 objects processed in tvtester2. Next marker 33_tVtester2_32 9900 objects processed in tvtester2. Next marker 9_tVtester2_83 { "user_id": "test$user2", "display_name": "User 2", "email": "", "suspended": 0, "max_buckets": 1000, "auid": 0, "subusers": [], "keys": [ { "user": "test$user2", "access_key": "user2", "secret_key": "123456789" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw" }
Verify that the user has been renamed successfully:
Syntax
radosgw-admin user info --uid=NEW_USER_NAME
Example
[ceph: root@host01 /]# radosgw-admin user info --uid=user2
If a user is inside a tenant, use the TENANT$USER_NAME format:
Syntax
radosgw-admin user info --uid= TENANT$USER_NAME
Example
[ceph: root@host01 /]# radosgw-admin user info --uid=test$user2
Additional Resources
-
The
screen(1)
manual page
9.4.10. Create a key
To create a key for a user, you must specify key create
. For a user, specify the user ID and the s3
key type. To create a key for a subuser, you must specify the subuser ID and the swift
keytype.
Example
[ceph: root@host01 /]# radosgw-admin key create --subuser=johndoe:swift --key-type=swift --gen-secret { "user_id": "johndoe", "rados_uid": 0, "display_name": "John Doe", "email": "john@example.com", "suspended": 0, "subusers": [ { "id": "johndoe:swift", "permissions": "full-control"}], "keys": [ { "user": "johndoe", "access_key": "QFAMEDSJP5DEKJO0DDXY", "secret_key": "iaSFLDVvDdQt6lkNzHyW4fPLZugBAI1g17LO0+87"}], "swift_keys": [ { "user": "johndoe:swift", "secret_key": "E9T2rUZNu2gxUjcwUBO8n\/Ev4KX6\/GprEuH4qhu1"}]}
9.4.11. Add and remove access keys
Users and subusers must have access keys to use the S3 and Swift interfaces. When you create a user or subuser and you do not specify an access key and secret, the key and secret get generated automatically. You may create a key and either specify or generate the access key and/or secret. You may also remove an access key and secret. Options include:
-
--secret=SECRET_KEY
specifies a secret key, for example, manually generated. -
--gen-access-key
generates a random access key (for S3 users by default). -
--gen-secret
generates a random secret key. -
--key-type=KEY_TYPE
specifies a key type. The options are: swift and s3.
To add a key, specify the user:
Example
[root@host01 ~]# radosgw-admin key create --uid=johndoe --key-type=s3 --gen-access-key --gen-secret
You might also specify a key and a secret.
To remove an access key, you need to specify the user and the key:
Find the access key for the specific user:
Example
[root@host01 ~]# radosgw-admin user info --uid=johndoe
The access key is the
"access_key"
value in the output:Example
[root@host01 ~]# radosgw-admin user info --uid=johndoe { "user_id": "johndoe", ... "keys": [ { "user": "johndoe", "access_key": "0555b35654ad1656d804", "secret_key": "h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q==" } ], ... }
Specify the user ID and the access key from the previous step to remove the access key:
Syntax
radosgw-admin key rm --uid=USER_ID --access-key ACCESS_KEY
Example
[root@host01 ~]# radosgw-admin key rm --uid=johndoe --access-key 0555b35654ad1656d804
9.4.12. Add and remove admin capabilities
The Ceph Storage Cluster provides an administrative API that enables users to run administrative functions via the REST API. By default, users DO NOT have access to this API. To enable a user to exercise administrative functionality, provide the user with administrative capabilities.
To add administrative capabilities to a user, run the following command:
Syntax
radosgw-admin caps add --uid=USER_ID--caps=CAPS
You can add read, write, or all capabilities to users, buckets, metadata, and usage (utilization).
Syntax
--caps="[users|buckets|metadata|usage|zone]=[*|read|write|read, write]"
Example
[root@host01 ~]# radosgw-admin caps add --uid=johndoe --caps="users=*"
To remove administrative capabilities from a user, run the following command:
Example
[root@host01 ~]# radosgw-admin caps remove --uid=johndoe --caps={caps}
9.5. Role management
As a storage administrator, you can create, delete, or update a role and the permissions associated with that role with the radosgw-admin
commands.
A role is similar to a user and has permission policies attached to it. It can be assumed by any identity. If a user assumes a role, a set of dynamically created temporary credentials are returned to the user. A role can be used to delegate access to users, applications and services that do not have permissions to access some S3 resources.
9.5.1. Creating a role
Create a role for the user with the radosgw-admin role create
command. You need to create a user with assume-role-policy-doc
parameter in the command, which is the trust relationship policy document that grants an entity the permission to assume the role.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- An S3 user created with user access.
Procedure
Create the role:
Syntax
radosgw-admin role create --role-name=ROLE_NAME [--path=="PATH_TO_FILE"] [--assume-role-policy-doc=TRUST_RELATIONSHIP_POLICY_DOCUMENT]
Example
[root@host01 ~]# radosgw-admin role create --role-name=S3Access1 --path=/application_abc/component_xyz/ --assume-role-policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\} { "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec", "RoleName": "S3Access1", "Path": "/application_abc/component_xyz/", "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1", "CreateDate": "2022-06-17T10:18:29.116Z", "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}" }
The value for
--path
is/
by default.
9.5.2. Getting a role
Get the information about a role with the get
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Getting the information about the role:
Syntax
radosgw-admin role get --role-name=ROLE_NAME
Example
[root@host01 ~]# radosgw-admin role get --role-name=S3Access1 { "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec", "RoleName": "S3Access1", "Path": "/application_abc/component_xyz/", "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1", "CreateDate": "2022-06-17T10:18:29.116Z", "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}" }
Additional Resources
- See the Creating a role section in the Red Hat Ceph Storage Object Gateway Guide for details.
9.5.3. Listing a role
You can list the roles in the specific path with the list
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
List the roles:
Syntax
radosgw-admin role list
Example
[root@host01 ~]# radosgw-admin role list [ { "RoleId": "85fb46dd-a88a-4233-96f5-4fb54f4353f7", "RoleName": "kvm-sts", "Path": "/application_abc/component_xyz/", "Arn": "arn:aws:iam:::role/application_abc/component_xyz/kvm-sts", "CreateDate": "2022-09-13T11:55:09.39Z", "MaxSessionDuration": 7200, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}" }, { "RoleId": "9116218d-4e85-4413-b28d-cdfafba24794", "RoleName": "kvm-sts-1", "Path": "/application_abc/component_xyz/", "Arn": "arn:aws:iam:::role/application_abc/component_xyz/kvm-sts-1", "CreateDate": "2022-09-16T00:05:57.483Z", "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}" } ]
9.5.4. Updating assume role policy document of a role
You can update the assume role policy document that grants an entity permission to assume the role with the modify
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Modify the assume role policy document of a role:
Syntax
radosgw-admin role-trust-policy modify --role-name=ROLE_NAME --assume-role-policy-doc=TRUST_RELATIONSHIP_POLICY_DOCUMENT
Example
[root@host01 ~]# radosgw-admin role-trust-policy modify --role-name=S3Access1 --assume-role-policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Principal\":\{\"AWS\":\[\"arn:aws:iam:::user/TESTER\"\]\},\"Action\":\[\"sts:AssumeRole\"\]\}\]\} { "RoleId": "ca43045c-082c-491a-8af1-2eebca13deec", "RoleName": "S3Access1", "Path": "/application_abc/component_xyz/", "Arn": "arn:aws:iam:::role/application_abc/component_xyz/S3Access1", "CreateDate": "2022-06-17T10:18:29.116Z", "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/TESTER\"]},\"Action\":[\"sts:AssumeRole\"]}]}" }
9.5.5. Getting permission policy attached to a role
You can get the specific permission policy attached to a role with the get
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Get the permission policy:
Syntax
radosgw-admin role-policy get --role-name=ROLE_NAME --policy-name=POLICY_NAME
Example
[root@host01 ~]# radosgw-admin role-policy get --role-name=S3Access1 --policy-name=Policy1 { "Permission policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:*\"],\"Resource\":\"arn:aws:s3:::example_bucket\"}]}" }
9.5.6. Deleting a role
You can delete the role only after removing the permission policy attached to it.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- A role created.
- An S3 bucket created.
- An S3 user created with user access.
Procedure
Delete the policy attached to the role:
Syntax
radosgw-admin role policy delete --role-name=ROLE_NAME --policy-name=POLICY_NAME
Example
[root@host01 ~]# radosgw-admin role policy delete --role-name=S3Access1 --policy-name=Policy1
Delete the role:
Syntax
radosgw-admin role delete --role-name=ROLE_NAME
Example
[root@host01 ~]# radosgw-admin role delete --role-name=S3Access1
9.5.7. Updating a policy attached to a role
You can either add or update the inline policy attached to a role with the put
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Update the inline policy:
Syntax
radosgw-admin role-policy put --role-name=ROLE_NAME --policy-name=POLICY_NAME --policy-doc=PERMISSION_POLICY_DOCUMENT
Example
[root@host01 ~]# radosgw-admin role-policy put --role-name=S3Access1 --policy-name=Policy1 --policy-doc=\{\"Version\":\"2012-10-17\",\"Statement\":\[\{\"Effect\":\"Allow\",\"Action\":\[\"s3:*\"\],\"Resource\":\"arn:aws:s3:::example_bucket\"\}\]\}
In this example, you attach the
Policy1
to the roleS3Access1
which allows all S3 actions on anexample_bucket
.
9.5.8. Listing permission policy attached to a role
You can list the names of the permission policies attached to a role with the list
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
List the names of the permission policies:
Syntax
radosgw-admin role-policy list --role-name=ROLE_NAME
Example
[root@host01 ~]# radosgw-admin role-policy list --role-name=S3Access1 [ "Policy1" ]
9.5.9. Deleting policy attached to a role
You can delete the permission policy attached to a role with the rm
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Delete the permission policy:
Syntax
radosgw-admin role policy delete --role-name=ROLE_NAME --policy-name=POLICY_NAME
Example
[root@host01 ~]# radosgw-admin role policy delete --role-name=S3Access1 --policy-name=Policy1
9.5.10. Updating the session duration of a role
You can update the session duration of a role with the update
command to control the length of time that a user can be signed into the account with the provided credentials.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- A role created.
- An S3 user created with user access.
Procedure
Update the max-session-duration using the
update
command:Syntax
[root@node1 ~]# radosgw-admin role update --role-name=ROLE_NAME --max-session-duration=7200
Example
[root@node1 ~]# radosgw-admin role update --role-name=test-sts-role --max-session-duration=7200
Verification
List the roles to verify the updates:
Example
[root@node1 ~]#radosgw-admin role list [ { "RoleId": "d4caf33f-caba-42f3-8bd4-48c84b4ea4d3", "RoleName": "test-sts-role", "Path": "/", "Arn": "arn:aws:iam:::role/test-role", "CreateDate": "2022-09-07T20:01:15.563Z", "MaxSessionDuration": 7200, <<<<<< "AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/kvm\"]},\"Action\":[\"sts:AssumeRole\"]}]}" } ]
Additional Resources
- See the REST APIs for manipulating a role section in the Red Hat Ceph Storage Developer Guide for details.
9.6. Quota management
The Ceph Object Gateway enables you to set quotas on users and buckets owned by users. Quotas include the maximum number of objects in a bucket and the maximum storage size in megabytes.
-
Bucket: The
--bucket
option allows you to specify a quota for buckets the user owns. -
Maximum Objects: The
--max-objects
setting allows you to specify the maximum number of objects. A negative value disables this setting. -
Maximum Size: The
--max-size
option allows you to specify a quota for the maximum number of bytes. A negative value disables this setting. -
Quota Scope: The
--quota-scope
option sets the scope for the quota. The options arebucket
anduser
. Bucket quotas apply to buckets a user owns. User quotas apply to a user.
Buckets with a large number of objects can cause serious performance issues. The recommended maximum number of objects in a one bucket is 100,000. To increase this number, configure bucket index sharding. See the Configure bucket index resharding for details.
9.6.1. Set user quotas
Before you enable a quota, you must first set the quota parameters.
Syntax
radosgw-admin quota set --quota-scope=user --uid=USER_ID [--max-objects=NUMBER_OF_OBJECTS] [--max-size=MAXIMUM_SIZE_IN_BYTES]
Example
[root@host01 ~]# radosgw-admin quota set --quota-scope=user --uid=johndoe --max-objects=1024 --max-size=1024
A negative value for num objects and / or max size means that the specific quota attribute check is disabled.
9.6.2. Enable and disable user quotas
Once you set a user quota, you can enable it.
Syntax
radosgw-admin quota enable --quota-scope=user --uid=USER_ID
You may disable an enabled user quota.
Syntax
radosgw-admin quota disable --quota-scope=user --uid=USER_ID
9.6.3. Set bucket quotas
Bucket quotas apply to the buckets owned by the specified uid
. They are independent of the user.
Syntax
radosgw-admin quota set --uid=USER_ID --quota-scope=bucket --bucket=BUCKET_NAME [--max-objects=NUMBER_OF_OBJECTS] [--max-size=MAXIMUM_SIZE_IN_BYTES]
A negative value for NUMBER_OF_OBJECTS, MAXIMUM_SIZE_IN_BYTES, or both means that the specific quota attribute check is disabled.
9.6.4. Enable and disable bucket quotas
Once you set a bucket quota, you may enable it.
Syntax
radosgw-admin quota enable --quota-scope=bucket --uid=USER_ID
You may disable an enabled bucket quota.
Syntax
radosgw-admin quota disable --quota-scope=bucket --uid=USER_ID
9.6.5. Get quota settings
You may access each user’s quota settings via the user information API. To read user quota setting information with the CLI interface, run the following command:
Syntax
radosgw-admin user info --uid=USER_ID
To get quota settings for a tenanted user, specify the user ID and the name of the tenant:
Syntax
radosgw-admin user info --uid=USER_ID --tenant=TENANT
9.6.6. Update quota stats
Quota stats get updated asynchronously. You can update quota statistics for all users and all buckets manually to retrieve the latest quota stats.
Syntax
radosgw-admin user stats --uid=USER_ID --sync-stats
9.6.7. Get user quota usage stats
To see how much of the quota a user has consumed, run the following command:
Syntax
radosgw-admin user stats --uid=USER_ID
You should run the radosgw-admin user stats
command with the --sync-stats
option to receive the latest data.
9.6.8. Quota cache
Quota statistics are cached for each Ceph Gateway instance. If there are multiple instances, then the cache can keep quotas from being perfectly enforced, as each instance will have a different view of the quotas. The options that control this are rgw bucket quota ttl
, rgw user quota bucket sync interval
, and rgw user quota sync interval
. The higher these values are, the more efficient quota operations are, but the more out-of-sync multiple instances will be. The lower these values are, the closer to perfect enforcement multiple instances will achieve. If all three are 0, then quota caching is effectively disabled, and multiple instances will have perfect quota enforcement.
9.6.9. Reading and writing global quotas
You can read and write quota settings in a zonegroup map. To get a zonegroup map:
[root@host01 ~]# radosgw-admin global quota get
The global quota settings can be manipulated with the global quota
counterparts of the quota set
, quota enable
, and quota disable
commands, for example:
[root@host01 ~]# radosgw-admin global quota set --quota-scope bucket --max-objects 1024 [root@host01 ~]# radosgw-admin global quota enable --quota-scope bucket
In a multi-site configuration, where there is a realm and period present, changes to the global quotas must be committed using period update --commit
. If there is no period present, the Ceph Object Gateways must be restarted for the changes to take effect.
9.7. Bucket management
As a storage administrator, when using the Ceph Object Gateway you can manage buckets by moving them between users and renaming them. You can create bucket notifications to trigger on specific events. Also, you can find orphan or leaky objects within the Ceph Object Gateway that can occur over the lifetime of a storage cluster.
When millions of objects are uploaded to a Ceph Object Gateway bucket with a high ingest rate, incorrect num_objects
are reported with the radosgw-admin bucket stats
command. With the radosgw-admin bucket list
command you can correct the value of num_objects
parameter.
In a multi-site cluster, deletion of a bucket from the secondary site does not sync the metadata changes with the primary site. Hence, Red Hat recommends to delete a bucket only from the primary site and not from the secondary site.
9.7.1. Renaming buckets
You can rename buckets. If you want to allow underscores in bucket names, then set the rgw_relaxed_s3_bucket_names
option to true
.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway software.
- An existing bucket.
Procedure
List the buckets:
Example
[ceph: root@host01 /]# radosgw-admin bucket list [ "34150b2e9174475db8e191c188e920f6/swcontainer", "s3bucket1", "34150b2e9174475db8e191c188e920f6/swimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/ec2container", "c278edd68cfb4705bb3e07837c7ad1a8/demoten1", "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct", "c278edd68cfb4705bb3e07837c7ad1a8/demopostup", "34150b2e9174475db8e191c188e920f6/postimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/demoten2", "c278edd68cfb4705bb3e07837c7ad1a8/postupsw" ]
Rename the bucket:
Syntax
radosgw-admin bucket link --bucket=ORIGINAL_NAME --bucket-new-name=NEW_NAME --uid=USER_ID
Example
[ceph: root@host01 /]# radosgw-admin bucket link --bucket=s3bucket1 --bucket-new-name=s3newb --uid=testuser
If the bucket is inside a tenant, specify the tenant as well:
Syntax
radosgw-admin bucket link --bucket=tenant/ORIGINAL_NAME --bucket-new-name=NEW_NAME --uid=TENANT$USER_ID
Example
[ceph: root@host01 /]# radosgw-admin bucket link --bucket=test/s3bucket1 --bucket-new-name=s3newb --uid=test$testuser
Verify the bucket was renamed:
Example
[ceph: root@host01 /]# radosgw-admin bucket list [ "34150b2e9174475db8e191c188e920f6/swcontainer", "34150b2e9174475db8e191c188e920f6/swimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/ec2container", "s3newb", "c278edd68cfb4705bb3e07837c7ad1a8/demoten1", "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct", "c278edd68cfb4705bb3e07837c7ad1a8/demopostup", "34150b2e9174475db8e191c188e920f6/postimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/demoten2", "c278edd68cfb4705bb3e07837c7ad1a8/postupsw" ]
9.7.2. Removing buckets
Remove buckets from a Red Hat Ceph Storage cluster with the Ceph Object Gateway configuration.
When the bucket does not have objects, you can run the radosgw-admin bucket rm
command. If there are objects in the buckets, then you can use the --purge-objects
option.
For multi-site configuration, Red Hat recommends to delete the buckets from the primary site.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway software.
- An existing bucket.
Procedure
List the buckets.
Example
[ceph: root@host01 /]# radosgw-admin bucket list [ "34150b2e9174475db8e191c188e920f6/swcontainer", "s3bucket1", "34150b2e9174475db8e191c188e920f6/swimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/ec2container", "c278edd68cfb4705bb3e07837c7ad1a8/demoten1", "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct", "c278edd68cfb4705bb3e07837c7ad1a8/demopostup", "34150b2e9174475db8e191c188e920f6/postimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/demoten2", "c278edd68cfb4705bb3e07837c7ad1a8/postupsw" ]
Remove the bucket.
Syntax
radosgw-admin bucket rm --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin bucket rm --bucket=s3bucket1
If the bucket has objects, then run the following command:
Syntax
radosgw-admin bucket rm --bucket=BUCKET --purge-objects --bypass-gc
Example
[ceph: root@host01 /]# radosgw-admin bucket rm --bucket=s3bucket1 --purge-objects --bypass-gc
The
--purge-objects
option purges the objects and--bypass-gc
option triggers deletion of objects without the garbage collector to make the process more efficient.Verify the bucket was removed.
Example
[ceph: root@host01 /]# radosgw-admin bucket list [ "34150b2e9174475db8e191c188e920f6/swcontainer", "34150b2e9174475db8e191c188e920f6/swimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/ec2container", "c278edd68cfb4705bb3e07837c7ad1a8/demoten1", "c278edd68cfb4705bb3e07837c7ad1a8/demo-ct", "c278edd68cfb4705bb3e07837c7ad1a8/demopostup", "34150b2e9174475db8e191c188e920f6/postimpfalse", "c278edd68cfb4705bb3e07837c7ad1a8/demoten2", "c278edd68cfb4705bb3e07837c7ad1a8/postupsw" ]
9.7.3. Moving buckets
The radosgw-admin bucket
utility provides the ability to move buckets between users. To do so, link the bucket to a new user and change the ownership of the bucket to the new user.
You can move buckets:
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Ceph Object Gateway is installed.
- An S3 bucket.
- Various tenanted and non-tenanted users.
9.7.3.1. Moving buckets between non-tenanted users
The radosgw-admin bucket chown
command provides the ability to change the ownership of buckets and all objects they contain from one user to another. To do so, unlink a bucket from the current user, link it to a new user, and change the ownership of the bucket to the new user.
Procedure
Link the bucket to a new user:
Syntax
radosgw-admin bucket link --uid=USER --bucket=BUCKET
Example
[ceph: root@host01 /]# radosgw-admin bucket link --uid=user2 --bucket=data
Verify that the bucket has been linked to
user2
successfully:Example
[ceph: root@host01 /]# radosgw-admin bucket list --uid=user2 [ "data" ]
Change the ownership of the bucket to the new user:
Syntax
radosgw-admin bucket chown --uid=user --bucket=bucket
Example
[ceph: root@host01 /]# radosgw-admin bucket chown --uid=user2 --bucket=data
Verify that the ownership of the
data
bucket has been successfully changed by checking theowner
line in the output of the following command:Example
[ceph: root@host01 /]# radosgw-admin bucket list --bucket=data
9.7.3.2. Moving buckets between tenanted users
You can move buckets between one tenanted user and another.
Procedure
Link the bucket to a new user:
Syntax
radosgw-admin bucket link --bucket=CURRENT_TENANT/BUCKET --uid=NEW_TENANT$USER
Example
[ceph: root@host01 /]# radosgw-admin bucket link --bucket=test/data --uid=test2$user2
Verify that the bucket has been linked to
user2
successfully:[ceph: root@host01 /]# radosgw-admin bucket list --uid=test$user2 [ "data" ]
Change the ownership of the bucket to the new user:
Syntax
radosgw-admin bucket chown --bucket=NEW_TENANT/BUCKET --uid=NEW_TENANT$USER
Example
[ceph: root@host01 /]# radosgw-admin bucket chown --bucket='test2/data' --uid='test$tuser2'
Verify that the ownership of the
data
bucket has been successfully changed by checking theowner
line in the output of the following command:[ceph: root@host01 /]# radosgw-admin bucket list --bucket=test2/data
9.7.3.3. Moving buckets from non-tenanted users to tenanted users
You can move buckets from a non-tenanted user to a tenanted user.
Procedure
Optional: If you do not already have multiple tenants, you can create them by enabling
rgw_keystone_implicit_tenants
and accessing the Ceph Object Gateway from an external tenant:Enable the
rgw_keystone_implicit_tenants
option:Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_keystone_implicit_tenants true
Access the Ceph Object Gateway from an eternal tenant using either the
s3cmd
orswift
command:Example
[ceph: root@host01 /]# swift list
Or use
s3cmd
:Example
[ceph: root@host01 /]# s3cmd ls
The first access from an external tenant creates an equivalent Ceph Object Gateway user.
Move a bucket to a tenanted user:
Syntax
radosgw-admin bucket link --bucket=/BUCKET --uid='TENANT$USER'
Example
[ceph: root@host01 /]# radosgw-admin bucket link --bucket=/data --uid='test$tenanted-user'
Verify that the
data
bucket has been linked totenanted-user
successfully:Example
[ceph: root@host01 /]# radosgw-admin bucket list --uid='test$tenanted-user' [ "data" ]
Change the ownership of the bucket to the new user:
Syntax
radosgw-admin bucket chown --bucket='tenant/bucket name' --uid='tenant$user'
Example
[ceph: root@host01 /]# radosgw-admin bucket chown --bucket='test/data' --uid='test$tenanted-user'
Verify that the ownership of the
data
bucket has been successfully changed by checking theowner
line in the output of the following command:Example
[ceph: root@host01 /]# radosgw-admin bucket list --bucket=test/data
9.7.3.4. Finding orphan and leaky objects
A healthy storage cluster does not have any orphan or leaky objects, but in some cases orphan or leaky objects can occur.
An orphan object exists in a storage cluster and has an object ID associated with the RADOS object. However, there is no reference of the RADOS object with the S3 object in the bucket index reference. For example, if the Ceph Object Gateway goes down in the middle of an operation, this can cause some objects to become orphans. Also, an undiscovered bug can cause orphan objects to occur.
You can see how the Ceph Object Gateway objects map to the RADOS objects. The radosgw-admin
command provides a tool to search for and produce a list of these potential orphan or leaky objects. Using the radoslist
subcommand displays objects stored within buckets, or all buckets in the storage cluster. The rgw-orphan-list
script displays orphan objects within a pool.
The radoslist
subcommand is replacing the deprecated orphans find
and orphans finish
subcommands.
Do not use this command where Indexless
buckets are in use as all the objects appear as orphaned
.
Another alternate way to identity orphaned objects is to run the rados -p <pool> ls | grep BUCKET_ID
command.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A running Ceph Object Gateway.
Procedure
Generate a list of objects that hold data within a bucket.
Syntax
radosgw-admin bucket radoslist --bucket BUCKET_NAME
Example
[root@host01 ~]# radosgw-admin bucket radoslist --bucket mybucket
NoteIf the BUCKET_NAME is omitted, then all objects in all buckets are displayed.
Check the version of
rgw-orphan-list
.Example
[root@host01 ~]# head /usr/bin/rgw-orphan-list
The version should be
2023-01-11
or newer.Create a directory where you need to generate the list of orphans.
Example
[root@host01 ~]# mkdir orphans
Navigate to the directory created earlier.
Example
[root@host01 ~]# cd orphans
From the pool list, select the pool in which you want to find orphans. This script might run for a long time depending on the objects in the cluster.
Example
[root@host01 orphans]# rgw-orphan-list
Example
Available pools: .rgw.root default.rgw.control default.rgw.meta default.rgw.log default.rgw.buckets.index default.rgw.buckets.data rbd default.rgw.buckets.non-ec ma.rgw.control ma.rgw.meta ma.rgw.log ma.rgw.buckets.index ma.rgw.buckets.data ma.rgw.buckets.non-ec Which pool do you want to search for orphans?
Enter the pool name to search for orphans.
ImportantA data pool must be specified when using the
rgw-orphan-list
command, and not a metadata pool.View the details of the
rgw-orphan-list
tool usage.Synatx
rgw-orphan-list -h rgw-orphan-list POOL_NAME /DIRECTORY
Example
[root@host01 orphans]# rgw-orphan-list default.rgw.buckets.data /orphans 2023-09-12 08:41:14 ceph-host01 Computing delta... 2023-09-12 08:41:14 ceph-host01 Computing results... 10 potential orphans found out of a possible 2412 (0%). <<<<<<< orphans detected The results can be found in './orphan-list-20230912124113.out'. Intermediate files are './rados-20230912124113.intermediate' and './radosgw-admin-20230912124113.intermediate'. *** *** WARNING: This is EXPERIMENTAL code and the results should be used *** only with CAUTION! *** Done at 2023-09-12 08:41:14.
Run the
ls -l
command to verify the files ending with error should be zero length indicating the script ran without any issues.Example
[root@host01 orphans]# ls -l -rw-r--r--. 1 root root 770 Sep 12 03:59 orphan-list-20230912075939.out -rw-r--r--. 1 root root 0 Sep 12 03:59 rados-20230912075939.error -rw-r--r--. 1 root root 248508 Sep 12 03:59 rados-20230912075939.intermediate -rw-r--r--. 1 root root 0 Sep 12 03:59 rados-20230912075939.issues -rw-r--r--. 1 root root 0 Sep 12 03:59 radosgw-admin-20230912075939.error -rw-r--r--. 1 root root 247738 Sep 12 03:59 radosgw-admin-20230912075939.intermediate
Review the orphan objects listed.
Example
[root@host01 orphans]# cat ./orphan-list-20230912124113.out a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.0 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.1 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.2 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.3 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.4 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.5 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.6 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.7 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.8 a9c042bc-be24-412c-9052-dda6b2f01f55.16749.1_key1.cherylf.433-bucky-4865-0.9
Remove orphan objects:
Syntax
rados -p POOL_NAME rm OBJECT_NAME
Example
[root@host01 orphans]# rados -p default.rgw.buckets.data rm myobject
WarningVerify you are removing the correct objects. Running the
rados rm
command removes data from the storage cluster.
9.7.3.5. Managing bucket index entries
You can manage the bucket index entries of the Ceph Object Gateway in a Red Hat Ceph Storage cluster using the radosgw-admin bucket check
sub-command.
Each bucket index entry related to a piece of a multipart upload object is matched against its corresponding .meta
index entry. There should be one .meta
entry for all the pieces of a given multipart upload. If it fails to find a corresponding .meta
entry for a piece, it lists out the "orphaned" piece entries in a section of the output.
The stats for the bucket are stored in the bucket index headers. This phase loads those headers and also iterates through all the plain object entries in the bucket index and recalculates the stats. It then displays the actual and calculated stats in sections labeled "existing_header" and "calculated_header" respectively, so they can be compared.
If you use the --fix
option with the bucket check
sub-command, it removes the "orphaned" entries from the bucket index and also overwrites the existing stats in the header with those that it calculated. It causes all entries, including the multiple entries used in versioning, to be listed in the output.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A running Ceph Object Gateway.
- A newly created bucket.
Procedure
Check the bucket index of a specific bucket:
Syntax
radosgw-admin bucket check --bucket=BUCKET_NAME
Example
[root@rgw ~]# radosgw-admin bucket check --bucket=mybucket
Fix the inconsistencies in the bucket index, including removal of orphaned objects:
Syntax
radosgw-admin bucket check --fix --bucket=BUCKET_NAME
Example
[root@rgw ~]# radosgw-admin bucket check --fix --bucket=mybucket
9.7.3.6. Bucket notifications
Bucket notifications provide a way to send information out of the Ceph Object Gateway when certain events happen in the bucket. Bucket notifications can be sent to HTTP, AMQP0.9.1, and Kafka endpoints. A notification entry must be created to send bucket notifications for events on a specific bucket and to a specific topic. A bucket notification can be created on a subset of event types or by default for all event types. The bucket notification can filter out events based on key prefix or suffix, regular expression matching the keys, and on the metadata attributes attached to the object, or the object tags. Bucket notifications have a REST API to provide configuration and control interfaces for the bucket notification mechanism.
The bucket notifications API is enabled by default. If rgw_enable_apis
configuration parameter is explicitly set, ensure that s3
, and notifications
are included. To verify this, run the ceph --admin-daemon /var/run/ceph/ceph-client.rgw.NAME.asok config get rgw_enable_apis
command. Replace NAME with the Ceph Object Gateway instance name.
Topic management using CLI
You can manage list, get, and remove topics for the Ceph Object Gateway buckets:
List topics: Run the following command to list the configuration of all topics:
Example
[ceph: host01 /]# radosgw-admin topic list
Get topics: Run the following command to get the configuration of a specific topic:
Example
[ceph: host01 /]# radosgw-admin topic get --topic=topic1
Remove topics: Run the following command to remove the configuration of a specific topic:
Example
[ceph: host01 /]# radosgw-admin topic rm --topic=topic1
NoteThe topic is removed even if the Ceph Object Gateway bucket is configured to that topic.
9.7.3.7. Creating bucket notifications
Create bucket notifications at the bucket level. The notification configuration has the Red Hat Ceph Storage Object Gateway S3 events, ObjectCreated
, ObjectRemoved
, and ObjectLifecycle:Expiration
. These need to be published with the destination to send the bucket notifications. Bucket notifications are S3 operations.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A running HTTP server, RabbitMQ server, or a Kafka server.
- Root-level access.
- Installation of the Red Hat Ceph Storage Object Gateway.
- User access key and secret key.
- Endpoint parameters.
Red Hat supports ObjectCreate
events, such as put
, post
, multipartUpload
, and copy
. Red Hat also supports ObjectRemove
events, such as object_delete
and s3_multi_object_delete
.
Procedure
- Create an S3 bucket.
-
Create an SNS topic for
http
,amqp
, orkafka
protocol. Create an S3 bucket notification for
s3:objectCreate
,s3:objectRemove
, ands3:ObjectLifecycle:Expiration
events:Example
client.put_bucket_notification_configuration( Bucket=bucket_name, NotificationConfiguration={ 'TopicConfigurations': [ { 'Id': notification_name, 'TopicArn': topic_arn, 'Events': ['s3:ObjectCreated:*', 's3:ObjectRemoved:*', 's3:ObjectLifecycle:Expiration:*'] }]})
- Create S3 objects in the bucket.
-
Verify the object creation events at the
http
,rabbitmq
, orkafka
receiver. - Delete the objects.
-
Verify the object deletion events at the
http
,rabbitmq
, orkafka
receiver.
9.7.4. S3 bucket replication API
The S3 bucket replication API is implemented, and allows users to create replication rules between different buckets. Note though that while the AWS replication feature allows bucket replication within the same zone, Ceph Object Gateway does not allow it at the moment. However, the Ceph Object Gateway API also added a Zone
array that allows users to select to what zones the specific bucket will be synced.
9.7.4.1. Creating S3 bucket replication
Create a replication configuration for a bucket or replace an existing one.
A replication configuration must include at least one rule. Each rule identifies a subset of objects to replicate by filtering the objects in the source bucket.
Prerequisites
- A running Red Hat Ceph Storage cluster with multi-site Ceph object gateway configured. For more information on creating multi-site sync policies, see Creating a sync policy group.
- Zone group level policy created. For more information on creating zone group policies, see Bucket granular sync policies.
Procedure
Create a replication configuration file that contains the details of replication:
Syntax
{ "Role": "arn:aws:iam::account-id:role/role-name", "Rules": [ { "ID": "String", "Status": "Enabled", "Priority": 1, "DeleteMarkerReplication": { "Status": "Enabled"|"Disabled" }, "Destination": { "Bucket": "BUCKET_NAME" } } ] }
Example
[root@host01 ~]# cat replication.json { "Role": "arn:aws:iam::account-id:role/role-name", "Rules": [ { "ID": "pipe-bkt", "Status": "Enabled", "Priority": 1, "DeleteMarkerReplication": { "Status": "Disabled" }, "Destination": { "Bucket": "testbucket" } } ] }
Create the S3 API put bucket replication:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL s3api put-bucket-replication --bucket BUCKET_NAME --replication-configuration file://REPLICATION_CONFIIRATION_FILE.json
Example
[root@host01 ~]# aws --endpoint-url=http://host01:80 s3api put-bucket-replication --bucket testbucket --replication-configuration file://replication.json
Verification
Verify the sync policy, by using the
sync policy get
command.Syntax
radosgw-admin sync policy get --bucket BUCKET_NAME
NoteWhen applying replication policy, the rules are converted to sync-policy rules, known as pipes, and are categorized as
enabled
anddisabled
.-
Enabled: These pipes are enabled and are active and the group status is set to 'rgw_sync_policy_group:STATUS'. For example,
s3-bucket-replication:enabled
. -
Disabled: The pipes under this set are not active and the group status is set to 'rgw_sync_policy_group:STATUS'. For example,
s3-bucket-replication:disabled
.
Since there can be multiple rules which can be configured as part of replication policy, it has two separate groups (one with 'enabled' and another with 'allowed' state) for accurate mapping of each group.
Example
[ceph: root@host01 /]# radosgw-admin sync policy get --bucket testbucket { "groups": [ { "id": "s3-bucket-replication:disabled", "data_flow": {}, "pipes": [], "status": "allowed" }, { "id": "s3-bucket-replication:enabled", "data_flow": {}, "pipes": [ { "id": "", "source": { "bucket": "*", "zones": [ "*" ] }, "dest": { "bucket": "testbucket", "zones": [ "*" ] }, "params": { "source": {}, "dest": {}, "priority": 1, "mode": "user", "user": "s3cmd" } } ], "status": "enabled" } ] }
-
Enabled: These pipes are enabled and are active and the group status is set to 'rgw_sync_policy_group:STATUS'. For example,
Additional Resources
- See the Using multi-site sync policies section in the Red Hat Ceph Storage Object Gateway Guide for details.
9.7.4.2. Getting S3 bucket replication
You can retrieve the replication configuration of the bucket.
Prerequisites
- A running Red Hat Ceph Storage cluster with multi-site Ceph object gateway configured. For more information on creating multi-site sync policies, see Creating a sync policy group.
- Zone group level policy created. For more information on creating zone group policies, see Bucket granular sync policies.
- An S3 bucket replication created. For more information, see S3 bucket replication API.
Procedure
Get the S3 API put bucket replication:
Syntax
aws s3api get-bucket-replication --bucket BUCKET_NAME --endpoint-url=RADOSGW_ENDPOINT_URL
Example
[root@host01 ~]# aws s3api get-bucket-replication --bucket testbucket --endpoint-url=http://host01:80 { "ReplicationConfiguration": { "Role": "", "Rules": [ { "ID": "pipe-bkt", "Status": "Enabled", "Priority": 1, "Destination": { Bucket": "testbucket" } } ] } }
9.7.4.3. Deleting S3 bucket replication
Delete a replication configuration from a bucket.
The bucket owner can grant permission to others to remove the replication configuration.
Prerequisites
- A running Red Hat Ceph Storage cluster with multi-site Ceph object gateway configured. For more information on creating multi-site sync policies, see Creating a sync policy group.
- Zone group level policy created. For more information on creating zone group policies, see Bucket granular sync policies.
- An S3 bucket replication created. For more information, see S3 bucket replication API.
Procedure
Delete the S3 API put bucket replication:
Syntax
aws s3api delete-bucket-replication --bucket BUCKET_NAME --endpoint-url=RADOSGW_ENDPOINT_URL
Example
[root@host01 ~]# aws s3api delete-bucket-replication --bucket testbucket --endpoint-url=http://host01:80
Verification
Verify that the existing replication rules are deleted:
Syntax
radosgw-admin sync policy get --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin sync policy get --bucket=testbucket
9.7.4.4. Disabling S3 bucket replication for user
As an administrator, you can set a user policy for other users to restrict performing any s3 replication API operations on buckets that reside under that particular user/users.
Prerequisites
- A running Red Hat Ceph Storage cluster with multi-site Ceph object gateway configured. For more information on creating multi-site sync policies, see Creating a sync policy group.
- Zone group level policy created. For more information on creating zone group policies, see Bucket granular sync policies.
Procedure
Create a user policy configuration file to deny access to S3 bucket replication API:
Example
[root@host01 ~]# cat user_policy.json { "Version":"2012-10-17", "Statement": { "Effect":"Deny", "Action": [ "s3:PutReplicationConfiguration", "s3:GetReplicationConfiguration", "s3:DeleteReplicationConfiguration" ], "Resource": "arn:aws:s3:::*", } }
As an admin user, set user policy to user to disable user access to S3 API:
Syntax
aws --endpoint-url=ENDPOINT_URL iam put-user-policy --user-name USER_NAME --policy-name USER_POLICY_NAME --policy-document POLICY_DOCUMENT_PATH
Example
[root@host01 ~]# aws --endpoint-url=http://host01:80 iam put-user-policy --user-name newuser1 --policy-name userpolicy --policy-document file://user_policy.json
Verification
As an admin user, verify the user policy set:
Syntax
aws --endpoint-url=ENDPOINT_URL iam get-user-policy --user-name USER_NAME --policy-name USER_POLICY_NAME --region us
Example
[root@host01 ~]# aws --endpoint-url=http://host01:80 iam get-user-policy --user-name newuser1 --policy-name userpolicy --region us
As a user on whom the user policy is set by admin user, try performing th below S3 bucket replication API operations to verify whether the action is denied as expected.
Additional Resources
- See the S3 bucket replication API section in the Red Hat Ceph Storage Object Gateway Guide for details.
9.8. Bucket lifecycle
As a storage administrator, you can use a bucket lifecycle configuration to manage your objects so they are stored effectively throughout their lifetime. For example, you can transition objects to less expensive storage classes, archive, or even delete them based on your use case.
RADOS Gateway supports S3 API object expiration by using rules defined for a set of bucket objects. Each rule has a prefix, which selects the objects, and a number of days after which objects become unavailable.
The radosgw-admin lc reshard
command is deprecated in Red Hat Ceph Storage 3.3 and not supported in Red Hat Ceph Storage 4 and later releases.
9.8.1. Creating a lifecycle management policy
You can manage a bucket lifecycle policy configuration using standard S3 operations rather than using the radosgw-admin
command. RADOS Gateway supports only a subset of the Amazon S3 API policy language applied to buckets. The lifecycle configuration contains one or more rules defined for a set of bucket objects.
Prerequisites
- A running Red Hat Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- An S3 user created with user access.
-
Access to a Ceph Object Gateway client with the
AWS CLI
package installed.
Procedure
Create a JSON file for lifecycle configuration:
Example
[user@client ~]$ vi lifecycle.json
Add the specific lifecycle configuration rules in the file:
Example
{ "Rules": [ { "Filter": { "Prefix": "images/" }, "Status": "Enabled", "Expiration": { "Days": 1 }, "ID": "ImageExpiration" } ] }
The lifecycle configuration example expires objects in the images directory after 1 day.
Set the lifecycle configuration on the bucket:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api put-bucket-lifecycle-configuration --bucket BUCKET_NAME --lifecycle-configuration file://PATH_TO_LIFECYCLE_CONFIGURATION_FILE/LIFECYCLE_CONFIGURATION_FILE.json
Example
[user@client ~]$ aws --endpoint-url=http://host01:80 s3api put-bucket-lifecycle-configuration --bucket testbucket --lifecycle-configuration file://lifecycle.json
In this example, the
lifecycle.json
file exists in the current directory.
Verification
Retrieve the lifecycle configuration for the bucket:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME
Example
[user@client ~]$ aws --endpoint-url=http://host01:80 s3api get-bucket-lifecycle-configuration --bucket testbucket { "Rules": [ { "Expiration": { "Days": 1 }, "ID": "ImageExpiration", "Filter": { "Prefix": "images/" }, "Status": "Enabled" } ] }
Optional: From the Ceph Object Gateway node, log into the Cephadm shell and retrieve the bucket lifecycle configuration:
Syntax
radosgw-admin lc get --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket { "prefix_map": { "images/": { "status": true, "dm_expiration": false, "expiration": 1, "noncur_expiration": 0, "mp_expiration": 0, "transitions": {}, "noncur_transitions": {} } }, "rule_map": [ { "id": "ImageExpiration", "rule": { "id": "ImageExpiration", "prefix": "", "status": "Enabled", "expiration": { "days": "1", "date": "" }, "mp_expiration": { "days": "", "date": "" }, "filter": { "prefix": "images/", "obj_tags": { "tagset": {} } }, "transitions": {}, "noncur_transitions": {}, "dm_expiration": false } } ] }
Additional Resources
- See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.
-
For more information on using the
AWS CLI
to manage lifecycle configurations, see the Setting lifecycle configuration on a bucket section of the Amazon Simple Storage Service documentation.
9.8.2. Deleting a lifecycle management policy
You can delete the lifecycle management policy for a specified bucket by using the s3api delete-bucket-lifecycle
command.
Prerequisites
- A running Red Hat Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- An S3 user created with user access.
-
Access to a Ceph Object Gateway client with the
AWS CLI
package installed.
Procedure
Delete a lifecycle configuration:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api delete-bucket-lifecycle --bucket BUCKET_NAME
Example
[user@client ~]$ aws --endpoint-url=http://host01:80 s3api delete-bucket-lifecycle --bucket testbucket
Verification
Retrieve lifecycle configuration for the bucket:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME
Example
[user@client ~]# aws --endpoint-url=http://host01:80 s3api get-bucket-lifecycle-configuration --bucket testbucket
Optional: From the Ceph Object Gateway node, retrieve the bucket lifecycle configuration:
Syntax
radosgw-admin lc get --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket
NoteThe command does not return any information if a bucket lifecycle policy is not present.
Additional Resources
- See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.
9.8.3. Updating a lifecycle management policy
You can update a lifecycle management policy by using the s3cmd put-bucket-lifecycle-configuration
command.
The put-bucket-lifecycle-configuration
overwrites an existing bucket lifecycle configuration. If you want to retain any of the current lifecycle policy settings, you must include them in the lifecycle configuration file.
Prerequisites
- A running Red Hat Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
- An S3 bucket created.
- An S3 user created with user access.
-
Access to a Ceph Object Gateway client with the
AWS CLI
package installed.
Procedure
Create a JSON file for the lifecycle configuration:
Example
[user@client ~]$ vi lifecycle.json
Add the specific lifecycle configuration rules to the file:
Example
{ "Rules": [ { "Filter": { "Prefix": "images/" }, "Status": "Enabled", "Expiration": { "Days": 1 }, "ID": "ImageExpiration" }, { "Filter": { "Prefix": "docs/" }, "Status": "Enabled", "Expiration": { "Days": 30 }, "ID": "DocsExpiration" } ] }
Update the lifecycle configuration on the bucket:
Syntax
aws --endpoint-url=RADOSGW_ENDPOINT_URL:PORT s3api put-bucket-lifecycle-configuration --bucket BUCKET_NAME --lifecycle-configuration file://PATH_TO_LIFECYCLE_CONFIGURATION_FILE/LIFECYCLE_CONFIGURATION_FILE.json
Example
[user@client ~]$ aws --endpoint-url=http://host01:80 s3api put-bucket-lifecycle-configuration --bucket testbucket --lifecycle-configuration file://lifecycle.json
Verification
Retrieve the lifecycle configuration for the bucket:
Syntax
aws --endpointurl=RADOSGW_ENDPOINT_URL:PORT s3api get-bucket-lifecycle-configuration --bucket BUCKET_NAME
Example
[user@client ~]$ aws -endpoint-url=http://host01:80 s3api get-bucket-lifecycle-configuration --bucket testbucket { "Rules": [ { "Expiration": { "Days": 30 }, "ID": "DocsExpiration", "Filter": { "Prefix": "docs/" }, "Status": "Enabled" }, { "Expiration": { "Days": 1 }, "ID": "ImageExpiration", "Filter": { "Prefix": "images/" }, "Status": "Enabled" } ] }
Optional: From the Ceph Object Gateway node, log into the Cephadm shell and retrieve the bucket lifecycle configuration:
Syntax
radosgw-admin lc get --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin lc get --bucket=testbucket { "prefix_map": { "docs/": { "status": true, "dm_expiration": false, "expiration": 1, "noncur_expiration": 0, "mp_expiration": 0, "transitions": {}, "noncur_transitions": {} }, "images/": { "status": true, "dm_expiration": false, "expiration": 1, "noncur_expiration": 0, "mp_expiration": 0, "transitions": {}, "noncur_transitions": {} } }, "rule_map": [ { "id": "DocsExpiration", "rule": { "id": "DocsExpiration", "prefix": "", "status": "Enabled", "expiration": { "days": "30", "date": "" }, "noncur_expiration": { "days": "", "date": "" }, "mp_expiration": { "days": "", "date": "" }, "filter": { "prefix": "docs/", "obj_tags": { "tagset": {} } }, "transitions": {}, "noncur_transitions": {}, "dm_expiration": false } }, { "id": "ImageExpiration", "rule": { "id": "ImageExpiration", "prefix": "", "status": "Enabled", "expiration": { "days": "1", "date": "" }, "mp_expiration": { "days": "", "date": "" }, "filter": { "prefix": "images/", "obj_tags": { "tagset": {} } }, "transitions": {}, "noncur_transitions": {}, "dm_expiration": false } } ] }
Additional Resources
- See the Red Hat Ceph Storage Developer Guide for details on Amazon S3 bucket lifecycles.
9.8.4. Monitoring bucket lifecycles
You can monitor lifecycle processing and manually process the lifecycle of buckets with the radosgw-admin lc list
and radosgw-admin lc process
commands.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to a Ceph Object Gateway node.
- Creation of an S3 bucket with a lifecycle configuration policy applied.
Procedure
Log into the Cephadm shell:
Example
[root@host01 ~]# cephadm shell
List bucket lifecycle progress:
Example
[ceph: root@host01 /]# radosgw-admin lc list [ { “bucket”: “:testbucket:8b63d584-9ea1-4cf3-8443-a6a15beca943.54187.1”, “started”: “Thu, 01 Jan 1970 00:00:00 GMT”, “status” : “UNINITIAL” }, { “bucket”: “:testbucket1:8b635499-9e41-4cf3-8443-a6a15345943.54187.2”, “started”: “Thu, 01 Jan 1970 00:00:00 GMT”, “status” : “UNINITIAL” } ]
The bucket lifecycle processing status can be one of the following:
- UNINITIAL - The process has not run yet.
- PROCESSING - The process is currently running.
- COMPLETE - The process has completed.
Optional: You can manually process bucket lifecycle policies:
Process the lifecycle policy for a single bucket:
Syntax
radosgw-admin lc process --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin lc process --bucket=testbucket1
Process all bucket lifecycle policies immediately:
Example
[ceph: root@host01 /]# radosgw-admin lc process
Verification
List the bucket lifecycle policies:
[ceph: root@host01 /]# radosgw-admin lc list [ { “bucket”: “:testbucket:8b63d584-9ea1-4cf3-8443-a6a15beca943.54187.1”, “started”: “Thu, 17 Mar 2022 21:48:50 GMT”, “status” : “COMPLETE” } { “bucket”: “:testbucket1:8b635499-9e41-4cf3-8443-a6a15345943.54187.2”, “started”: “Thu, 17 Mar 2022 20:38:50 GMT”, “status” : “COMPLETE” } ]
Additional Resources
- See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.
9.8.5. Configuring lifecycle expiration window
You can set the time that the lifecycle management process runs each day by setting the rgw_lifecycle_work_time
parameter. By default, lifecycle processing occurs once per day, at midnight.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Installation of the Ceph Object Gateway.
- Root-level access to a Ceph Object Gateway node.
Procedure
Log into the Cephadm shell:
Example
[root@host01 ~]# cephadm shell
Set the lifecycle expiration time:
Syntax
ceph config set client.rgw rgw_lifecycle_work_time %D:%D-%D:%D
Replace %d:%d-%d:%d with
start_hour:start_minute-end_hour:end_minute
.Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_lifecycle_work_time 06:00-08:00
Verification
Retrieve the lifecycle expiration work time:
Example
[ceph: root@host01 /]# ceph config get client.rgw rgw_lifecycle_work_time 06:00-08:00
Additional Resources
- See the S3 bucket lifecycle section in the Red Hat Ceph Storage Developer Guide for details.
9.8.6. S3 bucket lifecycle transition within a storage cluster
You can use a bucket lifecycle configuration to manage objects so objects are stored effectively throughout the object’s lifetime. The object lifecycle transition rule allows you to manage, and effectively store the objects throughout the object’s lifetime. You can transition objects to less expensive storage classes, archive, or even delete them.
You can create storage classes for:
- Fast media, such as SSD or NVMe for I/O sensitive workloads.
- Slow magnetic media, such as SAS or SATA for archiving.
You can create a schedule for data movement between a hot storage class and a cold storage class. You can schedule this movement after a specified time so that the object expires and is deleted permanently for example you can transition objects to a storage class 30 days after you have created or even archived the objects to a storage class one year after creating them. You can do this through a transition rule. This rule applies to an object transitioning from one storage class to another. The lifecycle configuration contains one or more rules using the <Rule>
element.
Additional Resources
- See the Red Hat Ceph Storage Developer Guide for details on bucket lifecycle.
9.8.7. Transitioning an object from one storage class to another
The object lifecycle transition rule allows you to transition an object from one storage class to another class.
You can migrate data between replicated pools, erasure-coded pools, replicated to erasure-coded pools, or erasure-coded to replicated pools with the Ceph Object Gateway lifecycle transition policy.
In a multi-site configuration, when the lifecycle transition rule is applied on the first site, to transition objects from one data pool to another in the same storage cluster, then the same rule is valid for the second site, if the second site has the respective data pool created and enabled with rgw
application.
Prerequisites
- Installation of the Ceph Object Gateway software.
- Root-level access to the Ceph Object Gateway node.
- An S3 user created with user access.
Procedure
Create a new data pool:
Syntax
ceph osd pool create POOL_NAME
Example
[ceph: root@host01 /]# ceph osd pool create test.hot.data
Add a new storage class:
Syntax
radosgw-admin zonegroup placement add --rgw-zonegroup default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS
Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup default --placement-id default-placement --storage-class hot.test { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "STANDARD", "hot.test" ] } }
Provide the zone placement information for the new storage class:
Syntax
radosgw-admin zone placement add --rgw-zone default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS --data-pool DATA_POOL
Example
[ceph: root@host01 /]# radosgw-admin zone placement add --rgw-zone default --placement-id default-placement --storage-class hot.test --data-pool test.hot.data { "key": "default-placement", "val": { "index_pool": "test_zone.rgw.buckets.index", "storage_classes": { "STANDARD": { "data_pool": "test.hot.data" }, "hot.test": { "data_pool": "test.hot.data", } }, "data_extra_pool": "", "index_type": 0 }
NoteConsider setting the
compression_type
when creating cold or archival data storage pools with write once.Enable the
rgw
application on the data pool:Syntax
ceph osd pool application enable POOL_NAME rgw
Example
[ceph: root@host01 /]# ceph osd pool application enable test.hot.data rgw enabled application 'rgw' on pool 'test.hot.data'
-
Restart all the
rgw
daemons. Create a bucket:
Example
[ceph: root@host01 /]# aws s3api create-bucket --bucket testbucket10 --create-bucket-configuration LocationConstraint=default:default-placement --endpoint-url http://10.0.0.80:8080
Add the object:
Example
[ceph: root@host01 /]# aws --endpoint=http://10.0.0.80:8080 s3api put-object --bucket testbucket10 --key compliance-upload --body /root/test2.txt
Create a second data pool:
Syntax
ceph osd pool create POOL_NAME
Example
[ceph: root@host01 /]# ceph osd pool create test.cold.data
Add a new storage class:
Syntax
radosgw-admin zonegroup placement add --rgw-zonegroup default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS
Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup default --placement-id default-placement --storage-class cold.test { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "STANDARD", "cold.test" ] } }
Provide the zone placement information for the new storage class:
Syntax
radosgw-admin zone placement add --rgw-zone default --placement-id PLACEMENT_TARGET --storage-class STORAGE_CLASS --data-pool DATA_POOL
Example
[ceph: root@host01 /]# radosgw-admin zone placement add --rgw-zone default --placement-id default-placement --storage-class cold.test --data-pool test.cold.data
Enable
rgw
application on the data pool:Syntax
ceph osd pool application enable POOL_NAME rgw
Example
[ceph: root@host01 /]# ceph osd pool application enable test.cold.data rgw enabled application 'rgw' on pool 'test.cold.data'
-
Restart all the
rgw
daemons. To view the zone group configuration, run the following command:
Syntax
radosgw-admin zonegroup get { "id": "3019de59-ddde-4c5c-b532-7cdd29de09a1", "name": "default", "api_name": "default", "is_master": "true", "endpoints": [], "hostnames": [], "hostnames_s3website": [], "master_zone": "adacbe1b-02b4-41b8-b11d-0d505b442ed4", "zones": [ { "id": "adacbe1b-02b4-41b8-b11d-0d505b442ed4", "name": "default", "endpoints": [], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 11, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [], "redirect_zone": "" } ], "placement_targets": [ { "name": "default-placement", "tags": [], "storage_classes": [ "hot.test", "cold.test", "STANDARD" ] } ], "default_placement": "default-placement", "realm_id": "", "sync_policy": { "groups": [] } }
To view the zone configuration, run the following command:
Syntax
radosgw-admin zone get { "id": "adacbe1b-02b4-41b8-b11d-0d505b442ed4", "name": "default", "domain_root": "default.rgw.meta:root", "control_pool": "default.rgw.control", "gc_pool": "default.rgw.log:gc", "lc_pool": "default.rgw.log:lc", "log_pool": "default.rgw.log", "intent_log_pool": "default.rgw.log:intent", "usage_log_pool": "default.rgw.log:usage", "roles_pool": "default.rgw.meta:roles", "reshard_pool": "default.rgw.log:reshard", "user_keys_pool": "default.rgw.meta:users.keys", "user_email_pool": "default.rgw.meta:users.email", "user_swift_pool": "default.rgw.meta:users.swift", "user_uid_pool": "default.rgw.meta:users.uid", "otp_pool": "default.rgw.otp", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "default.rgw.buckets.index", "storage_classes": { "cold.test": { "data_pool": "test.cold.data" }, "hot.test": { "data_pool": "test.hot.data" }, "STANDARD": { "data_pool": "default.rgw.buckets.data" } }, "data_extra_pool": "default.rgw.buckets.non-ec", "index_type": 0 } } ], "realm_id": "", "notif_pool": "default.rgw.log:notif" }
Create a bucket:
Example
[ceph: root@host01 /]# aws s3api create-bucket --bucket testbucket10 --create-bucket-configuration LocationConstraint=default:default-placement --endpoint-url http://10.0.0.80:8080
List the objects prior to transition:
Example
[ceph: root@host01 /]# radosgw-admin bucket list --bucket testbucket10 { "ETag": "\"211599863395c832a3dfcba92c6a3b90\"", "Size": 540, "StorageClass": "STANDARD", "Key": "obj1", "VersionId": "W95teRsXPSJI4YWJwwSG30KxSCzSgk-", "IsLatest": true, "LastModified": "2023-11-23T10:38:07.214Z", "Owner": { "DisplayName": "test-user", "ID": "test-user" } }
Create a JSON file for lifecycle configuration:
Example
[ceph: root@host01 /]# vi lifecycle.json
Add the specific lifecycle configuration rule in the file:
Example
{ "Rules": [ { "Filter": { "Prefix": "" }, "Status": "Enabled", "Transitions": [ { "Days": 5, "StorageClass": "hot.test" }, { "Days": 20, "StorageClass": "cold.test" } ], "Expiration": { "Days": 365 }, "ID": "double transition and expiration" } ] }
The lifecycle configuration example shows an object that will transition from the default
STANDARD
storage class to thehot.test
storage class after 5 days, again transitions after 20 days to thecold.test
storage class, and finally expires after 365 days in thecold.test
storage class.Set the lifecycle configuration on the bucket:
Example
[ceph: root@host01 /]# aws s3api put-bucket-lifecycle-configuration --bucket testbucket10 --lifecycle-configuration file://lifecycle.json
Retrieve the lifecycle configuration on the bucket:
Example
[ceph: root@host01 /]# aws s3api get-bucket-lifecycle-configuration --bucket testbucke10 { "Rules": [ { "Expiration": { "Days": 365 }, "ID": "double transition and expiration", "Prefix": "", "Status": "Enabled", "Transitions": [ { "Days": 20, "StorageClass": "cold.test" }, { "Days": 5, "StorageClass": "hot.test" } ] } ] }
Verify that the object is transitioned to the given storage class:
Example
[ceph: root@host01 /]# radosgw-admin bucket list --bucket testbucket10 { "ETag": "\"211599863395c832a3dfcba92c6a3b90\"", "Size": 540, "StorageClass": "cold.test", "Key": "obj1", "VersionId": "W95teRsXPSJI4YWJwwSG30KxSCzSgk-", "IsLatest": true, "LastModified": "2023-11-23T10:38:07.214Z", "Owner": { "DisplayName": "test-user", "ID": "test-user" } }
Additional Resources
- See the Red Hat Ceph Storage Developer Guide for details on bucket lifecycle.
9.8.8. Enabling object lock for S3
Using the S3 object lock mechanism, you can use object lock concepts like retention period, legal hold, and bucket configuration to implement Write-Once-Read-Many (WORM) functionality as part of the custom workflow overriding data deletion permissions.
The object version(s), not the object name, is the defining and required value for object lock to perform correctly to support the GOVERNANCE or COMPLIANCE mode. You need to know the version of the object when it is written so that you can retrieve it at a later time.
Prerequisites
- A running Red Hat Ceph Storage cluster with Ceph Object Gateway installed.
- Root-level access to the Ceph Object Gateway node.
- S3 user with version-bucket creation access.
Procedure
Create a bucket with object lock enabled:
Syntax
aws --endpoint=http://RGW_PORT:8080 s3api create-bucket --bucket BUCKET_NAME --object-lock-enabled-for-bucket
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api create-bucket --bucket worm-bucket --object-lock-enabled-for-bucket
Set a retention period for the bucket:
Syntax
aws --endpoint=http://RGW_PORT:8080 s3api put-object-lock-configuration --bucket BUCKET_NAME --object-lock-configuration '{ "ObjectLockEnabled": "Enabled", "Rule": { "DefaultRetention": { "Mode": "RETENTION_MODE", "Days": NUMBER_OF_DAYS }}}'
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object-lock-configuration --bucket worm-bucket --object-lock-configuration '{ "ObjectLockEnabled": "Enabled", "Rule": { "DefaultRetention": { "Mode": "COMPLIANCE", "Days": 10 }}}'
NoteYou can choose either the GOVERNANCE or COMPLIANCE mode for the RETENTION_MODE in S3 object lock, to apply different levels of protection to any object version that is protected by object lock.
In GOVERNANCE mode, users cannot overwrite or delete an object version or alter its lock settings unless they have special permissions.
In COMPLIANCE mode, a protected object version cannot be overwritten or deleted by any user, including the root user in your AWS account. When an object is locked in COMPLIANCE mode, its RETENTION_MODE cannot be changed, and its retention period cannot be shortened. COMPLIANCE mode helps ensure that an object version cannot be overwritten or deleted for the duration of the period.
Put the object into the bucket with a retention time set:
Syntax
aws --endpoint=http://RGW_PORT:8080 s3api put-object --bucket BUCKET_NAME --object-lock-mode RETENTION_MODE --object-lock-retain-until-date "DATE" --key compliance-upload --body TEST_FILE
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object --bucket worm-bucket --object-lock-mode COMPLIANCE --object-lock-retain-until-date "2022-05-31" --key compliance-upload --body test.dd { "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"", "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD" }
Upload a new object using the same key:
Syntax
aws --endpoint=http://RGW_PORT:8080 s3api put-object --bucket BUCKET_NAME --object-lock-mode RETENTION_MODE --object-lock-retain-until-date "DATE" --key compliance-upload --body PATH
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object --bucket worm-bucket --object-lock-mode COMPLIANCE --object-lock-retain-until-date "2022-05-31" --key compliance-upload --body /etc/fstab { "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"", "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD" }
Command line options
Set an object lock legal hold on an object version:
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api put-object-legal-hold --bucket worm-bucket --key compliance-upload --legal-hold Status=ON
NoteUsing the object lock legal hold operation, you can place a legal hold on an object version, thereby preventing an object version from being overwritten or deleted. A legal hold doesn’t have an associated retention period and hence, remains in effect until removed.
List the objects from the bucket to retrieve only the latest version of the object:
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api list-objects --bucket worm-bucket
List the object versions from the bucket:
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api list-objects --bucket worm-bucket { "Versions": [ { "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"", "Size": 288, "StorageClass": "STANDARD", "Key": "hosts", "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD", "IsLatest": true, "LastModified": "2022-06-17T08:51:17.392000+00:00", "Owner": { "DisplayName": "Test User in Tenant test", "ID": "test$test.user" } } } ] }
Access objects using version-ids:
Example
[root@rgw-2 ~]# aws --endpoint=http://rgw.ceph.com:8080 s3api get-object --bucket worm-bucket --key compliance-upload --version-id 'IGOU.vdIs3SPduZglrB-RBaK.sfXpcd' download.1 { "AcceptRanges": "bytes", "LastModified": "2022-06-17T08:51:17+00:00", "ContentLength": 288, "ETag": "\"d560ea5652951637ba9c594d8e6ea8c1\"", "VersionId": "Nhhk5kRS6Yp6dZXVWpZZdRcpSpBKToD", "ContentType": "binary/octet-stream", "Metadata": {}, "ObjectLockMode": "COMPLIANCE", "ObjectLockRetainUntilDate": "2023-06-17T08:51:17+00:00" }
9.9. Usage
The Ceph Object Gateway logs usage for each user. You can track user usage within date ranges too.
Options include:
-
Start Date: The
--start-date
option allows you to filter usage stats from a particular start date (format:yyyy-mm-dd[HH:MM:SS]
). -
End Date: The
--end-date
option allows you to filter usage up to a particular date (format:yyyy-mm-dd[HH:MM:SS]
). -
Log Entries: The
--show-log-entries
option allows you to specify whether or not to include log entries with the usage stats (options:true
|false
).
You can specify time with minutes and seconds, but it is stored with 1 hour resolution.
9.9.1. Show usage
To show usage statistics, specify the usage show
. To show usage for a particular user, you must specify a user ID. You may also specify a start date, end date, and whether or not to show log entries.
Example
[ceph: root@host01 /]# radosgw-admin usage show \ --uid=johndoe --start-date=2022-06-01 \ --end-date=2022-07-01
You may also show a summary of usage information for all users by omitting a user ID.
Example
[ceph: root@host01 /]# radosgw-admin usage show --show-log-entries=false
9.9.2. Trim usage
With heavy use, usage logs can begin to take up storage space. You can trim usage logs for all users and for specific users. You may also specify date ranges for trim operations.
Example
[ceph: root@host01 /]# radosgw-admin usage trim --start-date=2022-06-01 \ --end-date=2022-07-31 [ceph: root@host01 /]# radosgw-admin usage trim --uid=johndoe [ceph: root@host01 /]# radosgw-admin usage trim --uid=johndoe --end-date=2021-04-31
9.10. Ceph Object Gateway data layout
Although RADOS only knows about pools and objects with their Extended Attributes (xattrs
) and object map (OMAP), conceptually Ceph Object Gateway organizes its data into three different kinds:
- metadata
- bucket index
- data
Metadata
There are three sections of metadata:
-
user
: Holds user information. -
bucket
: Holds a mapping between bucket name and bucket instance ID. -
bucket.instance
: Holds bucket instance information.
You can use the following commands to view metadata entries:
Syntax
radosgw-admin metadata get bucket:BUCKET_NAME radosgw-admin metadata get bucket.instance:BUCKET:BUCKET_ID radosgw-admin metadata get user:USER radosgw-admin metadata set user:USER
Example
[ceph: root@host01 /]# radosgw-admin metadata list [ceph: root@host01 /]# radosgw-admin metadata list bucket [ceph: root@host01 /]# radosgw-admin metadata list bucket.instance [ceph: root@host01 /]# radosgw-admin metadata list user
Every metadata entry is kept on a single RADOS object.
A Ceph Object Gateway object might consist of several RADOS objects, the first of which is the head that contains the metadata, such as manifest, Access Control List (ACL), content type, ETag, and user-defined metadata. The metadata is stored in xattrs
. The head might also contain up to 512 KB of object data, for efficiency and atomicity. The manifest describes how each object is laid out in RADOS objects.
Bucket index
It is a different kind of metadata, and kept separately. The bucket index holds a key-value map in RADOS objects. By default, it is a single RADOS object per bucket, but it is possible to shard the map over multiple RADOS objects.
The map itself is kept in OMAP associated with each RADOS object. The key of each OMAP is the name of the objects, and the value holds some basic metadata of that object, the metadata that appears when listing the bucket. Each OMAP holds a header, and we keep some bucket accounting metadata in that header such as number of objects, total size, and the like.
When using the radosgw-admin
tool, ensure that the tool and the Ceph Cluster are of the same version. The use of mismatched versions is not supported.
OMAP is a key-value store, associated with an object, in a way similar to how extended attributes associate with a POSIX file. An object’s OMAP is not physically located in the object’s storage, but its precise implementation is invisible and immaterial to the Ceph Object Gateway.
Data
Objects data is kept in one or more RADOS objects for each Ceph Object Gateway object.
9.10.1. Object lookup path
When accessing objects, REST APIs come to Ceph Object Gateway with three parameters:
- Account information, which has the access key in S3 or account name in Swift
- Bucket or container name
- Object name or key
At present, Ceph Object Gateway only uses account information to find out the user ID and for access control. It uses only the bucket name and object key to address the object in a pool.
Account information
The user ID in Ceph Object Gateway is a string, typically the actual user name from the user credentials and not a hashed or mapped identifier.
When accessing a user’s data, the user record is loaded from an object USER_ID
in the default.rgw.meta
pool with users.uid
namespace. .Bucket names They are represented in the default.rgw.meta
pool with root
namespace. Bucket record is loaded in order to obtain a marker, which serves as a bucket ID.
Object names
The object is located in the default.rgw.buckets.data
pool. Object name is MARKER_KEY
, for example default.7593.4_image.png
, where the marker is default.7593.4
and the key is image.png
. These concatenated names are not parsed and are passed down to RADOS only. Therefore, the choice of the separator is not important and causes no ambiguity. For the same reason, slashes are permitted in object names, such as keys.
9.10.1.1. Multiple data pools
It is possible to create multiple data pools so that different users’ buckets are created in different RADOS pools by default, thus providing the necessary scaling. The layout and naming of these pools is controlled by a policy
setting.
9.10.2. Bucket and object listing
Buckets that belong to a given user are listed in an OMAP of an object named USER_ID.buckets
, for example, foo.buckets
, in the default.rgw.meta
pool with users.uid
namespace. These objects are accessed when listing buckets, when updating bucket contents, and updating and retrieving bucket statistics such as quota. These listings are kept consistent with buckets in the .rgw
pool.
See the user-visible, encoded class cls_user_bucket_entry
and its nested class cls_user_bucket
for the values of these OMAP entries.
Objects that belong to a given bucket are listed in a bucket index. The default naming for index objects is .dir.MARKER
in the default.rgw.buckets.index
pool.
Additional Resources
- See the Configure bucket index resharding section in the Red Hat Ceph Storage Object Gateway Guide for more details.
9.10.3. Object Gateway data layout parameters
This is a list of data layout parameters for Ceph Object Gateway.
Known pools:
.rgw.root
- Unspecified region, zone, and global information records, one per object.
ZONE.rgw.control
- notify.N
ZONE.rgw.meta
Multiple namespaces with different kinds of metadata
- namespace: root
BUCKET .bucket.meta.BUCKET:MARKER # see put_bucket_instance_info()
The tenant is used to disambiguate buckets, but not bucket instances.
Example
.bucket.meta.prodtx:test%25star:default.84099.6 .bucket.meta.testcont:default.4126.1 .bucket.meta.prodtx:testcont:default.84099.4 prodtx/testcont prodtx/test%25star testcont
- namespace: users.uid
Contains per-user information (RGWUserInfo) in
USER
objects and per-user lists of buckets in omaps ofUSER.buckets
objects. TheUSER
might contain the tenant if non-empty.Example
prodtx$prodt test2.buckets prodtx$prodt.buckets test2
- namespace: users.email
- Unimportant
- namespace: users.keys
47UA98JSTJZ9YAN3OS3O
This allows Ceph Object Gateway to look up users by their access keys during authentication.
- namespace: users.swift
- test:tester
ZONE.rgw.buckets.index
-
Objects are named
.dir.MARKER
, each contains a bucket index. If the index is sharded, each shard appends the shard index after the marker. ZONE.rgw.buckets.data
default.7593.4__shadow_.488urDFerTYXavx4yAd-Op8mxehnvTI_1 MARKER_KEY
An example of a marker would be
default.16004.1
ordefault.7593.4
. The current format isZONE.INSTANCE_ID.BUCKET_ID
, but once generated, a marker is not parsed again, so its format might change freely in the future.
Additional Resources
- See the Ceph Object Gateway data layout in the Red Hat Ceph Storage Object Gateway Guide for more details.
9.11. Rate limits for ingesting data
As a storage administrator, you can set rate limits on users and buckets based on the operations and bandwidth when saving an object in a Red Hat Ceph Storage cluster with a Ceph Object Gateway configuration.
9.11.1. Purpose of rate limits in a storage cluster
You can set rate limits on users and buckets in a Ceph Object Gateway configuration. The rate limit includes the maximum number of read operations, write operations per minute, and how many bytes per minute can be written or read per user or per bucket.
Requests that use GET or HEAD method in the REST are “read requests”, else they are “write requests”.
The Ceph Object Gateway tracks the user and bucket requests separately and does not share with other gateways, which means that the desired limits configured should be divided by the number of active Object Gateways.
For example, if user A should be limited by ten ops per minute and there are two Ceph Object Gateways in the cluster, the limit over user A should be five, that is, ten ops per minute for two Ceph Object Gateways. If the requests are not balanced between Ceph Object Gateways, the rate limit may be underutilized. For example, if the ops limit is five and there are two Ceph Object Gateways, but the load balancer sends load only to one of those Ceph Object Gateways, the effective limit would be five ops, because this limit is enforced per Ceph Object Gateway.
If there is a limit reached for the bucket, but not for the user, or vice versa the request would be canceled as well.
The bandwidth counting happens after the request is accepted. As a result, this request proceeds even if the bucket or the user has reached its bandwidth limit in the middle of the request.
The Ceph Object Gateway keeps a “debt” of used bytes more than the configured value and prevents this user or bucket from sending more requests until their “debt” is paid. The “debt” maximum size is twice the max-read/write-bytes per minute. If user A has 1 byte read limit per minute and this user tries to GET 1 GB object, the user can do it.
After user A completes this 1 GB operation, the Ceph Object Gateway blocks the user request for up to two minutes until user A is able to send the GET request again.
Different options for limiting rates:
-
Bucket: The
--bucket
option allows you to specify a rate limit for a bucket. -
User: The
--uid
option allows you to specify a rate limit for a user. -
Maximum read ops: The
--max-read-ops
setting allows you to specify the maximum number of read ops per minute per Ceph Object Gateway. A value of0
disables this setting, which means unlimited access. -
Maximum read bytes: The
--max-read-bytes
setting allows you to specify the maximum number of read bytes per minute per Ceph Object Gateway. A value of0
disables this setting, which means unlimited access. -
Maximum write ops: The
--max-write-ops
setting allows you to specify the maximum number of write ops per minute per Ceph Object Gateway. A value of0
disables this setting, which means unlimited access. -
Maximum write bytes: The
--max-write-bytes
setting allows you to specify the maximum number of write bytes per minute per Ceph Object Gateway. A value of0
disables this setting, which means unlimited access. -
Rate limit scope: The
--rate-limit-scope
option sets the scope for the rate limit. The options arebucket
,user
, andanonymous
. Bucket rate limit applies to buckets, user rate limit applies to a user, and anonymous applies to an unauthenticated user. Anonymous scope is only available for global rate limit.
9.11.2. Enabling user rate limit
You can set rate limits on users in a Ceph Object Gateway configuration. The rate limit on users include the maximum number of read operations, write operations per minute, and how many bytes per minute can be written or read per user.
You can enable the rate limit on users after setting the value of rate limits by using the radosgw-admin ratelimit set
command with the ratelimit-scope
set as user
.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed.
Procedure
Set the rate limit for the user:
Syntax
radosgw-admin ratelimit set --ratelimit-scope=user --uid=USER_ID [--max-read-ops=NUMBER_OF_OPERATIONS] [--max-read-bytes=NUMBER_OF_BYTES] [--max-write-ops=NUMBER_OF_OPERATIONS] [--max-write-bytes=NUMBER_OF_BYTES]
Example
[ceph: root@host01 /]# radosgw-admin ratelimit set --ratelimit-scope=user --uid=testing --max-read-ops=1024 --max-write-bytes=10240
A value of
0
for NUMBER_OF_OPERATIONS or NUMBER_OF_BYTES means that the specific rate limit attribute check is disabled.Get the user rate limit:
Syntax
radosgw-admin ratelimit get --ratelimit-scope=user --uid=USER_ID
Example
[ceph: root@host01 /]# radosgw-admin ratelimit get --ratelimit-scope=user --uid=testing { "user_ratelimit": { "max_read_ops": 1024, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 10240, "enabled": false } }
Enable user rate limit:
Syntax
radosgw-admin ratelimit enable --ratelimit-scope=user --uid=USER_ID
Example
[ceph: root@host01 /]# radosgw-admin ratelimit enable --ratelimit-scope=user --uid=testing { "user_ratelimit": { "max_read_ops": 1024, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 10240, "enabled": true } }
Optional: Disable user rate limit:
Syntax
radosgw-admin ratelimit disable --ratelimit-scope=user --uid=USER_ID
Example
[ceph: root@host01 /]# radosgw-admin ratelimit disable --ratelimit-scope=user --uid=testing
9.11.3. Enabling bucket rate limit
You can set rate limits on buckets in a Ceph Object Gateway configuration. The rate limit on buckets include the maximum number of read operations, write operations per minute, and how many bytes per minute can be written or read per user.
You can enable the rate limit on buckets after setting the value of rate limits by using the radosgw-admin ratelimit set
command with the ratelimit-scope
set as bucket
.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed.
Procedure
Set the rate limit for the bucket:
Syntax
radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket= BUCKET_NAME [--max-read-ops=NUMBER_OF_OPERATIONS] [--max-read-bytes=NUMBER_OF_BYTES] [--max-write-ops=NUMBER_OF_OPERATIONS] [--max-write-bytes=NUMBER_OF_BYTES]
Example
[ceph: root@host01 /]# radosgw-admin ratelimit set --ratelimit-scope=bucket --bucket=mybucket --max-read-ops=1024 --max-write-bytes=10240
A value of
0
for NUMBER_OF_OPERATIONS or NUMBER_OF_BYTES means that the specific rate limit attribute check is disabled.Get the bucket rate limit:
Syntax
radosgw-admin ratelimit get --ratelimit-scope=bucket --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin ratelimit get --ratelimit-scope=bucket --bucket=mybucket { "bucket_ratelimit": { "max_read_ops": 1024, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 10240, "enabled": false } }
Enable bucket rate limit:
Syntax
radosgw-admin ratelimit enable --ratelimit-scope=bucket --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin ratelimit enable --ratelimit-scope=bucket --bucket=mybucket { "bucket_ratelimit": { "max_read_ops": 1024, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 10240, "enabled": true } }
Optional: Disable bucket rate limit:
Syntax
radosgw-admin ratelimit disable --ratelimit-scope=bucket --bucket=BUCKET_NAME
Example
[ceph: root@host01 /]# radosgw-admin ratelimit disable --ratelimit-scope=bucket --bucket=mybucket
9.11.4. Configuring global rate limits
You can read or write global rate limit settings in period configuration. You can override the user or bucket rate limit configuration by manipulating the global rate limit settings with the global ratelimit
parameter, which is the counterpart of ratelimit set
, ratelimit enable
, and ratelimit disable
commands.
In a multi-site configuration, where there is a realm and period present, changes to the global rate limit must be committed using period update --commit
command. If there is no period present, the Ceph Object Gateways must be restarted for the changes to take effect.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- A Ceph Object Gateway installed.
Procedure
View the global rate limit settings:
Syntax
radosgw-admin global ratelimit get
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit get { "bucket_ratelimit": { "max_read_ops": 1024, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 0, "enabled": false }, "user_ratelimit": { "max_read_ops": 0, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 0, "enabled": false }, "anonymous_ratelimit": { "max_read_ops": 0, "max_write_ops": 0, "max_read_bytes": 0, "max_write_bytes": 0, "enabled": false } }
Configure and enable rate limit scope for the buckets:
Set the global rate limits for bucket:
Syntax
radosgw-admin global ratelimit set --ratelimit-scope=bucket [--max-read-ops=NUMBER_OF_OPERATIONS] [--max-read-bytes=NUMBER_OF_BYTES] [--max-write-ops=NUMBER_OF_OPERATIONS] [--max-write-bytes=NUMBER_OF_BYTES]
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit set --ratelimit-scope bucket --max-read-ops=1024
Enable bucket rate limit:
Syntax
radosgw-admin global ratelimit enable --ratelimit-scope=bucket
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit enable --ratelimit-scope bucket
Configure and enable rate limit scope for authenticated users:
Set the global rate limits for users:
Syntax
radosgw-admin global ratelimit set --ratelimit-scope=user [--max-read-ops=NUMBER_OF_OPERATIONS] [--max-read-bytes=NUMBER_OF_BYTES] [--max-write-ops=NUMBER_OF_OPERATIONS] [--max-write-bytes=NUMBER_OF_BYTES]
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit set --ratelimit-scope=user --max-read-ops=1024
Enable user rate limit:
Syntax
radosgw-admin global ratelimit enable --ratelimit-scope=user
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit enable --ratelimit-scope=user
Configure and enable rate limit scope for unauthenticated users:
Set the global rate limits for unauthenticated users:
Syntax
radosgw-admin global ratelimit set --ratelimit-scope=anonymous [--max-read-ops=NUMBER_OF_OPERATIONS] [--max-read-bytes=NUMBER_OF_BYTES] [--max-write-ops=NUMBER_OF_OPERATIONS] [--max-write-bytes=NUMBER_OF_BYTES]
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit set --ratelimit-scope=anonymous --max-read-ops=1024
Enable user rate limit:
Syntax
radosgw-admin global ratelimit enable --ratelimit-scope=anonymous
Example
[ceph: root@host01 /]# radosgw-admin global ratelimit enable --ratelimit-scope=anonymous
9.12. Optimize the Ceph Object Gateway’s garbage collection
When new data objects are written into the storage cluster, the Ceph Object Gateway immediately allocates the storage for these new objects. After you delete or overwrite data objects in the storage cluster, the Ceph Object Gateway deletes those objects from the bucket index. Some time afterward, the Ceph Object Gateway then purges the space that was used to store the objects in the storage cluster. The process of purging the deleted object data from the storage cluster is known as Garbage Collection, or GC.
Garbage collection operations typically run in the background. You can configure these operations to either run continuously, or to run only during intervals of low activity and light workloads. By default, the Ceph Object Gateway conducts GC operations continuously. Because GC operations are a normal part of Ceph Object Gateway operations, deleted objects that are eligible for garbage collection exist most of the time.
9.12.1. Viewing the garbage collection queue
Before you purge deleted and overwritten objects from the storage cluster, use radosgw-admin
to view the objects awaiting garbage collection.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the Ceph Object Gateway.
Procedure
To view the queue of objects awaiting garbage collection:
Example
[ceph: root@host01 /]# radosgw-admin gc list
To list all entries in the queue, including unexpired entries, use the --include-all
option.
9.12.2. Adjusting Garbage Collection Settings
The Ceph Object Gateway allocates storage for new and overwritten objects immediately. Additionally, the parts of a multi-part upload also consume some storage.
The Ceph Object Gateway purges the storage space used for deleted objects after deleting the objects from the bucket index. Similarly, the Ceph Object Gateway will delete data associated with a multi-part upload after the multi-part upload completes or when the upload has gone inactive or failed to complete for a configurable amount of time. The process of purging the deleted object data from the Red Hat Ceph Storage cluster is known as garbage collection (GC).
Viewing the objects awaiting garbage collection can be done with the following command:
radosgw-admin gc list
Garbage collection is a background activity that runs continuously or during times of low loads, depending upon how the storage administrator configures the Ceph Object Gateway. By default, the Ceph Object Gateway conducts garbage collection operations continuously. Since garbage collection operations are a normal function of the Ceph Object Gateway, especially with object delete operations, objects eligible for garbage collection exist most of the time.
Some workloads can temporarily or permanently outpace the rate of garbage collection activity. This is especially true of delete-heavy workloads, where many objects get stored for a short period of time and then deleted. For these types of workloads, storage administrators can increase the priority of garbage collection operations relative to other operations with the following configuration parameters:
-
The
rgw_gc_obj_min_wait
configuration option waits a minimum length of time, in seconds, before purging a deleted object’s data. The default value is two hours, or 7200 seconds. The object is not purged immediately, because a client might be reading the object. Under heavy workloads, this setting can consume too much storage or have a large number of deleted objects to purge. Red Hat recommends not setting this value below 30 minutes, or 1800 seconds. -
The
rgw_gc_processor_period
configuration option is the garbage collection cycle run time. That is, the amount of time between the start of consecutive runs of garbage collection threads. If garbage collection runs longer than this period, the Ceph Object Gateway will not wait before running a garbage collection cycle again. -
The
rgw_gc_max_concurrent_io
configuration option specifies the maximum number of concurrent IO operations that the gateway garbage collection thread will use when purging deleted data. Under delete heavy workloads, consider increasing this setting to a larger number of concurrent IO operations. -
The
rgw_gc_max_trim_chunk
configuration option specifies the maximum number of keys to remove from the garbage collector log in a single operation. Under delete heavy operations, consider increasing the maximum number of keys so that more objects are purged during each garbage collection operation.
Starting with Red Hat Ceph Storage 4.1, offloading the index object’s OMAP from the garbage collection log helps lessen the performance impact of garbage collection activities on the storage cluster. Some new configuration parameters have been added to Ceph Object Gateway to tune the garbage collection queue, as follows:
-
The
rgw_gc_max_deferred_entries_size
configuration option sets the maximum size of deferred entries in the garbage collection queue. -
The
rgw_gc_max_queue_size
configuration option sets the maximum queue size used for garbage collection. This value should not be greater thanosd_max_object_size
minusrgw_gc_max_deferred_entries_size
minus 1 KB. -
The
rgw_gc_max_deferred
configuration option sets the maximum number of deferred entries stored in the garbage collection queue.
These garbage collection configuration parameters are for Red Hat Ceph Storage 7 and higher.
In testing, with an evenly balanced delete-write workload, such as 50% delete and 50% write operations, the storage cluster fills completely in 11 hours. This is because Ceph Object Gateway garbage collection fails to keep pace with the delete operations. The cluster status switches to the HEALTH_ERR
state if this happens. Aggressive settings for parallel garbage collection tunables significantly delayed the onset of storage cluster fill in testing and can be helpful for many workloads. Typical real-world storage cluster workloads are not likely to cause a storage cluster fill primarily due to garbage collection.
9.12.3. Adjusting garbage collection for delete-heavy workloads
Some workloads may temporarily or permanently outpace the rate of garbage collection activity. This is especially true of delete-heavy workloads, where many objects get stored for a short period of time and are then deleted. For these types of workloads, consider increasing the priority of garbage collection operations relative to other operations. Contact Red Hat Support with any additional questions about Ceph Object Gateway Garbage Collection.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to all nodes in the storage cluster.
Procedure
Set the value of
rgw_gc_max_concurrent_io
to20
, and the value ofrgw_gc_max_trim_chunk
to64
:Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_gc_max_concurrent_io 20 [ceph: root@host01 /]# ceph config set client.rgw rgw_gc_max_trim_chunk 64
- Restart the Ceph Object Gateway to allow the changed settings to take effect.
- Monitor the storage cluster during GC activity to verify that the increased values do not adversely affect performance.
Never modify the value for the rgw_gc_max_objs
option in a running cluster. You should only change this value before deploying the RGW nodes.
Additional Resources
9.13. Optimize the Ceph Object Gateway’s data object storage
Bucket lifecycle configuration optimizes data object storage to increase its efficiency and to provide effective storage throughout the lifetime of the data.
The S3 API in the Ceph Object Gateway currently supports a subset of the AWS bucket lifecycle configuration actions:
- Expiration
- NoncurrentVersionExpiration
- AbortIncompleteMultipartUpload
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to all of the nodes in the storage cluster.
9.13.1. Parallel thread processing for bucket life cycles
The Ceph Object Gateway now allows for parallel thread processing of bucket life cycles across multiple Ceph Object Gateway instances. Increasing the number of threads that run in parallel enables the Ceph Object Gateway to process large workloads more efficiently. In addition, the Ceph Object Gateway now uses a numbered sequence for index shard enumeration instead of using in-order numbering.
9.13.2. Optimizing the bucket lifecycle
Two options in the Ceph configuration file affect the efficiency of bucket lifecycle processing:
-
rgw_lc_max_worker
specifies the number of lifecycle worker threads to run in parallel. This enables the simultaneous processing of both bucket and index shards. The default value for this option is 3. -
rgw_lc_max_wp_worker
specifies the number of threads in each lifecycle worker thread’s work pool. This option helps to accelerate processing for each bucket. The default value for this option is 3.
For a workload with a large number of buckets — for example, a workload with thousands of buckets — consider increasing the value of the rgw_lc_max_worker
option.
For a workload with a smaller number of buckets but with a higher number of objects in each bucket — such as in the hundreds of thousands — consider increasing the value of the rgw_lc_max_wp_worker
option.
Before increasing the value of either of these options, please validate current storage cluster performance and Ceph Object Gateway utilization. Red Hat does not recommend that you assign a value of 10 or above for either of these options.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to all of the nodes in the storage cluster.
Procedure
To increase the number of threads to run in parallel, set the value of
rgw_lc_max_worker
to a value between3
and9
:Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_lc_max_worker 7
To increase the number of threads in each thread’s work pool, set the value of
rgw_lc_max_wp_worker
to a value between3
and9
:Example
[ceph: root@host01 /]# ceph config set client.rgw rgw_lc_max_wp_worker 7
- Restart the Ceph Object Gateway to allow the changed settings to take effect.
- Monitor the storage cluster to verify that the increased values do not adversely affect performance.
Additional Resources
- For more information about the bucket lifecycle and parallel thread processing, see Bucket lifecycle parallel processing
- For more information about Ceph Object Gateway lifecycle, contact Red Hat Support.
9.14. Transitioning data to Amazon S3 cloud service
You can transition data to a remote cloud service as part of the lifecycle configuration using storage classes to reduce cost and improve manageability. The transition is unidirectional and data cannot be transitioned back from the remote zone. This feature is to enable data transition to multiple cloud providers such as Amazon (S3).
Use cloud-s3
as tier-type
to configure the remote cloud S3 object store service to which the data needs to be transitioned. These do not need a data pool and are defined in terms of the zonegroup placement targets.
Prerequisites
- A Red Hat Ceph Storage cluster with Ceph Object Gateway installed.
- User credentials for the remote cloud service, Amazon S3.
- Target path created on Amazon S3.
-
s3cmd
installed on the bootstrapped node. - Amazon AWS configured locally to download data.
Procedure
Create a user with access key and secret key:
Syntax
radosgw-admin user create --uid=USER_NAME --display-name="DISPLAY_NAME" [--access-key ACCESS_KEY --secret-key SECRET_KEY]
Example
[ceph: root@host01 /]# radosgw-admin user create --uid=test-user --display-name="test-user" --access-key a21e86bce636c3aa1 --secret-key cf764951f1fdde5e { "user_id": "test-user", "display_name": "test-user", "email": "", "suspended": 0, "max_buckets": 1000, "subusers": [], "keys": [ { "user": "test-user", "access_key": "a21e86bce636c3aa1", "secret_key": "cf764951f1fdde5e" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "default_storage_class": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw", "mfa_ids": [] }
On the bootstrapped node, add a storage class with the tier type as
cloud-s3
:NoteOnce a storage class is created with the
--tier-type=cloud-s3
option , it cannot be later modified to any other storage class type.Syntax
radosgw-admin zonegroup placement add --rgw-zonegroup =ZONE_GROUP_NAME \ --placement-id=PLACEMENT_ID \ --storage-class =STORAGE_CLASS_NAME \ --tier-type=cloud-s3
Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup=default \ --placement-id=default-placement \ --storage-class=CLOUDTIER \ --tier-type=cloud-s3 [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "CLOUDTIER", "STANDARD" ], "tier_targets": [ { "key": "CLOUDTIER", "val": { "tier_type": "cloud-s3", "storage_class": "CLOUDTIER", "retain_head_object": "false", "s3": { "endpoint": "", "access_key": "", "secret": "", "host_style": "path", "target_storage_class": "", "target_path": "", "acl_mappings": [], "multipart_sync_threshold": 33554432, "multipart_min_part_size": 33554432 } } } ] } } ]
Update
storage_class
:NoteIf the cluster is part of a multi-site setup, run
period update --commit
so that the zonegroup changes are propagated to all the zones in the multi-site.NoteMake sure
access_key
andsecret
do not start with a digit.Mandatory parameters are:
-
access_key
is the remote cloud S3 access key used for a specific connection. -
secret
is the secret key for the remote cloud S3 service. -
endpoint
is the URL of the remote cloud S3 service endpoint. -
region
(for AWS) is the remote cloud S3 service region name.
Optional parameters are:
-
target_path
defines how the target path is created. The target path specifies a prefix to which the sourcebucket-name/object-name
is appended. If not specified, the target_path created isrgwx-ZONE_GROUP_NAME-STORAGE_CLASS_NAME-cloud-bucket
. -
target_storage_class
defines the target storage class to which the object transitions. If not specified, the object is transitioned to STANDARD storage class. -
retain_head_object
, if true, retains the metadata of the object transitioned to cloud. If false (default), the object is deleted post transition. This option is ignored for current versioned objects. -
multipart_sync_threshold
specifies that objects this size or larger are transitioned to the cloud using multipart upload. multipart_min_part_size
specifies the minimum part size to use when transitioning objects using multipart upload.Syntax
radosgw-admin zonegroup placement modify --rgw-zonegroup ZONE_GROUP_NAME \ --placement-id PLACEMENT_ID \ --storage-class STORAGE_CLASS_NAME \ --tier-config=endpoint=AWS_ENDPOINT_URL,\ access_key=AWS_ACCESS_KEY,secret=AWS_SECRET_KEY,\ target_path="TARGET_BUCKET_ON_AWS",\ multipart_sync_threshold=44432,\ multipart_min_part_size=44432,\ retain_head_object=true region=REGION_NAME
Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement modify --rgw-zonegroup default --placement-id default-placement \ --storage-class CLOUDTIER \ --tier-config=endpoint=http://10.0.210.010:8080,\ access_key=a21e86bce636c3aa2,secret=cf764951f1fdde5f,\ target_path="dfqe-bucket-01",\ multipart_sync_threshold=44432,\ multipart_min_part_size=44432,\ retain_head_object=true region=us-east-1 [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "CLOUDTIER", "STANDARD", "cold.test", "hot.test" ], "tier_targets": [ { "key": "CLOUDTIER", "val": { "tier_type": "cloud-s3", "storage_class": "CLOUDTIER", "retain_head_object": "true", "s3": { "endpoint": "http://10.0.210.010:8080", "access_key": "a21e86bce636c3aa2", "secret": "cf764951f1fdde5f", "region": "", "host_style": "path", "target_storage_class": "", "target_path": "dfqe-bucket-01", "acl_mappings": [], "multipart_sync_threshold": 44432, "multipart_min_part_size": 44432 } } } ] } } ] ]
-
Restart the Ceph Object Gateway:
Syntax
ceph orch restart CEPH_OBJECT_GATEWAY_SERVICE_NAME
Example
[ceph: root@host 01 /]# ceph orch restart rgw.rgw.1 Scheduled to restart rgw.rgw.1.host03.vkfldf on host 'host03’
Exit the shell and as a root user, configure Amazon S3 on your bootstrapped node:
Example
[root@host01 ~]# s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. Access Key: a21e86bce636c3aa2 Secret Key: cf764951f1fdde5f Default Region [US]: Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. S3 Endpoint [s3.amazonaws.com]: 10.0.210.78:80 Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used if the target S3 system supports dns based buckets. DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: 10.0.210.78:80 Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP, and can only be proxied with Python 2.7 or newer Use HTTPS protocol [Yes]: No On some networks all internet access must go through a HTTP proxy. Try setting it here if you can't connect to S3 directly HTTP Proxy server name: New settings: Access Key: a21e86bce636c3aa2 Secret Key: cf764951f1fdde5f Default Region: US S3 Endpoint: 10.0.210.78:80 DNS-style bucket+hostname:port template for accessing a bucket: 10.0.210.78:80 Encryption password: Path to GPG program: /usr/bin/gpg Use HTTPS protocol: False HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n] Y Please wait, attempting to list all buckets... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Not configured. Never mind. Save settings? [y/N] y Configuration saved to '/root/.s3cfg'
Create the S3 bucket:
Syntax
s3cmd mb s3://NAME_OF_THE_BUCKET_FOR_S3
Example
[root@host01 ~]# s3cmd mb s3://awstestbucket Bucket 's3://awstestbucket/' created
Create your file, input all the data, and move it to S3 service:
Syntax
s3cmd put FILE_NAME s3://NAME_OF_THE_BUCKET_ON_S3
Example
[root@host01 ~]# s3cmd put test.txt s3://awstestbucket upload: 'test.txt' -> 's3://awstestbucket/test.txt' [1 of 1] 21 of 21 100% in 1s 16.75 B/s done
Create the lifecycle configuration transition policy:
Syntax
<LifecycleConfiguration> <Rule> <ID>RULE_NAME</ID> <Filter> <Prefix></Prefix> </Filter> <Status>Enabled</Status> <Transition> <Days>DAYS</Days> <StorageClass>STORAGE_CLASS_NAME</StorageClass> </Transition> </Rule> </LifecycleConfiguration>
Example
[root@host01 ~]# cat lc_cloud.xml <LifecycleConfiguration> <Rule> <ID>Archive all objects</ID> <Filter> <Prefix></Prefix> </Filter> <Status>Enabled</Status> <Transition> <Days>2</Days> <StorageClass>CLOUDTIER</StorageClass> </Transition> </Rule> </LifecycleConfiguration>
Set the lifecycle configuration transition policy:
Syntax
s3cmd setlifecycle FILE_NAME s3://NAME_OF_THE_BUCKET_FOR_S3
Example
[root@host01 ~]# s3cmd setlifecycle lc_config.xml s3://awstestbucket s3://awstestbucket/: Lifecycle Policy updated
Log in to
cephadm shell
:Example
[root@host 01 ~]# cephadm shell
Restart the Ceph Object Gateway:
Syntax
ceph orch restart CEPH_OBJECT_GATEWAY_SERVICE_NAME
Example
[ceph: root@host 01 /]# ceph orch restart rgw.rgw.1 Scheduled to restart rgw.rgw.1.host03.vkfldf on host 'host03’
Verification
On the source cluster, verify if the data has moved to S3 with
radosgw-admin lc list
command:Example
[ceph: root@host01 /]# radosgw-admin lc list [ { "bucket": ":awstestbucket:552a3adb-39e0-40f6-8c84-00590ed70097.54639.1", "started": "Mon, 26 Sep 2022 18:32:07 GMT", "status": "COMPLETE" } ]
Verify object transition at cloud endpoint:
Example
[root@client ~]$ radosgw-admin bucket list [ "awstestbucket" ]
List the objects in the bucket:
Example
[root@host01 ~]$ aws s3api list-objects --bucket awstestbucket --endpoint=http://10.0.209.002:8080 { "Contents": [ { "Key": "awstestbucket/test", "LastModified": "2022-08-25T16:14:23.118Z", "ETag": "\"378c905939cc4459d249662dfae9fd6f\"", "Size": 29, "StorageClass": "STANDARD", "Owner": { "DisplayName": "test-user", "ID": "test-user" } } ] }
List the contents of the S3 bucket:
Example
[root@host01 ~]# s3cmd ls s3://awstestbucket 2022-08-25 09:57 0 s3://awstestbucket/test.txt
Check the information of the file:
Example
[root@host01 ~]# s3cmd info s3://awstestbucket/test.txt s3://awstestbucket/test.txt (object): File size: 0 Last mod: Mon, 03 Aug 2022 09:57:49 GMT MIME type: text/plain Storage: CLOUDTIER MD5 sum: 991d2528bb41bb839d1a9ed74b710794 SSE: none Policy: none CORS: none ACL: test-user: FULL_CONTROL x-amz-meta-s3cmd-attrs: atime:1664790668/ctime:1664790668/gid:0/gname:root/md5:991d2528bb41bb839d1a9ed74b710794/mode:33188/mtime:1664790668/uid:0/uname:root
Download data locally from Amazon S3:
Configure AWS:
Example
[client@client01 ~]$ aws configure AWS Access Key ID [****************6VVP]: AWS Secret Access Key [****************pXqy]: Default region name [us-east-1]: Default output format [json]:
List the contents of the AWS bucket:
Example
[client@client01 ~]$ aws s3 ls s3://dfqe-bucket-01/awstest PRE awstestbucket/
Download data from S3:
Example
[client@client01 ~]$ aws s3 cp s3://dfqe-bucket-01/awstestbucket/test.txt . download: s3://dfqe-bucket-01/awstestbucket/test.txt to ./test.txt
9.15. Transitioning data to Azure cloud service
You can transition data to a remote cloud service as part of the lifecycle configuration using storage classes to reduce cost and improve manageability. The transition is unidirectional and data cannot be transitioned back from the remote zone. This feature is to enable data transition to multiple cloud providers such as Azure. One of the key differences with the AWS configuration is that you need to configure the multi-cloud gateway (MCG) and use MCG to translate from the S3 protocol to Azure Blob.
Use cloud-s3
as tier-type
to configure the remote cloud S3 object store service to which the data needs to be transitioned. These do not need a data pool and are defined in terms of the zonegroup placement targets.
Prerequisites
- A Red Hat Ceph Storage cluster with Ceph Object Gateway installed.
- User credentials for the remote cloud service, Azure.
- Azure configured locally to download data.
-
s3cmd
installed on the bootstrapped node. -
Azure container for the for MCG namespace created. In this example, it is
mcgnamespace
.
Procedure
Create a user with access key and secret key:
Syntax
radosgw-admin user create --uid=USER_NAME --display-name="DISPLAY_NAME" [--access-key ACCESS_KEY --secret-key SECRET_KEY]
Example
[ceph: root@host01 /]# radosgw-admin user create --uid=test-user --display-name="test-user" --access-key a21e86bce636c3aa1 --secret-key cf764951f1fdde5e { "user_id": "test-user", "display_name": "test-user", "email": "", "suspended": 0, "max_buckets": 1000, "subusers": [], "keys": [ { "user": "test-user", "access_key": "a21e86bce636c3aa1", "secret_key": "cf764951f1fdde5e" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "default_storage_class": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw", "mfa_ids": [] }
As a root user, configure AWS CLI with the user credentials and create a bucket with default placement:
Syntax
aws s3 --ca-bundle CA_PERMISSION --profile rgw --endpoint ENDPOINT_URL --region default mb s3://BUCKET_NAME
Example
[root@host01 ~]$ aws s3 --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default mb s3://transition
Verify that the bucket is using
default-placement
with the placement rule:Example
[root@host01 ~]# radosgw-admin bucket stats --bucket transition { "bucket": "transition", "num_shards": 11, "tenant": "", "zonegroup": "b29b0e50-1301-4330-99fc-5cdcfc349acf", "placement_rule": "default-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" },
Log into the OpenShift Container Platform (OCP) cluster with OpenShift Data Foundation (ODF) deployed:
Example
[root@host01 ~]$ oc project openshift-storage [root@host01 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.6 True False 4d1h Cluster version is 4.11.6 [root@host01 ~]$ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 4d Ready 2023-06-27T15:23:01Z 4.11.0
Configure the multi-cloud gateway (MCG) namespace Azure bucket running on an OCP cluster in Azure:
Syntax
noobaa namespacestore create azure-blob az --account-key='ACCOUNT_KEY' --account-name='ACCOUNT_NAME' --target-blob-container='_AZURE_CONTAINER_NAME'
Example
[root@host01 ~]$ noobaa namespacestore create azure-blob az --account-key='iq3+6hRtt9bQ46QfHKQ0nSm2aP+tyMzdn8dBSRW4XWrFhY+1nwfqEj4hk2q66nmD85E/o5OrrUqo+AStkKwm9w==' --account-name='transitionrgw' --target-blob-container='mcgnamespace'
Create an MCG bucket class pointing to the
namespacestore
:Example
[root@host01 ~]$ noobaa bucketclass create namespace-bucketclass single aznamespace-bucket-class --resource az -n openshift-storage
Create an object bucket claim (OBC) for the transition to cloud:
Syntax
noobaa obc create OBC_NAME --bucketclass aznamespace-bucket-class -n openshift-storage
Example
[root@host01 ~]$ noobaa obc create rgwobc --bucketclass aznamespace-bucket-class -n openshift-storage
NoteUse the credentials provided by OBC to configure zonegroup placement on the Ceph Object Gateway.
On the bootstrapped node, create a storage class with the tier type as
cloud-s3
on the default placement within the default zonegroup on the previously configured MCG in Azure:NoteOnce a storage class is created with the
--tier-type=cloud-s3
option , it cannot be later modified to any other storage class type.Syntax
radosgw-admin zonegroup placement add --rgw-zonegroup =ZONE_GROUP_NAME \ --placement-id=PLACEMENT_ID \ --storage-class =STORAGE_CLASS_NAME \ --tier-type=cloud-s3
Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup=default \ --placement-id=default-placement \ --storage-class=AZURE \ --tier-type=cloud-s3 [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "AZURE", "STANDARD" ], "tier_targets": [ { "key": "AZURE", "val": { "tier_type": "cloud-s3", "storage_class": "AZURE", "retain_head_object": "false", "s3": { "endpoint": "", "access_key": "", "secret": "", "host_style": "path", "target_storage_class": "", "target_path": "", "acl_mappings": [], "multipart_sync_threshold": 33554432, "multipart_min_part_size": 33554432 } } } ] } } ]
Configure the cloud S3 cloud storage class:
Syntax
radosgw-admin zonegroup placement modify --rgw-zonegroup ZONE_GROUP_NAME \ --placement-id PLACEMENT_ID \ --storage-class STORAGE_CLASS_NAME \ --tier-config=endpoint=ENDPOINT_URL,\ access_key=ACCESS_KEY,secret=SECRET_KEY,\ target_path="TARGET_BUCKET_ON",\ multipart_sync_threshold=44432,\ multipart_min_part_size=44432,\ retain_head_object=true region=REGION_NAME
ImportantSetting the
retain_head_object
parameter totrue
retains the metadata or the head of the object to list the objects that are transitioned.Example
[ceph: root@host01 /]# radosgw-admin zonegroup placement modify --rgw-zonegroup default --placement-id default-placement \ --storage-class AZURE \ --tier-config=endpoint="https://s3-openshift-storage.apps.ocp410.0e73azopenshift.com",\ access_key=a21e86bce636c3aa2,secret=cf764951f1fdde5f,\ target_path="dfqe-bucket-01",\ multipart_sync_threshold=44432,\ multipart_min_part_size=44432,\ retain_head_object=true region=us-east-1 [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "AZURE", "STANDARD", "cold.test", "hot.test" ], "tier_targets": [ { "key": "AZURE", "val": { "tier_type": "cloud-s3", "storage_class": "AZURE", "retain_head_object": "true", "s3": { "endpoint": "https://s3-openshift-storage.apps.ocp410.0e73azopenshift.com", "access_key": "a21e86bce636c3aa2", "secret": "cf764951f1fdde5f", "region": "", "host_style": "path", "target_storage_class": "", "target_path": "dfqe-bucket-01", "acl_mappings": [], "multipart_sync_threshold": 44432, "multipart_min_part_size": 44432 } } } ] } } ] ]
Restart the Ceph Object Gateway:
Syntax
ceph orch restart CEPH_OBJECT_GATEWAY_SERVICE_NAME
Example
[ceph: root@host 01 /]# ceph orch restart client.rgw.objectgwhttps.host02.udyllp Scheduled to restart client.rgw.objectgwhttps.host02.udyllp on host 'host02
Create the lifecycle configuration transition policy for the bucket created previously. In this example, the bucket is
transition
:Syntax
cat transition.json { "Rules": [ { "Filter": { "Prefix": "" }, "Status": "Enabled", "Transitions": [ { "Days": 30, "StorageClass": "STORAGE_CLASS" } ], "ID": "TRANSITION_ID" } ] }
NoteAll the objects in the bucket older than 30 days are transferred to the cloud storage class called
AZURE
.Example
[root@host01 ~]$ cat transition.json { "Rules": [ { "Filter": { "Prefix": "" }, "Status": "Enabled", "Transitions": [ { "Days": 30, "StorageClass": "AZURE" } ], "ID": "Transition Objects in bucket to AZURE Blob after 30 days" } ] }
Apply the bucket lifecycle configuration using AWS CLI:
Syntax
aws s3api --ca-bundle CA_PERMISSION --profile rgw --endpoint ENDPOINT_URL--region default put-bucket-lifecycle-configuration --lifecycle-configuration file://BUCKET.json --bucket BUCKET_NAME
Example
[root@host01 ~]$ aws s3api --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default put-bucket-lifecycle-configuration --lifecycle-configuration file://transition.json --bucket transition
Optional: Get the lifecycle configuration:
Syntax
aws s3api --ca-bundle CA_PERMISSION --profile rgw --endpoint ENDPOINT_URL--region default get-bucket-lifecycle-configuration --lifecycle-configuration file://BUCKET.json --bucket BUCKET_NAME
Example
[root@host01 ~]$ aws s3api --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default get-bucket-lifecycle-configuration --bucket transition { "Rules": [ { "ID": "Transition Objects in bucket to AZURE Blob after 30 days", "Prefix": "", "Status": "Enabled", "Transitions": [ { "Days": 30, "StorageClass": "AZURE" } ] } ] }
Optional: Get the lifecycle configuration with the
radosgw-admin lc list
command:Example
[root@host 01 ~]# radosgw-admin lc list [ { "bucket": ":transition:d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1", "started": "Thu, 01 Jan 1970 00:00:00 GMT", "status": "UNINITIAL" } ]
NoteThe
UNINITAL
status implies that the lifecycle configuration is not processed. It moves toCOMPLETED
state after the transition process is complete.Log in to
cephadm shell
:Example
[root@host 01 ~]# cephadm shell
Restart the Ceph Object Gateway daemon:
Syntax
ceph orch daemon CEPH_OBJECT_GATEWAY_DAEMON_NAME
Example
[ceph: root@host 01 /]# ceph orch daemon restart rgw.objectgwhttps.host02.udyllp [ceph: root@host 01 /]# ceph orch daemon restart rgw.objectgw.host02.afwvyq [ceph: root@host 01 /]# ceph orch daemon restart rgw.objectgw.host05.ucpsrr
Migrate data from the source cluster to Azure:
Example
[root@host 01 ~]# for i in 1 2 3 4 5 do aws s3 --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default cp /etc/hosts s3://transition/transition$i done
Verify transition of data:
Example
[root@host 01 ~]# aws s3 --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default ls s3://transition 2023-06-30 10:24:01 3847 transition1 2023-06-30 10:24:04 3847 transition2 2023-06-30 10:24:07 3847 transition3 2023-06-30 10:24:09 3847 transition4 2023-06-30 10:24:13 3847 transition5
Verify if the data has moved to Azure with
rados ls
command:Example
[root@host 01 ~]# rados ls -p default.rgw.buckets.data | grep transition d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1_transition1 d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1_transition4 d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1_transition2 d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1_transition3 d9c4f708-5598-4c44-9d36-849552a08c4d.169377.1_transition5
If the data is not transitioned, you can run the
lc process
command:Example
[root@host 01 ~]# radosgw-admin lc process
This will force the lifecycle process to start and evaluates all the bucket lifecycle policies configured. It then starts the transition of data wherever needed.
Verification
Run the
radosgw-admin lc list
command to verify the completion of the transition:Example
[root@host 01 ~]# radosgw-admin lc list [ { "bucket": ":transition:d9c4f708-5598-4c44-9d36-849552a08c4d.170017.5", "started": "Mon, 30 Jun 2023-06-30 16:52:56 GMT", "status": "COMPLETE" } ]
List the objects in the bucket:
Example
[root@host01 ~]$ aws s3api list-objects --bucket awstestbucket --endpoint=http://10.0.209.002:8080 { "Contents": [ { "Key": "awstestbucket/test", "LastModified": "2023-06-25T16:14:23.118Z", "ETag": "\"378c905939cc4459d249662dfae9fd6f\"", "Size": 29, "StorageClass": "STANDARD", "Owner": { "DisplayName": "test-user", "ID": "test-user" } } ] }
List the objects on the cluster:
Example
[root@host01 ~]$ aws s3 --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default ls s3://transition 2023-06-30 17:52:56 0 transition1 2023-06-30 17:51:59 0 transition2 2023-06-30 17:51:59 0 transition3 2023-06-30 17:51:58 0 transition4 2023-06-30 17:51:59 0 transition5
The objects are
0
in size. You can list the objects, but cannot copy them since they are transitioned to Azure.Check the head of the object using the S3 API:
Example
[root@host01 ~]$ aws s3api --ca-bundle /etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw --endpoint https://host02.example.com:8043 --region default head-object --key transition1 --bucket transition { "AcceptRanges": "bytes", "LastModified": "2023-06-31T16:52:56+00:00", "ContentLength": 0, "ETag": "\"46ecb42fd0def0e42f85922d62d06766\"", "ContentType": "binary/octet-stream", "Metadata": {}, "StorageClass": "CLOUDTIER" }
You can see that the storage class has changed from
STANDARD
toCLOUDTIER
.