主页
产品
Red Hat Ceph Storage
8
对象网关指南
9.14. 基于策略的数据存档和重试到 S3 兼容平台

9.14. 基于策略的数据存档和重试到 S3 兼容平台

对象生命周期转换规则允许您将对象从一个存储类转换到另一个 S3 兼容平台。

您可以使用 Ceph 对象网关生命周期转换策略将数据迁移到 S3 兼容平台。

注意

在多站点配置中，当生命周期转换规则应用于第一个站点时，若要将对象从一个数据池转换到同一存储集群中的另一个数据，则同一规则对第二个站点有效，如果第二个站点具有使用 rgw 应用创建并启用了相应的数据池。

9.14.1. 将数据转换到 Amazon S3 云服务
复制链接

您可以使用存储类将数据转换到远程云服务，以降低成本并提高可管理性。过渡是单向的，无法从远程区转换数据。此功能是使数据转换到多个云供应商，如 Amazon (S3)。

使用 cloud-s3 作为 层类型，以配置需要转换数据的远程云 S3 对象存储服务。它们不需要数据池，并在 zonegroup 放置目标中定义。

先决条件

在开始前，请确保您有以下先决条件：

安装了 Ceph 对象网关的 Red Hat Ceph Storage 集群。
远程云服务 Amazon S3 的用户凭据。
在 Amazon S3 上创建的目标路径。
在 bootstrapped 节点上安装的 s3cmd。
Amazon AWS 在本地配置为下载数据。

流程

创建带有 access key 和 secret key 的用户：

语法

radosgw-admin user create --uid=USER_NAME --display-name="DISPLAY_NAME" [--access-key ACCESS_KEY --secret-key SECRET_KEY]

Example

[ceph: root@host01 /]# radosgw-admin user create --uid=test-user --display-name="test-user" --access-key a21e86bce636c3aa1 --secret-key cf764951f1fdde5e
{
    "user_id": "test-user",
    "display_name": "test-user",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "subusers": [],
    "keys": [
        {
            "user": "test-user",
            "access_key": "a21e86bce636c3aa1",
            "secret_key": "cf764951f1fdde5e"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

在 Red Hat Ceph Storage 8.1 或更高版本上，层类型选项为 cloud-s3 和 cloud-s3-glacier。要使用 cloud-s3，请转到下一步。使用这个步骤添加 cloud-s3-glacier 层类型。

语法

radosgw-admin zonegroup placement add --rgw-zonegroup =ZONE_GROUP_NAME \
                            --placement-id=PLACEMENT_ID \
                            --storage-class =CLOUDTIER-GLACIER \
                            --tier-type=cloud-s3-glacier

Example

[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup=default \
                                                 --placement-id=default-placement \
                                                 --storage-class=CLOUDTIER-GLACIER \
                                                 --tier-type=cloud-s3-glacier

[
    {
                "key": "CLOUDTIER-GLACIER",
                "val": {
                    "tier_type": "cloud-s3-glacier",
                    "storage_class": "CLOUDTIER-GLACIER",
                    "retain_head_object": true,
                    "s3": {
                        "endpoint": "http://s3.us-east-1.amazonaws.com",
                        "access_key": "XXXXXXXXXX",
                        "secret": "YYYYYYYYYY",
                        "region": "us-east-1",
                        "host_style": "path",
                        "target_storage_class": "GLACIER",
                        "target_path": "rgwbucket",
                        "acl_mappings": [],
                        "multipart_sync_threshold": 44432,
                        "multipart_min_part_size": 44432
                    },
                    "allow_read_through": false,
                    "read_through_restore_days": 10,
                    "restore_storage_class": "COLDTIER",
                    "s3-glacier": {
                        "glacier_restore_days": 2,
                        "glacier_restore_tier_type": "Expedited"
                    }
                }
            }

]

在 bootstrapped 节点上，添加层类型为 cloud-s3 的存储类：

注意

使用 --tier-type=cloud-s3 选项创建存储类后，无法在以后将其修改到任何其他存储类类型。

语法

radosgw-admin zonegroup placement add --rgw-zonegroup =ZONE_GROUP_NAME \
                            --placement-id=PLACEMENT_ID \
                            --storage-class =STORAGE_CLASS_NAME \
                            --tier-type=cloud-s3

Example

[ceph: root@host01 /]# radosgw-admin zonegroup placement add --rgw-zonegroup=default \
                                                 --placement-id=default-placement \
                                                 --storage-class=CLOUDTIER \
                                                 --tier-type=cloud-s3
[
    {
        "key": "default-placement",
        "val": {
            "name": "default-placement",
            "tags": [],
            "storage_classes": [
                "CLOUDTIER",
                "STANDARD"
            ],
            "tier_targets": [
                {
                    "key": "CLOUDTIER",
                    "val": {
                        "tier_type": "cloud-s3",
                        "storage_class": "CLOUDTIER",
                        "retain_head_object": "false",
                        "s3": {
                            "endpoint": "",
                            "access_key": "",
                            "secret": "",
                            "host_style": "path",
                            "target_storage_class": "",
                            "target_path": "",
                            "acl_mappings": [],
                            "multipart_sync_threshold": 33554432,
                            "multipart_min_part_size": 33554432
                        }
                    }
                }
            ]
        }
    }
]

更新 storage_class ：

注意

如果集群是多站点设置的一部分，请运行 period update --commit 以便 zonegroup 更改传播到多站点中的所有区域。

注意

确保 access_key 和 secret 没有以数字开头。

必需的参数有：

access_key

用于特定连接的远程云 S3 访问密钥。

secret

远程云 S3 服务的 secret 密钥。

端点

远程云 S3 服务端点的 URL。

region

AWS 的远程云 S3 服务区域名称。

可选参数是：

target_path

定义如何创建目标路径。目标路径指定一个附加源 bucket-name/object-name 的前缀。如果没有指定，则创建的 target_path 是 rgwx-ZONE_GROUP_NAME-STORAGE_CLASS_NAME-cloud-bucket。

target_storage_class

定义对象转换的目标存储类。如果没有指定，则对象将转换为 STANDARD 存储类。

retain_head_object

设置为 true 以保留转换到云的对象元数据。设置为 false，以在转换后删除对象。默认设置为 false。

注意

当前版本化对象会忽略这个选项。

multipart_sync_threshold

指定这个大小或更大的对象，使用多部分上传过渡到云。

multipart_min_part_size

指定使用多部分上传转换对象时使用的最小部分大小。

glacier_restore_days

在 Red Hat Ceph Storage 8.1 或更高版本中使用 cloud-s3-glacier。指定在 Glacier 或 Tape 端点上恢复对象的天数。默认设置为 1 ( 1 天)。

glacier_restore_tier_type

在 Red Hat Ceph Storage 8.1 或更高版本中使用 cloud-s3-glacier。指定恢复检索类型。选项为 Standard 或 Expedited。默认设置为 Standard。

语法

radosgw-admin zonegroup placement modify --rgw-zonegroup ZONE_GROUP_NAME \
                                         --placement-id PLACEMENT_ID \
                                         --storage-class STORAGE_CLASS_NAME  \
                                         --tier-config=endpoint=AWS_ENDPOINT_URL,\
                                         access_key=AWS_ACCESS_KEY,secret=AWS_SECRET_KEY,\
                                         target_path="TARGET_BUCKET_ON_AWS",\
                                         multipart_sync_threshold=44432,\
                                         multipart_min_part_size=44432,\
                                         retain_head_object=true
                                          region=REGION_NAME

Example

[ceph: root@host01 /]# radosgw-admin zonegroup placement modify --rgw-zonegroup default
                                                                --placement-id default-placement \
                                                                --storage-class CLOUDTIER \
                                                                --tier-config=endpoint=http://10.0.210.010:8080,\
                                                                access_key=a21e86bce636c3aa2,secret=cf764951f1fdde5f,\
                                                                target_path="dfqe-bucket-01",\
                                                                multipart_sync_threshold=44432,\
                                                                multipart_min_part_size=44432,\
                                                                retain_head_object=true
                                                                region=us-east-1

[
    {
        "key": "default-placement",
        "val": {
            "name": "default-placement",
            "tags": [],
            "storage_classes": [
                "CLOUDTIER",
                "STANDARD",
                "cold.test",
                "hot.test"
            ],
            "tier_targets": [
                {
                    "key": "CLOUDTIER",
                    "val": {
                        "tier_type": "cloud-s3",
                        "storage_class": "CLOUDTIER",
                        "retain_head_object": "true",
                        "s3": {
                            "endpoint": "http://10.0.210.010:8080",
                            "access_key": "a21e86bce636c3aa2",
                            "secret": "cf764951f1fdde5f",
                            "region": "",
                            "host_style": "path",
                            "target_storage_class": "",
                            "target_path": "dfqe-bucket-01",
                            "acl_mappings": [],
                            "multipart_sync_threshold": 44432,
                            "multipart_min_part_size": 44432
                        }
                    }
                }
            ]
        }
    }
]
]

重启 Ceph 对象网关：

语法

ceph orch restart CEPH_OBJECT_GATEWAY_SERVICE_NAME

Example

[ceph: root@host 01 /]# ceph orch restart rgw.rgw.1

Scheduled to restart rgw.rgw.1.host03.vkfldf on host 'host03’

退出 shell，并以 root 用户身份，在 bootstrapped 节点上配置 Amazon S3：

Example

[root@host01 ~]# s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: a21e86bce636c3aa2
Secret Key: cf764951f1fdde5f
Default Region [US]:

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: 10.0.210.78:80

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: 10.0.210.78:80

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: No

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
  Access Key: a21e86bce636c3aa2
  Secret Key: cf764951f1fdde5f
  Default Region: US
  S3 Endpoint: 10.0.210.78:80
  DNS-style bucket+hostname:port template for accessing a bucket: 10.0.210.78:80
  Encryption password:
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] Y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Not configured. Never mind.

Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

创建 S3 存储桶：

语法

s3cmd mb s3://NAME_OF_THE_BUCKET_FOR_S3

Example

[root@host01 ~]# s3cmd mb s3://awstestbucket
Bucket 's3://awstestbucket/' created

创建文件，输入所有数据，并将其移到 S3 服务：

语法

s3cmd put FILE_NAME  s3://NAME_OF_THE_BUCKET_ON_S3

Example

[root@host01 ~]# s3cmd put test.txt s3://awstestbucket

upload: 'test.txt' -> 's3://awstestbucket/test.txt'  [1 of 1]
 21 of 21   100% in    1s    16.75 B/s  done

创建生命周期配置转换策略：

语法

<LifecycleConfiguration>
  <Rule>
    <ID>RULE_NAME</ID>
    <Filter>
      <Prefix></Prefix>
    </Filter>
    <Status>Enabled</Status>
    <Transition>
      <Days>DAYS</Days>
      <StorageClass>STORAGE_CLASS_NAME</StorageClass>
    </Transition>
  </Rule>
</LifecycleConfiguration>

Example

[root@host01 ~]# cat lc_cloud.xml
<LifecycleConfiguration>
  <Rule>
    <ID>Archive all objects</ID>
    <Filter>
      <Prefix></Prefix>
    </Filter>
    <Status>Enabled</Status>
    <Transition>
      <Days>2</Days>
      <StorageClass>CLOUDTIER</StorageClass>
    </Transition>
  </Rule>
</LifecycleConfiguration>

设置生命周期配置转换策略：

语法

s3cmd setlifecycle FILE_NAME s3://NAME_OF_THE_BUCKET_FOR_S3

Example

[root@host01 ~]#  s3cmd setlifecycle lc_config.xml s3://awstestbucket

s3://awstestbucket/: Lifecycle Policy updated

登录到 cephadm shell:
Example
```
[root@host 01 ~]# cephadm shell
```

重启 Ceph 对象网关：

语法

ceph orch restart CEPH_OBJECT_GATEWAY_SERVICE_NAME

Example

[ceph: root@host 01 /]# ceph orch restart rgw.rgw.1

Scheduled to restart rgw.rgw.1.host03.vkfldf on host 'host03’

验证

在源集群中，使用 radosgw-admin lc list 命令验证数据是否已移至 S3：

Example

[ceph: root@host01 /]# radosgw-admin lc list
[
    {
        "bucket": ":awstestbucket:552a3adb-39e0-40f6-8c84-00590ed70097.54639.1",
        "started": "Mon, 26 Sep 2022 18:32:07 GMT",
        "status": "COMPLETE"
    }
]

在云端点验证对象转换：

Example

[root@client ~]$ radosgw-admin bucket list
[
    "awstestbucket"
]

列出存储桶中的对象：

Example

[root@host01 ~]$ aws s3api list-objects --bucket awstestbucket --endpoint=http://10.0.209.002:8080
{
    "Contents": [
        {
            "Key": "awstestbucket/test",
            "LastModified": "2022-08-25T16:14:23.118Z",
            "ETag": "\"378c905939cc4459d249662dfae9fd6f\"",
            "Size": 29,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "test-user",
                "ID": "test-user"
            }
        }
    ]
}

列出 S3 存储桶的内容：

Example

[root@host01 ~]# s3cmd ls s3://awstestbucket
2022-08-25 09:57            0  s3://awstestbucket/test.txt

检查文件的信息：

Example

[root@host01 ~]# s3cmd info s3://awstestbucket/test.txt
s3://awstestbucket/test.txt (object):
   File size: 0
   Last mod:  Mon, 03 Aug 2022 09:57:49 GMT
   MIME type: text/plain
   Storage:   CLOUDTIER
   MD5 sum:   991d2528bb41bb839d1a9ed74b710794
   SSE:       none
   Policy:    none
   CORS:      none
   ACL:       test-user: FULL_CONTROL
   x-amz-meta-s3cmd-attrs: atime:1664790668/ctime:1664790668/gid:0/gname:root/md5:991d2528bb41bb839d1a9ed74b710794/mode:33188/mtime:1664790668/uid:0/uname:root

从 Amazon S3 本地下载数据：

配置 AWS：

Example

[client@client01 ~]$ aws configure

AWS Access Key ID [****************6VVP]:
AWS Secret Access Key [****************pXqy]:
Default region name [us-east-1]:
Default output format [json]:

列出 AWS 存储桶的内容：

Example

[client@client01 ~]$ aws s3 ls s3://dfqe-bucket-01/awstest
PRE awstestbucket/

从 S3: 下载数据：

Example

[client@client01 ~]$ aws s3 cp s3://dfqe-bucket-01/awstestbucket/test.txt .

download: s3://dfqe-bucket-01/awstestbucket/test.txt to ./test.txt

9.14.3. 从 S3 云层存储中恢复对象
复制链接

您可以在转换对象上使用 S3 Restore API 将对象从云恢复到 Ceph 对象网关集群。

注意

对于 Red Hat Ceph Storage 8.0，对象恢复到 STANDARD storage-class。但是，对于临时对象，x-amz-storage-class 仍然会返回原始 cloud-tier 存储类。
对于 Red Hat Ceph Storage 8.1 或更高版本，默认情况下对象恢复到 STANDARD 存储类。但是，您可以配置对象需要恢复到的存储类。

注意

如果对象的 null 版本不是最新版本，在发出 restore-object 请求时不要指定 'version-id null'。例如，假设名为 object1 的对象上传到名为 testbucket1 的存储桶，过渡到云，然后恢复。之后，testbucket1 上启用了版本控制，将上传 object1 的新版本。此时，原始(null)版本不再是最新的。在这种情况下，当恢复 object1 时，您应该从 restore -object 命令省略--version-id null 参数。

先决条件

在从 S3 云层存储中恢复对象之前，请确保满足以下先决条件：

在 Amazon S3 上创建的目标路径。
在配置 cloudtier 存储类时，retain_head_object 应该设为 true，以便能够从云服务恢复。
在 bootstrapped 节点上安装的 s3cmd。
确保为 cloud-tier 存储类提供的用户凭据保持有效，以便恢复功能正常工作。
对于 Red Hat Ceph Storage 8.1 或更新的 cloud-s3-glacier 层类型，请确定正确设置了 glacier_restore_days 和 glacier_restore_tier_type 选项。如需更多信息，请参阅将数据转换到 Amazon S3 云服务。

流程

使用以下步骤之一恢复对象。

使用 S3 restore-object 命令恢复云转换的对象。

注意

通过 read-through 恢复的对象副本是临时的，仅在 read_through_restore_days 的持续时间内保留。

语法

tier-type = cloud-s3
tier-config =
aws s3api restore-object
--bucket <value>
--key <value>
[--version-id <value>]
--restore-request (structure)
{
{
"Days": integer,
}
}

Example

tier-type = cloud-s3
tier-config =
aws s3api restore-object
--bucket my-glacier-bucket
--key doc1.rtf
[--version-id 3sL4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo]
--restore-request
{
{
"Days": 10,
}
}

使用 read-through 功能恢复并读取转换的对象。

确保设置了以下参数：

"Allow_read_through": "enable"

“read_through_restore_days”: 10

语法

tier-type = cloud-s3
tier-config =
{
"access_key": <access>,
"secret": <secret>,
"endpoint": <endpoint>,
"region": <region>,
"host_style": <path | virtual>,
"acls": [ { "type": <id | email | uri>,
"source_id": <source_id>,
"dest_id": <dest_id> } ... ],
"target_path": <target_path>,
"target_storage_class": <target-storage-class>,
"multipart_sync_threshold": {object_size},
"multipart_min_part_size": {part_size},
"retain_head_object": <true | false>,
“Allow_read_through”: “enable | disable”,
“read_through_restore_days”: integer
}

恢复 cloud-s3 层类型的示例

tier-type = cloud-s3
tier-config =
{
"access_key": a21e86bce636c3aa2,
"secret": cf764951f1fdde5f,
"endpoint": http://10.0.210.010:8080
“,
"region": “”,
"host_style": “path”,
"acls": [ { "type": “id”,
"source_id": “”,
"dest_id": “” } ... ],
"target_path": dfqe-bucket-01,
"target_storage_class": “”,
"multipart_sync_threshold": 44432,
"multipart_min_part_size": 44432,
"retain_head_object": “true”,
“Allow_read_through”: “enable”,
“read_through_restore_days”: 10
}

恢复 cloud-s3-glacier tier-type 的示例

tier-type = cloud-s3-glacier
tier-config =
{
"access_key": a21e86bce636c3aa2,
"secret": cf764951f1fdde5f,
"endpoint": http://s3.us-east-1.amazonaws.com
“,
"region": “us-east-1”,
"host_style": “path”,
"acls": [ { "type": “id”,
"source_id": “”,
"dest_id": “” } ... ],
"target_path": rgwbucket,
"target_storage_class": “GLACIER”,
"multipart_sync_threshold": 44432,
"multipart_min_part_size": 44432,
"retain_head_object": “true”,
“Allow_read_through”: “enable”,
“read_through_restore_days”: 10
"restore_storage_class": "COLDTIER",
"s3-glacier": {
    "glacier_restore_days": 2,
    "glacier_restore_tier_type": "Expedited"
}

验证

使用指定参数运行 S3 head-object 请求来验证恢复的状态。

使用 S3 head-object 请求检查恢复的状态。

Example

[root@host01 ~]$ aws s3api --ca-bundle
/etc/pki/ca-trust/source/anchors/myCA.pem --profile rgw
--endpoint https://host02.example.com:8043 --region default head-object
--key transition1 --bucket transition

9.14. 基于策略的数据存档和重试到 S3 兼容平台

9.14.1. 将数据转换到 Amazon S3 云服务
复制链接

9.14.2. 将数据转换到 Azure 云服务
复制链接

9.14.3. 从 S3 云层存储中恢复对象
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

9.14. 基于策略的数据存档和重试到 S3 兼容平台

9.14.1. 将数据转换到 Amazon S3 云服务复制链接链接已复制到粘贴板!

9.14.2. 将数据转换到 Azure 云服务复制链接链接已复制到粘贴板!

9.14.3. 从 S3 云层存储中恢复对象复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

9.14.1. 将数据转换到 Amazon S3 云服务
复制链接

9.14.2. 将数据转换到 Azure 云服务
复制链接

9.14.3. 从 S3 云层存储中恢复对象
复制链接