Chapter 14. Scaling Multicloud Object Gateway performance
The Multicloud Object Gateway (MCG) performance may vary from one environment to another. In some cases, specific applications require faster performance which can be easily addressed by scaling S3 endpoints.
The MCG resource pool is a group of NooBaa daemon containers that provide two types of services enabled by default:
- Storage service
- S3 endpoint service
S3 endpoint service
The S3 endpoint is a service that every Multicloud Object Gateway (MCG) provides by default that handles the heavy lifting data digestion in the MCG. The endpoint service handles the inline data chunking, deduplication, compression, and encryption, and it accepts data placement instructions from the MCG.
14.1. Automatic scaling of MultiCloud Object Gateway endpoints Copy linkLink copied to clipboard!
The number of MultiCloud Object Gateway (MCG) endpoints scale automatically when the load on the MCG S3 service increases or decreases. OpenShift Data Foundation clusters are deployed with one active MCG endpoint. Each MCG endpoint pod is configured by default with 1 CPU and 2Gi memory request, with limits matching the request. When the CPU load on the endpoint crosses over an 80% usage threshold for a consistent period of time, a second endpoint is deployed lowering the load on the first endpoint. When the average CPU load on both endpoints falls below the 80% threshold for a consistent period of time, one of the endpoints is deleted. This feature improves performance and serviceability of the MCG.
You can scale the Horizontal Pod Autoscaler (HPA) for noobaa-endpoint using the following oc patch command, for example:
# oc patch -n openshift-storage storagecluster ocs-storagecluster \
--type merge \
--patch '{"spec": {"multiCloudGateway": {"endpoints": {"minCount": 3,"maxCount": 10}}}}'
The example above sets the minCount to 3 and the maxCount to `10.
14.2. Autoscaling RGW in OpenShift Data Foundation using HPA and KEDA Copy linkLink copied to clipboard!
This section describes how to enable autoscaling for RGW (RADOS Gateway) in OpenShift Data Foundation by integrating with OpenShift Horizontal Pod Autoscaler (HPA) - KEDA (Kubernetes Event‑Driven Autoscaling).
Prerequisites
-
All resources described in this procedure (Roles, RoleBindings, TriggerAuthentication, ScaledObject, and so on) must be created in the
openshift-storagenamespace. Custom Metrics Autoscaler Operator version must be 2.6 or later.
Install the Custom Metrics Autoscaler Operator.
- Log in to the OpenShift Web Console.
-
Click Ecosystem
Software Catalog. - Scroll or type Custom Metrics Autoscaler into the Filter by keyword box to find the Custom Metrics Autoscaler Operator.
- Click Install.
Create a Custom Metrics Autoscaler controller and verify whether the following pods are running in the
openshift-kedanamespace:oc get pods -n openshift-keda NAME READY STATUS RESTARTS AGE custom-metrics-autoscaler-operator-6cb698bb4d-nnpmj 1/1 Running 0 103m keda-admission-c6d879546-twprp 1/1 Running 0 103m keda-metrics-apiserver-68788fbd9b-h7l5n 1/1 Running 0 103m keda-operator-664dc57859-mb49f 1/1 Running 0 20mConfigure Custom Metrics Autoscaler with the Thanos Query Service.
Prometheus scaler that is supported by Custom Metrics Autoscaler collects metrics through the Thanos Query service provided by OpenShift Monitoring. Because Thanos Query requires authentication, a ServiceAccount token must be used for bearer authentication.
Create a Service Account.
oc create serviceaccount thanos -n openshift-storageCreate a secret to generate the token.
apiVersion: v1 kind: Secret metadata: name: thanos-token-secret namespace: openshift-storage annotations: kubernetes.io/service-account.name: "thanos" type: kubernetes.io/service-account-tokenoc apply -f secret.yamlAdd the cluster-monitoring-view cluster role to the created Service Account in the openshift-storage namespace.
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z thanosCreate a triggerAuthentication CRD for the bearer authentication using the secret name.
apiVersion: keda.sh/v1alpha1 kind: TriggerAuthentication metadata: name: keda-trigger-auth-prometheus namespace: openshift-storage spec: secretTargetRef: - parameter: bearerToken name: thanos-token-secret key: token - parameter: ca name: thanos-token-secret key: ca.crt
Enable RGW autoscaling.
oc annotate storagecluster ocs-storagecluster ocs.openshift.io/enable-rgw-autoscale="true" -n openshift-storageThis behavior takes precedence over manual scaling. You can revert this change by performing the following command:
oc annotate storagecluster ocs-storagecluster ocs.openshift.io/enable-rgw-autoscale- -n openshift-storageCreate ScaledObject for autoscaling RGW.
Autoscaling using the HPA feature of OpenShift triggers scaling based on custom metrics. The prometheus metrics related to RGW exported by the Ceph manager is used. The following example uses
ceph_rgw_put_obj_opsmetrics and scale up to five pods.apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: rgw-scale-namespace namespace: openshift-storage spec: scaleTargetRef: apiVersion: ceph.rook.io/v1 kind: CephObjectStore name: ocs-storagecluster-cephobjectstore namespace: openshift-storage minReplicaCount: 1 maxReplicaCount: 5 triggers: - type: prometheus metadata: serverAddress: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091 metricName: ceph_rgw_put_obj_ops query: | sum(rate(ceph_rgw_op_put_obj_ops[2m])) threshold: "1" authModes: "bearer" namespace: openshift-storage authenticationRef: name: keda-trigger-auth-prometheus
14.3. Increasing CPU and memory for PV pool resources Copy linkLink copied to clipboard!
MCG default configuration supports low resource consumption. However, when you need to increase CPU and memory to accommodate specific workloads and to increase MCG performance for the workloads, you can configure the required values for CPU and memory in the OpenShift Web Console.
Procedure
-
In the OpenShift Web Console, navigate to Storage
Object Storage Backing Store. - Select the relevant backing store and click on YAML.
Scroll down until you find
spec:and updatepvPoolwith CPU and memory. Add a new property of limits and then add cpu and memory.Example reference:
spec: pvPool: resources: limits: cpu: 1000m memory: 4000Mi requests: cpu: 800m memory: 800Mi storage: 50Gi- Click Save.
Verification steps
- To verfiy, you can check the resource values of the PV pool pods.