이 콘텐츠는 선택한 언어로 제공되지 않습니다.
Chapter 8. Scaling the Cluster Monitoring Operator
OpenShift Container Platform exposes metrics that the Cluster Monitoring Operator collects and stores in the Prometheus-based monitoring stack. As an administrator, you can view system resources, containers and components metrics in one dashboard interface, Grafana.
8.1. Prometheus database storage requirements
Red Hat performed various tests for different scale sizes.
The Prometheus storage requirements below are not prescriptive. Higher resource consumption might be observed in your cluster depending on workload activity and resource use.
Number of Nodes | Number of pods | Prometheus storage growth per day | Prometheus storage growth per 15 days | RAM Space (per scale size) | Network (per tsdb chunk) |
---|---|---|---|---|---|
50 | 1800 | 6.3 GB | 94 GB | 6 GB | 16 MB |
100 | 3600 | 13 GB | 195 GB | 10 GB | 26 MB |
150 | 5400 | 19 GB | 283 GB | 12 GB | 36 MB |
200 | 7200 | 25 GB | 375 GB | 14 GB | 46 MB |
Approximately 20 percent of the expected size was added as overhead to ensure that the storage requirements do not exceed the calculated value.
The above calculation is for the default OpenShift Container Platform Cluster Monitoring Operator.
CPU utilization has minor impact. The ratio is approximately 1 core out of 40 per 50 nodes and 1800 pods.
Recommendations for OpenShift Container Platform
- Use at least three infrastructure (infra) nodes.
- Use at least three openshift-container-storage nodes with non-volatile memory express (NVMe) drives.
8.2. Configuring cluster monitoring
Procedure
To increase the storage capacity for Prometheus:
Create a YAML configuration file,
cluster-monitoring-config.yml
. For example:apiVersion: v1 kind: ConfigMap data: config.yaml: | prometheusOperator: baseImage: quay.io/coreos/prometheus-operator prometheusConfigReloaderBaseImage: quay.io/coreos/prometheus-config-reloader configReloaderBaseImage: quay.io/coreos/configmap-reload nodeSelector: node-role.kubernetes.io/infra: "" prometheusK8s: retention: {{PROMETHEUS_RETENTION_PERIOD}} 1 baseImage: openshift/prometheus nodeSelector: node-role.kubernetes.io/infra: "" volumeClaimTemplate: spec: storageClassName: gp2 resources: requests: storage: {{PROMETHEUS_STORAGE_SIZE}} 2 alertmanagerMain: baseImage: openshift/prometheus-alertmanager nodeSelector: node-role.kubernetes.io/infra: "" volumeClaimTemplate: spec: storageClassName: gp2 resources: requests: storage: {{ALERTMANAGER_STORAGE_SIZE}} 3 nodeExporter: baseImage: openshift/prometheus-node-exporter kubeRbacProxy: baseImage: quay.io/coreos/kube-rbac-proxy kubeStateMetrics: baseImage: quay.io/coreos/kube-state-metrics nodeSelector: node-role.kubernetes.io/infra: "" grafana: baseImage: grafana/grafana nodeSelector: node-role.kubernetes.io/infra: "" auth: baseImage: openshift/oauth-proxy k8sPrometheusAdapter: nodeSelector: node-role.kubernetes.io/infra: "" metadata: name: cluster-monitoring-config namespace: openshift-monitoring
- 1
- A typical value is
PROMETHEUS_RETENTION_PERIOD=15d
. Units are measured in time using one of these suffixes: s, m, h, d. - 2
- A typical value is
PROMETHEUS_STORAGE_SIZE=2000Gi
. Storage values can be a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki. - 3
- A typical value is
ALERTMANAGER_STORAGE_SIZE=20Gi
. Storage values can be a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.
- Set the values like the retention period and storage sizes.
Apply the changes by running:
$ oc create -f cluster-monitoring-config.yml