Este contenido no está disponible en el idioma seleccionado.

Chapter 5. Scaling the Cluster Monitoring Operator


OpenShift Container Platform exposes metrics that the Cluster Monitoring Operator collects and stores in the Prometheus-based monitoring stack. As an administrator, you can view system resources, containers and components metrics in one dashboard interface, Grafana.

5.1. Prometheus database storage requirements

Red Hat performed various tests for different scale sizes.

Expand
Table 5.1. Prometheus Database storage requirements based on number of nodes/pods in the cluster
Number of NodesNumber of PodsPrometheus storage growth per dayPrometheus storage growth per 15 daysRAM Space (per scale size)Network (per tsdb chunk)

50

1800

6.3 GB

94 GB

6 GB

16 MB

100

3600

13 GB

195 GB

10 GB

26 MB

150

5400

19 GB

283 GB

12 GB

36 MB

200

7200

25 GB

375 GB

14 GB

46 MB

Approximately 20 percent of the expected size was added as overhead to ensure that the storage requirements do not exceed the calculated value.

The above calculation is for the default OpenShift Container Platform Cluster Monitoring Operator.

Note

CPU utilization has minor impact. The ratio is approximately 1 core out of 40 per 50 nodes and 1800 pods.

Lab environment

In a previous release, all experiments were performed in an OpenShift Container Platform on RHOSP environment:

  • Infra nodes (VMs) - 40 cores, 157 GB RAM.
  • CNS nodes (VMs) - 16 cores, 62 GB RAM, NVMe drives.
Important

Currently, RHOSP environments are not supported for OpenShift Container Platform 4.2.

Recommendations for OpenShift Container Platform

  • Use at least three infrastructure (infra) nodes.
  • Use at least three openshift-container-storage nodes with non-volatile memory express (NVMe) drives.

5.2. Configuring cluster monitoring

Procedure

To increase the storage capacity for Prometheus:

  1. Create a YAML configuration file, cluster-monitoring-config.yml. For example:

    apiVersion: v1
    kind: ConfigMap
    data:
      config.yaml: |
        prometheusOperator:
          baseImage: quay.io/coreos/prometheus-operator
          prometheusConfigReloaderBaseImage: quay.io/coreos/prometheus-config-reloader
          configReloaderBaseImage: quay.io/coreos/configmap-reload
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        prometheusK8s:
          retention: {{PROMETHEUS_RETENTION_PERIOD}} 
    1
    
          baseImage: openshift/prometheus
          nodeSelector:
            node-role.kubernetes.io/infra: ""
          volumeClaimTemplate:
            spec:
              storageClassName: gp2
              resources:
                requests:
                  storage: {{PROMETHEUS_STORAGE_SIZE}} 
    2
    
        alertmanagerMain:
          baseImage: openshift/prometheus-alertmanager
          nodeSelector:
            node-role.kubernetes.io/infra: ""
          volumeClaimTemplate:
            spec:
              storageClassName: gp2
              resources:
                requests:
                  storage: {{ALERTMANAGER_STORAGE_SIZE}} 
    3
    
        nodeExporter:
          baseImage: openshift/prometheus-node-exporter
        kubeRbacProxy:
          baseImage: quay.io/coreos/kube-rbac-proxy
        kubeStateMetrics:
          baseImage: quay.io/coreos/kube-state-metrics
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        grafana:
          baseImage: grafana/grafana
          nodeSelector:
            node-role.kubernetes.io/infra: ""
        auth:
          baseImage: openshift/oauth-proxy
        k8sPrometheusAdapter:
          nodeSelector:
            node-role.kubernetes.io/infra: ""
    metadata:
      name: cluster-monitoring-config
    namespace: openshift-monitoring
    Copy to Clipboard Toggle word wrap
    1
    A typical value is PROMETHEUS_RETENTION_PERIOD=15d. Units are measured in time using one of these suffixes: s, m, h, d.
    2
    A typical value is PROMETHEUS_STORAGE_SIZE=2000Gi. Storage values can be a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.
    3
    A typical value is ALERTMANAGER_STORAGE_SIZE=20Gi. Storage values can be a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.
  2. Set the values like the retention period and storage sizes.
  3. Apply the changes by running:

    $ oc create -f cluster-monitoring-config.yml
    Copy to Clipboard Toggle word wrap
Volver arriba
Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2025 Red Hat