Chapter 2. Configuring performance and scalability for user workload monitoring
You can configure the monitoring stack to optimize the performance and scale of your clusters. The following documentation provides information about how to distribute the monitoring components and control the impact of the monitoring stack on CPU and memory resources.
2.1. Controlling the placement and distribution of monitoring components Copy linkLink copied to clipboard!
Control the placement and distribution of monitoring components across cluster nodes to optimize system resource use, improve performance, and separate workloads based on specific requirements or policies.
You can move the monitoring stack components to specific nodes with the following methods:
-
Use the
nodeSelectorconstraint with labeled nodes to move any of the monitoring stack components to specific nodes. - Assign tolerations to enable moving components to tainted nodes.
2.1.1. Moving monitoring components to different nodes Copy linkLink copied to clipboard!
Move monitoring stack components to specific nodes to optimize performance or meet hardware requirements, by configuring nodeSelector constraints in the user-workload-monitoring-config config map to match labels assigned to the nodes.
You cannot add a node selector constraint directly to an existing scheduled pod.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. - A cluster administrator has enabled monitoring for user-defined projects.
-
You have installed the OpenShift CLI (
oc).
Procedure
If you have not done so yet, add a label to the nodes on which you want to run the monitoring components:
$ oc label nodes <node_name> <node_label>where:
<node_name>- Specifies the name of the node where you want to add the label.
<node_label>- Specifies the name of the wanted label.
Edit the
user-workload-monitoring-configConfigMapobject in theopenshift-user-workload-monitoringproject:$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configSpecify the node labels for the
nodeSelectorconstraint for the component underdata/config.yaml:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | # ... <component>: nodeSelector: <node_label_1> <node_label_2> # ...where:
<component>- Specifies the monitoring stack component.
<node_label_1>- Specifies the label you added to the node.
<node_label_2>- Optional: Specifies additional labels. If you specify additional labels, the pods for the component are only scheduled on the nodes that contain all of the specified labels.
NoteIf monitoring components remain in a
Pendingstate after configuring thenodeSelectorconstraint, check the pod events for errors relating to taints and tolerations.- Save the file to apply the changes. The components specified in the new configuration are automatically moved to the new nodes, and the pods affected by the new configuration are redeployed.
2.1.2. Assigning tolerations to monitoring components Copy linkLink copied to clipboard!
You can assign tolerations to the components that monitor user-defined projects, to enable moving them to tainted worker nodes. Scheduling is not permitted on control plane or infrastructure nodes.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. - A cluster administrator has enabled monitoring for user-defined projects.
-
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
user-workload-monitoring-configconfig map in theopenshift-user-workload-monitoringproject:$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configSpecify
tolerationsfor the component:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>: tolerations: <toleration_specification>Substitute
<component>and<toleration_specification>accordingly.For example,
oc adm taint nodes node1 key1=value1:NoScheduleadds a taint tonode1with the keykey1and the valuevalue1. This prevents monitoring components from deploying pods onnode1unless a toleration is configured for that taint. The following example configures thethanosRulercomponent to tolerate the example taint:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | thanosRuler: tolerations: - key: "key1" operator: "Equal" value: "value1" effect: "NoSchedule"- Save the file to apply the changes. The pods affected by the new configuration are automatically redeployed.
2.2. Managing CPU and memory resources for monitoring components Copy linkLink copied to clipboard!
Ensure that the containers that run monitoring components have enough CPU and memory resources by specifying values for resource limits and requests for those components.
Configure these limits and requests for monitoring components that monitor user-defined projects in the openshift-user-workload-monitoring namespace.
2.2.1. Specifying limits and requests Copy linkLink copied to clipboard!
Prevent resource exhaustion and ensure stable monitoring operations by setting appropriate CPU and memory limits for each monitoring component in the user-workload-monitoring-config config map in the openshift-user-workload-monitoring namespace.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. -
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
user-workload-monitoring-configconfig map in theopenshift-user-workload-monitoringproject:$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configAdd values to define resource limits and requests for each component you want to configure.
ImportantEnsure that the value set for a limit is always higher than the value set for a request. Otherwise, an error will occur, and the container will not run.
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | alertmanager: resources: limits: cpu: 500m memory: 1Gi requests: cpu: 200m memory: 500Mi prometheus: resources: limits: cpu: 500m memory: 3Gi requests: cpu: 200m memory: 500Mi thanosRuler: resources: limits: cpu: 500m memory: 1Gi requests: cpu: 200m memory: 500Mi- Save the file to apply the changes. The pods affected by the new configuration are automatically redeployed.
2.3. Controlling the impact of unbound metrics attributes in user-defined projects Copy linkLink copied to clipboard!
Prevent monitoring performance degradation and excessive resource consumption by controlling the impact of unbound metrics attributes.
Developers can create labels to define attributes for metrics in the form of key-value pairs. The number of potential key-value pairs corresponds to the number of possible values for an attribute.
An attribute that has an unlimited number of potential values is called an unbound attribute. For example, a customer_id attribute is unbound because it has an infinite number of possible values.
Every assigned key-value pair has a unique time series. The use of many unbound attributes in labels can result in an exponential increase in the number of time series created. This can impact Prometheus performance and can consume a lot of disk space.
Cluster administrators can use the following measures to control the impact of unbound metrics attributes in user-defined projects:
- Limit the number of samples that can be accepted per target scrape in user-defined projects
- Limit the number of scraped labels, the length of label names, and the length of label values
- Configure the intervals between consecutive scrapes and between Prometheus rule evaluations
- Create alerts that fire when a scrape sample threshold is reached or when the target cannot be scraped
Limiting scrape samples can help prevent the issues caused by adding many unbound attributes to labels. Developers can also prevent the underlying cause by limiting the number of unbound attributes that they define for metrics. Using attributes that are bound to a limited set of possible values reduces the number of potential key-value pair combinations.
2.3.1. Setting scrape intervals, evaluation intervals, and enforced limits for user-defined projects Copy linkLink copied to clipboard!
Configure intervals between consecutive scrapes, intervals between Prometheus rule evaluations, and enforced limits for user-defined projects to control resource usage and optimize monitoring performance.
You can set the following scrape and label limits for user-defined projects:
- Limit the number of samples that can be accepted per target scrape
- Limit the number of scraped labels
- Limit the length of label names and label values
If you set sample or label limits, no further sample data is ingested for that target scrape after the limit is reached.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. - A cluster administrator has enabled monitoring for user-defined projects.
-
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
user-workload-monitoring-configConfigMapobject in theopenshift-user-workload-monitoringproject:$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configAdd the enforced limit and time interval configurations to
data/config.yaml:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: enforcedSampleLimit: 50000 enforcedLabelLimit: 500 enforcedLabelNameLengthLimit: 50 enforcedLabelValueLengthLimit: 600 scrapeInterval: 1m30s evaluationInterval: 1m15swhere:
enforcedSampleLimit- Defines the maximum number of samples that can be accepted per target scrape. A value is required if this parameter is specified. This example limits the number to 50,000.
enforcedLabelLimit-
Defines the maximum number of labels per scrape. The default value is
0, which specifies no limit. enforcedLabelNameLengthLimit-
Defines the maximum character length for a label name. The default value is
0, which specifies no limit. enforcedLabelValueLengthLimit-
Defines the maximum character length for a label value. The default value is
0, which specifies no limit. scrapeInterval-
Defines the interval between consecutive scrapes. The interval must be set between 5 seconds and 5 minutes. The default value is
30s. evaluationInterval-
Defines the interval between Prometheus rule evaluations. The interval must be set between 5 seconds and 5 minutes. The default value for Prometheus is
30s.
NoteYou can also configure the
evaluationIntervalproperty for Thanos Ruler through thedata/config.yaml/thanosRulerfield. The default value for Thanos Ruler is15s.- Save the file to apply the changes. The limits are applied automatically.
2.3.2. Creating scrape sample alerts Copy linkLink copied to clipboard!
Create alerts that notify you when monitoring targets cannot be scraped or become unavailable, or when scrape sample thresholds are reached or exceeded.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. - A cluster administrator has enabled monitoring for user-defined projects.
-
You have limited the number of samples that can be accepted per target scrape in user-defined projects, by using
enforcedSampleLimit. -
You have installed the OpenShift CLI (
oc).
Procedure
Create a YAML file with alerts that inform you when the targets are down and when the enforced sample limit is approaching. The file in this example is called
monitoring-stack-alerts.yaml:apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: k8s role: alert-rules name: monitoring-stack-alerts namespace: ns1 spec: groups: - name: general.rules rules: - alert: TargetDown annotations: message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace are down.' expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job, namespace, service)) > 10 for: 10m labels: severity: warning - alert: ApproachingEnforcedSamplesLimit annotations: message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.' expr: (scrape_samples_post_metric_relabeling / (scrape_sample_limit > 0)) > 0.9 for: 10m labels: severity: warningwhere:
metadata.name- Specifies the name of the alerting rule.
metadata.namespace- Specifies the user-defined project where the alerting rule is deployed.
TargetDown-
Specifies an alert that fires if the target cannot be scraped and is not available for the
forduration. ApproachingEnforcedSamplesLimit-
Specifies an alert that fires when the defined scrape sample threshold is exceeded and lasts for the specified
forduration. annotations.message- Specifies the message that is displayed when the alert fires.
expr-
Specifies the PromQL query expression that defines the new rule. For example, the
ApproachingEnforcedSamplesLimitalert fires when the number of ingested samples exceeds 90% of the configured limit. for- Specifies that the conditions for the alert must be true for this duration before the alert is fired.
labels.severity- Specifies the severity for the alert.
Apply the configuration to the user-defined project:
$ oc apply -f monitoring-stack-alerts.yamlAdditionally, you can check if a target has hit the configured limit:
In the OpenShift Container Platform web console, go to Observe
Targets and select an endpoint with a Downstatus that you want to check.The Scrape failed: sample limit exceeded message is displayed if the endpoint failed because of an exceeded sample limit.
2.4. Configuring pod topology spread constraints Copy linkLink copied to clipboard!
You can configure pod topology spread constraints for all the pods for user-defined monitoring to control how pod replicas are scheduled to nodes across zones.
This ensures that the pods are highly available and run more efficiently, because workloads are spread across nodes in different data centers or hierarchical infrastructure zones.
You can configure pod topology spread constraints for monitoring pods by using the user-workload-monitoring-config config map.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admincluster role or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. - A cluster administrator has enabled monitoring for user-defined projects.
-
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
user-workload-monitoring-configconfig map in theopenshift-user-workload-monitoringproject:$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configAdd the following settings under the
data/config.yamlfield to configure pod topology spread constraints:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>: topologySpreadConstraints: - maxSkew: <n> topologyKey: <key> whenUnsatisfiable: <value> labelSelector: <match_option>where:
<component>- Specifies a name of the component for which you want to set up pod topology spread constraints.
<n>-
Specifies a numeric value for
maxSkew, which defines the degree to which pods are allowed to be unevenly distributed. <key>-
Specifies a key of node labels for
topologyKey. Nodes that have a label with this key and identical values are considered to be in the same topology. The scheduler tries to put a balanced number of pods into each domain. <value>-
Specifies a value for
whenUnsatisfiable. Available options areDoNotScheduleandScheduleAnyway. SpecifyDoNotScheduleif you want themaxSkewvalue to define the maximum difference allowed between the number of matching pods in the target topology and the global minimum. SpecifyScheduleAnywayif you want the scheduler to still schedule the pod but to give higher priority to nodes that might reduce the skew. <match_option>-
Specifies
labelSelectorto find matching pods. Pods that match this label selector are counted to determine the number of pods in their corresponding topology domain.
Example configuration for Thanos Ruler:
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | thanosRuler: topologySpreadConstraints: - maxSkew: 1 topologyKey: monitoring whenUnsatisfiable: ScheduleAnyway labelSelector: matchLabels: app.kubernetes.io/name: thanos-ruler- Save the file to apply the changes. The pods affected by the new configuration are automatically redeployed.