This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Chapter 3. Configuring external alertmanager instances
The OpenShift Container Platform monitoring stack includes a local Alertmanager instance that routes alerts from Prometheus. You can add external Alertmanager instances by configuring the cluster-monitoring-config config map in either the openshift-monitoring project or the user-workload-monitoring-config project.
If you add the same external Alertmanager configuration for multiple clusters and disable the local instance for each cluster, you can then manage alert routing for multiple clusters by using a single external Alertmanager instance.
Prerequisites
-
You have installed the OpenShift CLI (
oc). If you are configuring core OpenShift Container Platform monitoring components in the
openshift-monitoringproject:-
You have access to the cluster as a user with the
cluster-adminrole. -
You have created the
cluster-monitoring-configconfig map.
-
You have access to the cluster as a user with the
If you are configuring components that monitor user-defined projects:
-
You have access to the cluster as a user with the
cluster-adminrole, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. -
You have created the
user-workload-monitoring-configconfig map.
-
You have access to the cluster as a user with the
Procedure
Edit the
ConfigMapobject.To configure additional Alertmanagers for routing alerts from core OpenShift Container Platform projects:
Edit the
cluster-monitoring-configconfig map in theopenshift-monitoringproject:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Add an
additionalAlertmanagerConfigs:section underdata/config.yaml/prometheusK8s. Add the configuration details for additional Alertmanagers in this section:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For
<alertmanager_specification>, substitute authentication and other configuration details for additional Alertmanager instances. Currently supported authentication methods are bearer token (bearerToken) and client TLS (tlsConfig). The following sample config map configures an additional Alertmanager using a bearer token with client TLS authentication:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To configure additional Alertmanager instances for routing alerts from user-defined projects:
Edit the
user-workload-monitoring-configconfig map in theopenshift-user-workload-monitoringproject:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Add a
<component>/additionalAlertmanagerConfigs:section underdata/config.yaml/. Add the configuration details for additional Alertmanagers in this section:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For
<component>, substitute one of two supported external Alertmanager components:prometheusorthanosRuler.For
<alertmanager_specification>, substitute authentication and other configuration details for additional Alertmanager instances. Currently supported authentication methods are bearer token (bearerToken) and client TLS (tlsConfig). The following sample config map configures an additional Alertmanager using Thanos Ruler with a bearer token and client TLS authentication:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteConfigurations applied to the
user-workload-monitoring-configConfigMapobject are not activated unless a cluster administrator has enabled monitoring for user-defined projects.
-
Save the file to apply the changes to the
ConfigMapobject. The new component placement configuration is applied automatically.
3.1. Attaching additional labels to your time series and alerts Copy linkLink copied to clipboard!
Using the external labels feature of Prometheus, you can attach custom labels to all time series and alerts leaving Prometheus.
Prerequisites
If you are configuring core OpenShift Container Platform monitoring components:
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have created the
cluster-monitoring-configConfigMapobject.
-
You have access to the cluster as a user with the
If you are configuring components that monitor user-defined projects:
-
You have access to the cluster as a user with the
cluster-adminrole, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. -
You have created the
user-workload-monitoring-configConfigMapobject.
-
You have access to the cluster as a user with the
-
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
ConfigMapobject:To attach custom labels to all time series and alerts leaving the Prometheus instance that monitors core OpenShift Container Platform projects:
Edit the
cluster-monitoring-configConfigMapobject in theopenshift-monitoringproject:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Define a map of labels you want to add for every metric under
data/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Substitute
<key>: <value>with a map of key-value pairs where<key>is a unique name for the new label and<value>is its value.
WarningDo not use
prometheusorprometheus_replicaas key names, because they are reserved and will be overwritten.For example, to add metadata about the region and environment to all time series and alerts, use:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To attach custom labels to all time series and alerts leaving the Prometheus instance that monitors user-defined projects:
Edit the
user-workload-monitoring-configConfigMapobject in theopenshift-user-workload-monitoringproject:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Define a map of labels you want to add for every metric under
data/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Substitute
<key>: <value>with a map of key-value pairs where<key>is a unique name for the new label and<value>is its value.
WarningDo not use
prometheusorprometheus_replicaas key names, because they are reserved and will be overwritten.NoteIn the
openshift-user-workload-monitoringproject, Prometheus handles metrics and Thanos Ruler handles alerting and recording rules. SettingexternalLabelsforprometheusin theuser-workload-monitoring-configConfigMapobject will only configure external labels for metrics and not for any rules.For example, to add metadata about the region and environment to all time series and alerts related to user-defined projects, use:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Save the file to apply the changes. The new configuration is applied automatically.
NoteConfigurations applied to the
user-workload-monitoring-configConfigMapobject are not activated unless a cluster administrator has enabled monitoring for user-defined projects.WarningWhen changes are saved to a monitoring config map, the pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
3.2. Setting log levels for monitoring components Copy linkLink copied to clipboard!
You can configure the log level for Alertmanager, Prometheus Operator, Prometheus, Thanos Querier, and Thanos Ruler.
The following log levels can be applied to the relevant component in the cluster-monitoring-config and user-workload-monitoring-config ConfigMap objects:
-
debug. Log debug, informational, warning, and error messages. -
info. Log informational, warning, and error messages. -
warn. Log warning and error messages only. -
error. Log error messages only.
The default log level is info.
Prerequisites
If you are setting a log level for Alertmanager, Prometheus Operator, Prometheus, or Thanos Querier in the
openshift-monitoringproject:-
You have access to the cluster as a user with the
cluster-adminrole. -
You have created the
cluster-monitoring-configConfigMapobject.
-
You have access to the cluster as a user with the
If you are setting a log level for Prometheus Operator, Prometheus, or Thanos Ruler in the
openshift-user-workload-monitoringproject:-
You have access to the cluster as a user with the
cluster-adminrole, or as a user with theuser-workload-monitoring-config-editrole in theopenshift-user-workload-monitoringproject. -
You have created the
user-workload-monitoring-configConfigMapobject.
-
You have access to the cluster as a user with the
-
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
ConfigMapobject:To set a log level for a component in the
openshift-monitoringproject:Edit the
cluster-monitoring-configConfigMapobject in theopenshift-monitoringproject:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add
logLevel: <log_level>for a component underdata/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The monitoring stack component for which you are setting a log level. For default platform monitoring, available component values are
prometheusK8s,alertmanagerMain,prometheusOperator, andthanosQuerier. - 2
- The log level to set for the component. The available values are
error,warn,info, anddebug. The default value isinfo.
To set a log level for a component in the
openshift-user-workload-monitoringproject:Edit the
user-workload-monitoring-configConfigMapobject in theopenshift-user-workload-monitoringproject:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add
logLevel: <log_level>for a component underdata/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The monitoring stack component for which you are setting a log level. For user workload monitoring, available component values are
prometheus,prometheusOperator, andthanosRuler. - 2
- The log level to set for the component. The available values are
error,warn,info, anddebug. The default value isinfo.
Save the file to apply the changes. The pods for the component restarts automatically when you apply the log-level change.
NoteConfigurations applied to the
user-workload-monitoring-configConfigMapobject are not activated unless a cluster administrator has enabled monitoring for user-defined projects.WarningWhen changes are saved to a monitoring config map, the pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
Confirm that the log-level has been applied by reviewing the deployment or pod configuration in the related project. The following example checks the log level in the
prometheus-operatordeployment in theopenshift-user-workload-monitoringproject:oc -n openshift-user-workload-monitoring get deploy prometheus-operator -o yaml | grep "log-level"
$ oc -n openshift-user-workload-monitoring get deploy prometheus-operator -o yaml | grep "log-level"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
- --log-level=debug
- --log-level=debugCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the pods for the component are running. The following example lists the status of pods in the
openshift-user-workload-monitoringproject:oc -n openshift-user-workload-monitoring get pods
$ oc -n openshift-user-workload-monitoring get podsCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf an unrecognized
loglevelvalue is included in theConfigMapobject, the pods for the component might not restart successfully.
3.3. Disabling the default Grafana deployment Copy linkLink copied to clipboard!
By default, a read-only Grafana instance is deployed with a collection of dashboards displaying cluster metrics. The Grafana instance is not user-configurable.
You can disable the Grafana deployment, causing the associated resources to be deleted from the cluster. You might do this if you do not need these dashboards and want to conserve resources in your cluster. You will still be able to view metrics and dashboards included in the web console. Grafana can be safely enabled again at any time.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have created the
cluster-monitoring-configConfigMapobject. -
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
cluster-monitoring-configConfigMapobject in theopenshift-monitoringproject:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add
enabled: falsefor thegrafanacomponent underdata/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the file to apply the changes. The resources will begin to be removed automatically when you apply the change.
WarningThis change results in some components, including Prometheus and the Thanos Querier, being restarted. This might lead to previously collected metrics being lost if you have not yet followed the steps in the "Configuring persistent storage" section.
Check that the Grafana pod is no longer running. The following example lists the status of pods in the
openshift-monitoringproject:oc -n openshift-monitoring get pods
$ oc -n openshift-monitoring get podsCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIt may take a few minutes after applying the change for these pods to terminate.
3.4. Disabling the local Alertmanager Copy linkLink copied to clipboard!
A local Alertmanager that routes alerts from Prometheus instances is enabled by default in the openshift-monitoring project of the OpenShift Container Platform monitoring stack.
If you do not need the local Alertmanager, you can disable it by configuring the cluster-monitoring-config config map in the openshift-monitoring project.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have created the
cluster-monitoring-configconfig map. -
You have installed the OpenShift CLI (
oc).
Procedure
Edit the
cluster-monitoring-configconfig map in theopenshift-monitoringproject:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add
enabled: falsefor thealertmanagerMaincomponent underdata/config.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes. The Alertmanager instance is disabled automatically when you apply the change.
3.5. Next steps Copy linkLink copied to clipboard!
- Enabling monitoring for user-defined projects
- Learn about remote health reporting and, if necessary, opt out of it