This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Ce contenu n'est pas disponible dans la langue sélectionnée.
Chapter 3. Configuring external alertmanager instances
The OpenShift Container Platform monitoring stack includes a local Alertmanager instance that routes alerts from Prometheus. You can add external Alertmanager instances by configuring the cluster-monitoring-config
config map in either the openshift-monitoring
project or the user-workload-monitoring-config
project.
If you add the same external Alertmanager configuration for multiple clusters and disable the local instance for each cluster, you can then manage alert routing for multiple clusters by using a single external Alertmanager instance.
Prerequisites
-
You have installed the OpenShift CLI (
oc
). If you are configuring core OpenShift Container Platform monitoring components in the
openshift-monitoring
project:-
You have access to the cluster as a user with the
cluster-admin
cluster role. -
You have created the
cluster-monitoring-config
config map.
-
You have access to the cluster as a user with the
If you are configuring components that monitor user-defined projects:
-
You have access to the cluster as a user with the
cluster-admin
cluster role, or as a user with theuser-workload-monitoring-config-edit
role in theopenshift-user-workload-monitoring
project. -
You have created the
user-workload-monitoring-config
config map.
-
You have access to the cluster as a user with the
Procedure
Edit the
ConfigMap
object.To configure additional Alertmanagers for routing alerts from core OpenShift Container Platform projects:
Edit the
cluster-monitoring-config
config map in theopenshift-monitoring
project:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! -
Add an
additionalAlertmanagerConfigs:
section underdata/config.yaml/prometheusK8s
. Add the configuration details for additional Alertmanagers in this section:
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: additionalAlertmanagerConfigs: - <alertmanager_specification>
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: additionalAlertmanagerConfigs: - <alertmanager_specification>
Copy to Clipboard Copied! For
<alertmanager_specification>
, substitute authentication and other configuration details for additional Alertmanager instances. Currently supported authentication methods are bearer token (bearerToken
) and client TLS (tlsConfig
). The following sample config map configures an additional Alertmanager using a bearer token with client TLS authentication:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: additionalAlertmanagerConfigs: - scheme: https pathPrefix: / timeout: "30s" apiVersion: v1 bearerToken: name: alertmanager-bearer-token key: token tlsConfig: key: name: alertmanager-tls key: tls.key cert: name: alertmanager-tls key: tls.crt ca: name: alertmanager-tls key: tls.ca staticConfigs: - external-alertmanager1-remote.com - external-alertmanager1-remote2.com
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: additionalAlertmanagerConfigs: - scheme: https pathPrefix: / timeout: "30s" apiVersion: v1 bearerToken: name: alertmanager-bearer-token key: token tlsConfig: key: name: alertmanager-tls key: tls.key cert: name: alertmanager-tls key: tls.crt ca: name: alertmanager-tls key: tls.ca staticConfigs: - external-alertmanager1-remote.com - external-alertmanager1-remote2.com
Copy to Clipboard Copied!
To configure additional Alertmanager instances for routing alerts from user-defined projects:
Edit the
user-workload-monitoring-config
config map in theopenshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
Copy to Clipboard Copied! -
Add a
<component>/additionalAlertmanagerConfigs:
section underdata/config.yaml/
. Add the configuration details for additional Alertmanagers in this section:
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>: additionalAlertmanagerConfigs: - <alertmanager_specification>
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>: additionalAlertmanagerConfigs: - <alertmanager_specification>
Copy to Clipboard Copied! For
<component>
, substitute one of two supported external Alertmanager components:prometheus
orthanosRuler
.For
<alertmanager_specification>
, substitute authentication and other configuration details for additional Alertmanager instances. Currently supported authentication methods are bearer token (bearerToken
) and client TLS (tlsConfig
). The following sample config map configures an additional Alertmanager using Thanos Ruler with a bearer token and client TLS authentication:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | thanosRuler: additionalAlertmanagerConfigs: - scheme: https pathPrefix: / timeout: "30s" apiVersion: v1 bearerToken: name: alertmanager-bearer-token key: token tlsConfig: key: name: alertmanager-tls key: tls.key cert: name: alertmanager-tls key: tls.crt ca: name: alertmanager-tls key: tls.ca staticConfigs: - external-alertmanager1-remote.com - external-alertmanager1-remote2.com
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | thanosRuler: additionalAlertmanagerConfigs: - scheme: https pathPrefix: / timeout: "30s" apiVersion: v1 bearerToken: name: alertmanager-bearer-token key: token tlsConfig: key: name: alertmanager-tls key: tls.key cert: name: alertmanager-tls key: tls.crt ca: name: alertmanager-tls key: tls.ca staticConfigs: - external-alertmanager1-remote.com - external-alertmanager1-remote2.com
Copy to Clipboard Copied! NoteConfigurations applied to the
user-workload-monitoring-config
ConfigMap
object are not activated unless a cluster administrator has enabled monitoring for user-defined projects.
-
Save the file to apply the changes to the
ConfigMap
object. The new component placement configuration is applied automatically.
3.1. Attaching additional labels to your time series and alerts
Using the external labels feature of Prometheus, you can attach custom labels to all time series and alerts leaving Prometheus.
Prerequisites
If you are configuring core OpenShift Container Platform monitoring components:
-
You have access to the cluster as a user with the
cluster-admin
cluster role. -
You have created the
cluster-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
If you are configuring components that monitor user-defined projects:
-
You have access to the cluster as a user with the
cluster-admin
cluster role, or as a user with theuser-workload-monitoring-config-edit
role in theopenshift-user-workload-monitoring
project. -
You have created the
user-workload-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
-
You have installed the OpenShift CLI (
oc
).
Procedure
Edit the
ConfigMap
object:To attach custom labels to all time series and alerts leaving the Prometheus instance that monitors core OpenShift Container Platform projects:
Edit the
cluster-monitoring-config
ConfigMap
object in theopenshift-monitoring
project:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Define a map of labels you want to add for every metric under
data/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: externalLabels: <key>: <value>
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: externalLabels: <key>: <value>
1 Copy to Clipboard Copied! - 1
- Substitute
<key>: <value>
with a map of key-value pairs where<key>
is a unique name for the new label and<value>
is its value.
WarningDo not use
prometheus
orprometheus_replica
as key names, because they are reserved and will be overwritten.For example, to add metadata about the region and environment to all time series and alerts, use:
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: externalLabels: region: eu environment: prod
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: externalLabels: region: eu environment: prod
Copy to Clipboard Copied!
To attach custom labels to all time series and alerts leaving the Prometheus instance that monitors user-defined projects:
Edit the
user-workload-monitoring-config
ConfigMap
object in theopenshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
Copy to Clipboard Copied! Define a map of labels you want to add for every metric under
data/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: externalLabels: <key>: <value>
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: externalLabels: <key>: <value>
1 Copy to Clipboard Copied! - 1
- Substitute
<key>: <value>
with a map of key-value pairs where<key>
is a unique name for the new label and<value>
is its value.
WarningDo not use
prometheus
orprometheus_replica
as key names, because they are reserved and will be overwritten.NoteIn the
openshift-user-workload-monitoring
project, Prometheus handles metrics and Thanos Ruler handles alerting and recording rules. SettingexternalLabels
forprometheus
in theuser-workload-monitoring-config
ConfigMap
object will only configure external labels for metrics and not for any rules.For example, to add metadata about the region and environment to all time series and alerts related to user-defined projects, use:
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: externalLabels: region: eu environment: prod
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: externalLabels: region: eu environment: prod
Copy to Clipboard Copied!
Save the file to apply the changes. The new configuration is applied automatically.
NoteConfigurations applied to the
user-workload-monitoring-config
ConfigMap
object are not activated unless a cluster administrator has enabled monitoring for user-defined projects.WarningWhen changes are saved to a monitoring config map, the pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
3.2. Setting log levels for monitoring components
You can configure the log level for Alertmanager, Prometheus Operator, Prometheus, Thanos Querier, and Thanos Ruler.
The following log levels can be applied to the relevant component in the cluster-monitoring-config
and user-workload-monitoring-config
ConfigMap
objects:
-
debug
. Log debug, informational, warning, and error messages. -
info
. Log informational, warning, and error messages. -
warn
. Log warning and error messages only. -
error
. Log error messages only.
The default log level is info
.
Prerequisites
If you are setting a log level for Alertmanager, Prometheus Operator, Prometheus, or Thanos Querier in the
openshift-monitoring
project:-
You have access to the cluster as a user with the
cluster-admin
cluster role. -
You have created the
cluster-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
If you are setting a log level for Prometheus Operator, Prometheus, or Thanos Ruler in the
openshift-user-workload-monitoring
project:-
You have access to the cluster as a user with the
cluster-admin
cluster role, or as a user with theuser-workload-monitoring-config-edit
role in theopenshift-user-workload-monitoring
project. -
You have created the
user-workload-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
-
You have installed the OpenShift CLI (
oc
).
Procedure
Edit the
ConfigMap
object:To set a log level for a component in the
openshift-monitoring
project:Edit the
cluster-monitoring-config
ConfigMap
object in theopenshift-monitoring
project:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Add
logLevel: <log_level>
for a component underdata/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | <component>: logLevel: <log_level>
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | <component>:
1 logLevel: <log_level>
2 Copy to Clipboard Copied! - 1
- The monitoring stack component for which you are setting a log level. For default platform monitoring, available component values are
prometheusK8s
,alertmanagerMain
,prometheusOperator
, andthanosQuerier
. - 2
- The log level to set for the component. The available values are
error
,warn
,info
, anddebug
. The default value isinfo
.
To set a log level for a component in the
openshift-user-workload-monitoring
project:Edit the
user-workload-monitoring-config
ConfigMap
object in theopenshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
Copy to Clipboard Copied! Add
logLevel: <log_level>
for a component underdata/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>: logLevel: <log_level>
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | <component>:
1 logLevel: <log_level>
2 Copy to Clipboard Copied! - 1
- The monitoring stack component for which you are setting a log level. For user workload monitoring, available component values are
prometheus
,prometheusOperator
, andthanosRuler
. - 2
- The log level to set for the component. The available values are
error
,warn
,info
, anddebug
. The default value isinfo
.
Save the file to apply the changes. The pods for the component restarts automatically when you apply the log-level change.
NoteConfigurations applied to the
user-workload-monitoring-config
ConfigMap
object are not activated unless a cluster administrator has enabled monitoring for user-defined projects.WarningWhen changes are saved to a monitoring config map, the pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
Confirm that the log-level has been applied by reviewing the deployment or pod configuration in the related project. The following example checks the log level in the
prometheus-operator
deployment in theopenshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring get deploy prometheus-operator -o yaml | grep "log-level"
$ oc -n openshift-user-workload-monitoring get deploy prometheus-operator -o yaml | grep "log-level"
Copy to Clipboard Copied! Example output
- --log-level=debug
- --log-level=debug
Copy to Clipboard Copied! Check that the pods for the component are running. The following example lists the status of pods in the
openshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring get pods
$ oc -n openshift-user-workload-monitoring get pods
Copy to Clipboard Copied! NoteIf an unrecognized
loglevel
value is included in theConfigMap
object, the pods for the component might not restart successfully.
3.3. Enabling the query log file for Prometheus
You can configure Prometheus to write all queries that have been run by the engine to a log file. You can do so for default platform monitoring and for user-defined workload monitoring.
Because log rotation is not supported, only enable this feature temporarily when you need to troubleshoot an issue. After you finish troubleshooting, disable query logging by reverting the changes you made to the ConfigMap
object to enable the feature.
Prerequisites
-
You have installed the OpenShift CLI (
oc
). If you are enabling the query log file feature for Prometheus in the
openshift-monitoring
project:-
You have access to the cluster as a user with the
cluster-admin
cluster role. -
You have created the
cluster-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
If you are enabling the query log file feature for Prometheus in the
openshift-user-workload-monitoring
project:-
You have access to the cluster as a user with the
cluster-admin
cluster role, or as a user with theuser-workload-monitoring-config-edit
role in theopenshift-user-workload-monitoring
project. -
You have created the
user-workload-monitoring-config
ConfigMap
object.
-
You have access to the cluster as a user with the
Procedure
To set the query log file for Prometheus in the
openshift-monitoring
project:Edit the
cluster-monitoring-config
ConfigMap
object in theopenshift-monitoring
project:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Add
queryLogFile: <path>
forprometheusK8s
underdata/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: queryLogFile: <path>
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | prometheusK8s: queryLogFile: <path>
1 Copy to Clipboard Copied! - 1
- The full path to the file in which queries will be logged.
Save the file to apply the changes.
WarningWhen you save changes to a monitoring config map, pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
Verify that the pods for the component are running. The following sample command lists the status of pods in the
openshift-monitoring
project:oc -n openshift-monitoring get pods
$ oc -n openshift-monitoring get pods
Copy to Clipboard Copied! Read the query log:
oc -n openshift-monitoring exec prometheus-k8s-0 -- cat <path>
$ oc -n openshift-monitoring exec prometheus-k8s-0 -- cat <path>
Copy to Clipboard Copied! ImportantRevert the setting in the config map after you have examined the logged query information.
To set the query log file for Prometheus in the
openshift-user-workload-monitoring
project:Edit the
user-workload-monitoring-config
ConfigMap
object in theopenshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
Copy to Clipboard Copied! Add
queryLogFile: <path>
forprometheus
underdata/config.yaml
:apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: queryLogFile: <path>
apiVersion: v1 kind: ConfigMap metadata: name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring data: config.yaml: | prometheus: queryLogFile: <path>
1 Copy to Clipboard Copied! - 1
- The full path to the file in which queries will be logged.
Save the file to apply the changes.
NoteConfigurations applied to the
user-workload-monitoring-config
ConfigMap
object are not activated unless a cluster administrator has enabled monitoring for user-defined projects.WarningWhen you save changes to a monitoring config map, pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
Verify that the pods for the component are running. The following example command lists the status of pods in the
openshift-user-workload-monitoring
project:oc -n openshift-user-workload-monitoring get pods
$ oc -n openshift-user-workload-monitoring get pods
Copy to Clipboard Copied! Read the query log:
oc -n openshift-user-workload-monitoring exec prometheus-user-workload-0 -- cat <path>
$ oc -n openshift-user-workload-monitoring exec prometheus-user-workload-0 -- cat <path>
Copy to Clipboard Copied! ImportantRevert the setting in the config map after you have examined the logged query information.
3.4. Enabling query logging for Thanos Querier
For default platform monitoring in the openshift-monitoring
project, you can enable the Cluster Monitoring Operator to log all queries run by Thanos Querier.
Because log rotation is not supported, only enable this feature temporarily when you need to troubleshoot an issue. After you finish troubleshooting, disable query logging by reverting the changes you made to the ConfigMap
object to enable the feature.
Prerequisites
-
You have installed the OpenShift CLI (
oc
). -
You have access to the cluster as a user with the
cluster-admin
cluster role. -
You have created the
cluster-monitoring-config
ConfigMap
object.
Procedure
You can enable query logging for Thanos Querier in the openshift-monitoring
project:
Edit the
cluster-monitoring-config
ConfigMap
object in theopenshift-monitoring
project:oc -n openshift-monitoring edit configmap cluster-monitoring-config
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
Copy to Clipboard Copied! Add a
thanosQuerier
section underdata/config.yaml
and add values as shown in the following example:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | thanosQuerier: enableRequestLogging: <value> logLevel: <value>
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | thanosQuerier: enableRequestLogging: <value>
1 logLevel: <value>
2 Copy to Clipboard Copied! Save the file to apply the changes.
WarningWhen you save changes to a monitoring config map, pods and other resources in the related project might be redeployed. The running monitoring processes in that project might also be restarted.
Verification
Verify that the Thanos Querier pods are running. The following sample command lists the status of pods in the
openshift-monitoring
project:oc -n openshift-monitoring get pods
$ oc -n openshift-monitoring get pods
Copy to Clipboard Copied! Run a test query using the following sample commands as a model:
token=`oc sa get-token prometheus-k8s -n openshift-monitoring` oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?query=cluster_version'
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring` $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?query=cluster_version'
Copy to Clipboard Copied! Run the following command to read the query log:
oc -n openshift-monitoring logs <thanos_querier_pod_name> -c thanos-query
$ oc -n openshift-monitoring logs <thanos_querier_pod_name> -c thanos-query
Copy to Clipboard Copied! NoteBecause the
thanos-querier
pods are highly available (HA) pods, you might be able to see logs in only one pod.-
After you examine the logged query information, disable query logging by changing the
enableRequestLogging
value tofalse
in the config map.