Chapter 5. Managing observability alerts


Receive and define alerts for the observability service to be notified of hub cluster and managed cluster changes.

5.1. Configuring Alertmanager

Integrate external messaging tools such as email, Slack, and PagerDuty to receive notifications from Alertmanager. You must override the alertmanager-config secret in the open-cluster-management-observability namespace to add integrations, and configure routes for Alertmanager. Complete the following steps to update the custom receiver rules:

  1. Extract the data from the alertmanager-config secret. Run the following command:

    oc -n open-cluster-management-observability get secret alertmanager-config --template='{{ index .data "alertmanager.yaml" }}' |base64 -d > alertmanager.yaml
  2. Edit and save the alertmanager.yaml file configuration by running the following command:

    oc -n open-cluster-management-observability create secret generic alertmanager-config --from-file=alertmanager.yaml --dry-run -o=yaml |  oc -n open-cluster-management-observability replace secret --filename=-

    Your updated secret might resemble the following content:

    global
      smtp_smarthost: 'localhost:25'
      smtp_from: 'alertmanager@example.org'
      smtp_auth_username: 'alertmanager'
      smtp_auth_password: 'password'
    templates:
    - '/etc/alertmanager/template/*.tmpl'
    route:
      group_by: ['alertname', 'cluster', 'service']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 3h
      receiver: team-X-mails
      routes:
      - match_re:
          service: ^(foo1|foo2|baz)$
        receiver: team-X-mails

Your changes are applied immediately after it is modified. For an example of Alertmanager, see prometheus/alertmanager.

5.2. Forwarding alerts

After you enable observability, alerts from your OpenShift Container Platform managed clusters are automatically sent to the hub cluster. You can use the alertmanager-config YAML file to configure alerts with an external notification system.

View the following example of the alertmanager-config YAML file:

global:
  slack_api_url: '<slack_webhook_url>'

route:
  receiver: 'slack-notifications'
  group_by: [alertname, datacenter, app]

receivers:
- name: 'slack-notifications'
  slack_configs:
  - channel: '#alerts'
    text: 'https://internal.myorg.net/wiki/alerts/{{ .GroupLabels.app }}/{{ .GroupLabels.alertname }}'

If you want to configure a proxy for alert forwarding, add the following global entry to the alertmanager-config YAML file:

global:
  slack_api_url: '<slack_webhook_url>'
  http_config:
    proxy_url: http://****

5.2.1. Disabling alert forwarding for managed clusters

To disable alert forwarding for managed clusters, add the following annotation to the MultiClusterObservability custom resource:

metadata:
      annotations:
        mco-disable-alerting: true

When you set the annotation, the alert forwarding configuration on the managed clusters is reverted. Any changes made to the ocp-monitoring-config ConfigMap in the openshift-monitoring namespace are also reverted. Setting the annotation ensures that the ocp-monitoring-config ConfigMap is no longer managed or updated by the observability operator endpoint. After you update the configuration, the Prometheus instance on your managed cluster restarts.

Important: Metrics on your managed cluster are lost if you have a Prometheus instance with a persistent volume for metrics, and the Prometheus instance restarts. Metrics from the hub cluster are not affected.

When the changes are reverted, a ConfigMap named cluster-monitoring-reverted is created in the open-cluster-management-addon-observability namespace. Any new, manually added alert forward configurations are not reverted from the ConfigMap.

Verify that the hub cluster alert manager is no longer propagating managed cluster alerts to third-party messaging tools. See the previous section, Configuring Alertmanager.

5.3. Silencing alerts

Add alerts that you do not want to receive. You can silence alerts by the alert name, match label, or time duration. After you add the alert that you want to silence, an ID is created. Your ID for your silenced alert might resemble the following string, d839aca9-ed46-40be-84c4-dca8773671da.

Continue reading for ways to silence alerts:

  • To silence a Red Hat Advanced Cluster Management alert, you must have access to the alertmanager-main pod in the open-cluster-management-observability namespace. For example, enter the following command in the pod terminal to silence SampleAlert:

    amtool silence add --alertmanager.url="http://localhost:9093" --author="user" --comment="Silencing sample alert" alertname="SampleAlert"
  • Silence an alert by using multiple match labels. The following command uses match-label-1 and match-label-2:

    amtool silence add --alertmanager.url="http://localhost:9093" --author="user" --comment="Silencing sample alert" <match-label-1>=<match-value-1> <match-label-2>=<match-value-2>
  • If you want to silence an alert for a specific period of time, use the --duration flag. Run the following command to silence the SampleAlert for an hour:

    amtool silence add --alertmanager.url="http://localhost:9093" --author="user" --comment="Silencing sample alert" --duration="1h" alertname="SampleAlert"

    You can also specify a start or end time for the silenced alert. Enter the following command to silence the SampleAlert at a specific start time:

    amtool silence add --alertmanager.url="http://localhost:9093" --author="user" --comment="Silencing sample alert" --start="2023-04-14T15:04:05-07:00" alertname="SampleAlert"
  • To view all silenced alerts that are created, run the following command:

    amtool silence --alertmanager.url="http://localhost:9093"
  • If you no longer want an alert to be silenced, end the silencing of the alert by running the following command:

    amtool silence expire --alertmanager.url="http://localhost:9093" "d839aca9-ed46-40be-84c4-dca8773671da"
  • To end the silencing of all alerts, run the following command:

    amtool silence expire --alertmanager.url="http://localhost:9093" $(amtool silence query --alertmanager.url="http://localhost:9093" -q)

5.4. Suppressing alerts

Suppress Red Hat Advanced Cluster Management alerts across your clusters globally that are less severe. Suppress alerts by defining an inhibition rule in the alertmanager-config in the open-cluster-management-observability namespace.

An inhibition rule mutes an alert when there is a set of parameter matches that match another set of existing matchers. In order for the rule to take effect, both the target and source alerts must have the same label values for the label names in the equal list. Your inhibit_rules might resemble the following:

global:
  resolve_timeout: 1h
inhibit_rules:1
  - equal:
      - namespace
    source_match:2
      severity: critical
    target_match_re:
      severity: warning|info
1
The inhibit_rules parameter section is defined to look for alerts in the same namespace. When a critical alert is initiated within a namespace and if there are any other alerts that contain the severity level warning or info in that namespace, only the critical alerts are routed to the Alertmanager receiver. The following alerts might be displayed when there are matches:
ALERTS{alertname="foo", namespace="ns-1", severity="critical"}
ALERTS{alertname="foo", namespace="ns-1", severity="warning"}
2
If the value of the source_match and target_match_re parameters do not match, the alert is routed to the receiver:
ALERTS{alertname="foo", namespace="ns-1", severity="critical"}
ALERTS{alertname="foo", namespace="ns-2", severity="warning"}
  • To view suppressed alerts in Red Hat Advanced Cluster Management, enter the following command:

    amtool alert --alertmanager.url="http://localhost:9093" --inhibited

5.5. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.