Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 6. Managing alerts


6.1. Managing alerts as an Administrator

In OpenShift Container Platform, the Alerting UI enables you to manage alerts, silences, and alerting rules.

Note

The alerts, silences, and alerting rules that are available in the Alerting UI relate to the projects that you have access to. For example, if you are logged in as a user with the cluster-admin role, you can access all alerts, silences, and alerting rules.

6.1.1. Accessing the Alerting UI from the Administrator perspective

The Alerting UI is accessible through the Administrator perspective of the OpenShift Container Platform web console.

  • From the Administrator perspective, go to Observe Alerting. The three main pages in the Alerting UI in this perspective are the Alerts, Silences, and Alerting rules pages.

6.1.2. Getting information about alerts, silences, and alerting rules from the Administrator perspective

The Alerting UI provides detailed information about alerts and their governing alerting rules and silences.

Prerequisites

  • You have access to the cluster as a user with view permissions for the project that you are viewing alerts for.

Procedure

To obtain information about alerts:

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to the Observe Alerting Alerts page.
  2. Optional: Search for alerts by name by using the Name field in the search list.
  3. Optional: Filter alerts by state, severity, and source by selecting filters in the Filter list.
  4. Optional: Sort the alerts by clicking one or more of the Name, Severity, State, and Source column headers.
  5. Click the name of an alert to view its Alert details page. The page includes a graph that illustrates alert time series data. It also provides the following information about the alert:

    • A description of the alert
    • Messages associated with the alert
    • Labels attached to the alert
    • A link to its governing alerting rule
    • Silences for the alert, if any exist

To obtain information about silences:

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to the Observe Alerting Silences page.
  2. Optional: Filter the silences by name using the Search by name field.
  3. Optional: Filter silences by state by selecting filters in the Filter list. By default, Active and Pending filters are applied.
  4. Optional: Sort the silences by clicking one or more of the Name, Firing alerts, State, and Creator column headers.
  5. Select the name of a silence to view its Silence details page. The page includes the following details:

    • Alert specification
    • Start time
    • End time
    • Silence state
    • Number and list of firing alerts

To obtain information about alerting rules:

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to the Observe Alerting Alerting rules page.
  2. Optional: Filter alerting rules by state, severity, and source by selecting filters in the Filter list.
  3. Optional: Sort the alerting rules by clicking one or more of the Name, Severity, Alert state, and Source column headers.
  4. Select the name of an alerting rule to view its Alerting rule details page. The page provides the following details about the alerting rule:

    • Alerting rule name, severity, and description.
    • The expression that defines the condition for firing the alert.
    • The time for which the condition should be true for an alert to fire.
    • A graph for each alert governed by the alerting rule, showing the value with which the alert is firing.
    • A table of all alerts governed by the alerting rule.

Additional resources

6.1.3. Managing silences

You can create a silence for an alert in the OpenShift Container Platform web console in the Administrator perspective. After you create silences, you can view, edit, and expire them. You also do not receive notifications about a silenced alert when the alert fires.

Note

When you create silences, they are replicated across Alertmanager pods. However, if you do not configure persistent storage for Alertmanager, silences might be lost. This can happen, for example, if all Alertmanager pods restart at the same time.

6.1.3.1. Silencing alerts from the Administrator perspective

You can silence a specific alert or silence alerts that match a specification that you define.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

Procedure

To silence a specific alert:

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to Observe Alerting Alerts.
  2. For the alert that you want to silence, click kebab and select Silence alert to open the Silence alert page with a default configuration for the chosen alert.
  3. Optional: Change the default configuration details for the silence.

    Note

    You must add a comment before saving a silence.

  4. To save the silence, click Silence.

To silence a set of alerts:

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to Observe Alerting Silences.
  2. Click Create silence.
  3. On the Create silence page, set the schedule, duration, and label details for an alert.

    Note

    You must add a comment before saving a silence.

  4. To create silences for alerts that match the labels that you entered, click Silence.

6.1.3.2. Editing silences from the Administrator perspective

You can edit a silence, which expires the existing silence and creates a new one with the changed configuration.

Prerequisites

  • If you are a cluster administrator, you have access to the cluster as a user with the cluster-admin role.
  • If you are a non-administrator user, you have access to the cluster as a user with the following user roles:

    • The cluster-monitoring-view cluster role, which allows you to access Alertmanager.
    • The monitoring-alertmanager-edit role, which permits you to create and silence alerts in the Administrator perspective in the web console.

Procedure

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to Observe Alerting Silences.
  2. For the silence you want to modify, click kebab and select Edit silence.

    Alternatively, you can click Actions and select Edit silence on the Silence details page for a silence.

  3. On the Edit silence page, make changes and click Silence. Doing so expires the existing silence and creates one with the updated configuration.

6.1.3.3. Expiring silences from the Administrator perspective

You can expire a single silence or multiple silences. Expiring a silence deactivates it permanently.

Note

You cannot delete expired, silenced alerts. Expired silences older than 120 hours are garbage collected.

Prerequisites

  • If you are a cluster administrator, you have access to the cluster as a user with the cluster-admin role.
  • If you are a non-administrator user, you have access to the cluster as a user with the following user roles:

    • The cluster-monitoring-view cluster role, which allows you to access Alertmanager.
    • The monitoring-alertmanager-edit role, which permits you to create and silence alerts in the Administrator perspective in the web console.

Procedure

  1. Go to Observe Alerting Silences.
  2. For the silence or silences you want to expire, select the checkbox in the corresponding row.
  3. Click Expire 1 silence to expire a single selected silence or Expire <n> silences to expire multiple selected silences, where <n> is the number of silences you selected.

    Alternatively, to expire a single silence you can click Actions and select Expire silence on the Silence details page for a silence.

6.1.4. Managing alerting rules for core platform monitoring

The OpenShift Container Platform monitoring includes a large set of default alerting rules for platform metrics. As a cluster administrator, you can customize this set of rules in two ways:

  • Modify the settings for existing platform alerting rules by adjusting thresholds or by adding and modifying labels. For example, you can change the severity label for an alert from warning to critical to help you route and triage issues flagged by an alert.
  • Define and add new custom alerting rules by constructing a query expression based on core platform metrics in the openshift-monitoring project.

6.1.4.1. Creating new alerting rules

As a cluster administrator, you can create new alerting rules based on platform metrics. These alerting rules trigger alerts based on the values of chosen metrics.

Note
  • If you create a customized AlertingRule resource based on an existing platform alerting rule, silence the original alert to avoid receiving conflicting alerts.
  • To help users understand the impact and cause of the alert, ensure that your alerting rule contains an alert message and severity value.

Prerequisites

  • You have access to the cluster as a user that has the cluster-admin cluster role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Create a new YAML configuration file named example-alerting-rule.yaml.
  2. Add an AlertingRule resource to the YAML file. The following example creates a new alerting rule named example, similar to the default Watchdog alert:

    apiVersion: monitoring.openshift.io/v1
    kind: AlertingRule
    metadata:
      name: example
      namespace: openshift-monitoring 1
    spec:
      groups:
      - name: example-rules
        rules:
        - alert: ExampleAlert 2
          for: 1m 3
          expr: vector(1) 4
          labels:
            severity: warning 5
          annotations:
            message: This is an example alert. 6
    1
    Ensure that the namespace is openshift-monitoring.
    2
    The name of the alerting rule you want to create.
    3
    The duration for which the condition should be true before an alert is fired.
    4
    The PromQL query expression that defines the new rule.
    5
    The severity that alerting rule assigns to the alert.
    6
    The message associated with the alert.
    Important

    You must create the AlertingRule object in the openshift-monitoring namespace. Otherwise, the alerting rule is not accepted.

  3. Apply the configuration file to the cluster:

    $ oc apply -f example-alerting-rule.yaml

6.1.4.2. Modifying core platform alerting rules

As a cluster administrator, you can modify core platform alerts before Alertmanager routes them to a receiver. For example, you can change the severity label of an alert, add a custom label, or exclude an alert from being sent to Alertmanager.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin cluster role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Create a new YAML configuration file named example-modified-alerting-rule.yaml.
  2. Add an AlertRelabelConfig resource to the YAML file. The following example modifies the severity setting to critical for the default platform watchdog alerting rule:

    apiVersion: monitoring.openshift.io/v1
    kind: AlertRelabelConfig
    metadata:
      name: watchdog
      namespace: openshift-monitoring 1
    spec:
      configs:
      - sourceLabels: [alertname,severity] 2
        regex: "Watchdog;none" 3
        targetLabel: severity 4
        replacement: critical 5
        action: Replace 6
    1
    Ensure that the namespace is openshift-monitoring.
    2
    The source labels for the values you want to modify.
    3
    The regular expression against which the value of sourceLabels is matched.
    4
    The target label of the value you want to modify.
    5
    The new value to replace the target label.
    6
    The relabel action that replaces the old value based on regex matching. The default action is Replace. Other possible values are Keep, Drop, HashMod, LabelMap, LabelDrop, and LabelKeep.
    Important

    You must create the AlertRelabelConfig object in the openshift-monitoring namespace. Otherwise, the alert label will not change.

  3. Apply the configuration file to the cluster:

    $ oc apply -f example-modified-alerting-rule.yaml

Additional resources

6.1.5. Managing alerting rules for user-defined projects

In OpenShift Container Platform, you can create, view, edit, and remove alerting rules for user-defined projects. Those alerting rules will trigger alerts based on the values of the chosen metrics.

6.1.5.1. Creating alerting rules for user-defined projects

You can create alerting rules for user-defined projects. Those alerting rules will trigger alerts based on the values of the chosen metrics.

Note
  • When you create an alerting rule, a project label is enforced on it even if a rule with the same name exists in another project.
  • To help users understand the impact and cause of the alert, ensure that your alerting rule contains an alert message and severity value.

Prerequisites

  • You have enabled monitoring for user-defined projects.
  • You are logged in as a cluster administrator or as a user that has the monitoring-rules-edit cluster role for the project where you want to create an alerting rule.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Create a YAML file for alerting rules. In this example, it is called example-app-alerting-rule.yaml.
  2. Add an alerting rule configuration to the YAML file. The following example creates a new alerting rule named example-alert. The alerting rule fires an alert when the version metric exposed by the sample service becomes 0:

    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: example-alert
      namespace: ns1
    spec:
      groups:
      - name: example
        rules:
        - alert: VersionAlert 1
          for: 1m 2
          expr: version{job="prometheus-example-app"} == 0 3
          labels:
            severity: warning 4
          annotations:
            message: This is an example alert. 5
    1
    The name of the alerting rule you want to create.
    2
    The duration for which the condition should be true before an alert is fired.
    3
    The PromQL query expression that defines the new rule.
    4
    The severity that alerting rule assigns to the alert.
    5
    The message associated with the alert.
  3. Apply the configuration file to the cluster:

    $ oc apply -f example-app-alerting-rule.yaml

Additional resources

6.1.5.2. Listing alerting rules for all projects in a single view

As a cluster administrator, you can list alerting rules for core OpenShift Container Platform and user-defined projects together in a single view.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. From the Administrator perspective of the OpenShift Container Platform web console, go to Observe Alerting Alerting rules.
  2. Select the Platform and User sources in the Filter drop-down menu.

    Note

    The Platform source is selected by default.

6.1.5.3. Removing alerting rules for user-defined projects

You can remove alerting rules for user-defined projects.

Prerequisites

  • You have enabled monitoring for user-defined projects.
  • You are logged in as a cluster administrator or as a user that has the monitoring-rules-edit cluster role for the project where you want to create an alerting rule.
  • You have installed the OpenShift CLI (oc).

Procedure

  • To remove rule <foo> in <namespace>, run the following:

    $ oc -n <namespace> delete prometheusrule <foo>

Additional resources

6.2. Managing alerts as a Developer

In OpenShift Container Platform, the Alerting UI enables you to manage alerts, silences, and alerting rules.

Note

The alerts, silences, and alerting rules that are available in the Alerting UI relate to the projects that you have access to.

6.2.1. Accessing the Alerting UI from the Developer perspective

The Alerting UI is accessible through the Developer perspective of the OpenShift Container Platform web console.

  • From the Developer perspective, go to Observe and go to the Alerts tab.
  • Select the project that you want to manage alerts for from the Project: list.

In this perspective, alerts, silences, and alerting rules are all managed from the Alerts tab. The results shown in the Alerts tab are specific to the selected project.

Note

In the Developer perspective, you can select from core OpenShift Container Platform and user-defined projects that you have access to in the Project: <project_name> list. However, alerts, silences, and alerting rules relating to core OpenShift Container Platform projects are not displayed if you are not logged in as a cluster administrator.

6.2.2. Getting information about alerts, silences, and alerting rules from the Developer perspective

The Alerting UI provides detailed information about alerts and their governing alerting rules and silences.

Prerequisites

  • You have access to the cluster as a user with view permissions for the project that you are viewing alerts for.

Procedure

To obtain information about alerts, silences, and alerting rules:

  1. From the Developer perspective of the OpenShift Container Platform web console, go to the Observe <project_name> Alerts page.
  2. View details for an alert, silence, or an alerting rule:

    • Alert details can be viewed by clicking a greater than symbol (>) next to an alert name and then selecting the alert from the list.
    • Silence details can be viewed by clicking a silence in the Silenced by section of the Alert details page. The Silence details page includes the following information:

      • Alert specification
      • Start time
      • End time
      • Silence state
      • Number and list of firing alerts
    • Alerting rule details can be viewed by clicking the kebab menu next to an alert in the Alerts page and then clicking View Alerting Rule.
Note

Only alerts, silences, and alerting rules relating to the selected project are displayed in the Developer perspective.

Additional resources

6.2.3. Managing silences

You can create a silence for an alert in the OpenShift Container Platform web console in the Developer perspective. After you create silences, you can view, edit, and expire them. You also do not receive notifications about a silenced alert when the alert fires.

Note

When you create silences, they are replicated across Alertmanager pods. However, if you do not configure persistent storage for Alertmanager, silences might be lost. This can happen, for example, if all Alertmanager pods restart at the same time.

6.2.3.1. Silencing alerts from the Developer perspective

You can silence a specific alert or silence alerts that match a specification that you define.

Prerequisites

  • If you are a cluster administrator, you have access to the cluster as a user with the cluster-admin role.
  • If you are a non-administrator user, you have access to the cluster as a user with the following user roles:

    • The cluster-monitoring-view cluster role, which allows you to access Alertmanager.
    • The monitoring-alertmanager-edit role, which permits you to create and silence alerts in the Administrator perspective in the web console.
    • The monitoring-rules-edit cluster role, which permits you to create and silence alerts in the Developer perspective in the web console.

Procedure

To silence a specific alert:

  1. From the Developer perspective of the OpenShift Container Platform web console, go to Observe and go to the Alerts tab.
  2. Select the project that you want to silence an alert for from the Project: list.
  3. If necessary, expand the details for the alert by clicking a greater than symbol (>) next to the alert name.
  4. Click the alert message in the expanded view to open the Alert details page for the alert.
  5. Click Silence alert to open the Silence alert page with a default configuration for the alert.
  6. Optional: Change the default configuration details for the silence.

    Note

    You must add a comment before saving a silence.

  7. To save the silence, click Silence.

To silence a set of alerts:

  1. From the Developer perspective of the OpenShift Container Platform web console, go to Observe and go to the Silences tab.
  2. Select the project that you want to silence alerts for from the Project: list.
  3. Click Create silence.
  4. On the Create silence page, set the duration and label details for an alert.

    Note

    You must add a comment before saving a silence.

  5. To create silences for alerts that match the labels that you entered, click Silence.

6.2.3.2. Editing silences from the Developer perspective

You can edit a silence, which expires the existing silence and creates a new one with the changed configuration.

Prerequisites

  • If you are a cluster administrator, you have access to the cluster as a user with the cluster-admin role.
  • If you are a non-administrator user, you have access to the cluster as a user with the following user roles:

    • The cluster-monitoring-view cluster role, which allows you to access Alertmanager.
    • The monitoring-rules-edit cluster role, which permits you to create and silence alerts in the Developer perspective in the web console.

Procedure

  1. From the Developer perspective of the OpenShift Container Platform web console, go to Observe and go to the Silences tab.
  2. Select the project that you want to edit silences for from the Project: list.
  3. For the silence you want to modify, click kebab and select Edit silence.

    Alternatively, you can click Actions and select Edit silence on the Silence details page for a silence.

  4. On the Edit silence page, make changes and click Silence. Doing so expires the existing silence and creates one with the updated configuration.

6.2.3.3. Expiring silences from the Developer perspective

You can expire a single silence or multiple silences. Expiring a silence deactivates it permanently.

Note

You cannot delete expired, silenced alerts. Expired silences older than 120 hours are garbage collected.

Prerequisites

  • If you are a cluster administrator, you have access to the cluster as a user with the cluster-admin role.
  • If you are a non-administrator user, you have access to the cluster as a user with the following user roles:

    • The cluster-monitoring-view cluster role, which allows you to access Alertmanager.
    • The monitoring-rules-edit cluster role, which permits you to create and silence alerts in the Developer perspective in the web console.

Procedure

  1. From the Developer perspective of the OpenShift Container Platform web console, go to Observe and go to the Silences tab.
  2. Select the project that you want to expire a silence for from the Project: list.
  3. For the silence or silences you want to expire, select the checkbox in the corresponding row.
  4. Click Expire 1 silence to expire a single selected silence or Expire <n> silences to expire multiple selected silences, where <n> is the number of silences you selected.

    Alternatively, to expire a single silence you can click Actions and select Expire silence on the Silence details page for a silence.

6.2.4. Managing alerting rules for user-defined projects

In OpenShift Container Platform, you can create, view, edit, and remove alerting rules for user-defined projects. Those alerting rules will trigger alerts based on the values of the chosen metrics.

6.2.4.1. Creating alerting rules for user-defined projects

You can create alerting rules for user-defined projects. Those alerting rules will trigger alerts based on the values of the chosen metrics.

Note
  • When you create an alerting rule, a project label is enforced on it even if a rule with the same name exists in another project.
  • To help users understand the impact and cause of the alert, ensure that your alerting rule contains an alert message and severity value.

Prerequisites

  • You have enabled monitoring for user-defined projects.
  • You are logged in as a cluster administrator or as a user that has the monitoring-rules-edit cluster role for the project where you want to create an alerting rule.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. Create a YAML file for alerting rules. In this example, it is called example-app-alerting-rule.yaml.
  2. Add an alerting rule configuration to the YAML file. The following example creates a new alerting rule named example-alert. The alerting rule fires an alert when the version metric exposed by the sample service becomes 0:

    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: example-alert
      namespace: ns1
    spec:
      groups:
      - name: example
        rules:
        - alert: VersionAlert 1
          for: 1m 2
          expr: version{job="prometheus-example-app"} == 0 3
          labels:
            severity: warning 4
          annotations:
            message: This is an example alert. 5
    1
    The name of the alerting rule you want to create.
    2
    The duration for which the condition should be true before an alert is fired.
    3
    The PromQL query expression that defines the new rule.
    4
    The severity that alerting rule assigns to the alert.
    5
    The message associated with the alert.
  3. Apply the configuration file to the cluster:

    $ oc apply -f example-app-alerting-rule.yaml

Additional resources

6.2.4.2. Accessing alerting rules for user-defined projects

To list alerting rules for a user-defined project, you must have been assigned the monitoring-rules-view cluster role for the project.

Prerequisites

  • You have enabled monitoring for user-defined projects.
  • You are logged in as a user that has the monitoring-rules-view cluster role for your project.
  • You have installed the OpenShift CLI (oc).

Procedure

  1. To list alerting rules in <project>:

    $ oc -n <project> get prometheusrule
  2. To list the configuration of an alerting rule, run the following:

    $ oc -n <project> get prometheusrule <rule> -o yaml

6.2.4.3. Removing alerting rules for user-defined projects

You can remove alerting rules for user-defined projects.

Prerequisites

  • You have enabled monitoring for user-defined projects.
  • You are logged in as a cluster administrator or as a user that has the monitoring-rules-edit cluster role for the project where you want to create an alerting rule.
  • You have installed the OpenShift CLI (oc).

Procedure

  • To remove rule <foo> in <namespace>, run the following:

    $ oc -n <namespace> delete prometheusrule <foo>

Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Wir helfen Red Hat Benutzern, mit unseren Produkten und Diensten innovativ zu sein und ihre Ziele zu erreichen – mit Inhalten, denen sie vertrauen können.

Mehr Inklusion in Open Source

Red Hat hat sich verpflichtet, problematische Sprache in unserem Code, unserer Dokumentation und unseren Web-Eigenschaften zu ersetzen. Weitere Einzelheiten finden Sie in Red Hat Blog.

Über Red Hat

Wir liefern gehärtete Lösungen, die es Unternehmen leichter machen, plattform- und umgebungsübergreifend zu arbeiten, vom zentralen Rechenzentrum bis zum Netzwerkrand.

© 2024 Red Hat, Inc.