Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 2. Custom logging alerts
You can configure the LokiStack deployment to produce customized alerts and recorded metrics. If you want to use customized alerting and recording rules, you must enable the LokiStack ruler component.
2.1. About configuring log-based alerts and recording rules for Loki Link kopierenLink in die Zwischenablage kopiert!
Learn how to configure log-based alerts and recorded metrics for Loki by using LogQL expressions and custom resources (CRs).
LokiStack log-based alerts and recorded metrics are triggered by providing LogQL(Grafana documentation) expressions to the ruler component.
To give these expressions, you must create an AlertingRule CR containing alerting rules, or a RecordingRule CR containing Prometheus-compatible recording rules (Prometheus documentation).
Administrators can configure log-based alerts or recorded metrics for application, audit, or infrastructure tenants. Users without administrator permissions can configure log-based alerts or recorded metrics for application tenants of the applications that they have access to.
Application, audit, and infrastructure alerts are sent by default to the OpenShift Container Platform monitoring stack Alertmanager in the openshift-monitoring namespace, unless you have disabled the local Alertmanager instance. If the Alertmanager that is used to monitor user-defined projects in the openshift-user-workload-monitoring namespace is enabled, application alerts are sent to the Alertmanager in this namespace by default.
2.2. Configuring the ruler Link kopierenLink in die Zwischenablage kopiert!
When the LokiStack ruler component is enabled, users can define a group of LogQL expressions that trigger logging alerts or recorded metrics.
Administrators can enable the ruler by modifying the LokiStack custom resource (CR).
Prerequisites
- You have installed the Red Hat OpenShift Logging Operator and the Loki Operator.
-
You have created a
LokiStackCR. - You have administrator permissions.
Procedure
Enable the ruler by ensuring that the
LokiStackCR has the following spec configuration:apiVersion: loki.grafana.com/v1 kind: LokiStack metadata: name: <name> namespace: <namespace> spec: # ... rules: enabled: true selector: matchLabels: <label_name>: "true" namespaceSelector: matchLabels: <label_name>: "true"rules.enabled-
Enables Loki alerting and recording rules in the cluster when set to
true. rules.selector- Specifies the selector for alerting and recording resources.
rules.selector.matchLabels.<label_name>- Defines a custom label that you can apply to namespaces to enable logging alerts and metrics.
rules.namespaceSelector-
Specifies the namespaces where alerting and recording rules are defined. If undefined, the system uses only the rules in the same namespace as the
LokiStack. rules.namespaceSelector.matchLabels.<label_name>- Defines a custom label for selecting namespaces where logging alerts and metrics are enabled.
2.3. Authorizing LokiStack rules RBAC permissions Link kopierenLink in die Zwischenablage kopiert!
Administrators bind cluster roles to users to enable them to create and manage alerting and recording rules. A cluster role is defined as a ClusterRole object that has the required role-based access control (RBAC) permissions.
The following cluster roles for alerting and recording rules are available for LokiStack:
| Rule name | Description |
|---|---|
|
|
Users with this role have administrative-level access to manage alerting rules. This cluster role grants permissions to create, read, update, delete, list, and watch |
|
|
Users with this role can view the definitions of Custom Resource Definitions (CRDs) related to |
|
|
Users with this role have permission to create, update, and delete |
|
|
Users with this role can read |
|
|
Users with this role have administrative-level access to manage recording rules. This cluster role grants permissions to create, read, update, delete, list, and watch |
|
|
Users with this role can view the definitions of Custom Resource Definitions (CRDs) related to |
|
|
Users with this role have permission to create, update, and delete |
|
|
Users with this role can read |
2.3.1. Examples Link kopierenLink in die Zwischenablage kopiert!
To apply cluster roles for a user, you must bind an existing cluster role to a specific username.
Cluster roles can be cluster or namespace scoped, depending on which type of role binding you use. When a RoleBinding object is used, as when using the oc adm policy add-role-to-user command, the cluster role only applies to the specified namespace. When a ClusterRoleBinding object is used, as when using the oc adm policy add-cluster-role-to-user command, the cluster role applies to all namespaces in the cluster.
The following example command gives the specified user create, read, update and delete (CRUD) permissions for alerting rules in a specific namespace in the cluster:
The following example displays cluster role binding command for alerting rule CRUD permissions in a specific namespace:
$ oc adm policy add-role-to-user alertingrules.loki.grafana.com-v1-admin -n <namespace> <username>
The following command gives the specified user administrator permissions for alerting rules in all namespaces:
$ oc adm policy add-cluster-role-to-user alertingrules.loki.grafana.com-v1-admin <username>
2.4. Creating a log-based alerting rule with Loki Link kopierenLink in die Zwischenablage kopiert!
The AlertingRule custom resource (CR) has a set of specifications and webhook validation definitions to declare groups of alerting rules for a single LokiStack instance. In addition, the webhook validation definition provides support for rule validation conditions:
-
If an
AlertingRuleCR includes an invalidintervalperiod, it is an invalid alerting rule. -
If an
AlertingRuleCR includes an invalidforperiod, it is an invalid alerting rule. -
If an
AlertingRuleCR includes an invalid LogQLexpr, it is an invalid alerting rule. -
If an
AlertingRuleCR includes two groups with the same name, it is an invalid alerting rule. - If none of the above applies, an alerting rule is considered valid.
| Tenant type | Valid namespaces for AlertingRule CRs |
|---|---|
| audit |
|
| infrastructure |
|
| application | All other namespaces. |
Prerequisites
- Red Hat OpenShift Logging Operator 5.7 and later
- OpenShift Container Platform 4.13 and later
Procedure
Create an
AlertingRulecustom resource (CR):The following example displays infrastructure
AlertingRuleCR:apiVersion: loki.grafana.com/v1 kind: AlertingRule metadata: name: loki-operator-alerts namespace: openshift-operators-redhat labels: openshift.io/cluster-monitoring: "true" spec: tenantID: infrastructure groups: - name: LokiOperatorHighReconciliationError rules: - alert: HighPercentageError expr: | sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"} |= "error" [1m])) by (job) / sum(rate({kubernetes_namespace_name="openshift-operators-redhat", kubernetes_pod_name=~"loki-operator-controller-manager.*"}[1m])) by (job) > 0.01 for: 10s labels: severity: critical annotations: summary: High Loki Operator Reconciliation Errors description: High Loki Operator Reconciliation Errorsmetadata.namespace-
The namespace must have a label that matches the LokiStack
spec.rules.namespaceSelectorconfiguration. metadata.labels-
The labels must match the LokiStack
spec.rules.selectorconfiguration. spec.tenantID-
For
infrastructuretenants, create theAlertingRuleonly inopenshift-*,kube-*, ordefaultnamespaces. expr-
The value of
kubernetes_namespace_namemust match themetadata.namespacevalue. labels.severity-
Set this mandatory field to
critical,warning, orinfo. annotations.summary- This field is mandatory.
annotations.descriptionThis field is mandatory.
The following example displays application
AlertingRuleCR:apiVersion: loki.grafana.com/v1 kind: AlertingRule metadata: name: app-user-workload namespace: app-ns labels: openshift.io/cluster-monitoring: "true" spec: tenantID: application groups: - name: AppUserWorkloadHighError rules: - alert: expr: | sum(rate({kubernetes_namespace_name="app-ns", kubernetes_pod_name=~"podName.*"} |= "error" [1m])) by (job) for: 10s labels: severity: critical annotations: summary: This is an example summary. description: This is an example description.metadata.namespace-
The namespace must have a label that matches the LokiStack
spec.rules.namespaceSelectorconfiguration. metadata.labels-
The labels must match the LokiStack
spec.rules.selectorconfiguration. expr-
The value of
kubernetes_namespace_namemust match themetadata.namespacevalue. labels.severity-
Set this mandatory field to
critical,warning, orinfo. annotations.summary- Provide a summary of the rule. This field is mandatory.
annotations.description- Provide a detailed description of the rule. This field is mandatory.
Apply the
AlertingRuleCR:$ oc apply -f <filename>.yaml