主页
产品
Red Hat Advanced Cluster Management for Kubernetes
2.9
Observability（可观察性）
第 4 章定制可观察性配置

第 4 章定制可观察性配置

在启用可观察性后，根据环境的特定需求自定义可观察性配置。

要了解更多有关如何管理和查看可观察性服务收集的集群群数据的信息，请阅读以下部分：

需要的访问权限：集群管理员

4.1. 创建自定义规则
复制链接

通过在可观察性资源中添加 Prometheus 记录规则和警报规则，为可观察性安装创建自定义规则。

要预先计算昂贵的表达式，请使用记录规则功能。结果保存为一组新的时间序列。使用警报规则，您可以根据如何将警报发送到外部服务来指定警报条件。

注：当您更新自定义规则时，observability-thanos-rule pod 会自动重启。

使用 Prometheus 定义自定义规则来创建警报条件，并将通知发送到外部消息服务。查看以下自定义规则示例：

创建自定义警报规则。在 open-cluster-management-observability 命名空间中创建一个名为 thanos-ruler-custom-rules 的配置映射。您必须命名键 custom_rules.yaml，如下例所示。您可以在配置中创建多个规则。

创建自定义警报规则，在 CPU 使用量通过定义的值时通知您。您的 YAML 可能类似以下内容：

data:
  custom_rules.yaml: |
    groups:
      - name: cluster-health
        rules:
        - alert: ClusterCPUHealth-jb
          annotations:
            summary: Notify when CPU utilization on a cluster is greater than the defined utilization limit
            description: "The cluster has a high CPU usage: {{ $value }} core for {{ $labels.cluster }} {{ $labels.clusterID }}."
          expr: |
            max(cluster:cpu_usage_cores:sum) by (clusterID, cluster, prometheus) > 0
          for: 5s
          labels:
            cluster: "{{ $labels.cluster }}"
            prometheus: "{{ $labels.prometheus }}"
            severity: critical

data:
  custom_rules.yaml: |
    groups:
      - name: cluster-health
        rules:
        - alert: ClusterCPUHealth-jb
          annotations:
            summary: Notify when CPU utilization on a cluster is greater than the defined utilization limit
            description: "The cluster has a high CPU usage: {{ $value }} core for {{ $labels.cluster }} {{ $labels.clusterID }}."
          expr: |
            max(cluster:cpu_usage_cores:sum) by (clusterID, cluster, prometheus) > 0
          for: 5s
          labels:
            cluster: "{{ $labels.cluster }}"
            prometheus: "{{ $labels.prometheus }}"
            severity: critical

Copy to Clipboard

Toggle word wrap

默认警报规则位于 open-cluster-management-observability 命名空间的 thanos-ruler-default-rules 配置映射中。

在 thanos-ruler-custom-rules 配置映射中创建自定义记录规则。创建一个记录规则，以便您获取 pod 的容器内存缓存总和。您的 YAML 可能类似以下内容：

data:
  custom_rules.yaml: |
    groups:
      - name: container-memory
        rules:
        - record: pod:container_memory_cache:sum
          expr: sum(container_memory_cache{pod!=""}) BY (pod, container)

data:
  custom_rules.yaml: |
    groups:
      - name: container-memory
        rules:
        - record: pod:container_memory_cache:sum
          expr: sum(container_memory_cache{pod!=""}) BY (pod, container)

Copy to Clipboard

Toggle word wrap

注：在修改了配置映射后，配置会自动重新载入。由于 observability-thanos-ruler sidecar 中的 config-reload，因此配置会重新载入。

要验证警报规则是否正常工作，进入 Grafana 仪表板，选择 Explore 页面并查询 ALERTS。只有在创建了警报时，Grafana 才会在 Grafana 中提供警报。

第 4 章定制可观察性配置

4.1. 创建自定义规则
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

第 4 章 定制可观察性配置

4.1. 创建自定义规则复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

第 4 章定制可观察性配置

4.1. 创建自定义规则
复制链接