This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.3.4.3. 移动监控解决方案
默认情况下,部署包含 Prometheus、Grafana 和 AlertManager 的 Prometheus Cluster Monitoring 堆栈来提供集群监控功能。它由 Cluster Monitoring Operator 进行管理。若要将其组件移到其他机器上,需要创建并应用自定义配置映射。
流程
将以下
ConfigMap
定义保存为cluster-monitoring-configmap.yaml
文件:apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: |+ alertmanagerMain: nodeSelector: node-role.kubernetes.io/infra: "" prometheusK8s: nodeSelector: node-role.kubernetes.io/infra: "" prometheusOperator: nodeSelector: node-role.kubernetes.io/infra: "" grafana: nodeSelector: node-role.kubernetes.io/infra: "" k8sPrometheusAdapter: nodeSelector: node-role.kubernetes.io/infra: "" kubeStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" telemeterClient: nodeSelector: node-role.kubernetes.io/infra: "" openshiftStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" thanosQuerier: nodeSelector: node-role.kubernetes.io/infra: ""
apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: |+ alertmanagerMain: nodeSelector: node-role.kubernetes.io/infra: "" prometheusK8s: nodeSelector: node-role.kubernetes.io/infra: "" prometheusOperator: nodeSelector: node-role.kubernetes.io/infra: "" grafana: nodeSelector: node-role.kubernetes.io/infra: "" k8sPrometheusAdapter: nodeSelector: node-role.kubernetes.io/infra: "" kubeStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" telemeterClient: nodeSelector: node-role.kubernetes.io/infra: "" openshiftStateMetrics: nodeSelector: node-role.kubernetes.io/infra: "" thanosQuerier: nodeSelector: node-role.kubernetes.io/infra: ""
Copy to Clipboard Copied! 运行此配置映射会强制将监控堆栈的组件重新部署到基础架构节点。
应用新的配置映射:
oc create -f cluster-monitoring-configmap.yaml
$ oc create -f cluster-monitoring-configmap.yaml
Copy to Clipboard Copied! 观察监控 pod 移至新机器:
watch 'oc get pod -n openshift-monitoring -o wide'
$ watch 'oc get pod -n openshift-monitoring -o wide'
Copy to Clipboard Copied! 如果组件没有移到
infra
节点,请删除带有这个组件的 pod:oc delete pod -n openshift-monitoring <pod>
$ oc delete pod -n openshift-monitoring <pod>
Copy to Clipboard Copied! 已删除 pod 的组件在
infra
节点上重新创建。