이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 8. Machine Config Daemon metrics overview


The Machine Config Daemon is a part of the Machine Config Operator. It runs on every node in the cluster. The Machine Config Daemon manages configuration changes and updates on each of the nodes.

8.1. Understanding Machine Config Daemon metrics

Beginning with OpenShift Container Platform 4.3, the Machine Config Daemon provides a set of metrics. These metrics can be accessed using the Prometheus Cluster Monitoring stack.

The following table describes this set of metrics. Some entries contain commands for getting specific logs. However, the most comprehensive set of logs is available using the oc adm must-gather command.

Note

Metrics marked with * in the Name and Description columns represent serious errors that might cause performance problems. Such problems might prevent updates and upgrades from proceeding.

Table 8.1. MCO metrics
NameFormatDescriptionNotes

mcd_host_os_and_version

[]string{"os", "version"}

Shows the OS that MCD is running on, such as RHCOS or RHEL. In case of RHCOS, the version is provided.

 

mcd_drain_err*

 

Logs errors received during failed drain. *

While drains might need multiple tries to succeed, terminal failed drains prevent updates from proceeding. The drain_time metric, which shows how much time the drain took, might help with troubleshooting.

For further investigation, see the logs by running:

$ oc logs -f -n openshift-machine-config-operator machine-config-daemon-<hash> -c machine-config-daemon

mcd_pivot_err*

[]string{"err", "node", "pivot_target"}

Logs errors encountered during pivot. *

Pivot errors might prevent OS upgrades from proceeding.

For further investigation, run this command to see the logs from the machine-config-daemon container:

$ oc logs -f -n openshift-machine-config-operator machine-config-daemon-<hash> -c machine-config-daemon

mcd_state

[]string{"state", "reason"}

State of Machine Config Daemon for the indicated node. Possible states are "Done", "Working", and "Degraded". In case of "Degraded", the reason is included.

For further investigation, see the logs by running:

$ oc logs -f -n openshift-machine-config-operator machine-config-daemon-<hash> -c machine-config-daemon

mcd_kubelet_state*

 

Logs kubelet health failures. *

This is expected to be empty, with failure count of 0. If failure count exceeds 2, the error indicating threshold is exceeded. This indicates a possible issue with the health of the kubelet.

For further investigation, run this command to access the node and see all its logs:

$ oc debug node/<node> — chroot /host journalctl -u kubelet

mcd_reboot_err*

[]string{"message", "err", "node"}

Logs the failed reboots and the corresponding errors. *

This is expected to be empty, which indicates a successful reboot.

For further investigation, see the logs by running:

$ oc logs -f -n openshift-machine-config-operator machine-config-daemon-<hash> -c machine-config-daemon

mcd_update_state

[]string{"config", "err"}

Logs success or failure of configuration updates and the corresponding errors.

The expected value is rendered-master/rendered-worker-XXXX. If the update fails, an error is present.

For further investigation, see the logs by running:

$ oc logs -f -n openshift-machine-config-operator machine-config-daemon-<hash> -c machine-config-daemon

Red Hat logoGithubRedditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

Red Hat을 사용하는 고객은 신뢰할 수 있는 콘텐츠가 포함된 제품과 서비스를 통해 혁신하고 목표를 달성할 수 있습니다.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

© 2024 Red Hat, Inc.