此内容没有您所选择的语言版本。

Chapter 9. Deploying machine health checks


You can configure and deploy a machine health check to automatically repair damaged machines in a machine pool.

Important

Machine health checks is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.

Important

This process is not applicable to clusters where you manually provisioned the machines yourself. You can use the advanced machine management and scaling capabilities only in clusters where the machine API is operational.

Prerequisites

  • Enable a FeatureGate so you can access Technology Preview features.

    Note

    Turning on Technology Preview features cannot be undone and prevents upgrades.

9.1. About MachineHealthChecks

MachineHealthChecks automatically repairs unhealthy Machines in a particular MachinePool.

To monitor machine health, you create a resource to define the configuration for a controller. You set a condition to check for, such as staying in the NotReady status for 15 minutes or displaying a permanent condition in the node-problem-detector, and a label for the set of machines to monitor.

Note

You cannot apply a MachineHealthCheck to a machine with the master role.

The controller that observes a MachineHealthCheck resource checks for the status that you defined. If a machine fails the health check, it is automatically deleted and a new one is created to take its place. When a machine is deleted, you see a machine deleted event. To limit disruptive impact of the machine deletion, the controller drains and deletes only one node at a time.

To stop the check, you remove the resource.

9.2. Sample MachineHealthCheck resource

The MachineHealthCheck resource resembles the following YAML file:

MachineHealthCheck

apiVersion: healthchecking.openshift.io/v1alpha1
kind: MachineHealthCheck
metadata:
 name: example 
1

 namespace: openshift-machine-api
Spec:
  Selector:
    matchLabels:

      machine.openshift.io/cluster-api-machine-role: <label> 
2

      machine.openshift.io/cluster-api-machine-type: <label> 
3

      machine.openshift.io/cluster-api-machineset: <cluster_name>-<label>-<AWS-zone> 
4
Copy to Clipboard Toggle word wrap

1
Specify the name of the MachineHealthCheck to deploy. Include the name of the MachinePool to track.
2 3
Specify a label for the MachinePool that you want to check.
4
Specify the MachineSet to track in <cluster_name>-<label>-<zone> format. For example, prod-node-us-east-1a.

9.3. Creating a MachineHealthCheck resource

You can create a MachineHealthCheck resource for all MachinePools in your cluster except the master pool.

Prerequisites

  • Install the oc command line interface.

Procedure

  1. Create a healthcheck.yml file that contains the definition of your MachineHealthCheck.
  2. Apply the healthcheck.yml file to your cluster:

    $ oc apply -f healthcheck.yml
    Copy to Clipboard Toggle word wrap
返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat