이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Greenboot workload health check scripts

Greenboot health check scripts are helpful on edge devices where direct serviceability is either limited or non-existent. If you installed the microshift-greenboot RPM package, you can also create health check scripts assess the health of your workloads and applications. These additional health check scripts are useful components of software problem checks and automatic system rollbacks.

A Red Hat build of MicroShift health check script is included in the microshift-greenboot RPM. You can also create your own health check scripts based on the workloads you are running. For example, you can write one that verifies that a service has started.

3.1. How workload health check scripts work
링크 복사

The workload or application health check script described in this tutorial uses the Red Hat build of MicroShift health check functions that are available in the /usr/share/microshift/functions/greenboot.sh file. This enables you to reuse procedures already implemented for the Red Hat build of MicroShift core services.

The script starts by running checks that the basic functions of the workload are operating as expected. To run the script successfully:

Execute the script from a root user account.
Enable the Red Hat build of MicroShift service.

The health check performs the following actions:

Gets a wait timeout of the current boot cycle for the wait_for function.
Calls the namespace_images_downloaded function to wait until pod images are available.
Calls the namespace_pods_ready function to wait until pods are ready.
Calls the namespace_pods_not_restarting function to verify pods are not restarting.

Note

Restarting pods can indicate a crash loop.

3.2. Included greenboot health checks
링크 복사

Health check scripts are available in /usr/lib/greenboot/check, a read-only directory in RPM-OSTree systems. The following health checks are included with the greenboot-default-health-checks framework.

Check if repository URLs are still DNS solvable:
This script is under /usr/lib/greenboot/check/required.d/01_repository_dns_check.sh and ensures that DNS queries to repository URLs are still available.
Check if update platforms are still reachable:
This script is under /usr/lib/greenboot/check/wanted.d/01_update_platform_check.sh and tries to connect and get a 2XX or 3XX HTTP code from the update platforms defined in /etc/ostree/remotes.d.
Check if the current boot has been triggered by the hardware watchdog:
This script is under /usr/lib/greenboot/check/required.d/02_watchdog.sh and checks whether the current boot has been watchdog-triggered or not.
- If the watchdog-triggered reboot occurs within the grace period, the current boot is marked as red. Greenboot does not trigger a rollback to the previous deployment.
- If the watchdog-triggered reboot occurs after the grace period, the current boot is not marked as red. Greenboot does not trigger a rollback to the previous deployment.
- A 24-hour grace period is enabled by default. This grace period can be either disabled by modifying GREENBOOT_WATCHDOG_CHECK_ENABLED in /etc/greenboot/greenboot.conf to false, or configured by changing the GREENBOOT_WATCHDOG_GRACE_PERIOD=number_of_hours variable value in /etc/greenboot/greenboot.conf.

3.3. How to create a health check script for your application
링크 복사

You can create workload or application health check scripts in the text editor of your choice using the example in this documentation. Save the scripts in the /etc/greenboot/check/required.d directory. When a script in the /etc/greenboot/check/required.d directory exits with an error, greenboot triggers a reboot in an attempt to heal the system.

Note

Any script in the /etc/greenboot/check/required.d directory triggers a reboot if it exits with an error.

If your health check logic requires any post-check steps, you can also create additional scripts and save them in the relevant greenboot directories. For example:

You can also place shell scripts you want to run after a boot has been declared successful in /etc/greenboot/green.d.
You can place shell scripts you want to run after a boot has been declared failed in /etc/greenboot/red.d. For example, if you have steps to heal the system before restarting, you can create scripts for your use case and place them in the /etc/greenboot/red.d directory.

3.3.1. About the workload health check script example
링크 복사

The following example uses the Red Hat build of MicroShift health check script as a template. You can use this example with the provided libraries as a guide for creating basic health check scripts for your applications.

3.3.1.1. Basic prerequisites for creating a health check script
링크 복사

The workload must be installed.
You must have root access.

3.3.1.2. Example and functional requirements
링크 복사

You can start with the following example health check script. Modify it for your use case. In your workload health check script, you must complete the following minimum steps:

Set the environment variables.
Define the user workload namespaces.
List the expected pod count.

Important

Choose a name prefix for your application that ensures it runs after the 40_microshift_running_check.sh script, which implements the Red Hat build of MicroShift health check procedure for its core services.

Example workload health check script

#!/bin/bash
set -e

SCRIPT_NAME=$(basename $0)
PODS_NS_LIST=(<user_workload_namespace1> <user_workload_namespace2>)
PODS_CT_LIST=(<user_workload_namespace1_pod_count> <user_workload_namespace2_pod_count>)
# Update these two lines with at least one namespace and the pod counts that are specific to your workloads. Use the kubernetes <namespace> where your workload is deployed.

# Set greenboot to read and execute the workload health check functions library.
source /usr/share/microshift/functions/greenboot.sh

# Set the exit handler to log the exit status.
trap 'script_exit' EXIT

# Set the script exit handler to log a `FAILURE` or `FINISHED` message depending on the exit status of the last command.
args: None
return: None
function script_exit() {
    [ "$?" -ne 0 ] && status=FAILURE || status=FINISHED
    echo $status
}

# Set the system to automatically stop the script if the user running it is not 'root'.
if [ $(id -u) -ne 0 ] ; then
    echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges"
    exit 1
fi

echo "STARTED"

# Set the script to stop without reporting an error if the MicroShift service is not running.
if [ $(systemctl is-enabled microshift.service 2>/dev/null) != "enabled" ] ; then
    echo "MicroShift service is not enabled. Exiting..."
    exit 0
fi

# Set the wait timeout for the current check based on the boot counter.
WAIT_TIMEOUT_SECS=$(get_wait_timeout)

# Set the script to wait for the pod images to be downloaded.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}

    echo "Waiting ${WAIT_TIMEOUT_SECS}s for pod image(s) from the ${CHECK_PODS_NS} namespace to be downloaded"
    wait_for ${WAIT_TIMEOUT_SECS} namespace_images_downloaded
done

# Set the script to wait for pods to enter ready state.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}
    CHECK_PODS_CT=${PODS_CT_LIST[$i]}

    echo "Waiting ${WAIT_TIMEOUT_SECS}s for ${CHECK_PODS_CT} pod(s) from the ${CHECK_PODS_NS} namespace to be in 'Ready' state"
    wait_for ${WAIT_TIMEOUT_SECS} namespace_pods_ready
done

# Verify that pods are not restarting by running, which could indicate a crash loop.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}

    echo "Checking pod restart count in the ${CHECK_PODS_NS} namespace"
    namespace_pods_not_restarting ${CHECK_PODS_NS}
done

# #!/bin/bash
set -e

SCRIPT_NAME=$(basename $0)
PODS_NS_LIST=(<user_workload_namespace1> <user_workload_namespace2>)
PODS_CT_LIST=(<user_workload_namespace1_pod_count> <user_workload_namespace2_pod_count>)
# Update these two lines with at least one namespace and the pod counts that are specific to your workloads. Use the kubernetes <namespace> where your workload is deployed.

# Set greenboot to read and execute the workload health check functions library.
source /usr/share/microshift/functions/greenboot.sh

# Set the exit handler to log the exit status.
trap 'script_exit' EXIT

# Set the script exit handler to log a `FAILURE` or `FINISHED` message depending on the exit status of the last command.
# args: None
# return: None
function script_exit() {
    [ "$?" -ne 0 ] && status=FAILURE || status=FINISHED
    echo $status
}

# Set the system to automatically stop the script if the user running it is not 'root'.
if [ $(id -u) -ne 0 ] ; then
    echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges"
    exit 1
fi

echo "STARTED"

# Set the script to stop without reporting an error if the MicroShift service is not running.
if [ $(systemctl is-enabled microshift.service 2>/dev/null) != "enabled" ] ; then
    echo "MicroShift service is not enabled. Exiting..."
    exit 0
fi

# Set the wait timeout for the current check based on the boot counter.
WAIT_TIMEOUT_SECS=$(get_wait_timeout)

# Set the script to wait for the pod images to be downloaded.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}

    echo "Waiting ${WAIT_TIMEOUT_SECS}s for pod image(s) from the ${CHECK_PODS_NS} namespace to be downloaded"
    wait_for ${WAIT_TIMEOUT_SECS} namespace_images_downloaded
done

# Set the script to wait for pods to enter ready state.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}
    CHECK_PODS_CT=${PODS_CT_LIST[$i]}

    echo "Waiting ${WAIT_TIMEOUT_SECS}s for ${CHECK_PODS_CT} pod(s) from the ${CHECK_PODS_NS} namespace to be in 'Ready' state"
    wait_for ${WAIT_TIMEOUT_SECS} namespace_pods_ready
done

# Verify that pods are not restarting by running, which could indicate a crash loop.
for i in ${!PODS_NS_LIST[@]}; do
    CHECK_PODS_NS=${PODS_NS_LIST[$i]}

    echo "Checking pod restart count in the ${CHECK_PODS_NS} namespace"
    namespace_pods_not_restarting ${CHECK_PODS_NS}
done

Copy to Clipboard

Toggle word wrap

3.4. Testing a workload health check script
링크 복사

Prerequisites

You have root access.
You have installed a workload.
You have created a health check script for the workload.
The Red Hat build of MicroShift service is enabled.

Procedure

To test that greenboot is running a health check script file, reboot the host by running the following command:
```
sudo reboot
```
```
$ sudo reboot
```
Copy to Clipboard Toggle word wrap

Examine the output of greenboot health checks by running the following command:

sudo journalctl -o cat -u greenboot-healthcheck.service

$ sudo journalctl -o cat -u greenboot-healthcheck.service

Copy to Clipboard

Toggle word wrap

Note

Red Hat build of MicroShift core service health checks run before the workload health checks.

Example output

GRUB boot variables:
boot_success=0
boot_indeterminate=0
Greenboot variables:
GREENBOOT_WATCHDOG_CHECK_ENABLED=true
...
...
FINISHED
Script '40_microshift_running_check.sh' SUCCESS
Running Wanted Health Check Scripts...
Finished greenboot Health Checks Runner.

GRUB boot variables:
boot_success=0
boot_indeterminate=0
Greenboot variables:
GREENBOOT_WATCHDOG_CHECK_ENABLED=true
...
...
FINISHED
Script '40_microshift_running_check.sh' SUCCESS
Running Wanted Health Check Scripts...
Finished greenboot Health Checks Runner.

Copy to Clipboard

Toggle word wrap

이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Greenboot workload health check scripts

3.1. How workload health check scripts work
링크 복사

3.2. Included greenboot health checks
링크 복사

3.3. How to create a health check script for your application
링크 복사

3.3.1. About the workload health check script example
링크 복사

3.3.1.1. Basic prerequisites for creating a health check script
링크 복사

3.3.1.2. Example and functional requirements
링크 복사

3.4. Testing a workload health check script
링크 복사

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Greenboot workload health check scripts

3.1. How workload health check scripts work링크 복사링크가 클립보드에 복사되었습니다!

3.2. Included greenboot health checks링크 복사링크가 클립보드에 복사되었습니다!

3.3. How to create a health check script for your application링크 복사링크가 클립보드에 복사되었습니다!

3.3.1. About the workload health check script example링크 복사링크가 클립보드에 복사되었습니다!

3.3.1.1. Basic prerequisites for creating a health check script링크 복사링크가 클립보드에 복사되었습니다!

3.3.1.2. Example and functional requirements링크 복사링크가 클립보드에 복사되었습니다!

3.4. Testing a workload health check script링크 복사링크가 클립보드에 복사되었습니다!

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

3.1. How workload health check scripts work
링크 복사

3.2. Included greenboot health checks
링크 복사

3.3. How to create a health check script for your application
링크 복사

3.3.1. About the workload health check script example
링크 복사

3.3.1.1. Basic prerequisites for creating a health check script
링크 복사

3.3.1.2. Example and functional requirements
링크 복사

3.4. Testing a workload health check script
링크 복사