Chapter 22. Resource monitoring operations
To ensure that resources remain healthy, you can add a monitoring operation to a resource’s definition. If you do not specify a monitoring operation for a resource, by default the pcs
command will create a monitoring operation, with an interval that is determined by the resource agent. If the resource agent does not provide a default monitoring interval, the pcs command will create a monitoring operation with an interval of 60 seconds.
The following table summarizes the properties of a resource monitoring operation.
Field | Description |
---|---|
| Unique name for the action. The system assigns this when you configure an operation. |
|
The action to perform. Common values: |
|
If set to a nonzero value, a recurring operation is created that repeats at this frequency, in seconds. A nonzero value makes sense only when the action
If set to zero, which is the default value, this parameter allows you to provide values to be used for operations created by the cluster. For example, if the |
|
If the operation does not complete in the amount of time set by this parameter, abort the operation and consider it failed. The default value is the value of
The |
| The action to take if this action ever fails. Allowed values:
*
*
*
*
*
*
*
The default for the |
|
If |
22.1. Configuring resource monitoring operations
You can configure monitoring operations when you create a resource with the following command.
pcs resource create resource_id standard:provider:type|type [resource_options] [op operation_action operation_options [operation_type operation_options]...]
pcs resource create resource_id standard:provider:type|type [resource_options] [op operation_action operation_options [operation_type operation_options]...]
For example, the following command creates an IPaddr2
resource with a monitoring operation. The new resource is called VirtualIP
with an IP address of 192.168.0.99 and a netmask of 24 on eth2
. A monitoring operation will be performed every 30 seconds.
pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2 op monitor interval=30s
# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2 op monitor interval=30s
Alternately, you can add a monitoring operation to an existing resource with the following command.
pcs resource op add resource_id operation_action [operation_properties]
pcs resource op add resource_id operation_action [operation_properties]
Use the following command to delete a configured resource operation.
pcs resource op remove resource_id operation_name operation_properties
pcs resource op remove resource_id operation_name operation_properties
You must specify the exact operation properties to properly remove an existing operation.
To change the values of a monitoring option, you can update the resource. For example, you can create a VirtualIP
with the following command.
pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2
# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2
By default, this command creates these operations.
Operations: start interval=0s timeout=20s (VirtualIP-start-timeout-20s) stop interval=0s timeout=20s (VirtualIP-stop-timeout-20s) monitor interval=10s timeout=20s (VirtualIP-monitor-interval-10s)
Operations: start interval=0s timeout=20s (VirtualIP-start-timeout-20s)
stop interval=0s timeout=20s (VirtualIP-stop-timeout-20s)
monitor interval=10s timeout=20s (VirtualIP-monitor-interval-10s)
To change the stop timeout operation, execute the following command.
pcs resource update VirtualIP op stop interval=0s timeout=40s pcs resource config VirtualIP
# pcs resource update VirtualIP op stop interval=0s timeout=40s
# pcs resource config VirtualIP
Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=192.168.0.99 cidr_netmask=24 nic=eth2
Operations: start interval=0s timeout=20s (VirtualIP-start-timeout-20s)
monitor interval=10s timeout=20s (VirtualIP-monitor-interval-10s)
stop interval=0s timeout=40s (VirtualIP-name-stop-interval-0s-timeout-40s)
22.2. Configuring global resource operation defaults
You can change the default value of a resource operation for all resources with the pcs resource op defaults update
command.
The following command sets a global default of a timeout
value of 240 seconds for all monitoring operations.
pcs resource op defaults update timeout=240s
# pcs resource op defaults update timeout=240s
The original pcs resource op defaults name=value
command, which set resource operation defaults for all resources in previous releases, remains supported unless there is more than one set of defaults configured. However, pcs resource op defaults update
is now the preferred version of the command.
22.2.1. Overriding resource-specific operation values
Note that a cluster resource will use the global default only when the option is not specified in the cluster resource definition. By default, resource agents define the timeout
option for all operations. For the global operation timeout value to be honored, you must create the cluster resource without the timeout
option explicitly or you must remove the timeout
option by updating the cluster resource, as in the following command.
pcs resource update VirtualIP op monitor interval=10s
# pcs resource update VirtualIP op monitor interval=10s
For example, after setting a global default of a timeout
value of 240 seconds for all monitoring operations and updating the cluster resource VirtualIP
to remove the timeout value for the monitor
operation, the resource VirtualIP
will then have timeout values for start
, stop
, and monitor
operations of 20s, 40s and 240s, respectively. The global default value for timeout operations is applied here only on the monitor
operation, where the default timeout
option was removed by the previous command.
pcs resource config VirtualIP
# pcs resource config VirtualIP
Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=192.168.0.99 cidr_netmask=24 nic=eth2
Operations: start interval=0s timeout=20s (VirtualIP-start-timeout-20s)
monitor interval=10s (VirtualIP-monitor-interval-10s)
stop interval=0s timeout=40s (VirtualIP-name-stop-interval-0s-timeout-40s)
22.2.2. Changing the default value of a resource operation for sets of resources
You can create multiple sets of resource operation defaults with the pcs resource op defaults set create
command, which allows you to specify a rule that contains resource
and operation expressions. All of the rule expressions supported by Pacemaker are allowed.
With this command, you can configure a default resource operation value for all resources of a particular type. For example, it is now possible to configure implicit podman
resources created by Pacemaker when bundles are in use.
The following command sets a default timeout value of 90s for all operations for all podman
resources. In this example, ::podman
means a resource of any class, any provider, of type podman
.
The id
option, which names the set of resource operation defaults, is not mandatory. If you do not set this option, pcs
will generate an ID automatically. Setting this value allows you to provide a more descriptive name.
pcs resource op defaults set create id=podman-timeout meta timeout=90s rule resource ::podman
# pcs resource op defaults set create id=podman-timeout meta timeout=90s rule resource ::podman
The following command sets a default timeout value of 120s for the stop
operation for all resources.
pcs resource op defaults set create id=stop-timeout meta timeout=120s rule op stop
# pcs resource op defaults set create id=stop-timeout meta timeout=120s rule op stop
It is possible to set the default timeout value for a specific operation for all resources of a particular type. The following example sets a default timeout value of 120s for the stop
operation for all podman
resources.
pcs resource op defaults set create id=podman-stop-timeout meta timeout=120s rule resource ::podman and op stop
# pcs resource op defaults set create id=podman-stop-timeout meta timeout=120s rule resource ::podman and op stop
22.2.3. Displaying currently configured resource operation default values
The pcs resource op defaults [config]
command displays a list of currently configured default values for resource operations, including any rules you specified. In Red Hat Enterprise Linux 9.5 and later, you can display the output of this command in text, JSON, and command formats.
-
Specifying
--output-format=text
displays the configured resource operation defaults in plain text format, which is the default value for this option. -
Specifying
--output-format=cmd
displays thepcs resource op defaults
commands created from the current cluster defaults configuration. You can use these commands to re-create configured resource operation defaults on a different system. -
Specifying
--output-format=json
displays the configured resource operation defaults in JSON format, which is suitable for machine parsing.
The following examples show the three different output formats of the pcs resource op defaults config
command after the default resource operation values for any ocf:pacemaker:podman
resource were reset with this example command:
pcs resource op defaults set create id=op-set-1 score=100 meta timeout=30s rule op monitor and resource ocf:pacemaker:podman
# pcs resource op defaults set create id=op-set-1 score=100 meta timeout=30s rule op monitor and resource ocf:pacemaker:podman
Warning: Defaults do not apply to resources which override them with their own defined values
This example displays the configured resource operation default values in plain text.
pcs resource op defaults config
# pcs resource op defaults config
Meta Attrs: op-set-1 score=100
timeout=30s
Rule: boolean-op=and score=INFINITY
Expression: op monitor
Expression: resource ocf:pacemaker:podman
This example displays the pcs resource op defaults
commands created from the current cluster defaults configuration.
pcs resource op defaults config --output-format=cmd
# pcs resource op defaults config --output-format=cmd
pcs -- resource op defaults set create id=op-set-1 score=100 \
meta timeout=30s \
rule 'op monitor and resource ocf:pacemaker:podman'
This example displays the configured resource operation default values in JSON format.
pcs resource op defaults config --output-format=json
# pcs resource op defaults config --output-format=json
{"instance_attributes": [], "meta_attributes": [{"id": "op-set-1", "options": {"score": "100"}, "rule": {"id": "op-set-1-rule", "type": "RULE", "in_effect": "UNKNOWN", "options": {"boolean-op": "and", "score": "INFINITY"}, "date_spec": null, "duration": null, "expressions": [{"id": "op-set-1-rule-op-monitor", "type": "OP_EXPRESSION", "in_effect": "UNKNOWN", "options": {"name": "monitor"}, "date_spec": null, "duration": null, "expressions": [], "as_string": "op monitor"}, {"id": "op-set-1-rule-rsc-ocf-pacemaker-podman", "type": "RSC_EXPRESSION", "in_effect": "UNKNOWN", "options": {"class": "ocf", "provider": "pacemaker", "type": "podman"}, "date_spec": null, "duration": null, "expressions": [], "as_string": "resource ocf:pacemaker:podman"}], "as_string": "op monitor and resource ocf:pacemaker:podman"}, "nvpairs": [{"id": "op-set-1-timeout", "name": "timeout", "value": "30s"}]}]}
22.3. Configuring multiple monitoring operations
You can configure a single resource with as many monitor operations as a resource agent supports. In this way you can do a superficial health check every minute and progressively more intense ones at higher intervals.
When configuring multiple monitor operations, you must ensure that no two operations are performed at the same interval.
To configure additional monitoring operations for a resource that supports more in-depth checks at different levels, you add an OCF_CHECK_LEVEL=n
option.
For example, if you configure the following IPaddr2
resource, by default this creates a monitoring operation with an interval of 10 seconds and a timeout value of 20 seconds.
pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2
# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.99 cidr_netmask=24 nic=eth2
If the Virtual IP supports a different check with a depth of 10, the following command causes Pacemaker to perform the more advanced monitoring check every 60 seconds in addition to the normal Virtual IP check every 10 seconds. (As noted, you should not configure the additional monitoring operation with a 10-second interval as well.)
pcs resource op add VirtualIP monitor interval=60s OCF_CHECK_LEVEL=10
# pcs resource op add VirtualIP monitor interval=60s OCF_CHECK_LEVEL=10