Chapter 34. Interpreting resource agent OCF return codes


Pacemaker resource agents conform to the Open Cluster Framework (OCF) Resource Agent API. This following tables describe the OCF return codes and how they are interpreted by Pacemaker.

The first thing the cluster does when an agent returns a code is to check the return code against the expected result. If the result does not match the expected value, then the operation is considered to have failed, and recovery action is initiated.

For any invocation, resource agents must exit with a defined return code that informs the caller of the outcome of the invoked action.

There are three types of failure recovery, as described in the following table.

Table 34.1. Types of Recovery Performed by the Cluster
TypeDescriptionAction Taken by the Cluster

soft

A transient error occurred.

Restart the resource or move it to a new location .

hard

A non-transient error that may be specific to the current node occurred.

Move the resource elsewhere and prevent it from being retried on the current node.

fatal

A non-transient error that will be common to all cluster nodes occurred (for example, a bad configuration was specified).

Stop the resource and prevent it from being started on any cluster node.

The following table provides The OCF return codes and the type of recovery the cluster will initiate when a failure code is received.Note that even actions that return 0 (OCF alias OCF_SUCCESS) can be considered to have failed, if 0 was not the expected return value.

Table 34.2. OCF Return Codes
Return CodeOCF LabelDescription

0

OCF_SUCCESS

* The action completed successfully. This is the expected return code for any successful start, stop, promote, and demote command.

* Type if unexpected: soft

1

OCF_ERR_GENERIC

* The action returned a generic error.

* Type: soft

* The resource manager will attempt to recover the resource or move it to a new location.

2

OCF_ERR_ARGS

* The resource’s configuration is not valid on this machine. For example, it refers to a location not found on the node.

* Type: hard

* The resource manager will move the resource elsewhere and prevent it from being retried on the current node

3

OCF_ERR_UNIMPLEMENTED

* The requested action is not implemented.

* Type: hard

4

OCF_ERR_PERM

* The resource agent does not have sufficient privileges to complete the task. This may be due, for example, to the agent not being able to open a certain file, to listen on a specific socket, or to write to a directory.

* Type: hard

* Unless specifically configured otherwise, the resource manager will attempt to recover a resource which failed with this error by restarting the resource on a different node (where the permission problem may not exist).

5

OCF_ERR_INSTALLED

* A required component is missing on the node where the action was executed. This may be due to a required binary not being executable, or a vital configuration file being unreadable.

* Type: hard

* Unless specifically configured otherwise, the resource manager will attempt to recover a resource which failed with this error by restarting the resource on a different node (where the required files or binaries may be present).

6

OCF_ERR_CONFIGURED

* The resource’s configuration on the local node is invalid.

* Type: fatal

* When this code is returned, Pacemaker will prevent the resource from running on any node in the cluster, even if the service configuraiton is valid on some other node.

7

OCF_NOT_RUNNING

* The resource is safely stopped. This implies that the resource has either gracefully shut down, or has never been started.

* Type if unexpected: soft

* The cluster will not attempt to stop a resource that returns this for any action.

8

OCF_RUNNING_PROMOTED

* The resource is running in promoted role.

* Type if unexpected: soft

9

OCF_FAILED_PROMOTED

* The resource is (or might be) in promoted role but has failed.

* Type: soft

* The resource will be demoted, stopped and then started (and possibly promoted) again.

190

 

* (RHEL 8.4 and later) The service is found to be properly active, but in such a condition that future failures are more likely.

191

 

* (RHEL 8.4 and later) The resource agent supports roles and the service is found to be properly active in the promoted role, but in such a condition that future failures are more likely.

other

N/A

Custom error code.

Red Hat logoGithubRedditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

© 2024 Red Hat, Inc.