28.8. Writing a cluster alert agent


There are three types of Pacemaker cluster alerts: node alerts, fencing alerts, and resource alerts. The environment variables that are passed to the alert agents can differ, depending on the type of alert. The following table describes the environment variables that are passed to alert agents and specifies when the environment variable is associated with a specific alert type.

Expand
表 28.2. Environment Variables Passed to Alert Agents
Environment VariableDescription

CRM_alert_kind

The type of alert (node, fencing, or resource)

CRM_alert_version

The version of Pacemaker sending the alert

CRM_alert_recipient

The configured recipient

CRM_alert_node_sequence

A sequence number increased whenever an alert is being issued on the local node, which can be used to reference the order in which alerts have been issued by Pacemaker. An alert for an event that happened later in time reliably has a higher sequence number than alerts for earlier events. Be aware that this number has no cluster-wide meaning.

CRM_alert_timestamp

A timestamp created prior to executing the agent, in the format specified by the timestamp-format meta option. This allows the agent to have a reliable, high-precision time of when the event occurred, regardless of when the agent itself was invoked (which could potentially be delayed due to system load or other circumstances).

CRM_alert_node

Name of affected node

CRM_alert_desc

Detail about event. For node alerts, this is the node’s current state (member or lost). For fencing alerts, this is a summary of the requested fencing operation, including origin, target, and fencing operation error code, if any. For resource alerts, this is a readable string equivalent of CRM_alert_status.

CRM_alert_nodeid

ID of node whose status changed (provided with node alerts only)

CRM_alert_task

The requested fencing or resource operation (provided with fencing and resource alerts only)

CRM_alert_rc

The numerical return code of the fencing or resource operation (provided with fencing and resource alerts only)

CRM_alert_rsc

The name of the affected resource (resource alerts only)

CRM_alert_interval

The interval of the resource operation (resource alerts only)

CRM_alert_target_rc

The expected numerical return code of the operation (resource alerts only)

CRM_alert_status

A numerical code used by Pacemaker to represent the operation result (resource alerts only)

When writing an alert agent, you must take the following concerns into account.

  • Alert agents may be called with no recipient (if none is configured), so the agent must be able to handle this situation, even if it only exits in that case. Users may modify the configuration in stages, and add a recipient later.
  • If more than one recipient is configured for an alert, the alert agent will be called once per recipient. If an agent is not able to run concurrently, it should be configured with only a single recipient. The agent is free, however, to interpret the recipient as a list.
  • When a cluster event occurs, all alerts are fired off at the same time as separate processes. Depending on how many alerts and recipients are configured and on what is done within the alert agents, a significant load burst may occur. The agent could be written to take this into consideration, for example by queueing resource-intensive actions into some other instance, instead of directly executing them.
  • Alert agents are run as the hacluster user, which has a minimal set of permissions. If an agent requires additional privileges, it is recommended to configure sudo to allow the agent to run the necessary commands as another user with the appropriate privileges.
  • Take care to validate and sanitize user-configured parameters, such as CRM_alert_timestamp (whose content is specified by the user-configured timestamp-format), CRM_alert_recipient, and all alert options. This is necessary to protect against configuration errors. In addition, if some user can modify the CIB without having hacluster-level access to the cluster nodes, this is a potential security concern as well, and you should avoid the possibility of code injection.
  • If a cluster contains resources with operations for which the on-fail parameter is set to fence, there will be multiple fence notifications on failure, one for each resource for which this parameter is set plus one additional notification. Both the pacemaker-fenced and pacemaker-controld will send notifications. Pacemaker performs only one actual fence operation in this case, however, no matter how many notifications are sent.
注意

The alerts interface is designed to be backward compatible with the external scripts interface used by the ocf:pacemaker:ClusterMon resource. To preserve this compatibility, the environment variables passed to alert agents are available prepended with CRM_notify_ as well as CRM_alert_. One break in compatibility is that the ClusterMon resource ran external scripts as the root user, while alert agents are run as the hacluster user.

Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2026 Red Hat
返回顶部