6.2. pacemaker 命令

6.2.2. 将集群设置为 maintenance-mode
复制链接

如果要进行更改，并且希望避免 pacemaker 集群的干扰，您可以通过将其置于 maintenance-mode 来"忽略"集群：

pcs property set maintenance-mode=true

# pcs property set maintenance-mode=true

Copy to Clipboard

Toggle word wrap

验证 maintenance-mode 的一种简单方法是检查资源是否是非受管状态：

pcs resource

# pcs resource
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02] (unmanaged):
    * SAPHanaTopology_RH2_02    (ocf:heartbeat:SAPHanaTopology):         Started clusternode1 (unmanaged)
    * SAPHanaTopology_RH2_02    (ocf:heartbeat:SAPHanaTopology):         Started clusternode2 (unmanaged)
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable, unmanaged):
    * SAPHana_RH2_02    (ocf:heartbeat:SAPHana):         Unpromoted clusternode1 (unmanaged)
    * SAPHana_RH2_02    (ocf:heartbeat:SAPHana):         Promoted clusternode2 (unmanaged)
  * vip_RH2_02_MASTER   (ocf:heartbeat:IPaddr2):         Started clusternode2 (unmanaged)

Copy to Clipboard

Toggle word wrap

刷新集群资源以便在集群处于 maintenance-mode 时检测资源状态，且不更新资源状态更改：

pcs resource refresh

# pcs resource refresh

Copy to Clipboard

Toggle word wrap

这将指示任何内容是否尚未正确，并会在不使用 maintenance-mode 时立即导致补救操作。

运行以下命令来删除 maintenance-mode ：

pcs property set maintenance-mode=false

# pcs property set maintenance-mode=false

Copy to Clipboard

Toggle word wrap

现在，集群将继续正常工作。如果配置了错误，它将现在做出反应。

6.2.5. 检查资源配置
复制链接

以下显示当前资源配置：

pcs resource config

# pcs resource config
Resource: vip_RH2_02_MASTER (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: vip_RH2_02_MASTER-instance_attributes
    ip=192.168.5.136
  Operations:
    monitor: vip_RH2_02_MASTER-monitor-interval-10s
      interval=10s
      timeout=20s
    start: vip_RH2_02_MASTER-start-interval-0s
      interval=0s
      timeout=20s
    stop: vip_RH2_02_MASTER-stop-interval-0s
      interval=0s
      timeout=20s
Clone: SAPHanaTopology_RH2_02-clone
  Meta Attributes: SAPHanaTopology_RH2_02-clone-meta_attributes
    clone-max=2
    clone-node-max=1
    interleave=true
  Resource: SAPHanaTopology_RH2_02 (class=ocf provider=heartbeat type=SAPHanaTopology)
    Attributes: SAPHanaTopology_RH2_02-instance_attributes
      InstanceNumber=02
      SID=RH2
    Operations:
      methods: SAPHanaTopology_RH2_02-methods-interval-0s
        interval=0s
        timeout=5
      monitor: SAPHanaTopology_RH2_02-monitor-interval-10
        interval=10
        timeout=600
      reload: SAPHanaTopology_RH2_02-reload-interval-0s
        interval=0s
        timeout=5
      start: SAPHanaTopology_RH2_02-start-interval-0s
        interval=0s
        timeout=600
      stop: SAPHanaTopology_RH2_02-stop-interval-0s
        interval=0s
        timeout=600
Clone: SAPHana_RH2_02-clone
  Meta Attributes: SAPHana_RH2_02-clone-meta_attributes
    clone-max=2
    clone-node-max=1
    interleave=true
    notify=true
    promotable=true
  Resource: SAPHana_RH2_02 (class=ocf provider=heartbeat type=SAPHana)
    Attributes: SAPHana_RH2_02-instance_attributes
      AUTOMATED_REGISTER=true
      DUPLICATE_PRIMARY_TIMEOUT=300
      HANA_CALL_TIMEOUT=10
      InstanceNumber=02
      PREFER_SITE_TAKEOVER=true
      SID=RH2
    Operations:
      demote: SAPHana_RH2_02-demote-interval-0s
        interval=0s
        timeout=3600
      methods: SAPHana_RH2_02-methods-interval-0s
        interval=0s
        timeout=5
      monitor: SAPHana_RH2_02-monitor-interval-251
        interval=251
        timeout=700
        role=Unpromoted
      monitor: SAPHana_RH2_02-monitor-interval-249
        interval=249
        timeout=700
        role=Promoted
      promote: SAPHana_RH2_02-promote-interval-0s
        interval=0s
        timeout=3600
      reload: SAPHana_RH2_02-reload-interval-0s
        interval=0s
        timeout=5
      start: SAPHana_RH2_02-start-interval-0s
        interval=0s
        timeout=3200
      stop: SAPHana_RH2_02-stop-interval-0s
        interval=0s
        timeout=3100

Copy to Clipboard

Toggle word wrap

这将列出用于配置安装和配置的资源代理的所有参数。

6.2.6. SAPHana 资源选项 AUTOMATED_REGISTER=true
复制链接

如果在 SAPHana 资源中使用这个选项，则 pacemaker 将自动重新注册二级数据库。

建议您在第一次测试中使用这个选项。使用 AUTOMATED_REGISTER=false 时，管理员需要手动重新注册次要节点。

6.2.7. 资源处理
复制链接

管理资源有几个选项。如需更多信息，请查看帮助信息：

pcs resource help

# pcs resource help

Copy to Clipboard

Toggle word wrap

列出使用的资源代理：

pcs resource config | grep "type=" | awk -F"type=" '{ print $2 }' | sed -e "s/)//g"

# pcs resource config | grep "type=" | awk -F"type=" '{ print $2 }' | sed -e "s/)//g"

Copy to Clipboard

Toggle word wrap

输出示例：

IPaddr2
SAPHanaTopology
SAPHana

IPaddr2
SAPHanaTopology
SAPHana

Copy to Clipboard

Toggle word wrap

显示特定的资源代理描述和配置参数：

pcs resource describe <resource agent>

# pcs resource describe <resource agent>

Copy to Clipboard

Toggle word wrap

示例（不带输出）：

pcs resource describe IPaddr2

# pcs resource describe IPaddr2

Copy to Clipboard

Toggle word wrap

资源代理 IPaddr2 示例（带有输出）：

Assumed agent name 'ocf:heartbeat:IPaddr2' (deduced from 'IPaddr2')
ocf:heartbeat:IPaddr2 - Manages virtual IPv4 and IPv6 addresses (Linux specific version)

This Linux-specific resource manages IP alias IP addresses. It can add an IP alias, or remove one. In
addition, it can implement Cluster Alias IP functionality if invoked as a clone resource. If used as a
clone, "shared address with a trivial, stateless (autonomous) load-balancing/mutual exclusion on
ingress" mode gets applied (as opposed to "assume resource uniqueness" mode otherwise). For that, Linux
firewall (kernel and userspace) is assumed, and since recent distributions are ambivalent in plain
"iptables" command to particular back-end resolution, "iptables-legacy" (when present) gets prioritized
so as to avoid incompatibilities (note that respective ipt_CLUSTERIP firewall extension in use here is,
at the same time, marked deprecated, yet said "legacy" layer can make it workable, literally, to this
day) with "netfilter" one (as in "iptables-nft"). In that case, you should explicitly set clone-node-max
>= 2, and/or clone-max < number of nodes. In case of node failure, clone instances need to be re-
allocated on surviving nodes. This would not be possible if there is already an instance on those nodes,
and clone-node-max=1 (which is the default). When the specified IP address gets assigned to a
respective interface, the resource agent sends unsolicited ARP (Address Resolution Protocol, IPv4) or NA
(Neighbor Advertisement, IPv6) packets to inform neighboring machines about the change. This
functionality is controlled for both IPv4 and IPv6 by shared 'arp_*' parameters.

Resource options:
ip (required) (unique): The IPv4 (dotted quad notation) or IPv6 address (colon hexadecimal notation)
example IPv4 "192.168.1.1". example IPv6 "2001:db8:DC28:0:0:FC57:D4C8:1FFF".
nic: The base network interface on which the IP address will be brought online. If left empty, the
script will try and determine this from the routing table. Do NOT specify an alias interface in
the form eth0:1 or anything here; rather, specify the base interface only. If you want a label,
see the iflabel parameter. Prerequisite: There must be at least one static IP address, which is
not managed by the cluster, assigned to the network interface. If you can not assign any static IP
address on the interface, modify this kernel parameter: sysctl -w
net.ipv4.conf.all.promote_secondaries=1 # (or per device)
cidr_netmask: The netmask for the interface in CIDR format (e.g., 24 and not 255.255.255.0) If
unspecified, the script will also try to determine this from the routing table.
broadcast: Broadcast address associated with the IP. It is possible to use the special symbols '+' and
'-' instead of the broadcast address. In this case, the broadcast address is derived by
setting/resetting the host bits of the interface prefix.
iflabel: You can specify an additional label for your IP address here. This label is appended to your
interface name. The kernel allows alphanumeric labels up to a maximum length of 15 characters
including the interface name and colon (e.g. eth0:foobar1234) A label can be specified in nic
parameter but it is deprecated. If a label is specified in nic name, this parameter has no effect.
lvs_support: Enable support for LVS Direct Routing configurations. In case a IP address is stopped,
only move it to the loopback device to allow the local node to continue to service requests, but
no longer advertise it on the network. Notes for IPv6: It is not necessary to enable this option
on IPv6. Instead, enable 'lvs_ipv6_addrlabel' option for LVS-DR usage on IPv6.
lvs_ipv6_addrlabel: Enable adding IPv6 address label so IPv6 traffic originating from the address's
interface does not use this address as the source. This is necessary for LVS-DR health checks to
realservers to work. Without it, the most recently added IPv6 address (probably the address added
by IPaddr2) will be used as the source address for IPv6 traffic from that interface and since that
address exists on loopback on the realservers, the realserver response to pings/connections will
never leave its loopback. See RFC3484 for the detail of the source address selection. See also
'lvs_ipv6_addrlabel_value' parameter.
lvs_ipv6_addrlabel_value: Specify IPv6 address label value used when 'lvs_ipv6_addrlabel' is enabled.
The value should be an unused label in the policy table which is shown by 'ip addrlabel list'
command. You would rarely need to change this parameter.
mac: Set the interface MAC address explicitly. Currently only used in case of the Cluster IP Alias.
Leave empty to chose automatically.
clusterip_hash: Specify the hashing algorithm used for the Cluster IP functionality.
unique_clone_address: If true, add the clone ID to the supplied value of IP to create a unique address
to manage
arp_interval: Specify the interval between unsolicited ARP (IPv4) or NA (IPv6) packets in
milliseconds. This parameter is deprecated and used for the backward compatibility only. It is
effective only for the send_arp binary which is built with libnet, and send_ua for IPv6. It has no
effect for other arp_sender.
arp_count: Number of unsolicited ARP (IPv4) or NA (IPv6) packets to send at resource initialization.
arp_count_refresh: For IPv4, number of unsolicited ARP packets to send during resource monitoring.
Doing so helps mitigate issues of stuck ARP caches resulting from split-brain situations.
arp_bg: Whether or not to send the ARP (IPv4) or NA (IPv6) packets in the background. The default is
true for IPv4 and false for IPv6.
arp_sender: For IPv4, the program to send ARP packets with on start. Available options are: -
send_arp: default - ipoibarping: default for infiniband interfaces if ipoibarping is available -
iputils_arping: use arping in iputils package - libnet_arping: use another variant of arping
based on libnet
send_arp_opts: For IPv4, extra options to pass to the arp_sender program. Available options are vary
depending on which arp_sender is used. A typical use case is specifying '-A' for iputils_arping
to use ARP REPLY instead of ARP REQUEST as Gratuitous ARPs.
flush_routes: Flush the routing table on stop. This is for applications which use the cluster IP
address and which run on the same physical host that the IP address lives on. The Linux kernel may
force that application to take a shortcut to the local loopback interface, instead of the
interface the address is really bound to. Under those circumstances, an application may, somewhat
unexpectedly, continue to use connections for some time even after the IP address is deconfigured.
Set this parameter in order to immediately disable said shortcut when the IP address goes away.
run_arping: For IPv4, whether or not to run arping for collision detection check.
nodad: For IPv6, do not perform Duplicate Address Detection when adding the address.
noprefixroute: Use noprefixroute flag (see 'man ip-address').
preferred_lft: For IPv6, set the preferred lifetime of the IP address. This can be used to ensure that
the created IP address will not be used as a source address for routing. Expects a value as
specified in section 5.5.4 of RFC 4862.
network_namespace: Specifies the network namespace to operate within. The namespace must already
exist, and the interface to be used must be within the namespace.

Default operations:
start:
interval=0s
timeout=20s
stop:
interval=0s
timeout=20s
monitor:
interval=10s
timeout=20s

Assumed agent name 'ocf:heartbeat:IPaddr2' (deduced from 'IPaddr2')
ocf:heartbeat:IPaddr2 - Manages virtual IPv4 and IPv6 addresses (Linux specific version)

This Linux-specific resource manages IP alias IP addresses. It can add an IP alias, or remove one. In
addition, it can implement Cluster Alias IP functionality if invoked as a clone resource.  If used as a
clone, "shared address with a trivial, stateless (autonomous) load-balancing/mutual exclusion on
ingress" mode gets applied (as opposed to "assume resource uniqueness" mode otherwise). For that, Linux
firewall (kernel and userspace) is assumed, and since recent distributions are ambivalent in plain
"iptables" command to particular back-end resolution, "iptables-legacy" (when present) gets prioritized
so as to avoid incompatibilities (note that respective ipt_CLUSTERIP firewall extension in use here is,
at the same time, marked deprecated, yet said "legacy" layer can make it workable, literally, to this
day) with "netfilter" one (as in "iptables-nft"). In that case, you should explicitly set clone-node-max
>= 2, and/or clone-max < number of nodes. In case of node failure, clone instances need to be re-
allocated on surviving nodes. This would not be possible if there is already an instance on those nodes,
and clone-node-max=1 (which is the default).  When the specified IP address gets assigned to a
respective interface, the resource agent sends unsolicited ARP (Address Resolution Protocol, IPv4) or NA
(Neighbor Advertisement, IPv6) packets to inform neighboring machines about the change. This
functionality is controlled for both IPv4 and IPv6 by shared 'arp_*' parameters.

Resource options:
  ip (required) (unique): The IPv4 (dotted quad notation) or IPv6 address (colon hexadecimal notation)
      example IPv4 "192.168.1.1". example IPv6 "2001:db8:DC28:0:0:FC57:D4C8:1FFF".
  nic: The base network interface on which the IP address will be brought online.  If left empty, the
      script will try and determine this from the routing table.  Do NOT specify an alias interface in
      the form eth0:1 or anything here; rather, specify the base interface only. If you want a label,
      see the iflabel parameter.  Prerequisite:  There must be at least one static IP address, which is
      not managed by the cluster, assigned to the network interface. If you can not assign any static IP
      address on the interface, modify this kernel parameter:  sysctl -w
      net.ipv4.conf.all.promote_secondaries=1 # (or per device)
  cidr_netmask: The netmask for the interface in CIDR format (e.g., 24 and not 255.255.255.0)  If
      unspecified, the script will also try to determine this from the routing table.
  broadcast: Broadcast address associated with the IP. It is possible to use the special symbols '+' and
      '-' instead of the broadcast address. In this case, the broadcast address is derived by
      setting/resetting the host bits of the interface prefix.
  iflabel: You can specify an additional label for your IP address here. This label is appended to your
      interface name.  The kernel allows alphanumeric labels up to a maximum length of 15 characters
      including the interface name and colon (e.g. eth0:foobar1234)  A label can be specified in nic
      parameter but it is deprecated. If a label is specified in nic name, this parameter has no effect.
  lvs_support: Enable support for LVS Direct Routing configurations. In case a IP address is stopped,
      only move it to the loopback device to allow the local node to continue to service requests, but
      no longer advertise it on the network.  Notes for IPv6: It is not necessary to enable this option
      on IPv6. Instead, enable 'lvs_ipv6_addrlabel' option for LVS-DR usage on IPv6.
  lvs_ipv6_addrlabel: Enable adding IPv6 address label so IPv6 traffic originating from the address's
      interface does not use this address as the source. This is necessary for LVS-DR health checks to
      realservers to work. Without it, the most recently added IPv6 address (probably the address added
      by IPaddr2) will be used as the source address for IPv6 traffic from that interface and since that
      address exists on loopback on the realservers, the realserver response to pings/connections will
      never leave its loopback. See RFC3484 for the detail of the source address selection.  See also
      'lvs_ipv6_addrlabel_value' parameter.
  lvs_ipv6_addrlabel_value: Specify IPv6 address label value used when 'lvs_ipv6_addrlabel' is enabled.
      The value should be an unused label in the policy table which is shown by 'ip addrlabel list'
      command. You would rarely need to change this parameter.
  mac: Set the interface MAC address explicitly. Currently only used in case of the Cluster IP Alias.
      Leave empty to chose automatically.
  clusterip_hash: Specify the hashing algorithm used for the Cluster IP functionality.
  unique_clone_address: If true, add the clone ID to the supplied value of IP to create a unique address
      to manage
  arp_interval: Specify the interval between unsolicited ARP (IPv4) or NA (IPv6) packets in
      milliseconds.  This parameter is deprecated and used for the backward compatibility only. It is
      effective only for the send_arp binary which is built with libnet, and send_ua for IPv6. It has no
      effect for other arp_sender.
  arp_count: Number of unsolicited ARP (IPv4) or NA (IPv6) packets to send at resource initialization.
  arp_count_refresh: For IPv4, number of unsolicited ARP packets to send during resource monitoring.
      Doing so helps mitigate issues of stuck ARP caches resulting from split-brain situations.
  arp_bg: Whether or not to send the ARP (IPv4) or NA (IPv6) packets in the background. The default is
      true for IPv4 and false for IPv6.
  arp_sender: For IPv4, the program to send ARP packets with on start. Available options are:  -
      send_arp: default  - ipoibarping: default for infiniband interfaces if ipoibarping is available  -
      iputils_arping: use arping in iputils package  - libnet_arping: use another variant of arping
      based on libnet
  send_arp_opts: For IPv4, extra options to pass to the arp_sender program. Available options are vary
      depending on which arp_sender is used.  A typical use case is specifying '-A' for iputils_arping
      to use ARP REPLY instead of ARP REQUEST as Gratuitous ARPs.
  flush_routes: Flush the routing table on stop. This is for applications which use the cluster IP
      address and which run on the same physical host that the IP address lives on. The Linux kernel may
      force that application to take a shortcut to the local loopback interface, instead of the
      interface the address is really bound to. Under those circumstances, an application may, somewhat
      unexpectedly, continue to use connections for some time even after the IP address is deconfigured.
      Set this parameter in order to immediately disable said shortcut when the IP address goes away.
  run_arping: For IPv4, whether or not to run arping for collision detection check.
  nodad: For IPv6, do not perform Duplicate Address Detection when adding the address.
  noprefixroute: Use noprefixroute flag (see 'man ip-address').
  preferred_lft: For IPv6, set the preferred lifetime of the IP address. This can be used to ensure that
      the created IP address will not be used as a source address for routing. Expects a value as
      specified in section 5.5.4 of RFC 4862.
  network_namespace: Specifies the network namespace to operate within. The namespace must already
      exist, and the interface to be used must be within the namespace.

Default operations:
  start:
    interval=0s
    timeout=20s
  stop:
    interval=0s
    timeout=20s
  monitor:
    interval=10s
    timeout=20s

Copy to Clipboard

Toggle word wrap

如果集群停止，则所有资源也会停止；如果集群 处于维护模式，则所有资源将保持运行，但不会受监控或管理。

6.2.8. 集群属性处理，用于 maintenance-mode
复制链接

列出所有定义的属性：

[root@clusternode1] pcs property
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: cluster1
 concurrent-fencing: true
 dc-version: 2.1.5-7.el9-a3f44794f94
 hana_rh2_site_srHook_DC1: PRIM
 hana_rh2_site_srHook_DC2: SFAIL
 have-watchdog: false
 last-lrm-refresh: 1688548036
 maintenance-mode: true
 priority-fencing-delay: 10s
 stonith-enabled: true
 stonith-timeout: 900

[root@clusternode1] pcs property
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: cluster1
 concurrent-fencing: true
 dc-version: 2.1.5-7.el9-a3f44794f94
 hana_rh2_site_srHook_DC1: PRIM
 hana_rh2_site_srHook_DC2: SFAIL
 have-watchdog: false
 last-lrm-refresh: 1688548036
 maintenance-mode: true
 priority-fencing-delay: 10s
 stonith-enabled: true
 stonith-timeout: 900

Copy to Clipboard

Toggle word wrap

要重新配置数据库，必须指示集群忽略任何更改，直到配置完成为止。您可以使用以下方法将集群 置于维护模式 ：

pcs property set maintenance-mode=true

# pcs property set maintenance-mode=true

Copy to Clipboard

Toggle word wrap

检查 maintenance-mode ：

pcs resource

# pcs resource
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02] (unmanaged):
    * SAPHanaTopology_RH2_02	(ocf:heartbeat:SAPHanaTopology):	 Started clusternode1 (unmanaged)
    * SAPHanaTopology_RH2_02	(ocf:heartbeat:SAPHanaTopology):	 Started clusternode2 (unmanaged)
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable, unmanaged):
    * SAPHana_RH2_02	(ocf:heartbeat:SAPHana):	 Promoted clusternode1 (unmanaged)
    * SAPHana_RH2_02	(ocf:heartbeat:SAPHana):	 Unpromoted clusternode2 (unmanaged)
  * vip_RH2_02_MASTER	(ocf:heartbeat:IPaddr2):	 Started clusternode1 (unmanaged)

Copy to Clipboard

Toggle word wrap

验证所有资源都是 "unmanaged"：

pcs status

[root@clusternode1]# pcs status
Cluster name: cluster1
Status of pacemakerd: 'Pacemaker is running' (last updated 2023-06-27 16:02:15 +02:00)
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode2 (version 2.1.5-7.el9-a3f44794f94) - partition with quorum
  * Last updated: Tue Jun 27 16:02:16 2023
  * Last change:  Tue Jun 27 16:02:14 2023 by root via cibadmin on clusternode1
  * 2 nodes configured
  * 6 resource instances configured

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Node List:
  * Online: [ clusternode1 clusternode2 ]

Full List of Resources:
  * h7fence	(stonith:fence_rhevm):	 Started clusternode2 (unmanaged)
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02] (unmanaged):
    * SAPHanaTopology_RH2_02	(ocf:heartbeat:SAPHanaTopology):	 Started clusternode1 (unmanaged)
    * SAPHanaTopology_RH2_02	(ocf:heartbeat:SAPHanaTopology):	 Started clusternode2 (unmanaged)
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable, unmanaged):
    * SAPHana_RH2_02	(ocf:heartbeat:SAPHana):	 Promoted clusternode1 (unmanaged)
    * SAPHana_RH2_02	(ocf:heartbeat:SAPHana):	 Unpromoted clusternode2 (unmanaged)
  * vip_RH2_02_MASTER	(ocf:heartbeat:IPaddr2):	 Started clusternode1 (unmanaged)

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Copy to Clipboard

Toggle word wrap

如果您取消设置 maintenance-mode，则资源将切回到受管：

pcs property set maintenance-mode=false

# pcs property set maintenance-mode=false

Copy to Clipboard

Toggle word wrap

6.2.9. 使用 Move 故障转移 SAPHana 资源
复制链接

有关如何故障转移 SAP HANA 数据库的简单示例是使用 pcs resource move 命令。您需要使用克隆资源名称并移动资源，如下所示：

pcs resource move <SAPHana-clone-resource>

# pcs resource move <SAPHana-clone-resource>

Copy to Clipboard

Toggle word wrap

在本例中，克隆资源是 SAPHana_RH2_02-clone ：

pcs resource

[root@clusternode1]# pcs resource
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * Started: [ clusternode1 clusternode2 ]
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * Promoted: [ clusternode1 ]
    * Unpromoted: [ clusternode2 ]
  * vip_RH2_02_MASTER	(ocf:heartbeat:IPaddr2):	 Started clusternode1

Copy to Clipboard

Toggle word wrap

移动资源：

pcs resource move SAPHana_RH2_02-clone

# pcs resource move SAPHana_RH2_02-clone
Location constraint to move resource 'SAPHana_RH2_02-clone' has been created
Waiting for the cluster to apply configuration changes...
Location constraint created to move resource 'SAPHana_RH2_02-clone' has been removed
Waiting for the cluster to apply configuration changes...
resource 'SAPHana_RH2_02-clone' is promoted on node 'clusternode2'; unpromoted on node 'clusternode1'

Copy to Clipboard

Toggle word wrap

检查是否有剩余的限制：

pcs constraint location

# pcs constraint location

Copy to Clipboard

Toggle word wrap

您可以通过清除资源来删除在故障切换过程中创建的位置限制。Example:

pcs resource clear SAPHana_RH2_02-clone

[root@clusternode1]# pcs resource clear SAPHana_RH2_02-clone

Copy to Clipboard

Toggle word wrap

检查 "Migration Summary" 中是否存在剩余的警告或条目：

pcs status --full

# pcs status --full

Copy to Clipboard

Toggle word wrap

检查 stonith 历史记录：

pcs stonith history

# pcs stonith history

Copy to Clipboard

Toggle word wrap

如果需要，清除 stonith 历史记录：

pcs stonith history cleanup

# pcs stonith history cleanup

Copy to Clipboard

Toggle word wrap

如果您使用早于 2.1.5 的 pacemaker 版本，请参阅运行 pcs resource move 时是否存在管理限制的方法？并检查剩余的限制。

6.2.10. 监控故障切换和同步状态
复制链接

所有 pacemaker 活动都记录在集群节点上的 /var/log/messages 文件中。由于还有许多其他消息，有时很难阅读与 SAP 资源代理相关的消息。您可以配置命令别名，仅过滤与 SAP 资源代理相关的消息。

别名 tmsl 示例：

alias tmsl='tail -1000f /var/log/messages | egrep -s "Setting master-rsc_SAPHana_$SAPSYSTEMNAME_HDB${TINSTANCE}|sr_register|WAITING4LPA|PROMOTED|DEMOTED|UNDEFINED|master_walk|SWAIT|WaitforStopped|FAILED|LPT"'

# alias tmsl='tail -1000f /var/log/messages | egrep -s "Setting master-rsc_SAPHana_$SAPSYSTEMNAME_HDB${TINSTANCE}|sr_register|WAITING4LPA|PROMOTED|DEMOTED|UNDEFINED|master_walk|SWAIT|WaitforStopped|FAILED|LPT"'

Copy to Clipboard

Toggle word wrap

tsml 的输出示例：

tmsl

[root@clusternode1]# tmsl
Jun 22 13:59:54 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: secondary with sync status SOK ==> possible takeover node
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: saphana_monitor_secondary: scoring_crm_master(4:S:master1:master:worker:master,SOK)
Jun 22 13:59:55 clusternode1 SAPHana(SAPHana_RH2_02)[907482]: INFO: DEC: scoring_crm_master: sync(SOK) is matching syncPattern (SOK)
Jun 22 14:04:06 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:04:06 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:04:06 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: secondary with sync status SOK ==> possible takeover node
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: saphana_monitor_secondary: scoring_crm_master(4:S:master1:master:worker:master,SOK)
Jun 22 14:04:09 clusternode1 SAPHana(SAPHana_RH2_02)[914625]: INFO: DEC: scoring_crm_master: sync(SOK) is matching syncPattern (SOK)
Jun 22 14:08:21 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:08:21 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:08:21 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:08:23 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:08:23 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:08:23 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: secondary with sync status SOK ==> possible takeover node
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: saphana_monitor_secondary: scoring_crm_master(4:S:master1:master:worker:master,SOK)
Jun 22 14:08:24 clusternode1 SAPHana(SAPHana_RH2_02)[922136]: INFO: DEC: scoring_crm_master: sync(SOK) is matching syncPattern (SOK)
Jun 22 14:12:35 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:12:35 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:12:36 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:12:38 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:12:38 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:12:38 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:12:38 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: secondary with sync status SOK ==> possible takeover node
Jun 22 14:12:39 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:12:39 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:12:39 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:12:39 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: saphana_monitor_secondary: scoring_crm_master(4:S:master1:master:worker:master,SOK)
Jun 22 14:12:39 clusternode1 SAPHana(SAPHana_RH2_02)[929408]: INFO: DEC: scoring_crm_master: sync(SOK) is matching syncPattern (SOK)
Jun 22 14:14:01 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_clone_state[clusternode2]: PROMOTED -> DEMOTED
Jun 22 14:14:02 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_clone_state[clusternode2]: DEMOTED -> UNDEFINED
Jun 22 14:14:19 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_clone_state[clusternode1]: DEMOTED -> PROMOTED
Jun 22 14:14:21 clusternode1 SAPHana(SAPHana_RH2_02)[932762]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:14:21 clusternode1 SAPHana(SAPHana_RH2_02)[932762]: INFO: DEC: hana_rh2_site_srHook_DC1 is empty or SWAIT. Take polling attribute: hana_rh2_sync_state=SOK
Jun 22 14:14:21 clusternode1 SAPHana(SAPHana_RH2_02)[932762]: INFO: DEC: Finally get_SRHOOK()=SOK
Jun 22 14:15:14 clusternode1 SAPHana(SAPHana_RH2_02)[932762]: INFO: DEC: hana_rh2_site_srHook_DC1=SWAIT
Jun 22 14:15:22 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_sync_state[clusternode1]: SOK -> PRIM
Jun 22 14:15:23 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_sync_state[clusternode2]: PRIM -> SOK
Jun 22 14:15:23 clusternode1 SAPHana(SAPHana_RH2_02)[934810]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:15:25 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_clone_state[clusternode2]: UNDEFINED -> DEMOTED
Jun 22 14:15:32 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_sync_state[clusternode2]: SOK -> SFAIL
Jun 22 14:19:36 clusternode1 pacemaker-attrd[10150]: notice: Setting hana_rh2_sync_state[clusternode2]: SFAIL -> SOK
Jun 22 14:19:36 clusternode1 SAPHana(SAPHana_RH2_02)[942693]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:23:49 clusternode1 SAPHana(SAPHana_RH2_02)[950623]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:28:02 clusternode1 SAPHana(SAPHana_RH2_02)[958633]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:32:15 clusternode1 SAPHana(SAPHana_RH2_02)[966683]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:36:27 clusternode1 SAPHana(SAPHana_RH2_02)[974736]: INFO: ACT site=DC1, setting SOK for secondary (1)
Jun 22 14:40:40 clusternode1 SAPHana(SAPHana_RH2_02)[982934]: INFO: ACT site=DC1, setting SOK for secondary (1)

Copy to Clipboard

Toggle word wrap

通过过滤器，可以更轻松地了解正在发生哪些状态更改。如果缺少详细信息，您可以打开整个消息文件来读取所有信息。

故障转移后，您可以清除资源。另请检查是否有剩余位置限制。

6.2.11. 检查集群一致性
复制链接

在安装过程中，资源有时会在配置最终完成前启动。这可能导致 Cluster Information Base (CIB)中的条目，这可能会导致行为不正确。这可以轻松检查，并在配置完成后手动更正。

如果您启动 SAPHana 资源，则会重新创建缺少的条目。pcs 命令无法解决错误的条目，需要手动删除。

检查 CIB 条目：

cibadmin --query

# cibadmin --query

Copy to Clipboard

Toggle word wrap

DC3 和 SFAIL 是不应存在于 Cluster Information Base、当群集成员为 DC1 和 DC2 以及节点之间的同步状态报告为 SOK 的条目。

检查对应条目的示例：

cibadmin --query |grep '"DC3"'
cibadmin --query |grep '"SFAIL"'

# cibadmin --query |grep '"DC3"'
# cibadmin --query |grep '"SFAIL"'

Copy to Clipboard

Toggle word wrap

该命令可以作为 root 用户在集群中的任何节点上执行。通常，命令的输出为空。如果配置中仍然出现错误，输出可能会类似如下：

        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>

        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>

Copy to Clipboard

Toggle word wrap

使用以下命令可以删除这些条目：

cibadmin --delete --xml-text '<...>'

# cibadmin --delete --xml-text '<...>'

Copy to Clipboard

Toggle word wrap

要删除上例中的条目，您必须输入以下内容：请注意，输出包含双引号，因此文本必须嵌入到单引号中：

cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>'

# cibadmin --delete --xml-text '        <nvpair id="SAPHanaSR-hana_rh1_glob_sec" name="hana_rh1_glob_sec" value="DC3"/>'

Copy to Clipboard

Toggle word wrap

验证没有删除的 CIB 条目。返回的输出应为空。

cibadmin --query |grep 'DC3"'

# cibadmin --query |grep 'DC3"'

Copy to Clipboard

Toggle word wrap

6.2.12. 集群清理
复制链接

在故障转移测试过程中，可能留下在限制后，其他仍然保留在以前的测试中。在启动下一个测试前，需要从它们清除集群。

检查失败事件的集群状态：

pcs status --full

# pcs status --full

Copy to Clipboard

Toggle word wrap

如果您在 "Migration Summary" 中看到集群警告或条目，您应该清除并清理资源：

pcs resource clear SAPHana_RH2_02-clone
pcs resource cleanup SAPHana_RH2_02-clone

# pcs resource clear SAPHana_RH2_02-clone
# pcs resource cleanup SAPHana_RH2_02-clone

Copy to Clipboard

Toggle word wrap

输出：

Cleaned up SAPHana_RH2_02:0 on clusternode1
Cleaned up SAPHana_RH2_02:1 on clusternode2

Cleaned up SAPHana_RH2_02:0 on clusternode1
Cleaned up SAPHana_RH2_02:1 on clusternode2

Copy to Clipboard

Toggle word wrap

检查是否有不需要的位置限制，例如来自以前的故障切换：

pcs constraint location

# pcs constraint location

Copy to Clipboard

Toggle word wrap

更详细地检查现有限制：

pcs constraint --full

# pcs constraint --full

Copy to Clipboard

Toggle word wrap

资源移动后位置约束示例：

      Node: hana08 (score:-INFINITY) (role:Started) (id:cli-ban-SAPHana_RH2_02-clone-on-hana08)

      Node: hana08 (score:-INFINITY) (role:Started) (id:cli-ban-SAPHana_RH2_02-clone-on-hana08)

Copy to Clipboard

Toggle word wrap

清除此位置约束：

pcs resource clear SAPHana_RH2_02-clone

# pcs resource clear SAPHana_RH2_02-clone

Copy to Clipboard

Toggle word wrap

验证约束是否已从约束列表中显示。如果保留，则使用其约束 id 显式删除它：

pcs constraint delete cli-ban-SAPHana_RH2_02-clone-on-hana08

# pcs constraint delete cli-ban-SAPHana_RH2_02-clone-on-hana08

Copy to Clipboard

Toggle word wrap

如果您使用隔离运行多个测试，您可能也清除 stonith 历史记录：

pcs stonith history cleanup

# pcs stonith history cleanup

Copy to Clipboard

Toggle word wrap

所有 pcs 命令都是以 root 用户身份执行的。另外，请检查发现左侧。

6.2.13. 其他集群命令
复制链接

各种集群命令示例

pcs status --full
crm_mon -1Arf # Provides an overview
pcs resource # Lists all resources and shows if they are running
pcs constraint --full # Lists all constraint ids which should be removed
pcs cluster start --all # This will start the cluster on all nodes
pcs cluster stop --all # This will stop the cluster on all nodes
pcs node attribute # Lists node attributes

# pcs status --full
# crm_mon -1Arf # Provides an overview
# pcs resource # Lists all resources and shows if they are running
# pcs constraint --full # Lists all constraint ids which should be removed
# pcs cluster start --all # This will start the cluster on all nodes
# pcs cluster stop --all # This will stop the cluster on all nodes
# pcs node attribute # Lists node attributes

Copy to Clipboard

Toggle word wrap

6.2.1. 启动和停止集群
复制链接

6.2.2. 将集群设置为 maintenance-mode
复制链接

6.2.3. 检查集群状态
复制链接

6.2.4. 检查资源状态
复制链接

6.2.5. 检查资源配置
复制链接

6.2.6. SAPHana 资源选项 AUTOMATED_REGISTER=true
复制链接

6.2.7. 资源处理
复制链接

6.2.8. 集群属性处理，用于 maintenance-mode
复制链接

6.2.9. 使用 Move 故障转移 SAPHana 资源
复制链接

6.2.10. 监控故障切换和同步状态
复制链接

6.2.11. 检查集群一致性
复制链接

6.2.12. 集群清理
复制链接

6.2.13. 其他集群命令
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

6.2. pacemaker 命令

6.2.1. 启动和停止集群复制链接链接已复制到粘贴板!

6.2.2. 将集群设置为 maintenance-mode复制链接链接已复制到粘贴板!

6.2.3. 检查集群状态复制链接链接已复制到粘贴板!

6.2.4. 检查资源状态复制链接链接已复制到粘贴板!

6.2.5. 检查资源配置复制链接链接已复制到粘贴板!

6.2.6. SAPHana 资源选项 AUTOMATED_REGISTER=true复制链接链接已复制到粘贴板!

6.2.7. 资源处理复制链接链接已复制到粘贴板!

6.2.8. 集群属性处理，用于 maintenance-mode复制链接链接已复制到粘贴板!

6.2.9. 使用 Move 故障转移 SAPHana 资源复制链接链接已复制到粘贴板!

6.2.10. 监控故障切换和同步状态复制链接链接已复制到粘贴板!

6.2.11. 检查集群一致性复制链接链接已复制到粘贴板!

6.2.12. 集群清理复制链接链接已复制到粘贴板!

6.2.13. 其他集群命令复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

6.2.1. 启动和停止集群
复制链接

6.2.2. 将集群设置为 maintenance-mode
复制链接

6.2.3. 检查集群状态
复制链接

6.2.4. 检查资源状态
复制链接

6.2.5. 检查资源配置
复制链接

6.2.6. SAPHana 资源选项 AUTOMATED_REGISTER=true
复制链接

6.2.7. 资源处理
复制链接

6.2.8. 集群属性处理，用于 maintenance-mode
复制链接

6.2.9. 使用 Move 故障转移 SAPHana 资源
复制链接

6.2.10. 监控故障切换和同步状态
复制链接

6.2.11. 检查集群一致性
复制链接

6.2.12. 集群清理
复制链接

6.2.13. 其他集群命令
复制链接