5.6. 测试 4：将主节点故障转移到第一个站点

Expand

测试的主题	主切换回一个集群节点。故障恢复并再次启用集群。将第三个站点重新注册为次要站点。
测试先决条件	SAP HANA 主节点在第三个站点上运行。集群部分正在运行。集群 `进入维护模式`。前一个集群主要可以访问。
测试步骤	检查集群的预期主要信息。从 DC3 节点故障转移到 DC1 节点。检查前期次要是否已切换到新主设备。重新注册 remotehost3 作为新的次要。设置 cluster `maintenance-mode=false`，集群将继续工作。
监控测试	在新的主启动时： `remotehost3:rh2adm> watch python $DIR_EXECUTABLE/python_support/systemReplicationStatus.py [root@clusternode1]# watch pcs status --full On the secondary start: clusternode:rh2adm> watch hdbnsutil -sr_state`
启动测试	检查集群的预期主要内容： `[root@clusternode1]# pcs resource` VIP 和提升 SAP HANA 资源应该在同一节点上运行，而这是潜在的新主要资源。在这个潜在的主要运行中： `clusternode1:rh2adm> hdbnsutil -sr_takeover` 重新将前一个主重新注册为新次要： `clusternode1:rh2adm> hdbnsutil -sr_register \ --remoteHost=clusternode1 \ --remoteInstance=${TINSTANCE} \ --replicationMode=syncmem \ --name=DC3 \ --remoteName=DC1 \ --operationMode=logreplay \ --force_full_replica \ --online` 在设置 `maintenance-mode=false` 后集群将继续工作。
预期结果	新主要是启动 SAP HANA。复制状态将显示所有 3 个站点复制。第二个集群站点会自动重新注册到新主站点。灾难恢复(DR)站点成为数据库的额外副本。
返回初始状态的方法	运行测试 3。

详细描述

检查集群是否已设置为 maintenance-mode ：
```
pcs property config maintenance-mode
```
```
[root@clusternode1]# pcs property config maintenance-mode
Cluster Properties:
 maintenance-mode: true
```
Copy to Clipboard Toggle word wrap
如果 maintenance-mode 不是 true，您可以使用以下方法对其进行设置：
```
pcs property set  maintenance-mode=true
```
```
[root@clusternode1]# pcs property set  maintenance-mode=true
```
Copy to Clipboard Toggle word wrap
检查系统复制状态，并发现所有节点上的主数据库。首先使用以下命令发现主数据库：
```
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
```
```
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
```
Copy to Clipboard Toggle word wrap

其输出应如下所示：

在 clusternode1 上：

clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

Copy to Clipboard

Toggle word wrap

在 clusternode2 上：

clusternode2:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

clusternode2:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

Copy to Clipboard

Toggle word wrap

在 remotehost3:

remotehost3:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: primary

remotehost3:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: primary

Copy to Clipboard

Toggle word wrap

在所有三个节点上，主数据库为 remotehost3。在这个主数据库中，您必须确保所有三个节点的系统复制状态处于活跃状态，返回码为 15：

remotehost3:rh2adm> python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py
|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode1    |    30201 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode1    |    30207 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode1    |    30203 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
status system replication site "1": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 3
site name: DC3
[rh2adm@remotehost3: python_support]# echo $?
15

remotehost3:rh2adm> python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py
|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode1    |    30201 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode1    |    30207 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode1    |    30203 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
status system replication site "1": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 3
site name: DC3
[rh2adm@remotehost3: python_support]# echo $?
15

Copy to Clipboard

Toggle word wrap

检查所有三个 sr_states 都一致。

请在所有三个节点上运行 hdbnsutil -sr_state --sapcontrol=1 |grep site prerequisitesMode:

clusternode1:rh2adm>hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode

clusternode1:rh2adm>hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode

Copy to Clipboard

Toggle word wrap

clusternode2:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

clusternode2:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

Copy to Clipboard

Toggle word wrap

remotehost3:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

remotehost3:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

Copy to Clipboard

Toggle word wrap

所有节点上的输出应该相同：

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=async
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=async
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

Copy to Clipboard

Toggle word wrap

在单独的窗口中启动监控。

在 clusternode1 启动时：

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

Copy to Clipboard

Toggle word wrap

在 remotehost3 启动时：

remotehost3:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

remotehost3:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

Copy to Clipboard

Toggle word wrap

在 clusternode2 启动时：

clusternode2:rh2adm> watch "hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode"

clusternode2:rh2adm> watch "hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode"

Copy to Clipboard

Toggle word wrap

启动测试

在 clusternode1 上切换到 clusternode1 启动：

clusternode1:rh2adm> hdbnsutil -sr_takeover
done.

clusternode1:rh2adm> hdbnsutil -sr_takeover
done.

Copy to Clipboard

Toggle word wrap

检查 monitor 的输出。

clusternode1 上的监控器将更改为：

Every 2.0s: python systemReplicationStatus.py; echo $?                                                                                                                                                            clusternode1: Mon Sep  4 23:34:30 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
15

Every 2.0s: python systemReplicationStatus.py; echo $?                                                                                                                                                            clusternode1: Mon Sep  4 23:34:30 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
15

Copy to Clipboard

Toggle word wrap

重要信息也是返回代码 15。clusternode2 上的监控器将更改为：

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                clusternode2: Mon Sep  4 23:35:18 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC2=logreplay

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                clusternode2: Mon Sep  4 23:35:18 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC2=logreplay

Copy to Clipboard

Toggle word wrap

DC3 已消失，需要重新注册。在 remotehost3 上，systemReplicationStatus 会报告一个错误，并将 returncode 变为 11。

检查集群节点是否已重新注册。

clusternode1:rh2adm>  hdbnsutil -sr_state

System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~

online: true

mode: primary
operation mode: primary
site id: 1
site name: DC1

is source system: true
is secondary/consumer system: false
has secondaries/consumers attached: true
is a takeover active: false
is primary suspended: false

Host Mappings:
~~~~~~~~~~~~~~

clusternode1 -> [DC2] clusternode2
clusternode1 -> [DC1] clusternode1


Site Mappings:
~~~~~~~~~~~~~~
DC1 (primary/primary)
    |---DC2 (syncmem/logreplay)

Tier of DC1: 1
Tier of DC2: 2

Replication mode of DC1: primary
Replication mode of DC2: syncmem

Operation mode of DC1: primary
Operation mode of DC2: logreplay

Mapping: DC1 -> DC2
done.

clusternode1:rh2adm>  hdbnsutil -sr_state

System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~

online: true

mode: primary
operation mode: primary
site id: 1
site name: DC1

is source system: true
is secondary/consumer system: false
has secondaries/consumers attached: true
is a takeover active: false
is primary suspended: false

Host Mappings:
~~~~~~~~~~~~~~

clusternode1 -> [DC2] clusternode2
clusternode1 -> [DC1] clusternode1


Site Mappings:
~~~~~~~~~~~~~~
DC1 (primary/primary)
    |---DC2 (syncmem/logreplay)

Tier of DC1: 1
Tier of DC2: 2

Replication mode of DC1: primary
Replication mode of DC2: syncmem

Operation mode of DC1: primary
Operation mode of DC2: logreplay

Mapping: DC1 -> DC2
done.

Copy to Clipboard

Toggle word wrap

站点映射显示，clusternode2 (DC2)被重新注册。

检查或启用 vip 资源：

pcs resource

[root@clusternode1]# pcs resource
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02] (unmanaged):
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2 (unmanaged)
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1
   (unmanaged)
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable, unmanaged):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode2 (unmanaged)
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode1
   (unmanaged)
  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Stopped (disabled, unmanaged)

Copy to Clipboard

Toggle word wrap

vip 资源 vip_RH2_02_MASTER 已停止。要再次运行它：

pcs resource enable vip_RH2_02_MASTER

[root@clusternode1]# pcs resource enable vip_RH2_02_MASTER
Warning: 'vip_RH2_02_MASTER' is unmanaged

Copy to Clipboard

Toggle word wrap

警告正确，因为集群不会启动任何资源，除非 maintenance-mode=false。

停止集群 维护模式。

在我们停止 maintenance-mode 之前，我们应在单独的窗口中启动两个 monitor 以查看更改。在 clusternode2 上运行：

watch pcs status --full

[root@clusternode2]# watch pcs status --full

Copy to Clipboard

Toggle word wrap

在 clusternode1 上运行：

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo $?"

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo $?"

Copy to Clipboard

Toggle word wrap

现在，您可以运行以下命令，在 clusternode1 上取消设置 maintenance-mode ：

pcs property set maintenance-mode=false

[root@clusternode1]# pcs property set maintenance-mode=false

Copy to Clipboard

Toggle word wrap

clusternode2 上的 monitor 应该显示一切现在都如预期运行：

Every 2.0s: pcs status --full                                                                                                                                                                                     clusternode1: Tue Sep  5 00:01:17 2023

Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode1
 (1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Sep  5 00:01:17 2023
  * Last change:  Tue Sep  5 00:00:30 2023 by root via crm_attribute on clusternode1

  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ clusternode1
 (1) clusternode2 (2) ]

Full List of Resources:
  * auto_rhevm_fence1   (stonith:fence_rhevm):   Started clusternode1

  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1

  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode2
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode1

  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Started clusternode1


Node Attributes:
  * Node: clusternode1
 (1):
    * hana_rh2_clone_state              : PROMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode2
    * hana_rh2_roles                    : 4:P:master1:master:worker:master
    * hana_rh2_site                     : DC1
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : PRIM
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode1

    * lpa_rh2_lpt                       : 1693872030
    * master-SAPHana_RH2_02             : 150
  * Node: clusternode2 (2):
    * hana_rh2_clone_state              : DEMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode1

    * hana_rh2_roles                    : 4:S:master1:master:worker:master
    * hana_rh2_site                     : DC2
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : SOK
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode2
    * lpa_rh2_lpt                       : 30
    * master-SAPHana_RH2_02             : 100

Migration Summary:

Tickets:

PCSD Status:
  clusternode1
: Online
  clusternode2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Every 2.0s: pcs status --full                                                                                                                                                                                     clusternode1: Tue Sep  5 00:01:17 2023

Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode1
 (1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Sep  5 00:01:17 2023
  * Last change:  Tue Sep  5 00:00:30 2023 by root via crm_attribute on clusternode1

  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ clusternode1
 (1) clusternode2 (2) ]

Full List of Resources:
  * auto_rhevm_fence1   (stonith:fence_rhevm):   Started clusternode1

  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1

  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode2
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode1

  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Started clusternode1


Node Attributes:
  * Node: clusternode1
 (1):
    * hana_rh2_clone_state              : PROMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode2
    * hana_rh2_roles                    : 4:P:master1:master:worker:master
    * hana_rh2_site                     : DC1
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : PRIM
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode1

    * lpa_rh2_lpt                       : 1693872030
    * master-SAPHana_RH2_02             : 150
  * Node: clusternode2 (2):
    * hana_rh2_clone_state              : DEMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode1

    * hana_rh2_roles                    : 4:S:master1:master:worker:master
    * hana_rh2_site                     : DC2
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : SOK
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode2
    * lpa_rh2_lpt                       : 30
    * master-SAPHana_RH2_02             : 100

Migration Summary:

Tickets:

PCSD Status:
  clusternode1
: Online
  clusternode2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Copy to Clipboard

Toggle word wrap

手动交互后，最好清理集群，如 Cluster Cleanup 所述。

将 remotehost3 重新注册到 clusternode1 上的新主卷。

需要重新注册 Remotehost3。要监控进度，请在 clusternode1 上启动：

clusternode1:rh2adm> watch -n 5 'python
/usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?'

clusternode1:rh2adm> watch -n 5 'python
/usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?'

Copy to Clipboard

Toggle word wrap

在 remotehost3 上，请开始：

remotehost3:rh2adm> watch 'hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode'

remotehost3:rh2adm> watch 'hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode'

Copy to Clipboard

Toggle word wrap

现在，您可以使用这个命令重新注册 remotehost3：

remotehost3:rh2adm> hdbnsutil -sr_register --remoteHost=clusternode1 --remoteInstance=${TINSTANCE} --replicationMode=async --name=DC3 --remoteName=DC1 --operationMode=logreplay --online

remotehost3:rh2adm> hdbnsutil -sr_register --remoteHost=clusternode1 --remoteInstance=${TINSTANCE} --replicationMode=async --name=DC3 --remoteName=DC1 --operationMode=logreplay --online

Copy to Clipboard

Toggle word wrap

clusternode1 上的监控器将更改为：

Every 5.0s: python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?                                                                                         clusternode1: Tue Sep  5 00:14:40 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |remotehost3    |    30201 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |remotehost3    |    30207 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |remotehost3    |    30203 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "3": ACTIVE
status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
Status 15

Every 5.0s: python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?                                                                                         clusternode1: Tue Sep  5 00:14:40 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |remotehost3    |    30201 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |remotehost3    |    30207 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |remotehost3    |    30203 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "3": ACTIVE
status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
Status 15

Copy to Clipboard

Toggle word wrap

remotehost3 的监控器将更改为：

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                remotehost3: Tue Sep  5 02:15:28 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                remotehost3: Tue Sep  5 02:15:28 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

Copy to Clipboard

Toggle word wrap

现在，我们再次有 3 个条目，remotehost3 (DC3)再次是从 clusternode1 (DC1)复制的次要站点。

检查所有节点是否是 clusternode1 上的系统复制状态的一部分。

请在所有三个节点上运行 hdbnsutil -sr_state --sapcontrol=1 |grep site prerequisitesMode:

clusternode1:rh2adm> hdbnsutil -sr_state --sapcontrol=1 |grep  site.*ModesiteReplicationMode

clusternode1:rh2adm> hdbnsutil -sr_state --sapcontrol=1 |grep  site.*ModesiteReplicationMode

Copy to Clipboard

Toggle word wrap

clusternode2:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

clusternode2:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

Copy to Clipboard

Toggle word wrap

remotehost3:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

remotehost3:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

Copy to Clipboard

Toggle word wrap

在所有节点上，我们应该获得相同的输出：

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

Copy to Clipboard

Toggle word wrap

检查 pcs status --full 和 SOK。运行：

pcs status --full| grep sync_state

[root@clusternode1]# pcs status --full| grep sync_state

Copy to Clipboard

Toggle word wrap

输出应该是 PRIM 或 SOK ：

 * hana_rh2_sync_state             	: PRIM
    * hana_rh2_sync_state             	: SOK

 * hana_rh2_sync_state             	: PRIM
    * hana_rh2_sync_state             	: SOK

Copy to Clipboard

Toggle word wrap

最后，集群状态应如下所示，包括 sync_state PRIM 和 SOK ：

pcs status --full

[root@clusternode1]# pcs status --full
Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode1
 (1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Sep  5 00:18:52 2023
  * Last change:  Tue Sep  5 00:16:54 2023 by root via crm_attribute on clusternode1

  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ clusternode1
 (1) clusternode2 (2) ]

Full List of Resources:
  * auto_rhevm_fence1   (stonith:fence_rhevm):   Started clusternode1

  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1

  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode2
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode1

  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Started clusternode1


Node Attributes:
  * Node: clusternode1
 (1):
    * hana_rh2_clone_state              : PROMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode2
    * hana_rh2_roles                    : 4:P:master1:master:worker:master
    * hana_rh2_site                     : DC1
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : PRIM
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode1

    * lpa_rh2_lpt                       : 1693873014
    * master-SAPHana_RH2_02             : 150
  * Node: clusternode2 (2):
    * hana_rh2_clone_state              : DEMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode1

    * hana_rh2_roles                    : 4:S:master1:master:worker:master
    * hana_rh2_site                     : DC2
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : SOK
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode2
    * lpa_rh2_lpt                       : 30
    * master-SAPHana_RH2_02             : 100

Migration Summary:

Tickets:

PCSD Status:
  clusternode1
: Online
  clusternode2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Copy to Clipboard

Toggle word wrap

请参阅检查集群状态和 Check 数据库，以验证所有是否再次工作。

5.6. 测试 4：将主节点故障转移到第一个站点

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links