Expand|
测试的主题
|
主切换回一个集群节点。
故障恢复并再次启用集群。
将第三个站点重新注册为次要站点。
|
|
测试先决条件
| -
SAP HANA 主节点在第三个站点上运行。
-
集群部分正在运行。
-
集群
进入维护模式。
-
前一个集群主要可以访问。
|
|
测试步骤
|
检查集群的预期主要信息。
从 DC3 节点故障转移到 DC1 节点。
检查前期次要是否已切换到新主设备。
重新注册 remotehost3 作为新的次要。
设置 cluster maintenance-mode=false,集群将继续工作。
|
|
监控测试
|
在新的主启动时:
remotehost3:rh2adm> watch python $DIR_EXECUTABLE/python_support/systemReplicationStatus.py [root@clusternode1]# watch pcs status --full On the secondary start: clusternode:rh2adm> watch hdbnsutil -sr_state
|
|
启动测试
|
检查集群的预期主要内容: [root@clusternode1]# pcs resource
VIP 和提升 SAP HANA 资源应该在同一节点上运行,而这是潜在的新主要资源。
在这个潜在的主要运行中: clusternode1:rh2adm> hdbnsutil -sr_takeover
重新将前一个主重新注册为新次要:
clusternode1:rh2adm> hdbnsutil -sr_register \ --remoteHost=clusternode1 \ --remoteInstance=${TINSTANCE} \ --replicationMode=syncmem \ --name=DC3 \ --remoteName=DC1 \ --operationMode=logreplay \ --force_full_replica \ --online
在设置 maintenance-mode=false 后集群将继续工作。
|
|
预期结果
|
新主要是启动 SAP HANA。
复制状态将显示所有 3 个站点复制。
第二个集群站点会自动重新注册到新主站点。
灾难恢复(DR)站点成为数据库的额外副本。
|
|
返回初始状态的方法
|
运行测试 3。
|
详细描述
检查集群是否已设置为 maintenance-mode :
pcs property config maintenance-mode
[root@clusternode1]# pcs property config maintenance-mode
Cluster Properties:
maintenance-mode: true
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
如果 maintenance-mode 不是 true,您可以使用以下方法对其进行设置:
pcs property set maintenance-mode=true
[root@clusternode1]# pcs property set maintenance-mode=true
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
检查系统复制状态,并发现所有节点上的主数据库。首先使用以下命令发现主数据库:
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
其输出应如下所示:
在 clusternode1 上:
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 clusternode2 上:
clusternode2:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3
clusternode2:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 remotehost3:
remotehost3:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: primary
remotehost3:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: primary
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在所有三个节点上,主数据库为 remotehost3。在这个主数据库中,您必须确保所有三个节点的系统复制状态处于活跃状态,返回码为 15:
remotehost3:rh2adm> python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |remotehost3 |30201 |nameserver | 1 | 3 |DC3 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30207 |xsengine | 2 | 3 |DC3 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30203 |indexserver | 3 | 3 |DC3 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|SYSTEMDB |remotehost3 |30201 |nameserver | 1 | 3 |DC3 |clusternode1 | 30201 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30207 |xsengine | 2 | 3 |DC3 |clusternode1 | 30207 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30203 |indexserver | 3 | 3 |DC3 |clusternode1 | 30203 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "2": ACTIVE
status system replication site "1": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 3
site name: DC3
[rh2adm@remotehost3: python_support]# echo $?
15
remotehost3:rh2adm> python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |remotehost3 |30201 |nameserver | 1 | 3 |DC3 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30207 |xsengine | 2 | 3 |DC3 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30203 |indexserver | 3 | 3 |DC3 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|SYSTEMDB |remotehost3 |30201 |nameserver | 1 | 3 |DC3 |clusternode1 | 30201 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30207 |xsengine | 2 | 3 |DC3 |clusternode1 | 30207 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |remotehost3 |30203 |indexserver | 3 | 3 |DC3 |clusternode1 | 30203 | 1 |DC1 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "2": ACTIVE
status system replication site "1": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 3
site name: DC3
[rh2adm@remotehost3: python_support]# echo $?
15
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
请在所有三个节点上运行 hdbnsutil -sr_state --sapcontrol=1 |grep site prerequisitesMode:
clusternode1:rh2adm>hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode
clusternode1:rh2adm>hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
clusternode2:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
clusternode2:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
remotehost3:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
remotehost3:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
所有节点上的输出应该相同:
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=async
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=async
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 clusternode1 启动时:
clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"
clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 remotehost3 启动时:
remotehost3:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"
remotehost3:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 clusternode2 启动时:
clusternode2:rh2adm> watch "hdbnsutil -sr_state --sapcontrol=1 |grep siteReplicationMode"
clusternode2:rh2adm> watch "hdbnsutil -sr_state --sapcontrol=1 |grep siteReplicationMode"
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 clusternode1 上切换到 clusternode1 启动:
clusternode1:rh2adm> hdbnsutil -sr_takeover
done.
clusternode1:rh2adm> hdbnsutil -sr_takeover
done.
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
clusternode1 上的监控器将更改为:
Every 2.0s: python systemReplicationStatus.py; echo $? clusternode1: Mon Sep 4 23:34:30 2023
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
15
Every 2.0s: python systemReplicationStatus.py; echo $? clusternode1: Mon Sep 4 23:34:30 2023
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
15
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
重要信息也是返回代码 15。clusternode2 上的监控器将更改为:
Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode clusternode2: Mon Sep 4 23:35:18 2023
siteReplicationMode/DC1=primary
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC2=logreplay
Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode clusternode2: Mon Sep 4 23:35:18 2023
siteReplicationMode/DC1=primary
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC2=logreplay
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
DC3 已消失,需要重新注册。在 remotehost3 上,systemReplicationStatus 会报告一个错误,并将 returncode 变为 11。
pcs resource enable vip_RH2_02_MASTER
[root@clusternode1]# pcs resource enable vip_RH2_02_MASTER
Warning: 'vip_RH2_02_MASTER' is unmanaged
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
警告正确,因为集群不会启动任何资源,除非 maintenance-mode=false。
在我们停止 maintenance-mode 之前,我们应在单独的窗口中启动两个 monitor 以查看更改。在 clusternode2 上运行:
watch pcs status --full
[root@clusternode2]# watch pcs status --full
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 clusternode1 上运行:
clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo $?"
clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo $?"
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
现在,您可以运行以下命令,在 clusternode1 上取消设置 maintenance-mode :
pcs property set maintenance-mode=false
[root@clusternode1]# pcs property set maintenance-mode=false
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
clusternode2 上的 monitor 应该显示一切现在都如预期运行:
Every 2.0s: pcs status --full clusternode1: Tue Sep 5 00:01:17 2023
Cluster name: cluster1
Cluster Summary:
* Stack: corosync
* Current DC: clusternode1
(1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
* Last updated: Tue Sep 5 00:01:17 2023
* Last change: Tue Sep 5 00:00:30 2023 by root via crm_attribute on clusternode1
* 2 nodes configured
* 6 resource instances configured
Node List:
* Online: [ clusternode1
(1) clusternode2 (2) ]
Full List of Resources:
* auto_rhevm_fence1 (stonith:fence_rhevm): Started clusternode1
* Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode2
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode1
* Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Slave clusternode2
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Master clusternode1
* vip_RH2_02_MASTER (ocf::heartbeat:IPaddr2): Started clusternode1
Node Attributes:
* Node: clusternode1
(1):
* hana_rh2_clone_state : PROMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode2
* hana_rh2_roles : 4:P:master1:master:worker:master
* hana_rh2_site : DC1
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : PRIM
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode1
* lpa_rh2_lpt : 1693872030
* master-SAPHana_RH2_02 : 150
* Node: clusternode2 (2):
* hana_rh2_clone_state : DEMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode1
* hana_rh2_roles : 4:S:master1:master:worker:master
* hana_rh2_site : DC2
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : SOK
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode2
* lpa_rh2_lpt : 30
* master-SAPHana_RH2_02 : 100
Migration Summary:
Tickets:
PCSD Status:
clusternode1
: Online
clusternode2: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
Every 2.0s: pcs status --full clusternode1: Tue Sep 5 00:01:17 2023
Cluster name: cluster1
Cluster Summary:
* Stack: corosync
* Current DC: clusternode1
(1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
* Last updated: Tue Sep 5 00:01:17 2023
* Last change: Tue Sep 5 00:00:30 2023 by root via crm_attribute on clusternode1
* 2 nodes configured
* 6 resource instances configured
Node List:
* Online: [ clusternode1
(1) clusternode2 (2) ]
Full List of Resources:
* auto_rhevm_fence1 (stonith:fence_rhevm): Started clusternode1
* Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode2
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode1
* Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Slave clusternode2
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Master clusternode1
* vip_RH2_02_MASTER (ocf::heartbeat:IPaddr2): Started clusternode1
Node Attributes:
* Node: clusternode1
(1):
* hana_rh2_clone_state : PROMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode2
* hana_rh2_roles : 4:P:master1:master:worker:master
* hana_rh2_site : DC1
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : PRIM
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode1
* lpa_rh2_lpt : 1693872030
* master-SAPHana_RH2_02 : 150
* Node: clusternode2 (2):
* hana_rh2_clone_state : DEMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode1
* hana_rh2_roles : 4:S:master1:master:worker:master
* hana_rh2_site : DC2
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : SOK
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode2
* lpa_rh2_lpt : 30
* master-SAPHana_RH2_02 : 100
Migration Summary:
Tickets:
PCSD Status:
clusternode1
: Online
clusternode2: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
手动交互后,最好清理集群,如 Cluster Cleanup 所述。
-
将 remotehost3 重新注册到 clusternode1 上的新主卷。
需要重新注册 Remotehost3。要监控进度,请在 clusternode1 上启动:
clusternode1:rh2adm> watch -n 5 'python
/usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?'
clusternode1:rh2adm> watch -n 5 'python
/usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?'
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在 remotehost3 上,请开始:
remotehost3:rh2adm> watch 'hdbnsutil -sr_state --sapcontrol=1 |grep siteReplicationMode'
remotehost3:rh2adm> watch 'hdbnsutil -sr_state --sapcontrol=1 |grep siteReplicationMode'
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
现在,您可以使用这个命令重新注册 remotehost3:
remotehost3:rh2adm> hdbnsutil -sr_register --remoteHost=clusternode1 --remoteInstance=${TINSTANCE} --replicationMode=async --name=DC3 --remoteName=DC1 --operationMode=logreplay --online
remotehost3:rh2adm> hdbnsutil -sr_register --remoteHost=clusternode1 --remoteInstance=${TINSTANCE} --replicationMode=async --name=DC3 --remoteName=DC1 --operationMode=logreplay --online
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
clusternode1 上的监控器将更改为:
Every 5.0s: python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $? clusternode1: Tue Sep 5 00:14:40 2023
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |remotehost3 | 30201 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |remotehost3 | 30207 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |remotehost3 | 30203 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "3": ACTIVE
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
Status 15
Every 5.0s: python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $? clusternode1: Tue Sep 5 00:14:40 2023
|Database |Host |Port |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary |Replication |Replication |Replication |Secondary |
| | | | | | | |Host |Port |Site ID |Site Name |Active Status |Mode |Status |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |remotehost3 | 30201 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |remotehost3 | 30207 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |remotehost3 | 30203 | 3 |DC3 |YES |ASYNC |ACTIVE | | True |
|SYSTEMDB |clusternode1 |30201 |nameserver | 1 | 1 |DC1 |clusternode2 | 30201 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30207 |xsengine | 2 | 1 |DC1 |clusternode2 | 30207 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
|RH2 |clusternode1 |30203 |indexserver | 3 | 1 |DC1 |clusternode2 | 30203 | 2 |DC2 |YES |SYNCMEM |ACTIVE | | True |
status system replication site "3": ACTIVE
status system replication site "2": ACTIVE
overall system replication status: ACTIVE
Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode: PRIMARY
site id: 1
site name: DC1
Status 15
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
remotehost3 的监控器将更改为:
Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode remotehost3: Tue Sep 5 02:15:28 2023
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode remotehost3: Tue Sep 5 02:15:28 2023
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
现在,我们再次有 3 个条目,remotehost3 (DC3)再次是从 clusternode1 (DC1)复制的次要站点。
-
检查所有节点是否是 clusternode1 上的系统复制状态的一部分。
请在所有三个节点上运行 hdbnsutil -sr_state --sapcontrol=1 |grep site prerequisitesMode:
clusternode1:rh2adm> hdbnsutil -sr_state --sapcontrol=1 |grep site.*ModesiteReplicationMode
clusternode1:rh2adm> hdbnsutil -sr_state --sapcontrol=1 |grep site.*ModesiteReplicationMode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
clusternode2:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
clusternode2:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
remotehost3:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
remotehost3:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
在所有节点上,我们应该获得相同的输出:
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
-
检查
pcs status --full 和 SOK。运行:
pcs status --full| grep sync_state
[root@clusternode1]# pcs status --full| grep sync_state
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
输出应该是 PRIM 或 SOK :
* hana_rh2_sync_state : PRIM
* hana_rh2_sync_state : SOK
* hana_rh2_sync_state : PRIM
* hana_rh2_sync_state : SOK
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
最后,集群状态应如下所示,包括 sync_state PRIM 和 SOK :
pcs status --full
[root@clusternode1]# pcs status --full
Cluster name: cluster1
Cluster Summary:
* Stack: corosync
* Current DC: clusternode1
(1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
* Last updated: Tue Sep 5 00:18:52 2023
* Last change: Tue Sep 5 00:16:54 2023 by root via crm_attribute on clusternode1
* 2 nodes configured
* 6 resource instances configured
Node List:
* Online: [ clusternode1
(1) clusternode2 (2) ]
Full List of Resources:
* auto_rhevm_fence1 (stonith:fence_rhevm): Started clusternode1
* Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode2
* SAPHanaTopology_RH2_02 (ocf::heartbeat:SAPHanaTopology): Started clusternode1
* Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Slave clusternode2
* SAPHana_RH2_02 (ocf::heartbeat:SAPHana): Master clusternode1
* vip_RH2_02_MASTER (ocf::heartbeat:IPaddr2): Started clusternode1
Node Attributes:
* Node: clusternode1
(1):
* hana_rh2_clone_state : PROMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode2
* hana_rh2_roles : 4:P:master1:master:worker:master
* hana_rh2_site : DC1
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : PRIM
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode1
* lpa_rh2_lpt : 1693873014
* master-SAPHana_RH2_02 : 150
* Node: clusternode2 (2):
* hana_rh2_clone_state : DEMOTED
* hana_rh2_op_mode : logreplay
* hana_rh2_remoteHost : clusternode1
* hana_rh2_roles : 4:S:master1:master:worker:master
* hana_rh2_site : DC2
* hana_rh2_sra : -
* hana_rh2_srah : -
* hana_rh2_srmode : syncmem
* hana_rh2_sync_state : SOK
* hana_rh2_version : 2.00.062.00
* hana_rh2_vhost : clusternode2
* lpa_rh2_lpt : 30
* master-SAPHana_RH2_02 : 100
Migration Summary:
Tickets:
PCSD Status:
clusternode1
: Online
clusternode2: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
请参阅 检查集群状态和 Check 数据库,以验证所有是否再次工作。