5.6. テスト 4: プライマリーノードを 1 番目のサイトにフェイルバックする

PDF

テストの内容	プライマリーがクラスターノードに戻る。フェイルバックしてクラスターを再度有効にする。 3 番目のサイトをセカンダリーとして再登録する。
テストの前提条件	SAP HANA プライマリーノードが 3 番目のサイトで稼働している。クラスターが部分的に稼働している。クラスターが `maintenance-mode` になっている。以前のクラスタープライマリーにアクセスできる。
テストの手順	クラスターの予想されるプライマリーを確認します。 DC3 ノードから DC1 ノードにフェイルオーバーします。以前のセカンダリーが新しいプライマリーに切り替わったかどうかを確認します。 remotehost3 を新しいセカンダリーとして再登録します。クラスターを `maintenance-mode=false` に設定し、クラスターを引き続き動作させます。
テストのモニタリング	新しいプライマリーの起動時に、次のコマンドを実行します。 `remotehost3:rh2adm> watch python $DIR_EXECUTABLE/python_support/systemReplicationStatus.py [root@clusternode1]# watch pcs status --full on the secondary start: clusternode:rh2adm> watch hdbnsutil -sr_state`
テストの開始	クラスターの予想されるプライマリーを確認します： `[root@clusternode1]# pcs resource` VIP およびプロモートされた SAP HANA リソースを、新しい潜在的なプライマリーである同じノード上で実行する必要があります。この潜在的なプライマリー実行： `clusternode1:rh2adm> hdbnsutil -sr_takeover` 以前のプライマリーを新しいセカンダリーとして再登録します。 `clusternode1:rh2adm> hdbnsutil -sr_register \ --remoteHost=clusternode1 \ --remoteInstance=${TINSTANCE} \ --replicationMode=syncmem \ --name=DC3 \ --remoteName=DC1 \ --operationMode=logreplay \ --force_full_replica \ --online` `maintenance-mode=false` を設定すると、クラスターが引き続き動作します。
期待される結果	新しいプライマリーが SAP HANA を起動します。レプリケーションステータスで、3 つのサイトすべてがレプリケートされたことが示されます。 2 番目のクラスターサイトが、新しいプライマリーに自動的に再登録されます。 Disaster Recovery (DR)サイトは、データベースの追加レプリカになります。
初期状態に戻す方法	テスト 3 を実行します。

詳細な説明:

クラスターが maintenance-mode になっているか確認します。

[root@clusternode1]# pcs property config maintenance-mode
Cluster Properties:
 maintenance-mode: true

maintenance-mode が true でない場合は、次のように設定できます。

[root@clusternode1]# pcs property set  maintenance-mode=true

システムレプリケーションのステータスを確認し、すべてのノードでプライマリーデータベースを検出します。まず、次を使用してプライマリーデータベースを検出します。
```
clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
```

出力は、以下のようになります。

clusternode1 では：

clusternode1:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

clusternode2 の場合：

clusternode2:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: syncmem
primary masters: remotehost3

remotehost3 では、以下を行います。

remotehost3:rh2adm> hdbnsutil -sr_state | egrep -e "^mode:|primary masters"
mode: primary

3 つのノードでは、プライマリーデータベースは remotehost3 です。このプライマリーデータベースでは、3 つすべてのノードでシステムレプリケーションのステータスが active であり、戻りコードが 15 である必要があります。

remotehost3:rh2adm> python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py
|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|SYSTEMDB |remotehost3 |30201 |nameserver   |        1 |      3 |DC3       |clusternode1    |    30201 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30207 |xsengine     |        2 |      3 |DC3       |clusternode1    |    30207 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |remotehost3 |30203 |indexserver  |        3 |      3 |DC3       |clusternode1    |    30203 |        1 |DC1       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
status system replication site "1": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 3
site name: DC3
[rh2adm@remotehost3: python_support]# echo $?
15

3 つの sr_states がすべて一貫してあるかどうかを確認します。

3 つのノードすべてで hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode を実行してください。

clusternode1:rh2adm>hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode

clusternode2:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

remotehost3:rh2adm>hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

この出力はすべてのノードで同じである必要があります。

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=async
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

別のウィンドウでモニタリングを開始します。

clusternode1 が起動する場合は、以下のようになります。

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

remotehost3 の起動では、以下のようになります。

remotehost3:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo \$?"

clusternode2 が起動する場合：

clusternode2:rh2adm> watch "hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode"

テストを開始します。

clusternode1 へのフェイルオーバーは、clusternode1 で起動するには、以下を実行します。

clusternode1:rh2adm> hdbnsutil -sr_takeover
done.

モニターの出力を確認します。

clusternode1 のモニターは次のように変更されます。

Every 2.0s: python systemReplicationStatus.py; echo $?                                                                                                                                                            clusternode1: Mon Sep  4 23:34:30 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
15

重要点も戻りコード 15 です。clusternode2 のモニターは次のように変更されます。

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                clusternode2: Mon Sep  4 23:35:18 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC2=logreplay

DC3 はなくなっており、再登録する必要があります。remotehost3 では、systemReplicationStatus がエラーを報告し、リターンコードが 11 に変わります。

クラスターノードが再登録されるかどうかを確認します。

clusternode1:rh2adm>  hdbnsutil -sr_state

System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~

online: true

mode: primary
operation mode: primary
site id: 1
site name: DC1

is source system: true
is secondary/consumer system: false
has secondaries/consumers attached: true
is a takeover active: false
is primary suspended: false

Host Mappings:
~~~~~~~~~~~~~~

clusternode1 -> [DC2] clusternode2
clusternode1 -> [DC1] clusternode1


Site Mappings:
~~~~~~~~~~~~~~
DC1 (primary/primary)
    |---DC2 (syncmem/logreplay)

Tier of DC1: 1
Tier of DC2: 2

Replication mode of DC1: primary
Replication mode of DC2: syncmem

Operation mode of DC1: primary
Operation mode of DC2: logreplay

Mapping: DC1 -> DC2
done.

Site Mappings が表示され、clusternode2 (DC2)が再登録されました。

vip リソースを確認または有効にします。

[root@clusternode1]# pcs resource
  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02] (unmanaged):
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2 (unmanaged)
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1
   (unmanaged)
  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable, unmanaged):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode2 (unmanaged)
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode1
   (unmanaged)
  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Stopped (disabled, unmanaged)

vip リソース vip_RH2_02_MASTER が停止します。再度起動するには、以下のコマンドを実行します。

[root@clusternode1]# pcs resource enable vip_RH2_02_MASTER
Warning: 'vip_RH2_02_MASTER' is unmanaged

クラスターは maintenance-mode=false でない限りリソースを開始しないため、警告は右です。

クラスター maintenance-mode を停止します。

maintenance-mode を停止する前に、変更を確認するために、別のウィンドウで 2 つのモニターを起動する必要があります。clusternode2 で、以下を実行します。

[root@clusternode2]# watch pcs status --full

clusternode1 で、次を実行します。

clusternode1:rh2adm> watch "python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py; echo $?"

以下を実行して、clusternode1 の maintenance-mode の設定を解除できるようになりました。

[root@clusternode1]# pcs property set maintenance-mode=false

clusternode2 のモニターには、すべてが予想通りに実行されていることが表示されます。

Every 2.0s: pcs status --full                                                                                                                                                                                     clusternode1: Tue Sep  5 00:01:17 2023

Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode1
 (1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Sep  5 00:01:17 2023
  * Last change:  Tue Sep  5 00:00:30 2023 by root via crm_attribute on clusternode1

  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ clusternode1
 (1) clusternode2 (2) ]

Full List of Resources:
  * auto_rhevm_fence1   (stonith:fence_rhevm):   Started clusternode1

  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1

  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode2
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode1

  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Started clusternode1


Node Attributes:
  * Node: clusternode1
 (1):
    * hana_rh2_clone_state              : PROMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode2
    * hana_rh2_roles                    : 4:P:master1:master:worker:master
    * hana_rh2_site                     : DC1
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : PRIM
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode1

    * lpa_rh2_lpt                       : 1693872030
    * master-SAPHana_RH2_02             : 150
  * Node: clusternode2 (2):
    * hana_rh2_clone_state              : DEMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode1

    * hana_rh2_roles                    : 4:S:master1:master:worker:master
    * hana_rh2_site                     : DC2
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : SOK
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode2
    * lpa_rh2_lpt                       : 30
    * master-SAPHana_RH2_02             : 100

Migration Summary:

Tickets:

PCSD Status:
  clusternode1
: Online
  clusternode2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

手動との対話後、クラスターのクリーンアップで説明されているように、クラスターをクリーンアップするのに適したアドバイスが常にあります。???

remotehost3 を clusternode1 の新しいプライマリーに再登録します。

Remotehost3 を再登録する必要があります。進行状況を監視するには、clusternode1 で開始してください。

clusternode1:rh2adm> watch -n 5 'python
/usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?'

remotehost3で起動してください。

remotehost3:rh2adm> watch 'hdbnsutil -sr_state --sapcontrol=1 |grep  siteReplicationMode'

これで、以下のコマンドを使用して remotehost3 を再登録できるようになります。

remotehost3:rh2adm> hdbnsutil -sr_register --remoteHost=clusternode1 --remoteInstance=${TINSTANCE} --replicationMode=async --name=DC3 --remoteName=DC1 --operationMode=logreplay --online

clusternode1 のモニターは次のように変更されます。

Every 5.0s: python /usr/sap/$SAPSYSTEMNAME/HDB${TINSTANCE}/exe/python_support/systemReplicationStatus.py ; echo Status $?                                                                                         clusternode1: Tue Sep  5 00:14:40 2023

|Database |Host   |Port  |Service Name |Volume ID |Site ID |Site Name |Secondary |Secondary |Secondary |Secondary |Secondary     |Replication |Replication |Replication    |Secondary    |
|         |       |      |             |          |        |          |Host      |Port      |Site ID   |Site Name |Active Status |Mode        |Status      |Status Details |Fully Synced |
|-------- |------ |----- |------------ |--------- |------- |--------- |--------- |--------- |--------- |--------- |------------- |----------- |----------- |-------------- |------------ |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |remotehost3    |    30201 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |remotehost3    |    30207 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |remotehost3    |    30203 |        3 |DC3       |YES           |ASYNC     |ACTIVE      |               |        True |
|SYSTEMDB |clusternode1 |30201 |nameserver   |        1 |      1 |DC1       |clusternode2    |    30201 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30207 |xsengine     |        2 |      1 |DC1       |clusternode2    |    30207 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |
|RH2      |clusternode1 |30203 |indexserver  |        3 |      1 |DC1       |clusternode2    |    30203 |        2 |DC2       |YES           |SYNCMEM     |ACTIVE      |               |        True |

status system replication site "3": ACTIVE
status system replication site "2": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 1
site name: DC1
Status 15

また、remotehost3 のモニターは次のように変更されます。

Every 2.0s: hdbnsutil -sr_state --sapcontrol=1 |grep  site.*Mode                                                                remotehost3: Tue Sep  5 02:15:28 2023

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

これで 3 つのエントリーが再度あり、remotehost3 (DC3)が clusternode1 (DC1)からレプリケートされたセカンダリーサイトになりました。

すべてのノードが clusternode1 のシステムレプリケーションステータスの一部であるかどうかを確認します。

3 つのノードすべてで hdbnsutil -sr_state --sapcontrol=1 |grep site.*Mode を実行してください。

clusternode1:rh2adm> hdbnsutil -sr_state --sapcontrol=1 |grep  site.*ModesiteReplicationMode

clusternode2:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

remotehost3:rh2adm> hsbnsutil -sr_state --sapcontrol=1 | grep site.*Mode

すべてのノードで、同じ出力を取得する必要があります。

siteReplicationMode/DC1=primary
siteReplicationMode/DC3=syncmem
siteReplicationMode/DC2=syncmem
siteOperationMode/DC1=primary
siteOperationMode/DC3=logreplay
siteOperationMode/DC2=logreplay

pcs status --full および SOK を確認します。以下を実行します。

[root@clusternode1]# pcs status --full| grep sync_state

出力は PRIM または SOK のいずれかである必要があります。

 * hana_rh2_sync_state             	: PRIM
    * hana_rh2_sync_state             	: SOK

最後に、クラスターのステータスは、sync_state PRIM および SOK を含むようになります。

[root@clusternode1]# pcs status --full
Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: clusternode1
 (1) (version 2.1.2-4.el8_6.6-ada5c3b36e2) - partition with quorum
  * Last updated: Tue Sep  5 00:18:52 2023
  * Last change:  Tue Sep  5 00:16:54 2023 by root via crm_attribute on clusternode1

  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ clusternode1
 (1) clusternode2 (2) ]

Full List of Resources:
  * auto_rhevm_fence1   (stonith:fence_rhevm):   Started clusternode1

  * Clone Set: SAPHanaTopology_RH2_02-clone [SAPHanaTopology_RH2_02]:
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode2
    * SAPHanaTopology_RH2_02    (ocf::heartbeat:SAPHanaTopology):        Started clusternode1

  * Clone Set: SAPHana_RH2_02-clone [SAPHana_RH2_02] (promotable):
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Slave clusternode2
    * SAPHana_RH2_02    (ocf::heartbeat:SAPHana):        Master clusternode1

  * vip_RH2_02_MASTER   (ocf::heartbeat:IPaddr2):        Started clusternode1


Node Attributes:
  * Node: clusternode1
 (1):
    * hana_rh2_clone_state              : PROMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode2
    * hana_rh2_roles                    : 4:P:master1:master:worker:master
    * hana_rh2_site                     : DC1
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : PRIM
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode1

    * lpa_rh2_lpt                       : 1693873014
    * master-SAPHana_RH2_02             : 150
  * Node: clusternode2 (2):
    * hana_rh2_clone_state              : DEMOTED
    * hana_rh2_op_mode                  : logreplay
    * hana_rh2_remoteHost               : clusternode1

    * hana_rh2_roles                    : 4:S:master1:master:worker:master
    * hana_rh2_site                     : DC2
    * hana_rh2_sra                      : -
    * hana_rh2_srah                     : -
    * hana_rh2_srmode                   : syncmem
    * hana_rh2_sync_state               : SOK
    * hana_rh2_version                  : 2.00.062.00
    * hana_rh2_vhost                    : clusternode2
    * lpa_rh2_lpt                       : 30
    * master-SAPHana_RH2_02             : 100

Migration Summary:

Tickets:

PCSD Status:
  clusternode1
: Online
  clusternode2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

すべてが再び正常に動作することを確認するには、クラスターのステータスの確認およびデータベースの確認を参照してください。

5.6. テスト 4: プライマリーノードを 1 番目のサイトにフェイルバックする

詳細情報

試用、購入および販売

コミュニティー

Red Hat ドキュメントについて

多様性を受け入れるオープンソースの強化

会社概要

Red Hat legal and privacy links

Red Hat legal and privacy links