Este conteúdo não está disponível no idioma selecionado.

Chapter 10. Troubleshooting


10.1. The srHook cluster attribute value is incorrect

When the srHook attribute value does not match the actual HANA system replication status, it can lead to unexpected behavior in the cluster when a failure of a primary instance occurs.

Check and correct your sudo configuration when the srHook attribute of the secondary site and the HANA system replication status do not match:

  • The srHook cluster attribute of the secondary is empty.
  • The srHook cluster attribute of the secondary is set to SOK while the HANA system replication is not healthy.
  • The srHook cluster attribute of the secondary is set to SFAIL while the system replication is in ACTIVE state.

The primary site receives the events of HANA system replication changes and stores the result as a cluster attribute for the secondary site.

Procedure

  1. Check for crm_attribute update errors in the secure log, since the command is executed using sudo. The log shows the command that the hook script tries to execute, but potentially fails. Check on the primary instance node for an error like command not allowed, like in this example:

    [root]# grep crm_attribute /var/log/secure
    ... rh1adm : command not allowed ; PWD=/hana/shared/RH1/HDB02/<node> ; USER=root ; COMMAND=/usr/sbin/crm_attribute -n hana_rh1_site_srHook_DC2 -v SFAIL -t crm_config -s SAPHanaSR
  2. Compare the logged COMMAND to your sudoers configuration. Check thoroughly and fix the sudoers file, so that you have a sudo entry that matches the command. As a temporary measure you can ensure that the sudo entry as such works by simplifying it with a wildcard to exclude typos in the command parameters as the cause:

    [root]# cat /etc/sudoers.d/20-saphana
    Defaults:<sid>adm !requiretty
    <sid>adm ALL=(ALL) NOPASSWD: /usr/sbin/crm_attribute *
    • Replace <sid> with your lower-case HANA SID.
  3. Verify that the command path is correct:

    [root]# ls /usr/sbin/crm_attribute
    /usr/sbin/crm_attribute
  4. Fix the sudo configuration. For more information, see Configuring the HanaSR HA/DR provider for the srConnectionChanged() hook method.
  5. Repeat any fixing steps on all nodes. The sudo configuration must be identical on all instances.

10.2. The HANA instance does not start after hook changes

You recently made changes in the global.ini in a HA/DR provider section and the HANA instance does not start anymore.

Procedure

  1. Go to the HANA trace logs directory, as the <sid>adm user:

    rh1adm $ cdtrace
  2. Check for errors related to the HA/DR providers in the HANA nameserver process alert log:

    rh1adm $ grep ha_dr_provider nameserver_alert_*.trc
    ... ha_dr_provider   PythonProxyImpl.cpp(00145) : import of hanasr failed: No module named 'hanasr'
    ... ha_dr_provider   HADRProviderManager.cpp(00100) : could not load HA/DR Provider 'hanasr' from /usr/share/sap-hana-ha/
  3. Identify the root cause, for example a misspelled HA/DR provider name or a wrong path. Check the path and the hook script name. In this example the HA/DR provider name hanasr is not matching the hook script name HanaSR:

    rh1adm $ ls /usr/share/sap-hana-ha/
    ChkSrv.py  HanaSR.py  samples
  4. Correct the HanaSR HA/DR provider configuration:

    [ha_dr_provider_hanasr]
    provider = HanaSR
    path = /usr/share/sap-hana-ha/
    execution_order = 1
    • provider must match the name of the Python hook script. It is case-sensitive without the .py file suffix.
    • path must be the path in which the hook script is stored.

10.3. A cluster node is reported as offline during maintenance

When maintenance-mode is set for the cluster, for example, for a HANA update, it can still notice issues between the nodes, but does not trigger recovery actions yet.

If you encounter such a situation, you must first fix the cause of the issue before you lift the maintenance mode.

Example: the corosync communication between the nodes is blocked in a 2-node cluster

Both nodes report the other node as offline. If the maintenance mode is removed in this situation, the cluster tries to recover by fencing one node. This can have a severe impact on your ongoing HANA maintenance activity.

...
               *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Node List:
  * Node hana1 (1): online, feature set 3.19.0
  * Node hana2 (2): UNCLEAN (offline)

Full List of Resources:
  * Clone Set: cln_SAPHanaTop_RH1_HDB02 [rsc_SAPHanaTop_RH1_HDB02] (maintenance):
    * rsc_SAPHanaTop_RH1_HDB02  (ocf:heartbeat:SAPHanaTopology):         Started hana2 (UNCLEAN, maintenance)
    * rsc_SAPHanaTop_RH1_HDB02  (ocf:heartbeat:SAPHanaTopology):         Started hana1 (maintenance)
  * Clone Set: cln_SAPHanaCon_RH1_HDB02 [rsc_SAPHanaCon_RH1_HDB02] (promotable, maintenance):
    * rsc_SAPHanaCon_RH1_HDB02  (ocf:heartbeat:SAPHanaController):       Unpromoted hana2 (UNCLEAN, maintenance)
    * rsc_SAPHanaCon_RH1_HDB02  (ocf:heartbeat:SAPHanaController):       Promoted hana1 (maintenance)
  * Clone Set: cln_SAPHanaFil_RH1_HDB02 [rsc_SAPHanaFil_RH1_HDB02] (maintenance):
    * rsc_SAPHanaFil_RH1_HDB02  (ocf:heartbeat:SAPHanaFilesystem):       Started hana2 (UNCLEAN, maintenance)
    * rsc_SAPHanaFil_RH1_HDB02  (ocf:heartbeat:SAPHanaFilesystem):       Started hana1 (maintenance)
  * rsc_vip_RH1_HDB02_primary   (ocf:heartbeat:IPaddr2):         Started hana1 (maintenance)
  * rsc_vip_RH1_HDB02_readonly  (ocf:heartbeat:IPaddr2):         Started hana2 (UNCLEAN, maintenance)

...

Identify the root cause of the issue, for example:

  • Planned network maintenance on the cluster communication connection in parallel to your HANA maintenance.
  • Unplanned outage of network connections due to network device failures or misconfiguration on operating system or network level.
  • Firewall configuration blocking cluster communication ports.

Fix any issue to prevent the cluster from taking recovery measures when the cluster maintenance is removed.

Red Hat logoGithubredditYoutubeTwitter

Aprender

Experimente, compre e venda

Comunidades

Sobre a documentação da Red Hat

Ajudamos os usuários da Red Hat a inovar e atingir seus objetivos com nossos produtos e serviços com conteúdo em que podem confiar. Explore nossas atualizações recentes.

Tornando o open source mais inclusivo

A Red Hat está comprometida em substituir a linguagem problemática em nosso código, documentação e propriedades da web. Para mais detalhes veja o Blog da Red Hat.

Sobre a Red Hat

Fornecemos soluções robustas que facilitam o trabalho das empresas em plataformas e ambientes, desde o data center principal até a borda da rede.

Theme

© 2026 Red Hat
Voltar ao topo