Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 9. Troubleshooting
9.1. SAPInstance resources time out during monitor operations Link kopierenLink in die Zwischenablage kopiert!
The monitor operation of a cluster resource can fail when the application does not respond within the defined timeout.
For application resources like SAPInstance, one reason for such a timeout could be that the underlying filesystem on which the application is running is not responding at that moment. When your SAP instances are running on a shared NFS filesystem, a delay in the filesystem response can affect all instances on that filesystem on different cluster nodes. This can happen, for example, when the NFS server is temporarily overloaded or when there is maintenance ongoing on the NFS infrastructure or the related network connection.
Commands that the SAPInstance resource runs during a monitor operation:
-
systemctl status <instance_service>, when the instance is systemd-enabled. -
pgrepto check for running processes related to the instance. -
sapcontrolfor different functions, likeGetProcessListto get the list of instance components and compare them to theMONITOR_SERVICESlist. -
sapstartsrvto start the SAP start service of the instance, if it is not running.
You can increase the monitor operation timeouts of your resources to allow for lags in monitoring responses.
Procedure
Review the current settings of the affected resource, for example, the
SAPInstanceresource of your ASCS instance. The default monitor timeout ofSAPInstanceresource is 60 seconds:[root]# pcs resource config rsc_SAPInstance_S4H_ASCS20 Resource: rsc_SAPInstance_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPInstance) Attributes: rsc_SAPInstance_S4H_ASCS20-instance_attributes InstanceName=S4H_ASCS20_s4hascs MINIMAL_PROBE=true Meta Attributes: rsc_SAPInstance_S4H_ASCS20-meta_attributes resource-stickiness=5000 Operations: demote: rsc_SAPInstance_S4H_ASCS20-demote-interval-0s interval=0s timeout=320s methods: rsc_SAPInstance_S4H_ASCS20-methods-interval-0s interval=0s timeout=5s monitor: rsc_SAPInstance_S4H_ASCS20-monitor-interval-20 interval=20 timeout=60 on-fail=restart promote: rsc_SAPInstance_S4H_ASCS20-promote-interval-0s interval=0s timeout=320s reload: rsc_SAPInstance_S4H_ASCS20-reload-interval-0s interval=0s timeout=320s start: rsc_SAPInstance_S4H_ASCS20-start-interval-0s interval=0s timeout=180s stop: rsc_SAPInstance_S4H_ASCS20-stop-interval-0s interval=0s timeout=240sUpdate the monitor timeout of the resource to a value that fits your environment and requirements. For example, increase the timeout to
120s:[root]# pcs resource update rsc_SAPInstance_S4H_ASCS20 op monitor timeout=120sVerify the updated resource settings:
[root]# pcs resource config rsc_SAPInstance_S4H_ASCS20 Resource: rsc_SAPInstance_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPInstance) Attributes: rsc_SAPInstance_S4H_ASCS20-instance_attributes InstanceName=S4H_ASCS20_s4hascs MINIMAL_PROBE=true Meta Attributes: rsc_SAPInstance_S4H_ASCS20-meta_attributes resource-stickiness=5000 Operations: … monitor: rsc_SAPInstance_S4H_ASCS20-monitor-interval-60s interval=60s timeout=120s …
9.2. Communication gets lost between SAP application servers Link kopierenLink in die Zwischenablage kopiert!
The connections between your SAP application servers can get closed when they are idle for some time. This depends on the network landscape of your systems.
If you encounter issues like the communication between applications getting lost, you can try to tune the keepalive settings in the application instances and in the OS.
Procedure
Check the current values of the following TCP keepalive operating system settings on your SAP application servers, for example:
[root]# sysctl -a --pattern net.ipv4.tcp_keepalive net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200Temporarily update the TCP keepalive settings in a way that helps in your particular situation. The following are example values that SAP recommends in SAP Note 1410736 - TCP/IP: setting keepalive interval. Apply changes only as required for your environment and on all cluster nodes:
[root]# sysctl -w \ net.ipv4.tcp_keepalive_time=300 \ net.ipv4.tcp_keepalive_intvl=75 \ net.ipv4.tcp_keepalive_probes=9- Test if the settings improve the situation.
Make the changes permanent. Add the TCP keepalive settings to a
sysctlconfiguration file for a persistent setup. Do this on every cluster node:[root]# cat << EOF >> /etc/sysctl.d/sap.conf net.ipv4.tcp_keepalive_time=300 net.ipv4.tcp_keepalive_intvl=75 net.ipv4.tcp_keepalive_probes=9 EOF
Verification
Check the new values of the TCP keepalive operating system settings on all cluster nodes:
[root]# sysctl -a --pattern net.ipv4.tcp_keepalive net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 300