Deploying SAP NetWeaver or S/4HANA Application Server High Availability with simple mount


Red Hat Enterprise Linux for SAP Solutions 9

Creating an HA cluster for managing SAP NetWeaver or SAP S/4HANA application server instances with a simplified filesystem configuration using the SAPStartSrv cluster resource agent

Red Hat Customer Content Services

Abstract

Configure SAP NetWeaver or S/4HANA application server instances and install a Pacemaker cluster. Manage your SAP NetWeaver or S/4HANA application server instances more efficiently by using the RHEL for SAP Solutions components for higher availability.

Providing feedback on Red Hat documentation

We appreciate your feedback on our documentation. Let us know how we can improve it.

Submitting feedback through Jira (account required)

  1. Make sure you are logged in to the Jira website.
  2. Click on this link to provide feedback.
  3. Enter a descriptive title in the Summary field.
  4. Enter your suggestion for improvement in the Description field. Include links to the relevant parts of the documentation.
  5. Click Create at the bottom of the dialogue.

Deploy two or more systems in a Pacemaker cluster and configure your SAP NetWeaver or S/4HANA application server instances in the cluster for an advanced high availability (HA) of your applications. The cluster HA setup helps you manage your SAP services automatically in the case of a failure.

1.1. Terminology

  • node

    One host or system in a HA cluster setup, also called a cluster member.

  • cluster

    Cluster is the high-availability setup using the Pacemaker cluster manager from the RHEL HA Add-On. It consists of two or more members, or nodes.

  • instance

    One dedicated SAP application server installation.

1.2. SAP NetWeaver or S/4HANA High Availability

An SAP NetWeaver or S/4HANA environment as a high-availability system consists of a set of different instances. The central services instance (SCS or ASCS) and the database instances are single points of failure. Therefore, it is important that you configure an HA solution to protect these instances to avoid data loss or corruption and unnecessary outages of the SAP system.

For the application servers, the enqueue lock table, which is managed by the Standalone Enqueue Server in the ASCS instance, is the most critical component. To enhance resilience, the Enqueue Replication Server (ERS) keeps a backup copy of the Standalone Enqueue Server’s lock table. In an HA setup the ASCS instance runs on a different server than the ERS instance to keep the lock table and its backup separate.

The Standalone Enqueue Server 2 and Enqueue Replicator 2 are improved versions of the classic enqueue replication components.

Figure1.1: SAP application server instances in an HA cluster setup using a shared filesystem for the instances

Standalone Enqueue Server (ENSA1)

SAP NetWeaver and SAP S/4HANA older than ABAP Platform 2020 support enqueue replication as ENSA1. When there is an issue with the ASCS instance in an ENSA1 setup, it is required that the ASCS instance recover next to the ERS instance. That means an HA cluster must start the ASCS instance on the host where the ERS instance is currently running. Using the classic enqueue replication components, this is necessary to restore the enqueue lock table from the backup managed by the ERS instance into the ASCS instance. See The SAP Lock Concept and Standalone Enqueue Server for more information on how the Standalone Enqueue Server (ENSA1) works.

Standalone Enqueue Server 2 (ENSA2)

Starting with SAP NetWeaver 7.52 and S/4HANA, the ENSA2 setup is supported. Starting with S/4HANA 1809, ENSA2 is the default installation. Contrary to the ENSA1 setup, the new Standalone Enqueue Server 2 does not have to follow the Enqueue Replicator 2 anymore. This means that the HA cluster can start the ASCS instance on any available cluster node, no matter on which node the ERS instance is running. The new version is able to restore the lock table through the network connection between the ASCS and ERS instances. Using the ENSA2 configuration enhances the flexibility and allows the configuration of more than two HA cluster nodes for even higher resiliency. For more information on ENSA2, see SAP Note 2630416 - Support for Standalone Enqueue Server 2. Using SAP S/4HANA, you can also configure a cost-optimized HA cluster. In such a setup you configure the cluster for managing the HANA system replication as well as managing the ASCS and ERS instances. For more information, see Configuring a Cost-Optimized SAP S/4HANA HA cluster (HANA System Replication + ENSA2) using the RHEL HA Add-On.

Multi-SID support

You can manage the ASCS/ERS instances for multiple SAP environments (Multi-SID) within the same HA cluster. In this case you must take additional considerations into account.

  • Unique SID and instance number

To avoid conflicts, you must install each pair of ASCS/ERS instances with a different SID. Additionally, each instance must have a unique instance number, even if the instances belong to different SIDs.

  • Sizing

Ensure that each HA cluster node meets the SAP requirements for sizing to support multiple instances. You can check SAP resources like Hardware Requirements or Sizing - Helping our Customers Determine Their Hardware Requirements.

The following two packages provide the components for managing SAP application server instances in a HA cluster:

  • resource-agents-sap
  • sap-cluster-connector (only needed when the SAP HA Interface is used)
Note

You must use resource-agents-sap-4.15.1 or a newer version for the simplified filesystem configuration. Older versions do not provide the mandatory SAPStartSrv resource agent.

The packages provide the resource agents and additional tools for your setup:

The listed packages provide the following resource agents and additional tools for your setup:

  • SAPDatabase

    The SAPDatabase resource agent manages a legacy database for a SAP environment, like Oracle, IBM DB2, SAP ASE or SAP MaxDB. You can use this resource only in combination with a SAP NetWeaver setup.

  • SAPInstance

    The SAPInstance resource agent manages the SAP application server instances using the SAP Start Service that is part of the SAP Kernel. In addition to the ASCS, ERS, PAS, and AAS instances, it can also manage other SAP instance types, like standalone SAP Web Dispatcher or standalone SAP Gateway instances. See How to manage standalone SAP Web Dispatcher instances using the RHEL HA Add-On for information on how to configure a pacemaker resource for managing such instances. The SAP startup framework is responsible for all operations of the SAPInstance resource agent, and it communicates with the sapstartsrv process of each SAP instance for status information. sapstartsrv knows 4 status colors:

    Expand

    Color

    Meaning

    GREEN

    Everything is fine.

    YELLOW

    Something is wrong, but the service is still working.

    RED

    The service does not work.

    GRAY

    The service is stopped.

    The SAPInstance resource agent interprets GREEN and YELLOW as healthy, and it reports the statuses RED and GRAY as NOT_RUNNING to the cluster. The versions of the SAPInstance resource agent shipped with RHEL 9 also support SAP instances that are managed by the systemd-enabled SAP Startup framework. See The Systemd-Based SAP Startup Framework for further details.

  • SAPStartSrv

    The SAPStartSrv resource agent manages the sapstartsrv service for a given SAP application instance. It is responsible for starting, stopping and probing the service. Configure it without a recurring monitor operation to avoid resource group and instance failures. The SAPInstance resource automatically handles the recovery of a failed sapstartsrv process itself. The SAPStartSrv resource must be part of the instance resource group, and it must start before and stop after the SAPInstance resource.

  • sapping and sappong

    The sapping and sappong systemd services manage the visibility of the sapservices file during the system startup process. This mechanism prevents the sapinit startup script from automatically starting the SAP instance services when the instances are managed by the cluster. These two services are part of the resource-agents-sap package. The sapping systemd service runs before the sapinit startup script and renames the /usr/sap/sapservices file temporarily during the system startup to make it unavailable to sapinit. The sappong systemd service runs after the sapinit script and restores the /usr/sap/sapservices file to the original name to make it available again for manual control.

  • sap_cluster_connector

    The sap_cluster_connector tool connects the SAP HA interface with the Pacemaker cluster. The SAP application instance uses the tool to query the cluster for resource status information or to execute cluster commands for resource actions, like stopping a resource. Configure this interface for any individual instance that you configure in the cluster but also want to control using SAP tools. The sap_cluster_connector tool is optional and provided in the package sap-cluster-connector.

1.4. Support policies for SAP NetWeaver or S/4HANA

Red Hat supports the following components of the solution:

  • Basic operating system configuration for running SAP NetWeaver or S/4HANA on RHEL, based on SAP guidelines
  • RHEL HA Add-On
  • Red Hat HA solutions for SAP NetWeaver or S/4HANA

Chapter 2. Planning the HA cluster setup

Check the SAP documentation for Planning the Switchover Cluster for High Availability for more information about the HA concepts of SAP application servers.

Dedicated repositories provide the solutions for SAP NetWeaver or S/4HANA in a Pacemaker cluster for High Availability (HA). You need the RHEL for SAP Solutions subscription to access all relevant content. This subscription includes the following repositories:

  • High Availability

    Name of the repository that contains the content for the RHEL HA Add-On in general. The repository ID is represented as rhel-9-for-<arch>-highavailability-e4s-rpms.

  • SAP Netweaver

    Name of the repository that contains the SAP HANA specific content. The repository ID is represented as rhel-9-for-<arch>-sap-netweaver-e4s-rpms.

The <arch> denotes the specific hardware architecture:

  • x86_64
  • ppc64le

Example list of repositories enabled as part of the RHEL for SAP Solutions subscription:

[root]# dnf repolist
Updating Subscription Management repositories.
repo id                                     repo name
rhel-9-for-x86_64-appstream-e4s-rpms        Red Hat Enterprise Linux 9 for x86_64 - AppStream - Update Services for SAP Solutions (RPMs)
rhel-9-for-x86_64-baseos-e4s-rpms           Red Hat Enterprise Linux 9 for x86_64 - BaseOS - Update Services for SAP Solutions (RPMs)
rhel-9-for-x86_64-highavailability-e4s-rpms Red Hat Enterprise Linux 9 for x86_64 - High Availability - Update Services for SAP Solutions (RPMs)
rhel-9-for-x86_64-sap-netweaver-e4s-rpms    Red Hat Enterprise Linux 9 for x86_64 - SAP NetWeaver - Update Services for SAP Solutions (RPMs)
rhel-9-for-x86_64-sap-solutions-e4s-rpms    Red Hat Enterprise Linux 9 for x86_64 - SAP Solutions - Update Services for SAP Solutions (RPMs)

2.2. Operating system requirements

Check the SAP Note 3108316 - Red Hat Enterprise Linux 9.x: Installation and Configuration for operating system installation and configuration information to prepare the systems for SAP products on RHEL 9.

Deploy your host operating system as described in Installing RHEL 9 for SAP Solutions.

Root privileges

For the installation and the HA setup the root user or a privileged user that can execute any sudo commands is required.

2.3. Storage requirements

Configure the directories used by a SAP NetWeaver or S/4HANA installation that is managed by the cluster according to the guidelines provided by SAP. See Required File Systems and Directories for more information about the filesystems and directory structure.

Note

If you plan to upgrade your RHEL 9 systems to later RHEL releases, we recommend not using GFS2 filesystems for a new setup. The support for the GFS2 filesystem is discontinued in RHEL 10.

Alternatively, you can plan a migration of the filesystem setup to a supported configuration as part of the future operating system upgrade.

See Resilient Storage Add-On will be discontinued with RHEL 10 for more information.

Local storage

In the simplified filesystem configuration, most of the SAP content is stored on filesystems that are shared between the cluster nodes.

However, the following directory must be a local directory on all cluster nodes:

  • /usr/sap/

    This directory contains node specific SAP system files, like sapservices or hostctrl. It contains unique files for each node in a distributed SAP environment, separate from instance-specific files. You can configure a dedicated filesystem for this or plan it as part of the parent / or /usr filesystem, depending on your standard operating system filesystem structure. It requires limited filesystem space, because the majority of the SAP instance files are managed in the instance specific sub-directories on dedicated filesystems.

Shared storage

You must configure the following directories on dedicated Network File Systems (NFS) and share them between all servers running SAP instances of an SAP system:

  • /usr/sap/trans/

    The SAP Change and Transport System (CTS) uses the transport directory to organize development projects and move changes between SAP systems in your landscape. Configure this filesystem physically separate from the application server directories to ensure that it can not affect the instances if it runs full.

  • /sapmnt/

    The /sapmnt/ content must be accessible on the cluster nodes but also on any other server that is running services that are part of the same SAP system. This includes the servers hosting the HANA DB instances or servers hosting additional application servers that are not managed by the cluster.

  • /usr/sap/<SID>/

    This directory is the parent of the instance specific directories and other SAP system sub-directories associated with this particular SAP System ID (SID). Similar to /usr/sap/, it requires limited space because most data is in instance-specific subdirectories. Using the simplified filesystem setup for your SAP application server instances, you configure /usr/sap/<SID> to be statically mounted on all cluster nodes.

Note

Using the same host as an NFS server and as an NFS client that mounts the same NFS exports (loopback mounts) from this NFS server at the same time is not supported. See Support Policies for RHEL High Availability Clusters - Management of Highly Available Filesystem Mounts for more information.

2.4. HA cluster requirements

Fencing

For a supported HA cluster setup using the RHEL HA Add-on you must configure a fencing or STONITH device on each cluster node. Which fencing or STONITH device you can use depends on the platform the cluster is running on. Check the Support Policies for RHEL High Availability Clusters - Fencing/STONITH for recommendations on fencing agents or consult your hardware or cloud provider to find out which fence device is supported on their platform.

Note

fence_scsi or fence_mpath as fencing/STONITH mechanism requires shared storage between the cluster nodes that is fully managed by the HA cluster. If your SAP environment does not include such a shared disks setup, using these fencing options is not supported.

Quorum

In general, a quorum device is recommended for clusters with an even number of nodes. Two-node clusters specifically have an internal mechanism that handles split-brain situations by itself. For this case a quorum device is optional. Using a quorum device allows the cluster to better determine which node survives in a split-brain situation.

The options for setting up quorum devices vary depending on the platform, infrastructure and configuration.

2.5. SAP application server planning

To prepare the central services, enqueue replicator and application server setup, you can prepare a list of parameters, which are required for the installation and configuration of your planned environment.

  • ASCS/ERS instances

    ASCS and ERS are the main instances in the target HA configuration. Example configuration parameters for ASCS and ERS instances in a 2-node cluster:

    Expand
    ParameterExample value

    cluster node1 FQDN

    node1.example.com

    cluster node2 FQDN

    node2.example.com

    SID

    S4H

    ASCS instance number

    20

    ASCS virtual hostname

    s4hascs

    ASCS virtual IP address

    192.168.200.101

    ERS instance number

    29

    ERS virtual hostname

    s4hers

    ERS virtual IP address

    192.168.200.102

    ASCS/ERS administrative user

    s4hadm

  • Optional: Database instance

    With SAP NetWeaver you can also install a legacy database on the same nodes and configure this single database instance to be managed by the same cluster. The configuration of the database instance on this cluster is optional. Example configuration parameters for a legacy database instance:

    Expand
    ParameterExample value

    DB SID

    RH1

    DB virtual IP address

    192.168.200.115

  • Optional: PAS instance

    You can configure the Primary Application Server (PAS) instance in the same HA cluster. The configuration of a PAS instance on the ASCS/ERS cluster is optional. Example configuration parameters for a Primary Application Server (PAS) instance:

    Expand
    ParameterExample value

    PAS instance number

    21

    PAS virtual hostname

    s4hpas

    PAS virtual IP address

    192.168.200.103

  • Optional: AAS instance

    You can configure an Additional Application Server (AAS) instance in the same HA cluster. The configuration of an AAS instance on the ASCS/ERS cluster is optional. Example configuration parameters for an Additional Application Server (AAS) instance:

    Expand
    ParameterExample value

    AAS instance number

    22

    AAS virtual hostname

    s4haas

    AAS virtual IP address

    192.168.200.104

Virtual IPs for the virtual instance hostnames are mandatory for highly available SAP application server setups. You must configure the virtual IP per instance in a non-persistent way to make it available at instance installation time. Ensure that the IP configuration is only temporary, because the cluster manages the same addresses as part of the instance resource groups.

Prerequisites

  • You have reserved IP addresses for the virtual hostname of each instance that you plan to configure in the cluster.

Procedure

  • Temporarily configure each IP on its initial target node, for example, add the IP for the ASCS instance on node1 and add the IP for the ERS instance on node2:

    [root]# ip address add <ip>/<netmask> dev <nic>
    • Replace <ip> with the virtual IP of the instance, for example, 192.18.200.101.
    • Replace <netmask> with the netmask of the subnet, for example, 32.
    • Replace <nic> with the network device name on which the IP should run, for example, eth0.

Verification

  • Check on each node that the virtual IP is up, for example:

    [root]# ip address show dev eth0
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    …
        inet 192.168.200.101/32 scope global eth0
    …

3.2. Configuring the virtual instance hostnames

In an SAP environment you must configure virtual hostnames for the instances that you make highly available. See Using Virtual Host Names in the SAP documentation.

Prerequisites

  • You have reserved IP addresses for the virtual hostname of each instance that you plan to configure in the cluster.

Procedure

  • Add the virtual hostnames of the instances to all cluster nodes:

    [root]# cat /etc/hosts
    ...
    192.168.200.101 s4hascs.example.com s4hascs
    192.168.200.102 s4hers.example.com s4hers

    Also add the PAS or AAS virtual hostnames if you configure the application server instances on the same systems.

Verification

  • Check that you can ping the virtual hosts, when you have also configured the virtual IPs. This step is optional and an example only for a basic verification. The system resolves entries in /etc/hosts when you use the ping command:

    [root]# ping s4hascs
    PING s4hascs.example.com (192.168.200.101) 56(84) bytes of data.
    64 bytes from s4hascs.example.com (192.168.200.101): icmp_seq=1 ttl=64 time=0.017 ms
    …

3.3. Configuring the shared SAP filesystems

You must configure the shared filesystems on all systems that you plan to run SAP application server instances on as part of the HA cluster.

Prerequisites

  • You have prepared the shared NFS-based filesystems, and all cluster nodes are able to access them. The NFS shares must be external and not exported on one of the cluster nodes.

Procedure

  1. Create the directories for the shared filesystems:

    [root]# mkdir -p /sapmnt/ /usr/sap/trans/ /usr/sap/<SID>
    • Replace <SID> with the SID of your planned instances.
  2. Add the shared NFS filesystems to /etc/fstab to mount them automatically on system start:

    [root]# vi /etc/fstab
    …
    <nfs_server>:/usr/sap/<SID> /usr/sap/<SID> nfs4 defaults 0 0
    <nfs_server>:/usr/sap/trans /usr/sap/trans nfs4 defaults 0 0
    <nfs_server>:/sapmnt /sapmnt nfs4 defaults 0 0
    • Replace <SID> with the SID of your planned instances.
    • Replace <nfs_server> with the NFS server DNS name or the IP address of each share, for example, nfs01-datacenter1a.example.com.
  3. Reload the systemctl daemon to make the new /etc/fstab entries known to systemd:

    [root]# systemctl daemon-reload
  4. Mount any new filesystems that you configured in the /etc/fstab:

    [root]# mount -a
  5. Repeat the configuration steps on each cluster node.

Verification

  1. Check that the filesystems are mounted:

    [root]# df -hP | grep sap
    nfs01-datacenter1a.example.com:/sapmnt         8.0E  1.3G  8.0E   1% /sapmnt
    nfs01-datacenter1a.example.com:/usr/sap/trans  8.0E  1.3G  8.0E   1% /usr/sap/trans
    nfs01-datacenter1a.example.com:/usr/sap/S4H    8.0E  1.3G  8.0E   1% /usr/sap/S4H
  2. Check that the systemd mount targets exist for the filesystems configured in the /etc/fstab:

    [root]# systemctl list-units --all | grep -e '.*sap.*mount' | column -t
    sapmnt.mount         loaded  active  mounted  /sapmnt
    usr-sap-S4H.mount    loaded  active  mounted  /usr/sap/S4H
    usr-sap-trans.mount  loaded  active  mounted  /usr/sap/trans
  3. Repeat the verification steps on each cluster node. The results must be identical.

In a high-availability environment where the application can move between different systems, you must configure the application user and groups with identical numerical values for their user ID (UID) and group ID (GID). Different IDs for the same application users or groups cause access conflicts and prevent you from switching the application between the cluster nodes.

Prepare the following operating system group:

  • sapsys

Prepare the following operating system users:

  • sapadm
  • <sid>adm, using your target SID

Prerequisites

  • You have reserved identical user and group IDs for the required groups and users, for example, in your central identity management system for application users.

Procedure

  1. Create the sapsys group. Use the prepared group ID, for example, ID 10001:

    [root]# groupadd -g 10001 sapsys
  2. Create the sapadm user as a member of the sapsys group. The user does not need a login shell. Use the prepared user ID, for example, ID 10200:

    [root]# useradd -u 10200 -g sapsys sapadm \
    -c 'SAP Local Administrator' -s /sbin/nologin
  3. Create the <sid>adm user as a member of the sapsys group. Use the prepared user ID, for example, ID 10201 for user s4hadm:

    [root]# useradd -u 10201 -g sapsys s4hadm \
    -c 'SAP System Administrator' -s /bin/sh

    As the user shell, we recommend that you either use /bin/sh or /bin/csh. SAP installations provide user profiles and useful shell aliases in these shells.

  4. Repeat the steps on all nodes.

Verification

  1. Check that the users sapadm and <sid>adm exist and have the correct groups and IDs configured, for example:

    [root]# id sapadm s4hadm
    uid=10200(sapadm) gid=10001(sapsys) groups=10001(sapsys)
    uid=10201(s4hadm) gid=10001(sapsys) groups=10001(sapsys)
  2. Check that the users have the correct description, home directory and shell defined:

    [root]# grep -E 'sapadm|s4hadm' /etc/passwd
    sapadm:x:10200:10001:SAP Local Administrator:/home/sapadm:/sbin/nologin
    s4hadm:x:10201:10001:SAP System Administrator:/home/s4hadm:/bin/sh
  3. Repeat the check on all nodes and verify that the names and IDs are identical.

3.5. Installing the ASCS instance

Install the SAP application server instance according to the SAP documentation. See Running Software Provisioning Manager for more details about the SAP software installation.

Prerequisites

  • You have installed and configured the HA cluster nodes according to the recommendations from SAP and Red Hat for running SAP application server instances on RHEL 9. See Operating system requirements.
  • You have configured the virtual IP address for the ASCS instance on node1.
  • You have mounted the following filesystems on the HA cluster node where you install the ASCS instance, for example, node1:

    • /sapmnt
    • /usr/sap/trans
    • /usr/sap/<SID>
  • The virtual hostname for the ASCS instance resolves to the reserved virtual IP of the ASCS instance on all nodes. You have tested with the command getent hosts <virtual_hostname>, for example, getent hosts s4hascs, and it must return the correct IP address.
  • You have the installation media available on the system.

Procedure

  1. On the node where you install the ASCS instance, go to the directory where you have extracted the installation media:

    [root]# cd <software_path>
    • Replace <software_path> with the path to your unpacked media, for example, /sapmedia/SWPM20_SP19/.
  2. Run the installer command and specify the virtual hostname of your ASCS instance, for example, s4hascs:

    [root]# ./sapinst SAPINST_USE_HOSTNAME=s4hascs
  3. Open the web installer UI using the link provided in the terminal.
  4. Open the SAP product you want to install and enter the installation option. Expand the High-Availability System option and select ASCS Instance for the installation of the ASCS instance. Click Next.
  5. Provide the requested installation information on each page and click Next to move forward.

    Some steps, like extracting SAP packages, can take a while. Keep an eye on the terminal in which you started the installer for details of the ongoing process that are not displayed in the web UI.

Verification

  1. Switch to the <sid>adm user:

    [root]# su - s4hadm
  2. Check the instance status. Ensure that the service status is GREEN for all service components:

    s4hadm $ sapcontrol -nr 20 -function GetProcessList
    1. The following example shows a minimal setup with ENSA1:

      name, description, dispstatus, textstatus, starttime, elapsedtime, pid
      msg_server, MessageServer, GREEN, Running, YYYY MM DD 15:34:05, 0:05:38, 45041
      enserver, EnqueueServer, GREEN, Running, YYYY MM DD 15:34:05, 0:05:38, 45042
    2. The following example shows a minimal setup with ENSA2:

      name, description, dispstatus, textstatus, starttime, elapsedtime, pid
      msg_server, MessageServer, GREEN, Running, YYYY MM DD 15:43:49, 0:01:51, 5460
      enq_server, Enqueue Server 2, GREEN, Running, YYYY MM DD 15:43:49, 0:01:51, 5461

3.6. Installing the ERS instance

Install the SAP application server instance according to the SAP documentation. See Running Software Provisioning Manager for more details about the SAP software installation.

Prerequisites

  • You have installed and configured the ASCS instance on the first node, for example, node1.
  • You have configured the virtual IP address for the ERS instance on the second node, for example, node2.
  • You have mounted the following filesystems on node2:

    • /sapmnt
    • /usr/sap/trans
    • /usr/sap/<SID>
  • The virtual hostname for the ERS instance resolves to the reserved virtual IP of the ERS instance on all nodes. You have tested with the command getent hosts <virtual_hostname>, for example, getent hosts s4hers, and it must return the correct IP address.
  • You have the installation media available on the system.

Procedure

  1. On the node where you install the ERS instance, go to the directory where you have extracted the installation media:

    [root]# cd <software_path>
    • Replace <software_path> with the path to your unpacked media, for example, /sapmedia/SWPM20_SP19/.
  2. Run the installer command and specify the virtual hostname of your ERS instance, for example, s4hers:

    [root]# ./sapinst SAPINST_USE_HOSTNAME=s4hers
  3. Open the web installer UI using the link provided in the terminal.
  4. Open the SAP product you want to install and enter the installation option. Expand the High-Availability System option and select ERS Instance for the installation of the ERS instance. Click Next.
  5. Provide the requested installation information on each page and click Next to move forward.

    Some steps, like extracting SAP packages, can take a while. Keep an eye on the terminal in which you started the installer for details of the ongoing process that are not displayed in the web UI.

Verification

  1. Switch to the <sid>adm user:

    [root]# su - s4hadm
  2. Check the instance status. Ensure that the service status is GREEN for all service components:

    s4hadm $ sapcontrol -nr 29 -function GetProcessList
    1. The following example, shows a minimal setup with ENSA1:

      name, description, dispstatus, textstatus, starttime, elapsedtime, pid
      enrepserver, EnqueueReplicator, GREEN, Running, YYYY MM DD 16:34:05, 0:00:26, 11484
    2. The following example, shows a minimal setup with ENSA2:

      name, description, dispstatus, textstatus, starttime, elapsedtime, pid
      enq_replicator, Enqueue Replicator 2, GREEN, Running, YYYY MM DD 11:47:34, 0:33:14, 15623

3.7. Installing PAS or AAS instances

You can install a Primary Application Server (PAS) or an Additional Application Server (AAS) instance on the same systems and also configure them in the same cluster as your ASCS/ERS HA setup.

Install the SAP application server instance according to the SAP documentation. See Running Software Provisioning Manager for more details about the SAP software installation.

Skip the PAS or AAS setup if you do not need it.

Prerequisites

  • You have installed and configured a database instance and you can connect to the database instance from the application server system.
  • You have installed and started the ASCS and ERS instances.
  • You have installed the DB client on the application nodes.
  • You have configured the virtual IP address for the application server instance on the installation node.
  • You have mounted the following filesystems on the installation node:

    • /sapmnt
    • /usr/sap/trans
    • /usr/sap/<SID>
    • The virtual hostname for the application server instance resolves to the reserved virtual IP of the respective instance on all nodes. You have tested with the command getent hosts <virtual_hostname>, for example, getent hosts s4hpas, and it must return the correct IP address.
  • You have the installation media available on the system.

Procedure

  1. On the node where you install the application server instance, go to the directory where you have extracted the installation media:

    [root]# cd <software_path>
    • Replace <software_path> with the path to your unpacked media, for example, /sapmedia/SWPM20_SP19/.
  2. Run the installer command and specify the virtual hostname of your application server instance, for example, s4hpas:

    [root]# ./sapinst SAPINST_USE_HOSTNAME=s4hpas
  3. Open the web installer UI using the link provided in the terminal.
  4. Open the SAP product you want to install and enter the installation option. Expand the High-Availability System option and select Primary Application Server Instance for the installation of the PAS instance, or select Additional Application Server Instance for the installation of an AAS instance. Click Next.
  5. Provide the requested installation information on each page and click Next to move forward.

    Some steps, like extracting SAP packages, can take a while. Keep an eye on the terminal in which you started the installer for details of the ongoing process that are not displayed in the web UI.

Verification

  1. Switch to the <sid>adm user:

    [root]# su - s4hadm
  2. Check the instance status. Ensure that the service status is GREEN for all service components:

    s4hadm $ sapcontrol -nr 21 -function GetProcessList
    name, description, dispstatus, textstatus, starttime, elapsedtime, pid
    disp+work, Dispatcher, GREEN, Running, YYYY MM DD 16:40:47, 68:02:08, 17973
    igswd_mt, IGS Watchdog, GREEN, Running, YYYY MM DD 16:40:47, 68:02:08, 17974
    gwrd, Gateway, GREEN, Running, YYYY MM DD 16:40:50, 68:02:05, 18326
    icman, ICM, GREEN, Running, YYYY MM DD 16:40:50, 68:02:05, 18327

3.8. Verifying the SAP Host Agent installation

The SAP Host Agent is typically installed as part of the application server installation. It must have the same version and meet the minimum version requirements for the intended setup.

Check that the /usr/sap/hostctrl/ path is present and that the version is the same on each cluster node:

[root]# /usr/sap/hostctrl/exe/saphostexec -version

If you configure spare cluster nodes where you do not run the software installation of an instance, you must install the SAP Host Agent separately.

Also, update the SAP Host Agent if the versions do not match between the cluster nodes.

Refer to SAP Note 1031096 - Installing Package SAPHOSTAGENT for information and instructions.

3.9. Verifying the /etc/services file

The SAP software installation for application servers appends various standard ports for SAP applications to the /etc/services file on the host on which you run the installation.

Procedure

  1. Check that all nodes have the SAP ports in the services file. For example, count the entries that contain SAP System in their port description and compare the result on all nodes:

    [root]# grep -i "SAP System" /etc/services | wc -l
    401
  2. Update the /etc/services file on any node that is missing entries.

Chapter 4. Configuring the ASCS and ERS instances

4.1. Configuring the ASCS/ERS instance profiles

The ASCS and ERS instance profiles contain Restart directives for certain instance services by default. You must modify the profiles to prevent the automatic restart of the enqueue replication server process by the SAP Start service, because the cluster manages the availability of this service.

Procedure

  1. Update the profile of the ASCS instance and rename the Restart_Program_01 parameter of the enqueue server:

    [root]# sed -i -e 's/Restart_Program_01/Start_Program_01/' \
    /sapmnt/<SID>/profile/<SID>_ASCS<instance>_<ascs_virtual_hostname>
    • Replace <SID> with your instance SID, for example, S4H.
    • Replace <instance> with the ASCS instance number, for example, 20.
    • Replace <ascs_virtual_hostname> with the virtual hostname of your ASCS instance, for example, s4hascs.
  2. Update the profile of the ERS instance and rename the Restart_Program_00 parameter of the enqueue server:

    [root]# sed -i -e 's/Restart_Program_00/Start_Program_00/' \
    /sapmnt/<SID>/profile/<SID>_ERS<instance>_<ers_virtual_hostname>
    • Replace <SID> with your instance SID, for example, S4H.
    • Replace <instance> with the ERS instance number, for example, 29.
    • Replace <ers_virtual_hostname> with the virtual hostname of your ERS instance, for example, s4hers.

Verification

  1. Verify that the ASCS instance profile does not contain the Restart_Program_01 parameter:

    [root]# grep Restart_Program_01 /sapmnt/S4H/profile/*_ASCS*
  2. Verify that the ERS instance profile does not contain the Restart_Program_00 parameter:

    [root]# grep Restart_Program_00 /sapmnt/S4H/profile/*_ERS*

Systemd integration is the default configuration as of SAP Kernel Release 788. In HA environments you must apply additional modifications to integrate the different systemd services that are involved in the cluster setup.

You must execute the configuration on every cluster node and for every instance that you plan to manage in the cluster. At a minimum you must configure this for the ASCS and ERS instances. PAS and AAS instance configuration is optional and only required if you manage them in the same cluster.

Procedure

  1. Register the ASCS instance. Run the following SAP command as the root user to create the systemd integration:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/ASCS<instance>/exe && \
    /usr/sap/<SID>/ASCS<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_ASCS<instance>_<ascs_virtual_hostname> \
    -reg

    The command executes the sapstartsrv service for the selected instance profile and registers the instance service on the current system. It creates the systemd unit for the instance service, if it does not exist, and updates the local /usr/sap/sapservices file.

    • Replace <SID> with your ASCS instance SID, for example, S4H.
    • Replace <instance> with your ASCS instance number, for example, 20.
    • Replace <ascs_virtual_hostname> with the virtual hostname for your ASCS instance, for example, s4hascs.
  2. Register the ERS instance by repeating step 1 for the ERS profile:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/ERS<instance>/exe && \
    /usr/sap/<SID>/ERS<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_ERS<instance>_<ers_virtual_hostname> \
    -reg
    • Replace <SID> with your ERS instance SID, for example, S4H.
    • Replace <instance> with your ERS instance number, for example, 29.
    • Replace <ers_virtual_hostname> with the virtual hostname for your ERS instance, for example, s4hers.
  3. Optional: Register any PAS or AAS instance by repeating step 1 for the respective application server profile. Skip this step if you do not configure PAS or AAS instances in this cluster:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/D<instance>/exe && \
    /usr/sap/<SID>/D<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_D<instance>_<as_virtual_hostname> \
    -reg
    • Replace <SID> with your PAS or AAS instance SID, for example, S4H.
    • Replace <instance> with your PAS or AAS instance number, for example, 21.
    • Replace <as_virtual_hostname> with the virtual hostname for your PAS or AAS instance, for example, s4hpas.
  4. Disable the ASCS, ERS and any other application instance service that the cluster manages after the setup:

    [root]# systemctl disable SAP<SID>_<instance>.service
    Removed "/etc/systemd/system/multi-user.target.wants/SAP<SID>_<instance>.service".

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat the same for the PAS or AAS instance services.

  5. Create the systemd drop-in directory for the ASCS, ERS and any other application instance service that the cluster manages after the setup:

    [root]# mkdir /etc/systemd/system/SAP<SID>_<instance>.service.d

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat using the PAS or AAS instance number.

  6. Create the drop-in files for the instances in the new directory:

    [root]# cat << EOF > /etc/systemd/system/SAP<SID>_<instance>.service.d/HA.conf
    [Service]
    Restart=no
    EOF

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat using the PAS or AAS instance number.

  7. Reload the systemd units to activate the drop-in configuration:

    [root]# systemctl daemon-reload
  8. Repeat all steps on each cluster node for every instance.

Verification

  1. Check that all instances have instance systemd units and that they are disabled:

    [root]# systemctl list-unit-files SAPS4H*
    UNIT FILE         STATE    PRESET
    SAPS4H_20.service disabled disabled
    SAPS4H_29.service disabled disabled

    Optional: PAS or AAS instance service files are listed as well in all of the verification steps when you have configured the application server instances.

  2. Check that the sapservices file contains entries for every instance:

    [root]# cat /usr/sap/sapservices
    systemctl --no-ask-password start SAPS4H_20 # sapstartsrv pf=/sapmnt/S4H/profile/S4H_ASCS20_s4hascs
    systemctl --no-ask-password start SAPS4H_29 # sapstartsrv pf=/sapmnt/S4H/profile/S4H_ERS29_s4hers
  3. Check that all systemd configuration overrides are present:

    [root]# systemd-delta | grep SAP
    ...
    [EXTENDED]   /etc/systemd/system/SAPS4H_20.service → /etc/systemd/system/SAPS4H_20.service.d/HA.conf
    [EXTENDED]   /etc/systemd/system/SAPS4H_29.service → /etc/systemd/system/SAPS4H_29.service.d/HA.conf
  4. Repeat the steps on every cluster node. The results must be the same on all nodes.

Start and stop the application instances on all cluster nodes to verify that the prerequisites are met to configure and manage the instances in the cluster.

Prerequisites

  • You have configured the systemd-based SAP Start framework. If not, you must use the respective sapcontrol commands instead of the systemctl commands for the stop and start of the instances.
  • You have stopped each instance before starting it on another node.

Procedure

  1. Stop the instance on the node where it currently runs, for example, stop ASCS on node1:

    [root]# systemctl stop SAP<SID>_<instance>.service

    Stopping the SAP instance systemd service stops both the instance service and the instance itself.

  2. Start the instance service sapstartsrv on the node where it was not running before, for example, start the ASCS service on node2:

    [root]# systemctl start SAP<SID>_<instance>.service
  3. Start the instance using sapcontrol as the <sid>adm user. Replace <instance> with the instance number, for example, 20 for the ASCS instance:

    <sid>adm $ sapcontrol -nr <instance> -function Start
  4. Verify the instance’s health after it starts on the other node. Run this check as the <sid>adm user. Replace <instance> with the instance number, for example, 20 for the ASCS instance. The status of all services must be GREEN:

    <sid>adm $ sapcontrol -nr <instance> -function GetProcessList
  5. Optional: Repeat the previous steps to move the instance back to its original node, for example, move ASCS back to node1.
  6. Repeat all steps for the ERS instance.
  7. Optional: If you configure PAS or AAS instances, then repeat the steps for these application server instances as well.

4.4. Installing the SAP license keys

To ensure that the SAP instances continue to run after a failover, you have to install several SAP license keys based on the hardware key of each cluster node. See SAP Note 1178686 - Linux: Alternative method to generate a SAP hardware key for more information.

Chapter 5. Configuring the Pacemaker cluster

5.1. Deploying the basic cluster configuration

The following basic cluster setup covers the minimum steps to get started with the Pacemaker cluster setup for managing SAP instances in a two-node cluster.

For more information on settings and options for complex configurations, refer to the documentation for RHEL HA Add-On, for example, Configuring and managing high availability clusters.

Prerequisites

  • You have installed and configured all SAP instances that you plan to manage in the cluster.
  • You have configured the RHEL High Availability repository on the planned cluster nodes..
  • You have verified fencing and quorum requirements according to your planned environment. For more details see HA cluster requirements.

Procedure

  1. Install the Red Hat High Availability Add-On software packages from the High Availability repository. Choose which fence agents you want to install and execute the installation on all cluster nodes.

    1. Either install the cluster packages and all fence agents:

      [root]# dnf install pcs pacemaker fence-agents-all
    2. Or install the cluster packages and only a specific fence agent, depending on your environment:

      [root]# dnf install pcs pacemaker fence-agents-<model>
  2. Start and enable the pcsd service on all cluster nodes:

    [root]# systemctl enable --now pcsd.service
  3. Optional: If you are running the firewalld service, enable the ports that are required by the Red Hat High Availability Add-On. Run this on all cluster nodes:

    [root]# firewall-cmd --add-service=high-availability
    [root]# firewall-cmd --runtime-to-permanent
  4. Set a password for the user hacluster. Repeat the command on each node using the same password:

    [root]# passwd hacluster
  5. Authenticate the user hacluster for each node in the cluster. Run this on the first node:

    [root]# pcs host auth <node1> <node2>
    Username: hacluster
    Password:
    <node1>: Authorized
    <node2>: Authorized
    • Enter the node names with or without FQDN, as defined in the /etc/hosts file.
    • Enter the hacluster user password in the prompt.
  6. Create the cluster with a name and provide the names of the cluster members, for example node1 and node2 with fully qualified host names. This propagates the cluster configuration on both nodes and starts the cluster. Run this command on the first node:

    [root]# pcs cluster setup <cluster_name> --start <node1> <node2>
    No addresses specified for host 'node1', using 'node1'
    No addresses specified for host 'node2', using 'node2'
    Destroying cluster on hosts: 'node1', 'node2'...
    node2: Successfully destroyed cluster
    node1: Successfully destroyed cluster
    Requesting remove 'pcsd settings' from 'node1', 'node2'
    node1: successful removal of the file 'pcsd settings'
    node2: successful removal of the file 'pcsd settings'
    Sending 'corosync authkey', 'pacemaker authkey' to 'node1', 'node2'
    node1: successful distribution of the file 'corosync authkey'
    node1: successful distribution of the file 'pacemaker authkey'
    node2: successful distribution of the file 'corosync authkey'
    node2: successful distribution of the file 'pacemaker authkey'
    Sending 'corosync.conf' to 'node1', 'node2'
    node1: successful distribution of the file 'corosync.conf'
    node2: successful distribution of the file 'corosync.conf'
    Cluster has been successfully set up.
    Starting cluster on hosts: 'node1', 'node2'...
  7. Enable the cluster to be started automatically on system start, which enables the corosync and pacemaker services. Skip this step if you prefer to manually control the start of the cluster after a node restarts. Run on one node:

    [root]# pcs cluster enable --all
    node1: Cluster Enabled
    node2: Cluster Enabled

Verification

  • Check the cluster status. Verify that the cluster daemon services are in the desired state:

    [root]# pcs status --full
    Cluster name: node1-node2-cluster
    WARNINGS:
    No stonith devices and stonith-enabled is not false
    Cluster Status:
     Cluster Summary:
       * Stack: corosync (Pacemaker is running)
       * Current DC: node1 (version ***********) - partition with quorum
       * Last updated: ************************ on node1
       * Last change:  ************************ by hacluster via hacluster on node1
       * 2 nodes configured
       * 0 resource instances configured
    ...
    PCSD Status:
      node1: Online
      node2: Online
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

Next steps

5.2. Configuring general cluster properties

You must adjust cluster resource defaults to avoid unnecessary failovers of the resources, but restore the service on a different node if a resource fails repeatedly.

Procedure

  • Update cluster resource defaults to avoid unnecessary failovers of the resources. Run the command on one cluster node to apply the change to the cluster configuration:

    [root]# pcs resource defaults update \
    resource-stickiness=1 \
    migration-threshold=3
    • resource-stickiness=1 encourages the resource to stay on a node.
    • migration-threshold=3 causes the resource to move to a different node after 3 failures.

Verification

  • Check that the resource defaults are set:

    [root]# pcs resource defaults
    Meta Attrs: build-resource-defaults
      migration-threshold=3
      resource-stickiness=1

The resource-agents-sap RPM package in the Red Hat Enterprise Linux 9 for <arch> - SAP NetWeaver (RPMs) repository provides the resource agents and other SAP application server specific components for setting up an HA cluster for managing SAP application server instances.

Prerequisites

  • You have configured the repository Red Hat Enterprise Linux 9 for <arch> - SAP NetWeaver (RPMs) on all cluster nodes.

Procedure

  1. Install the resource-agents-sap package on all cluster nodes:

    [root]# dnf install resource-agents-sap
  2. Enable the sapping and sappong services.

    [root]# systemctl enable sapping.service sappong.service
    Created symlink /etc/systemd/system/multi-user.target.wants/sapping.service → /usr/lib/systemd/system/sapping.service.
    Created symlink /etc/systemd/system/multi-user.target.wants/sappong.service → /usr/lib/systemd/system/sappong.service.

Verification

  1. Check on all nodes that the package is installed:

    [root]# rpm -q resource-agents-sap
    resource-agents-sap-<version>.<release>.noarch
  2. Verify that the sapping and sappong services are enabled:

    [root]# systemctl list-unit-files sapping.service sappong.service
    UNIT FILE       STATE   PRESET
    sapping.service enabled disabled
    sappong.service enabled disabled

5.4. Creating the ASCS resource group

You must configure a virtual IP (VIP) resource for SAP clients to access the ASCS instance independently from the cluster node it is currently running on.

The resource agent needed for the VIP resource depends on the platform used. We are using the IPaddr2 resource agent to demonstrate the setup.

Prerequisites

  • You have reserved a virtual IP address for the ASCS service.
  • You have mounted the shared application filesystems /sapmnt and /usr/sap/trans on all cluster nodes.
  • You have configured the simplified filesystem setup in the /etc/fstab and mounted the ASCS filesystem on all cluster nodes.
  • You have installed the ASCS instance in the ASCS instance filesystem.
  • You have disabled the auto restart of the enqueue server in the ASCS instance profile.
  • You have tested that the ASCS instance can start and run on all cluster nodes.
Warning

Since RHEL 9.4 a new syntax for creating a resource in a group has been introduced in addition to the --group parameter. You receive the following deprecation warning now:

Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

You can ignore this warning. It only informs you of a change in later operating system versions.

Procedure

  1. Use the appropriate resource agent for managing the virtual IP address based on the platform on which the HA cluster is running. Adjust the parameters according to the resource agent you are using. Create the cluster resource for the ASCS virtual IP, for example, using the IPaddr2 agent:

    [root]# pcs resource create rsc_vip_<SID>_ASCS<instance> \
    ocf:heartbeat:IPaddr2 \
    ip=<address> cidr_netmask=<netmask> nic=<device> \
    --group grp_<SID>_ASCS<instance>
    • Replace <SID> with your ASCS SID.
    • Replace <instance> with your ASCS instance number.
    • Replace <address>, <netmask> and <device> with the details of your virtual IP address.
  2. Create the SAPStartSrv resource for the ASCS instance:

    [root]# pcs resource create rsc_SAPStartSrv_<SID>_ASCS<instance> \
    ocf:heartbeat:SAPStartSrv \
    InstanceName="<sap_instance_name>" \
    --group grp_<SID>_ASCS<instance> \
    op monitor interval=0 timeout=20 enabled=0
    • Replace <SID> with your ASCS SID, for example, S4H.
    • Replace <instance> with your ASCS instance number, for example, 20.
    • Replace <sap_instance_name> with the SAP start profile name of your ASCS instance, for example, S4H_ASCS_s4hascs.
    • Ensure that the recurring monitor operation is disabled using enabled=0. The single resource probe still runs at resource start, but recurring monitors must not run afterwards.
  3. Create the SAPInstance resource for the ASCS instance:

    [root]# pcs resource create rsc_SAPInstance_<SID>_ASCS<instance> \
    ocf:heartbeat:SAPInstance \
    InstanceName="<sap_instance_name>" \
    MINIMAL_PROBE=true \
    meta resource-stickiness=5000 \
    --group grp_<SID>_ASCS<instance> \
    op monitor interval=20 on-fail=restart timeout=60
    • Replace <sap_instance_name> with the SAP start profile name of your ASCS instance, for example, S4H_ASCS_s4hascs.

      resource-stickiness=5000 balances out the failover constraint with the ERS resource so that the resource stays on the node where it started. When you use the SAPStartSrv resource for the simplified filesystem setup, then you must set MINIMAL_PROBE=true.

  4. Optional: For an ENSA1 setup you must add migration-threshold=1 to the SAPInstance resource of the ASCS instance:

    [root]# pcs resource update rsc_SAPInstance_<SID>_ASCS<instance> \
    meta migration-threshold=1

    migration-threshold=1 ensures that the ASCS instance can not restart on the same node. In ENSA2 setups restarting the ASCS instance on the same node is allowed due to an improved lock table recovery mechanism through the network.

  5. Add a resource-stickiness to the ASCS resource group to ensure that the ASCS instance stays on a HA cluster node if possible:

    [root]# pcs resource meta grp_<SID>_ASCS<instance> resource-stickiness=3000

Verification

  1. Check the cluster status of the ASCS resources:

    [root]# pcs status --full | grep ASCS
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPaddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node1
  2. Verify the resource configuration details of all resources in the ASCS group:

    [root]# pcs resource config grp_S4H_ASCS20
    Group: grp_S4H_ASCS20
      Meta Attributes: grp_S4H_ASCS20-meta_attributes
        resource-stickiness=3000
      Resource: rsc_vip_S4H_ASCS20 (class=ocf provider=heartbeat type=IPaddr2)
        Attributes: rsc_vip_S4H_ASCS20-instance_attributes
          cidr_netmask=32
          ip=192.168.200.101
          nic=eth0
        Operations:
    ...
      Resource: rsc_SAPStartSrv_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPStartSrv)
        Attributes: rsc_SAPStartSrv_S4H_ASCS20-instance_attributes
          InstanceName=S4H_ASCS20_s4hascs
        Operations:
          monitor: rsc_SAPStartSrv_S4H_ASCS20-monitor-interval-0s
            interval=0s timeout=20s enabled=0
    ...
      Resource: rsc_SAPInstance_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPInstance)
        Attributes: rsc_SAPInstance_S4H_ASCS20-instance_attributes
          InstanceName=S4H_ASCS20_s4hascs
          MINIMAL_PROBE=true
        Meta Attributes: rsc_SAPInstance_S4H_ASCS20-meta_attributes
          resource-stickiness=5000
        Operations:
    ...
          monitor: rsc_SAPInstance_S4H_ASCS20-monitor-interval-20
            interval=20 timeout=60 on-fail=restart
    ...
Note

With the default pacemaker configuration for RHEL 9, certain failures of resource actions, for example, the stop of a resource fails, causes the cluster to fence the node. This leads to an outage for all other resources running on the same HA cluster node. See the description of the on-fail property for monitoring operations in Configuring and managing high availability clusters - Chapter 21. Resource monitoring operations for options on how to modify this behavior.

5.5. Creating the ERS resource group

You must configure a virtual IP (VIP) resource for SAP clients to access the ERS instance independently from the cluster node it is currently running on.

The resource agent needed for the VIP resource depends on the platform used. We are using the IPaddr2 resource agent to demonstrate the setup.

Prerequisites

  • You have reserved a virtual IP address for the ERS service.
  • You have mounted the shared application filesystems /sapmnt and /usr/sap/trans on all cluster nodes.
  • You have configured the simplified filesystem setup in the /etc/fstab and mounted the ERS filesystem on all cluster nodes.
  • You have installed the ERS instance in the ERS instance filesystem and configured the instance profile and systemd integration.
  • You have tested that the ERS instance can start and run on all cluster nodes.
Warning

Since RHEL 9.4 a new syntax for creating a resource in a group has been introduced in addition to the --group parameter. You receive the following deprecation warning now:

Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

You can ignore this warning. It only informs you of a change in later operating system versions.

Procedure

  1. Use the appropriate resource agent for managing the virtual IP address based on the platform on which the HA cluster is running. Adjust the parameters according to the resource agent you are using. Create the cluster resource for the ERS virtual IP, for example, using the IPaddr2 agent:

    [root]# pcs resource create rsc_vip_<SID>_ERS<instance> \
    ocf:heartbeat:IPaddr2 \
    ip=<address> cidr_netmask=<netmask> nic=<device> \
    --group grp_<SID>_ERS<instance>
    • Replace <SID> with your ERS SID.
    • Replace <instance> with your ERS instance number.
    • Replace <address>, <netmask> and <device> with the details of your virtual IP address.
  2. Create the SAPStartSrv resource for the ERS instance:

    [root]# pcs resource create rsc_SAPStartSrv_<SID>_ERS<instance> \
    ocf:heartbeat:SAPStartSrv \
    InstanceName="<sap_instance_name>" \
    --group grp_<SID>_ERS<instance> \
    op monitor interval=0 timeout=20 enabled=0
    • Replace <SID> with your ERS SID, for example, S4H.
    • Replace <instance> with your ERS instance number, for example, 29.
    • Replace <sap_instance_name> with the SAP start profile name of your ERS instance, for example, S4H_ERS_s4hers.
    • Ensure that the recurring monitor operation is disabled using enabled=0. The single resource probe is still run at resource start, but recurring monitors must not run afterwards.
  3. Create the SAPInstance resource for the ERS instance:

    [root]# pcs resource create rsc_SAPInstance_<SID>_ERS<instance> \
    ocf:heartbeat:SAPInstance \
    InstanceName="<sap_instance_name>" \
    IS_ERS=true \
    --group grp_<SID>_ERS<instance> \
    op monitor interval=20 on-fail=restart timeout=60 \
    op start interval=0 timeout=600 \
    op stop interval=0 timeout=600

Verification

  1. Check the cluster status of the ERS resources:

    [root]# pcs status --full | grep ERS
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29        (ocf:heartbeat:IPaddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ERS29        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ERS29        (ocf:heartbeat:SAPInstance):     Started node2
  2. Verify the resource configuration details of all resources in the ERS group:

    [root]# pcs resource config grp_S4H_ERS29
    Group: grp_S4H_ERS29
      Resource: rsc_vip_S4H_ERS29 (class=ocf provider=heartbeat type=IPaddr2)
        Attributes: rsc_vip_S4H_ERS29-instance_attributes
          cidr_netmask=32
          ip=192.168.200.102
          nic=eth0
    …
      Resource: rsc_SAPStartSrv_S4H_ERS29 (class=ocf provider=heartbeat type=SAPStartSrv)
        Attributes: rsc_SAPStartSrv_S4H_ERS29-instance_attributes
          InstanceName=S4H_ERS29_s4hers
        Operations:
          monitor: rsc_SAPStartSrv_S4H_ERS29-monitor-interval-0
            interval=0 timeout=20 enabled=0
    …
      Resource: rsc_SAPInstance_S4H_ERS29 (class=ocf provider=heartbeat type=SAPInstance)
        Attributes: rsc_SAPInstance_S4H_ERS29-instance_attributes
          IS_ERS=true
          InstanceName=S4H_ERS29_s4hers
        Operations:
    …
          monitor: rsc_SAPInstance_S4H_ERS29-monitor-interval-20
            interval=20 timeout=60 on-fail=restart
    …
          start: rsc_SAPInstance_S4H_ERS29-start-interval-0
            interval=0 timeout=600
          stop: rsc_SAPInstance_S4H_ERS29-stop-interval-0
            interval=0 timeout=600
    …
Note

The IS_ERS=true attribute is mandatory for an ENSA1 deployment. More information about IS_ERS can be found in How does the IS_ERS attribute work on a SAP NetWeaver cluster with Standalone Enqueue Server (ENSA1 and ENSA2)?.

5.6. Creating constraints for ASCS and ERS

Configure a colocation constraint for the ASCS and ERS resource groups to avoid running on the same node under normal circumstances.

Add an order constraint to ensure that the ERS resource group is stopped only after the ASCS group has fully started. If pacemaker decides to start the ASCS resource group at the same time as stopping the ERS resource group, it must wait for the ASCS instance to be fully started first, which includes the recovery of the replicated enqueue data from the running ERS instance. Only then can the ERS group stop, for example, as a result of the colocation constraint between the groups.

Warning

Ignore the informational deprecation warning for a change in a later operating system version:

Deprecation Warning: Using '-5000' without '--' is deprecated, those parameters will be considered position independent options in future pcs versions.

Procedure

  1. Create the colocation constraint for the ASCS and ERS resource groups. The order of the groups in the command matters:

    [root]# pcs constraint colocation add grp_<SID>_ERS<ers_instance> \
    with grp_<SID>_ASCS<ascs_instance> -5000
    • Replace <SID> with your ASCS and ERS SID, for example, S4H.
    • Replace <ers_instance> with your ERS instance number, for example, 29.
    • Replace <ascs_instance> with your ASCS instance number, for example, 20.
  2. Optional: When using ENSA1, you must ensure that the ASCS instance starts on the node where the ERS instance is running. This is required to recover the lock table from ERS into ASCS. Create a location constraint rule for the ASCS resource:

    [root]# pcs constraint location grp_<SID>_ASCS<instance> rule score=2000 runs_ers_<SID> eq 1
    • Replace <SID> with your ASCS and ERS SID, for example, S4H.
    • Replace <instance> with your ASCS instance number, for example, 20.

      runs_ers_<SID> is a cluster node attribute which the SAPInstance resource agent creates. It can be used to locate where the ERS instance resource is running in the cluster.

  3. Create the order constraint for the ASCS and ERS resource groups. This ensures that the ERS instance is not moved before ASCS has fully started and imported the enqueue data from the ERS instance:

    [root]# pcs constraint order start grp_<SID>_ASCS<ascs_instance> \
    then stop grp_<SID>_ERS<ers_instance> symmetrical=false kind=Optional
    • Replace <SID> with your ASCS and ERS SID, for example, S4H.
    • Replace <ers_instance> with your ERS instance number, for example, 29.
    • Replace <ascs_instance> with your ASCS instance number, for example, 20.

Verification

  • Check that the constraints are configured correctly.

    • Example when using ENSA1:

      [root]# pcs constraint
      Location Constraints:
        Resource: grp_S4H_ASCS20
          Constraint: location-grp_S4H_ASCS20
            Rule: score=2000
              Expression: runs_ers_S4H eq 1
      Colocation Constraints:
        resource 'grp_S4H_ERS29' with resource 'grp_S4H_ASCS20'
          score=-5000
      Order Constraints:
        start resource 'grp_S4H_ASCS20' then stop resource 'grp_S4H_ERS29'
          symmetrical=0 kind=Optional
    • Example when using ENSA2:

      [root]# pcs constraint
      Colocation Constraints:
        resource 'grp_S4H_ERS29' with resource 'grp_S4H_ASCS20'
          score=-5000
      Order Constraints:
        start resource 'grp_S4H_ASCS20' then stop resource 'grp_S4H_ERS29'
          symmetrical=0 kind=Optional
Note

Because symmetrical=false and kind=Optional are used, there can be a situation where the order constraint does not take effect. For more information, refer to Determining the order in which cluster resources are run.

When you deploy an HA cluster for managing a SAP NetWeaver based SAP product that is still using a legacy database like Oracle, IBM DB2, SAP ASE or SAP MaxDB, you can also add the database instance to the cluster configuration.

You cannot use the SAPDatabase resource with S/4HANA or to manage a HANA database instance. For a cluster setup of a HANA instance, use one of the SAP HANA guides from Red Hat Enterprise Linux for SAP Solutions 9 - High Availability, which applies to your target setup.

The filesystem on which the database is installed can only be mounted on one cluster node at a time. You must configure the filesystem in the cluster as part of the DB instance resource group, even when it is an NFS filesystem.

Skip this procedure if you use a HANA database for your SAP environment, or if you do not want to configure your legacy database instance in the cluster.

Prerequisites

  • You have installed a legacy database for your SAP NetWeaver environment on shared storage, for example, NFS or SAN.
  • You have ensured that the database filesystem is not automatically mounted by the operating system.
  • You have reserved a virtual IP address for your database client access.
  • You have tested that the database instance can start and run on all cluster nodes.

Procedure

  1. Option 1: Create the filesystem resource for the DB instance when the filesystem is an NFS share:

    [root]# pcs resource create rsc_fs_<SID>_db \
    ocf:heartbeat:Filesystem \
    device='<nfs_server>:<db_nfs_share>' \
    directory=<db_mountpoint> \
    fstype=nfs \
    force_unmount=safe \
    --group grp_<SID>_db
    • Replace <SID> with your database SID, for example, RH1.
    • Replace <nfs_server> with the NFS server name or IP of the DB instance share, for example, nfs01-datacenter1a.example.com.
    • Replace <db_nfs_share> with the NFS volume name for your DB instance.
    • Add the option force_unmount=safe when the NFS share is a directory on a shared tree, like it is common on Azure NetApp Files (ANF) or Amazon EFS.
  2. Option 2: Create the filesystem resource for the DB instance when the filesystem is on SAN storage. Use HA-LVM to manage the filesystem you set up on SAN. Apply the configuration according to What is a Highly Available LVM (HA-LVM) configuration and how do I implement it?.

    • Create the HA-LVM cluster resource:

      [root]# pcs resource create rsc_lvm_<SID>_db \
      ocf:heartbeat:LVM-activate \
      volgrpname='<db_volume_group>' \
      vg_access_mode=system_id \
      --group grp_<SID>_db
    • Replace <db_volume_group> with the LVM group name of your DB volume, for example, vg_db.
    • Create the filesystem resource:

      [root]# pcs resource create rsc_fs_<SID>_db \
      ocf:heartbeat:Filesystem \
      device='<lvm_volume>' \
      directory=<db_mountpoint> \
      fstype=<fs_type> \
      force_unmount=safe \
      --group grp_<SID>_db
    • Replace <SID> with your database SID, for example, RH1.
    • Replace <lvm_volume> with the path to the LVM share of the DB instance, for example /dev/vg_db/lv_db.
    • Replace <db_mountpoint> with the path to the mountpoint of the DB instance filesystem, for example, /sybase.
    • Replace <fs_type> with the type of filesystem you configured for the DB instance filesystem, for example, xfs.
  3. Create the virtual IP resource for the database. Use the appropriate resource agent for managing the virtual IP address based on the platform on which the HA cluster is running. Adjust the parameters according to the resource agent you are using. For example, use the IPaddr2 agent:

    [root]# pcs resource create rsc_vip_<SID>_db \
    ocf:heartbeat:IPaddr2 \
    ip=<address> cidr_netmask=<netmask> nic=<device> \
    --group grp_<SID>_db
    • Replace <SID> with your database SID, for example, RH1.
    • Replace <address>, <netmask> and <device> with the details of your virtual IP address.
  4. Create the SAPDatabase resource for the DB instance:

    [root]# pcs resource create rsc_SAPDatabase_<SID>_db \
    ocf:heartbeat:SAPDatabase \
    DBTYPE="<db_type>" \
    SID="<SID>" \
    STRICT_MONITORING="TRUE" \
    AUTOMATIC_RECOVER="TRUE" \
    --group grp_<SID>_db
    • Replace <SID> with your database SID, for example, RH1.
    • Replace <db_type> with the type of your database. It must be one of ADA, DB6, ORA or SYB.

Verification

  1. Check the cluster status of the database resources:

    [root]# pcs status --full | grep RH1
     * Resource Group: grp_RH1_db:
       * rsc_lvm_RH1_db	(ocf:heartbeat:LVM-activate):	 Started node1
       * rsc_fs_RH1_db	(ocf:heartbeat:Filesystem):	 Started node1
       * rsc_vip_RH1_db	(ocf:heartbeat:IPaddr2):	 Started node1
       * rsc_SAPDatabase_RH1_db	(ocf:heartbeat:SAPDatabase):	 Started node1
  2. Verify the resource configuration details of all resources in the database resource group:

    [root]# pcs resource config grp_RH1_db
    Group: grp_RH1_db
     Resource: rsc_lvm_RH1_db (class=ocf provider=heartbeat type=LVM-activate)
       Attributes: vg_access_mode=system_id vgname=vg_db
       Operations: monitor interval=30s timeout=90s (rsc_lvm_RH1_db-monitor-interval-30s)
                   start interval=0s timeout=90s (rsc_lvm_RH1_db-start-interval-0s)
                   stop interval=0s timeout=90s (rsc_lvm_RH1_db-stop-interval-0s)
      Resource: rsc_fs_RH1_db (class=ocf provider=heartbeat type=Filesystem)
       Attributes: device=/dev/vg_db/lv_db directory=/sybase fstype=xfs
       Operations: monitor interval=20s timeout=40s (rsc_fs_RH1_db-monitor-interval-20s)
                   start interval=0s timeout=60s (rsc_fs_RH1_db-start-interval-0s)
                   stop interval=0s timeout=60s (rsc_fs_RH1_db-stop-interval-0s)
      Resource: rsc_vip_RH1_db (class=ocf provider=heartbeat type=IPaddr2)
       Attributes: ip=192.168.200.115
       Operations: monitor interval=10s timeout=20s (rsc_vip_RH1_db-monitor-interval-10s)
                   start interval=0s timeout=20s (rsc_vip_RH1_db-start-interval-0s)
                   stop interval=0s timeout=20s (rsc_vip_RH1_db-stop-interval-0s)
      Resource: rsc_SAPDatabase_RH1_db (class=ocf provider=heartbeat type=SAPDatabase)
       Attributes: AUTOMATIC_RECOVER=TRUE DBTYPE=SYB SID=RH1 STRICT_MONITORING=TRUE
       Operations: methods interval=0s timeout=5s (rsc_SAPDatabase_RH1_db-methods-interval-0s)
                   monitor interval=120s timeout=60s (rsc_SAPDatabase_RH1_db-monitor-interval-120s)
                   start interval=0s timeout=1800s (rsc_SAPDatabase_RH1_db-start-interval-0s)
                   stop interval=0s timeout=1800s (rsc_SAPDatabase_RH1_db-stop-interval-0s)

5.8. Creating PAS or AAS resource groups

You can optionally add a Primary Application Server (PAS) or Additional Application Server (AAS) instances to be managed by the same cluster. The resource group configuration is similar to the ASCS and ERS setup, but with fewer attributes required for the resources.

Prerequisites

  • You have installed and prepared PAS or AAS instances on the nodes to be managed by the same cluster.
  • You have tested that the PAS or AAS instance can start and run on all cluster nodes.
Warning

Since RHEL 9.4 a new syntax for creating a resource in a group has been introduced in addition to the --group parameter. You receive the following deprecation warning now:

Deprecation Warning: Using '--group' is deprecated and will be replaced with 'group' in a future release. Specify --future to switch to the future behavior.

You can ignore this warning. It only informs you of a change in later operating system versions.

Procedure

  1. Use the appropriate resource agent for managing the virtual IP address based on the platform on which the HA cluster is running. Adjust the parameters according to the resource agent you are using. Create the cluster resource for the PAS or AAS virtual IP, for example, using the IPaddr2 agent:

    [root]# pcs resource create rsc_vip_<SID>_D<instance> \
    ocf:heartbeat:IPaddr2 \
    ip=<address> cidr_netmask=<netmask> nic=<device> \
    --group grp_<SID>_D<instance>
    • Replace <SID> with your PAS or AAS SID, for example, S4H.
    • Replace <instance> with your PAS or AAS instance number, for example, 21 for PAS.
    • Replace <address>, <netmask> and <device> with the details of your virtual IP address.
  2. Create the SAPStartSrv resource for the PAS or AAS instance for the simplified filesystem management:

    [root]# pcs resource create rsc_SAPStartSrv_<SID>_D<instance> \
    ocf:heartbeat:SAPStartSrv \
    InstanceName="<sap_instance_name>" \
    --group grp_<SID>_D<instance> \
    op monitor interval=0 timeout=20 enabled=0
    • Replace <SID> with your PAS or AAS SID, for example, S4H.
    • Replace <instance> with your PAS or AAS instance number, for example, 21 for PAS.
    • Replace <sap_instance_name> with the SAP start profile name of your PAS or AAS instance, for example, S4H_PAS_s4hpas for the PAS instance.
    • Ensure that the recurring monitor operation is disabled using enabled=0. The single resource probe is still run at resource start.
  3. Create the SAPInstance resource for the PAS or AAS instance:

    [root]# pcs resource create rsc_SAPInstance_<SID>_D<instance> \
    ocf:heartbeat:SAPInstance \
    InstanceName="<sap_instance_name>" \
    MINIMAL_PROBE=true \
    --group grp_<SID>_D<instance> \
    op monitor interval=20 on-fail=restart timeout=60
    • Replace <sap_instance_name> with the SAP start profile name of your PAS or AAS instance, for example, S4H_PAS_s4hpas for the PAS instance.

Verification

  1. Check the cluster status of the application resources:

    [root]# pcs status --full | grep S4H_D
     * Resource Group: grp_S4H_D21:
        * rsc_vip_S4H_D21	(ocf:heartbeat:IPaddr2):	 Started node1
        * rsc_SAPInstance_S4H_D21	(ocf:heartbeat:SAPInstance):	 Started node1
  2. Verify the resource configuration details of all resources in the PAS group:

    [root]# pcs resource config grp_S4H_D21
    Group: grp_S4H_D21
      Meta Attributes: grp_S4H_D21-meta_attributes
      Resource: rsc_vip_S4H_D21 (class=ocf provider=heartbeat type=IPaddr2)
        Attributes: rsc_vip_S4H_D21-instance_attributes
          cidr_netmask=32
          ip=192.168.200.103
          nic=eth0
        Operations:
    ...
      Resource: rsc_SAPStartSrv_S4H_D21 (class=ocf provider=heartbeat type=SAPStartSrv)
        Attributes: rsc_SAPStartSrv_S4H_D21-instance_attributes
          InstanceName=S4H_PAS_s4hpas
        Operations:
          monitor: rsc_SAPStartSrv_S4H_D21-monitor-interval-0s
            interval=0s timeout=20s enabled=0
    ...
      Resource: rsc_SAPInstance_S4H_D21 (class=ocf provider=heartbeat type=SAPInstance)
        Attributes: rsc_SAPInstance_S4H_D21-instance_attributes
          InstanceName=S4H_PAS_s4hpas
          MINIMAL_PROBE=true
        Meta Attributes: rsc_SAPInstance_S4H_D21-meta_attributes
        Operations:
    ...
          monitor: rsc_SAPInstance_S4H_D21-monitor-interval-20
            interval=20 timeout=60 on-fail=restart
    ...

5.9. Creating constraints for PAS and AAS

The PAS or AAS instances require the ASCS and database instance to be running before they can start properly. Configure cluster constraints to fulfill these requirements.

Prerequisites

  • You have installed and prepared PAS or AAS instances on the nodes to be managed by the same cluster as the ASCS instance.

Procedure

  1. Create the order constraint to start the PAS or AAS resource group only after the ASCS group is running:

    [root]# pcs constraint order start grp_<SID>_ASCS<ascs_instance> \
    then grp_<SID>_D<app_instance> kind=Optional symmetrical=false
    • Replace <SID> with your PAS or AAS SID, for example, S4H.
    • Replace <SID> with your S/4HANA environment SID, for example, S4H.
    • Replace <ascs_instance> with your ASCS instance number, for example, 20.
    • Replace <app_instance> with your PAS or AAS instance number, for example, 21 for PAS.
  2. Optional: When you have configured the PAS and the AAS instances in the same cluster, create the colocation constraint to ensure that the PAS and the AAS instances avoid running on the same cluster node, when possible:

    [root]# pcs constraint colocation add grp_<SID>_D<pas_instance> \
    with grp_<SID>_D<aas_instance> score=-1000
    • Replace <SID> with your PAS or AAS SID, for example, S4H.
    • Replace <SID> with your PAS/AAS SID, for example, S4H.
    • Replace <pas_instance> with your PAS instance number, for example, 21.
    • Replace <aas_instance> with your AAS instance number, for example, 22.
  3. Optional: When you have configured the HANA database in the same cluster, create a constraint to ensure that the HANA instance resource is promoted first:

    [root]# pcs constraint order promote cln_SAPHanaCon_<db_SID>_HDB<db_instance> \
    then grp_<app_SID>_D<app_instance> kind=Optional symmetrical=false
    • Replace <db_SID> with your database SID, for example, RH1.
    • Replace <db_instance> with your HANA instance number, for example, 02.
    • Replace <app_SID> with your NetWeaver PAS or AAS SID.
    • Replace <app_instance> with your PAS or AAS instance number.
  4. Optional: When you have configured a resource group for managing a legacy database, create resource constraints to ensure that the database resources are started before the PAS instance:

    [root]# pcs constraint order grp_<db_SID>_db \
    then grp_<app_SID>_D<instance> kind=Optional symmetrical=false
    • Replace <db_SID> with your database SID, for example, RH1.
    • Replace <app_SID> with your NetWeaver PAS or AAS SID, for example, S4H.
    • Replace <instance> with your PAS or AAS instance number, for example, 21.
  5. Optional: If you have configured PAS and AAS instances in the same cluster, ensure that these application server instances run on different cluster nodes for resiliency. Create a colocation constraint with a negative score so that the instances avoid running on the same node when possible:

    [root]# pcs constraint colocation add grp_S4H_D22 \
    with grp_S4H_D21 score=-1000

    Use a score of -1000 to allow the instances to run together if there is only one cluster node available. If you prefer to keep the AAS instance down during this situation, you can use score=-INFINITY, which prevents that AAS can run on the same node even when only one node is available.

Verification

  • Check that the constraints are configured correctly:

    [root]# pcs constraint
    Ordering Constraints:
      start grp_S4H_ASCS20 then start grp_S4H_D21 (kind:Optional)
      start grp_RH1_db then start grp_S4H_D21 (kind:Optional)

Managing SAP instances in an HA cluster also means that the instances cannot be managed using SAP tools while the cluster is active and in control of the instances.

You can configure the SAP HA interface to allow SAP admins to manage the SAP application server instances that are controlled by the Pacemaker cluster.

When you enable the SAP HA interface for each SAP application server instance, you ensure that the HA cluster becomes aware of any action performed by the SAP management tools that affects the application instance cluster resources. For example, the HA interface notifies the cluster when an instance it manages is being started or stopped by a SAP tool like SAP Landscape Management (LaMa) or sapcontrol commands.

Procedure

  1. Install the sap-cluster-connector package on all cluster nodes:

    [root]# dnf install sap-cluster-connector
  2. Add the SAP administrative user <sid>adm to the haclient group to allow the SAP user to run cluster commands:

    [root]# usermod -a -G haclient <sid>adm
  3. Add the service/halib configuration to the profiles of all application instances that are managed by the cluster:

    [root]# vi /sapmnt/<SID>/profile/<profile_name>
    …
    service/halib = $(DIR_EXECUTABLE)/saphascriptco.so
    service/halib_cluster_connector = /usr/bin/sap_cluster_connector
    • Replace <SID> with your instance SID, for example, S4H.
    • Replace <profile_name> with each instance profile, for example, S4H_ASCS_s4hascs for the ASCS instance.
  4. Restart the sapstartsrv process of all SAP instances of which you updated the instance profile in the previous step:

    [root]# su - <sid>adm -c "sapcontrol -nr <instance> -function RestartService"
    • Replace <sid> with your lower-case instance SID, for example, s4h.
    • Replace <instance> with the instance number, for example, 20 for the ASCS instance.
  5. Repeat steps 3 and 4 for every ASCS, ERS and application server instance that the cluster manages.

Verification

  1. Verify that sap_cluster_connector is loaded. You find the information in the sapstartsrv.log file on the node where the instance is running:

    [root]# grep -E "cluster_connector|HA_GetVersion" /usr/sap/S4H/ASCS20/work/sapstartsrv.log
    SAP HA Trace: profile_params: setting cluster_connector = "/usr/bin/sap_cluster_connector"
    SAP HA Trace: Fire system command /usr/bin/sap_cluster_connector init ...
    SAP HA Trace: === SAP_HA_GetVersionInfo ===
    SAP HA Trace: Fire system command /usr/bin/sap_cluster_connector gvi ...
    SAP HA Trace: SAP_HA_GetVersionInfo HA interface version: 3
    SAP HA Trace: SAP_HA_GetVersionInfo HAproduct: Pacemaker
    SAP HA Trace: SAP_HA_GetVersionInfo SAPinterface: sap_cluster_connector
    SAP HA Trace: SAP_HA_GetVersionInfo documentation: https://github.com/ClusterLabs/sap_cluster_connector
    SAP HA Trace: --- SAP_HA_GetVersionInfo Exit-Code: SAP_HA_OK ---
  2. Run the SAP HA interface check on the node where the instance is running. Verify that each line returns a SUCCESS state, for example:

    [root]# su - s4hadm -c "sapcontrol -nr 20 -function HACheckConfig"
    state, category, description, comment
    SUCCESS, SAP CONFIGURATION, Redundant ABAP instance configuration, 0 ABAP instances detected
    SUCCESS, SAP CONFIGURATION, Enqueue separation, All Enqueue server separated from application server
    SUCCESS, SAP CONFIGURATION, MessageServer separation, All MessageServer separated from application server
    SUCCESS, SAP STATE, SCS instance running, SCS instance status ok
    SUCCESS, SAP CONFIGURATION, SAPInstance RA sufficient version (s4hascs_S4H_20), SAPInstance includes is-ers patch
    SUCCESS, SAP CONFIGURATION, Enqueue replication (s4hascs_S4H_20), Enqueue replication enabled
    SUCCESS, SAP STATE, Enqueue replication state (s4hascs_S4H_20), Enqueue replication active
    SUCCESS, SAP CONFIGURATION, SAPInstance RA sufficient version (s4hers_S4H_29), SAPInstance includes is-ers patch
  3. Repeat steps 1-2 for all instances for which you configured the HA interface.

Chapter 6. Testing the setup

Test your new HANA HA cluster thoroughly before you enable it for production workloads.

Enhance the basic example test cases with your specific requirements.

Note

The following test case examples show the testing on a 2-node cluster with ASCS and ERS resource groups of an S/4HANA setup.

Test how the cluster moves an application server instance and its related resources from one node to another on demand. You can use this procedure to distribute instances to specific nodes.

Prerequisites

  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  • Move the ASCS resource to any other HA cluster node. You can use either the SAPInstance resource or the resource group in the command:

    [root]# pcs resource move rsc_SAPInstance_<SID>_ASCS<instance> [<node>]
    Location constraint to move resource 'rsc_SAPInstance_S4H_ASCS20' has been created
    Waiting for the cluster to apply configuration changes...
    Location constraint created to move resource 'rsc_SAPInstance_S4H_ASCS20' has been removed
    Waiting for the cluster to apply configuration changes...
    resource 'rsc_SAPInstance_S4H_ASCS20' is running on node 'node2'
    • Replace <SID> with the ASCS SID, for example, S4H.
    • Replace <instance> with the ASCS instance number, for example, 20.
    • Optionally, you can define a target node where the instance is moved to. When you do not define a node, the cluster chooses a healthy target node that meets the configuration.

Verification

  1. Check that the resource group starts fully on the other node, for example, after moving the ASCS group from node1 to node2:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1
  2. Optionally, when the cluster consists of 2 nodes: Verify that the ASCS resource group fully starts, then the ERS resource group automatically stops on this node and then moves to the node where the ASCS resource group was running before. This is triggered by the colocation constraint. Check for the related chain of actions by the pacemaker-controld in the system logfile:

    [root]# less /var/log/messages
    …
    … notice: Result of start operation for rsc_SAPInstance_S4H_ASCS20 on node2: ok
    …
    … notice: Requesting local execution of stop operation for rsc_SAPInstance_S4H_ERS29 on node2

This test verifies that the sapcontrol command is able to move the instances to the other HA cluster node when the SAP HA interface is enabled for the instance.

Prerequisites

  • You have enabled the SAP HA interface for the ASCS instance.
  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  1. Run the HAFailoverToNode function of sapcontrol to move the ASCS instance to the other node. Execute as user <sid>adm:

    <sid>adm $ sapcontrol -nr <instance> -function HAFailoverToNode ""
  2. Check that the instance stops on the current node and starts on the other node. The ERS instance automatically stops and starts as well after ASCS is fully up. For example, cluster resource status after you have moved the ASCS instance from node1 to node2:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1
  3. Check the new location constraint that has been created by the manual move due to the HA integration:

    [root]# pcs constraint
    Location Constraints:
      Started resource 'grp_S4H_ASCS20'
        Rules:
          Rule: boolean-op=and score=-INFINITY
            Expression: #uname eq string node1
            Expression: date lt YYYY-MM-DDT13:40:45Z

    The constraint defines that the ASCS resource group is banned for 5 minutes from the original node, which enforces the move to the other node. The date string in the rule defines the time at which the cluster deletes the constraint automatically.

  4. Optional: Remove the temporary constraint to enable the ASCS resource group on the previous node immediately and end this test:

    [root]# pcs resource clear grp_S4H_ASCS20

6.3. Testing failures of the ASCS instance

Test that the pacemaker cluster executes the recovery action when the enqueue server of the ASCS instance or the whole ASCS instance fails.

Prerequisites

  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  1. Identify the process ID (PID) of the enqueue server on the node where ASCS is running. Use sapcontrol… GetProcessList as user <sid>adm to find the PID as the last entry:

    <sid>adm $ sapcontrol -nr <instance> -function GetProcessList
    name, description, dispstatus, textstatus, starttime, elapsedtime, pid
    msg_server, MessageServer, GREEN, Running, YYYY MM DD 14:10:29, 0:01:00, 142607
    enq_server, Enqueue Server 2, GREEN, Running, YYYY MM DD 14:10:29, 0:01:00, 142608

    In the example, the enqueue server PID is 142608.

  2. Send a SIGKILL signal to the identified process to kill it instantly:

    <sid>adm $ kill -9 <pid>
    • Replace <pid> with the PID of the enqueue server, for example, 142608.
  3. Check that the ASCS instance recovers. With the default migration-threshold of all resources set to 3, the cluster restarts the resource on the same node twice, before it recovers on a different node.
  4. Repeat steps 1 and 2 until you have killed the process 3 times. The cluster recovers the resource on another node after 3 consecutive failures of the same resource.

Verification

  1. Check that the ASCS instance is running on the other node:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1

    In our example, the ASCS instance failed on node1 and has been moved to node2. As configured, the ERS instance moved to node1 after ASCS was up on node2.

  2. Check the fail count and notice that it has failed 3 times to trigger the recovery on the other node:

    [root]# pcs resource failcount
    Failcounts for resource 'rsc_SAPInstance_S4H_ASCS20'
      node1: 3

Next steps

6.4. Testing failures of the ERS instance

Test that the pacemaker cluster executes the recovery action when the enqueue server of the ERS instance or the whole ERS instance fails.

Prerequisites

  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  1. Identify the process ID (PID) of the enqueue server on the node where ERS is running. Use sapcontrol… GetProcessList as user <sid>adm to find the PID as the last entry:

    <sid>adm $ sapcontrol -nr <instance> -function GetProcessList
    name, description, dispstatus, textstatus, starttime, elapsedtime, pid
    enq_replicator, Enqueue Replicator 2, GREEN, Running, YYYY MM DD 15:42:03, 0:00:04, 19124

    In the example, the enqueue server PID is 191240.

  2. Send a SIGKILL signal to the identified process to kill it instantly:

    <sid>adm $ kill -9 <pid>
    • Replace <pid> with the PID of the enqueue server, for example, 191240.
  3. Check that the ERS instance recovers. With the default migration-threshold of all resources set to 3, the cluster restarts the resource on the same node twice, before it recovers on a different node.
  4. Repeat steps 1 and 2 until you have killed the process 3 times. The cluster recovers the resource on another node after 3 consecutive failures of the same resource.

Verification

  1. Check that the ERS instance is running on the other node:

    [root]# pcs resource
      * Resource Group: grp_S4H_ERS20:
        * rsc_vip_S4H_ERS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ERS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ERS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1

    In our example, the ERS instance failed on node2 and has been moved to node1. When the ERS instance cannot run on a different node than ASCS anymore due to the failures, it is restarted on the same node as the ASCS instance.

    When you clear the failure, the cluster automatically moves the ERS instance back to the other node.

  2. Check the fail count and notice that it has failed 3 times to trigger the recovery on the other node:

    [root]# pcs resource failcount
    Failcounts for resource 'rsc_SAPInstance_S4H_ERS20'
      node1: 3

Next steps

6.5. Crashing the node with the ASCS instance

Simulate the crash of the cluster node on which the ASCS instance runs to test the behavior of your cluster resources.

Prerequisites

  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  • Trigger a crash on the node that runs the ASCS instance, for example, node1. This immediately causes a kernel panic on the node, effectively simulating a system crash and the node becomes unresponsive.

    The cluster’s fencing mechanism (STONITH) detects the failure and initiates recovery actions. Typically it fences the node and restarts any failed resources on a surviving cluster node.

    The following command immediately causes a crash of the node on which you run the command, with no further warning:

    [root]# echo c > /proc/sysrq-trigger

Verification

  1. Check that the cluster on the other node fences the crashed node:

    [root]# pcs stonith history
    reboot of node1 successful: delegate=node2, client=pacemaker-controld.1468, origin=node2, completed=...
    1 event found
  2. Check that the cluster starts the ASCS resources on the remaining node. In a 2-node cluster this leads to ASCS and ERS running on the same node:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node2
  3. Check that the cluster automatically moves the ERS instance to the previously failed node after the fenced node is running again:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node2
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1

    The higher ASCS resource stickiness and the constraints you have configured earlier ensure that the ASCS instance stays in place to avoid the unnecessary disruption of the service again.

Next steps

6.6. Crashing the node with the ERS instance

Simulate the crash of the cluster node on which the ERS instance runs to test the behavior of your cluster resources.

Prerequisites

  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  • Trigger a crash on the node that runs the ERS instance, for example, node2. This immediately causes a kernel panic on the node, effectively simulating a system crash and the node becomes unresponsive.

    The cluster’s fencing mechanism (STONITH) detects the failure and initiates recovery actions. Typically it fences the node and restarts any failed resources on a surviving cluster node.

    The following command immediately causes a crash of the node on which you run the command, with no further warning:

    [root]# echo c > /proc/sysrq-trigger

Verification

  1. Check that the cluster on the other node fences the crashed node:

    [root]# pcs stonith history
    reboot of node2 successful: delegate=node1, client=pacemaker-controld.1426, origin=node1, completed=...
    1 event found
  2. Check that the cluster starts the ERS resources on the remaining node. In a 2-node cluster this leads to ASCS and ERS running on the same node:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node1
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node1
  3. Check that the cluster automatically moves the ERS instance to the previously failed node after the fenced node is running again:

    [root]# pcs resource
    * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node1
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node1
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node1
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node2

Next steps

Test that the pacemaker cluster recovers the ASCS or the ERS instance on a different node after consecutive failures.

The constraints you have configured in the ENSA2 setup try to keep the ASCS and ERS instances on separate nodes and the cluster uses any extra node for the recovery.

In a cluster with more than 2 nodes, the recovery of a failed ASCS or ERS instance is similar. The following test demonstrates an example of a failing ASCS instance.

Prerequisites

  • You have configured your ASCS and ERS instance in an ENSA2 setup.
  • You have configured 3 or more cluster nodes in this cluster.
  • You have configured the additional cluster node(s) to be able to run the instances.
  • You have ensured that all cluster nodes are up and the resource groups for the ASCS and ERS are running on different nodes.
  • You have no failures in the cluster status.

Procedure

  1. Identify the process ID (PID) of the enqueue server on the node where ASCS is running. Use sapcontrol… GetProcessList as user <sid>adm to find the PID as the last entry:

    <sid>adm $ sapcontrol -nr <instance> -function GetProcessList
    name, description, dispstatus, textstatus, starttime, elapsedtime, pid
    enq_replicator, Enqueue Replicator 2, GREEN, Running, YYYY MM DD 13:20:07, 0:00:08, 161323

    In the example, the enqueue server PID is 161323.

  2. Send a SIGKILL signal to the identified process to kill it instantly, for example, ASCS on node1:

    <sid>adm $ kill -9 <pid>
    • Replace <pid> with the PID of the enqueue server, for example, 161323.
  3. Check that the ASCS instance recovers on the same node. With the default migration-threshold of all resources set to 3, the cluster restarts the resource on the same node twice, before it recovers on a different node.
  4. Repeat steps 1 and 2 until you have killed the process 3 times. The cluster recovers the resource on another node after 3 consecutive failures of the same resource.

Verification

  1. Check that the ASCS instance is now running on the additional node:

    [root]# pcs resource
      * Resource Group: grp_S4H_ASCS20:
        * rsc_vip_S4H_ASCS20        (ocf:heartbeat:IPAddr2):         Started node3
        * rsc_SAPStartSrv_S4H_ASCS20        (ocf:heartbeat:SAPStartSrv):     Started node3
        * rsc_SAPInstance_S4H_ASCS20        (ocf:heartbeat:SAPInstance):     Started node3
      * Resource Group: grp_S4H_ERS29:
        * rsc_vip_S4H_ERS29 (ocf:heartbeat:IPAddr2):         Started node2
        * rsc_SAPStartSrv_S4H_ERS29 (ocf:heartbeat:SAPStartSrv):     Started node2
        * rsc_SAPInstance_S4H_ERS29 (ocf:heartbeat:SAPInstance):     Started node2

    In our example, the ASCS instance failed on node1 and has been moved to node3. The ERS instance stays in place on node2.

  2. Check the fail count and notice that it has failed 3 times to trigger the recovery on the other node:

    [root]# pcs resource failcount
    Failcounts for resource 'rsc_SAPInstance_S4H_ASCS20'
      node1: 3

Next steps

Chapter 7. Adding a node to the cluster

In SAP S/4HANA with ENSA2 setup of your ASCS and ERS instances, you can configure more than two nodes in the cluster to increase the resiliency and flexibility of your environment.

7.1. Preparing a new cluster node

To add a new node to an existing cluster that manages SAP application server instances, you first prepare the instance specific operating system setup in the same way as you have already configured on the existing cluster nodes.

You must execute the following set of steps before you proceed:

Use the SAP Software Manager to prepare the node for an existing instance. See Running Software Provisioning Manager for more details about the SAP software installation.

Prerequisites

  • You have installed and configured the new HA cluster node according to the recommendations from SAP and Red Hat for running SAP application server instances on RHEL 9. See Operating system requirements.
  • You have mounted the following filesystems on the new HA cluster node:

    • /sapmnt
    • /usr/sap/trans
    • /usr/sap/<SID>
  • You have the installation media available on the new system.

Procedure

  1. On the new node, go to the directory where you have extracted the installation media:

    [root]# cd <software_path>
    • Replace <software_path> with the path to your unpacked media, for example, /sapmedia/SWPM20_SP19/.
  2. Run the installer command on the new node:

    [root]# ./sapinst
  3. Open the web installer UI using the link provided in the terminal.
  4. Open the SAP product you want to install and enter the installation option. Expand the High-Availability System option and select Prepare Additional Cluster Node. Click Next.
  5. Provide the requested installation information on each page and click Next to move forward.

    Some steps, like extracting SAP packages, can take a while. Keep an eye on the terminal in which you started the installer for details of the ongoing process that are not displayed in the web UI.

Verification

  1. Check that the new node has the SAP ports in the services file. For example, count the entries that contain SAP System in their port description and compare the result on all existing nodes:

    [root]# grep -i "SAP System" /etc/services | wc -l
    401

    Update the /etc/services file on the node if it is missing entries.

  2. Check that the /usr/sap/hostctrl/ path is present and that the version is the same as on the existing cluster nodes:

    [root]# /usr/sap/hostctrl/exe/saphostexec -version

SAP instance services are managed through the local /usr/sap/sapservices file, which is created during the instance installation.

On the new cluster node you do not perform an instance installation. Therefore, you must copy this file from an existing node.

Procedure

  • Copy the /usr/sap/sapservices file directly from one node to the new node, for example, using root ssh keys between node1 and node3:

    [root]# rsync -av node1:/usr/sap/sapservices /usr/sap/sapservices

Verification

  1. Check that the file exists and has the same owner and permissions as on the source node:

    [root]# ls -lh /usr/sap/sapservices
    -rwxr-xr-x. 1 root sapinst 208 Jun 16 13:59 /usr/sap/sapservices
  2. Check that the file contains the configured instances, for example, ASCS and ERS:

    [root]# cat /usr/sap/sapservices
    systemctl --no-ask-password start SAPS4H_20 # sapstartsrv pf=/usr/sap/S4H/SYS/profile/S4H_ASCS20_s4hascs
    systemctl --no-ask-password start SAPS4H_29 # sapstartsrv pf=/usr/sap/S4H/SYS/profile/S4H_ERS29_s4hers

Systemd integration is the default configuration as of SAP Kernel Release 788. In HA environments you must apply additional modifications to integrate the different systemd services that are involved in the cluster setup.

Prerequisites

  • You have configured the systemd-based SAP startup framework on the existing cluster nodes. Skip this configuration otherwise.

Procedure

  1. Register the ASCS instance. Run the following SAP command as the root user on the new node to create the systemd integration:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/ASCS<instance>/exe && \
    /usr/sap/<SID>/ASCS<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_ASCS<instance>_<ascs_virtual_hostname> \
    -reg

    The command executes the sapstartsrv service for the selected instance profile and registers the instance service on the current system. It creates the systemd unit for the instance service, if it does not exist, and updates the local /usr/sap/sapservices file.

    • Replace <SID> with your ASCS instance SID, for example, S4H.
    • Replace <instance> with your ASCS instance number, for example, 20.
    • Replace <ascs_virtual_hostname> with the virtual hostname for your ASCS instance, for example, s4hascs.
  2. Register the ERS instance by repeating step 1 for the ERS profile:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/ERS<instance>/exe && \
    /usr/sap/<SID>/ERS<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_ERS<instance>_<ers_virtual_hostname> \
    -reg
  3. Optional: Register any PAS or AAS instance by repeating step 1 for the respective application server profile. Skip this step if you have not configured PAS or AAS instances in this cluster:

    [root]# export LD_LIBRARY_PATH=/usr/sap/<SID>/D<instance>/exe && \
    /usr/sap/<SID>/D<instance>/exe/sapstartsrv \
    pf=/usr/sap/<SID>/SYS/profile/<SID>_D<instance>_<as_virtual_hostname> \
    -reg
  4. Disable the ASCS, ERS and any other application instance service that the cluster manages:

    [root]# systemctl disable SAP<SID>_<instance>.service
    Removed "/etc/systemd/system/multi-user.target.wants/SAP<SID>_<instance>.service".

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat the same for the PAS or AAS instance services.

  5. Create the systemd drop-in directory for the ASCS, ERS and any other application instance service that the cluster manages:

    [root]# mkdir /etc/systemd/system/SAP<SID>_<instance>.service.d

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat using the PAS or AAS instance number.

  6. Create the drop-in files for the instances in the new directory:

    [root]# cat << EOF > /etc/systemd/system/SAP<SID>_<instance>.service.d/HA.conf
    [Service]
    Restart=no
    EOF

    Run this using the ASCS instance number and repeat the command using the ERS instance number.

    Optional: Repeat using the PAS or AAS instance number.

  7. Reload the systemd units to activate the drop-in configuration:

    [root]# systemctl daemon-reload

Verification

  1. Check that all instances have instance systemd units and that they are disabled on the new node:

    [root]# systemctl list-unit-files SAPS4H*
    UNIT FILE         STATE    PRESET
    SAPS4H_20.service disabled disabled
    SAPS4H_29.service disabled disabled

    Optional: PAS or AAS instance service files are listed as well in all of the verification steps when you have configured the application server instances.

  2. Check that the sapservices file contains entries for every instance on every cluster node:

    [root]# cat /usr/sap/sapservices
    systemctl --no-ask-password start SAPS4H_20 # sapstartsrv pf=/sapmnt/S4H/profile/S4H_ASCS20_s4hascs
    systemctl --no-ask-password start SAPS4H_29 # sapstartsrv pf=/sapmnt/S4H/profile/S4H_ERS29_s4hers
  3. Check that all systemd configuration overrides are present:

    [root]# systemd-delta | grep SAP
    ...
    [EXTENDED]   /etc/systemd/system/SAPS4H_20.service → /etc/systemd/system/SAPS4H_20.service.d/HA.conf
    [EXTENDED]   /etc/systemd/system/SAPS4H_29.service → /etc/systemd/system/SAPS4H_29.service.d/HA.conf

Prerequisites

  • You have configured the RHEL High Availability repository on the planned cluster nodes.

Procedure

  1. Install the Red Hat High Availability Add-On software packages from the High Availability repository. Choose the same fence agents as you have configured on the existing nodes and execute the installation on the new node:

    [root]# dnf install pcs pacemaker fence-agents-<model>
  2. Start and enable the pcsd service:

    [root]# systemctl enable --now pcsd.service
  3. Optional: If you are running the firewalld service, enable the ports that are required by the Red Hat High Availability Add-On:

    [root]# firewall-cmd --add-service=high-availability
    [root]# firewall-cmd --runtime-to-permanent
  4. Set a password for the user hacluster:

    [root]# passwd hacluster
  5. Authenticate the user hacluster for the new node in the existing cluster. Run this on an existing node, for example, node1:

    [root]# pcs host auth <node3>
    Username: hacluster
    Password:
    <node3>: Authorized
    • Enter the node names with or without FQDN, as defined in the /etc/hosts file.
    • Enter the hacluster user password in the prompt.
  6. Add the new node to the existing cluster. This syncs cluster files between the nodes. Run this on an existing node, for example, node1:

    [root]# pcs cluster node add <node3>
    No addresses specified for host 'node3', using 'node3'
    Disabling sbd...
    node3: sbd disabled
    Sending 'corosync authkey', 'pacemaker authkey' to 'node3'
    node3: successful distribution of the file 'corosync authkey'
    node3: successful distribution of the file 'pacemaker authkey'
    Sending updated corosync.conf to nodes...
    node1: Succeeded
    node3: Succeeded
    node2: Succeeded
    node1: Corosync configuration reloaded
  7. Start the cluster on the new node. Run this on the new node, for example, node3:

    [root]# pcs cluster start
    Starting Cluster...
  8. Enable the cluster to be started automatically on system start, which enables the corosync and pacemaker services. Skip this step if you prefer to manually control the start of the cluster after a node restarts. Run on the new node:

    [root]# pcs cluster enable

Verification

  • Check that the new node is available as a cluster member:

    [root]# pcs cluster status
    Cluster Status:
     Cluster Summary:
       * Stack: corosync (Pacemaker is running)
    …
       * 3 nodes configured
       * 7 resource instances configured
     Node List:
       * Online: [ node1 node2 node3 ]
    
    PCSD Status:
      node3: Online
      node2: Online
      node1: Online

Next steps

For detailed steps, refer to Installing the SAP application server HA components.

Chapter 8. Maintenance procedures

You must apply specific steps to ensure that the cluster does not cause unplanned impact so that you can perform maintenance of different components of SAP HA environments.

Use maintenance procedures to keep your cluster in a healthy state during planned change activity or to restore the health after unplanned incidents.

8.1. Cleaning up the failure history

Clear any failure notifications from the cluster that may be there from previous testing. This resets the failure counters and the migration thresholds.

Procedure

  1. Clean up resource failures:

    [root]# pcs resource cleanup
  2. Clean up the STONITH failure history:

    [root]# pcs stonith history cleanup

Verification

  1. Check the overall cluster status and confirm that no failures are displayed anymore:

    [root]# pcs status --full
  2. Check that the stonith history for fencing actions has 0 events:
[root]# pcs stonith history

You can either use the cluster control or SAP tools to move an application server instance to a different node.

Use one of the following sequence of steps and apply it to the application server instance you want to move:

For updates or offline changes on the HA cluster, the operating system or even the system hardware, you must follow the Recommended Practices for Applying Software Updates to a RHEL High Availability or Resilient Storage Cluster.

For any kind of maintenance of applications or other components that the HA cluster manages, you must enable the cluster maintenance mode to prevent the cluster from any interference during the maintenance.

During the update of your application server instances the cluster remains running, but is not actively monitoring resources or taking any actions.

Prerequisites

  • You have configured the Pacemaker cluster to manage SAP application server instances.

Procedure

  1. Set maintenance mode for the entire cluster:

    [root]# pcs property set maintenance-mode=true

    Setting maintenance for the whole cluster ensures that no activity during the maintenance phase can trigger cluster actions and impact the sap instance update process.

  2. Verify that the cluster resource management is fully disabled:

    [root]# pcs status
    ...
    
                  *** Resource management is DISABLED ***
      The cluster will not attempt to start, stop or recover services
    ...
  3. Perform the required maintenance on the application server instances using the SAP procedure.
  4. Remove the maintenance mode of the cluster again:

    [root]# pcs property set maintenance-mode=

Chapter 9. Troubleshooting

The monitor operation of a cluster resource can fail when the application does not respond within the defined timeout.

For application resources like SAPInstance, one reason for such a timeout could be that the underlying filesystem on which the application is running is not responding at that moment. When your SAP instances are running on a shared NFS filesystem, a delay in the filesystem response can affect all instances on that filesystem on different cluster nodes. This can happen, for example, when the NFS server is temporarily overloaded or when there is maintenance ongoing on the NFS infrastructure or the related network connection.

Commands that the SAPInstance resource runs during a monitor operation:

  • systemctl status <instance_service>, when the instance is systemd-enabled.
  • pgrep to check for running processes related to the instance.
  • sapcontrol for different functions, like GetProcessList to get the list of instance components and compare them to the MONITOR_SERVICES list.
  • sapstartsrv to start the SAP start service of the instance, if it is not running.

You can increase the monitor operation timeouts of your resources to allow for lags in monitoring responses.

Procedure

  1. Review the current settings of the affected resource, for example, the SAPInstance resource of your ASCS instance. The default monitor timeout of SAPInstance resource is 60 seconds:

    [root]# pcs resource config rsc_SAPInstance_S4H_ASCS20
    Resource: rsc_SAPInstance_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPInstance)
      Attributes: rsc_SAPInstance_S4H_ASCS20-instance_attributes
        InstanceName=S4H_ASCS20_s4hascs
        MINIMAL_PROBE=true
      Meta Attributes: rsc_SAPInstance_S4H_ASCS20-meta_attributes
        resource-stickiness=5000
      Operations:
        demote: rsc_SAPInstance_S4H_ASCS20-demote-interval-0s
          interval=0s timeout=320s
        methods: rsc_SAPInstance_S4H_ASCS20-methods-interval-0s
          interval=0s timeout=5s
        monitor: rsc_SAPInstance_S4H_ASCS20-monitor-interval-20
          interval=20 timeout=60 on-fail=restart
        promote: rsc_SAPInstance_S4H_ASCS20-promote-interval-0s
          interval=0s timeout=320s
        reload: rsc_SAPInstance_S4H_ASCS20-reload-interval-0s
          interval=0s timeout=320s
        start: rsc_SAPInstance_S4H_ASCS20-start-interval-0s
          interval=0s timeout=180s
        stop: rsc_SAPInstance_S4H_ASCS20-stop-interval-0s
          interval=0s timeout=240s
  2. Update the monitor timeout of the resource to a value that fits your environment and requirements. For example, increase the timeout to 120s:

    [root]# pcs resource update rsc_SAPInstance_S4H_ASCS20 op monitor timeout=120s
  3. Verify the updated resource settings:

    [root]# pcs resource config rsc_SAPInstance_S4H_ASCS20
    Resource: rsc_SAPInstance_S4H_ASCS20 (class=ocf provider=heartbeat type=SAPInstance)
      Attributes: rsc_SAPInstance_S4H_ASCS20-instance_attributes
        InstanceName=S4H_ASCS20_s4hascs
        MINIMAL_PROBE=true
      Meta Attributes: rsc_SAPInstance_S4H_ASCS20-meta_attributes
        resource-stickiness=5000
      Operations:
    …
        monitor: rsc_SAPInstance_S4H_ASCS20-monitor-interval-60s
          interval=60s timeout=120s
    …

The connections between your SAP application servers can get closed when they are idle for some time. This depends on the network landscape of your systems.

If you encounter issues like the communication between applications getting lost, you can try to tune the keepalive settings in the application instances and in the OS.

Procedure

  1. Check the current values of the following TCP keepalive operating system settings on your SAP application servers, for example:

    [root]# sysctl -a --pattern net.ipv4.tcp_keepalive
    net.ipv4.tcp_keepalive_intvl = 75
    net.ipv4.tcp_keepalive_probes = 9
    net.ipv4.tcp_keepalive_time = 7200
  2. Temporarily update the TCP keepalive settings in a way that helps in your particular situation. The following are example values that SAP recommends in SAP Note 1410736 - TCP/IP: setting keepalive interval. Apply changes only as required for your environment and on all cluster nodes:

    [root]# sysctl -w \
    net.ipv4.tcp_keepalive_time=300 \
    net.ipv4.tcp_keepalive_intvl=75 \
    net.ipv4.tcp_keepalive_probes=9
  3. Test if the settings improve the situation.
  4. Make the changes permanent. Add the TCP keepalive settings to a sysctl configuration file for a persistent setup. Do this on every cluster node:

    [root]# cat << EOF >> /etc/sysctl.d/sap.conf
    net.ipv4.tcp_keepalive_time=300
    net.ipv4.tcp_keepalive_intvl=75
    net.ipv4.tcp_keepalive_probes=9
    EOF

Verification

  • Check the new values of the TCP keepalive operating system settings on all cluster nodes:

    [root]# sysctl -a --pattern net.ipv4.tcp_keepalive
    net.ipv4.tcp_keepalive_intvl = 75
    net.ipv4.tcp_keepalive_probes = 9
    net.ipv4.tcp_keepalive_time = 300

Appendix A. Component options

A.1. SAPInstance resource parameters

Parameters that are available for the configuration of SAPInstance resources are shown below:

Expand
Resource optionsRequiredDefaultDescription

InstanceName

yes

 

The full SAP instance profile name (<SID>_<NAME><instance>_<virt_ hostname>), for example, S4H_ASCS20_s4ascs.

START_PROFILE

no

 

The name of the SAP instance profile. Specify this parameter, if you have changed the name of the SAP instance profile after the default SAP installation.

SAP standard paths are searched by default.

IS_ERS

no

false

Only used for ASCS/ERS SAP Netweaver installations without implementing a promotable resource to allow the ASCS to find the ERS running on another cluster node after a resource failure.

You must set this parameter to true for the ERS instance resource in an ENSA1 setup.

DIR_EXECUTABLE

no

 

The fully qualified path to binaries such as sapstartsrv and sapcontrol. Specify this parameter if you have changed the SAP kernel directory location after the default SAP installation.

SAP standard paths are searched by default.

DIR_PROFILE

no

 

The fully qualified path to the SAP START profile. Specify this parameter if you have changed the SAP profile directory location after the default SAP installation.

SAP standard paths are searched by default.

AUTOMATIC_RECOVER

no

false

The resource agent tries to recover a failed start attempt automatically one time. This is done by killing running instance processes, removing the kill.sap file, and executing cleanipc. Sometimes a crashed SAP instance leaves some processes or shared memory segments behind. Set this option to true to remove those leftovers during a start operation.

MONITOR_SERVICES

no

disp+work|msg_server|enserver|enrepserver|jcontrol|jstart|enq_server|enq_replicator

The list of services of an SAP instance that need to be monitored to determine the health of the instance. To monitor more, less or other services that sapstartsrv supports, change the list using this parameter. Names must match the strings used in the output of the command sapcontrol -nr <instance> -function GetProcessList. Use the pipe sign to specify multiple services. The value for this parameter must always be the complete list of the services to monitor.

SHUTDOWN_METHOD

no

normal

Usually a SAP instance is stopped by the command sapcontrol -nr <instance> -function Stop.

Set this to KILL to kill the SAP instance using operating system commands. This terminates SAP processes of the instance with kill -9, deletes shared memory using cleanipc and deletes the kill.sap file. That method is much faster than the graceful stop, but the instance does not have the chance to notify other SAP instances in the same system. Use with care!

START_WAITTIME

no

3600

The maximum time in seconds the cluster waits after a resource start, before the resource executes a monitor operation.

ERS_InstanceName

no

 

Only enable this in a promotable resource configuration.

The fully qualified SAP enqueue replication instance name. Usually this is the name of the SAP instance profile. You must use the following properties for the promotable configuration in the cluster:

  • clone_max = 2
  • clone_node_max = 1
  • master_node_max = 1
  • master_max = 1

ERS_START_PROFILE

no

 

Only use this in a promotable resource configuration.

Specify this parameter, if you have changed the name of the ERS instance profile after the default SAP installation.

SAP standard paths are searched by default.

MINIMAL_PROBE

no

false

Setting this to true forces the resource agent to do only a minimal check during the resource probe. This is needed for special file system setups.

Enabling this is only supported when specified in the configuration steps.

A.2. SAPStartSrv resource parameters

Parameters that are available for the configuration of SAPStartSrv resources are shown below:

Expand
Resource optionsRequiredDefaultDescription

InstanceName

yes

 

The full SAP instance profile name (<SID>_<NAME><instance>_<virt_ hostname>), for example, S4H_ASCS20_s4ascs.

START_PROFILE

no

 

The name of the SAP instance profile. Specify this parameter, if you have changed the name of the SAP instance profile after the default SAP installation.

SAP standard paths are searched by default.

A.3. SAPDatabase resource parameters

Parameters that are available for the configuration of SAPDatabase resources are shown below:

Expand
Resource optionsRequiredDefaultDescription

SID

yes

 

SAP system identifier.

DBTYPE

yes

 

The name of the database vendor you use. Set this to ADA, DB6, ORA or SYB accordingly.

DBINSTANCE

no

 

You must set this for implementations where the database instance name is not the same as the SID, for example when using Oracle DataGuard.

DBOSUSER

no

 

Set this parameter when your database is using a different user than the default.

Defaults per database type:

  • ADA: defined in /etc/opt/sdb
  • DB6: db2<sid>
  • ORA: ora<sid> and oracle
  • SYB: syb<sid>

STRICT_MONITORING

no

false

This parameter controls how the resource agent monitors the database. Set to true to use saphostctrl -function GetDatabaseStatus for testing the database state. By default, and when set to false, only operating system processes are monitored.

AUTOMATIC_RECOVER

no

false

Set this to true to always call saphostctrl -function StartDatabase with the -force option.

MONITOR_SERVICES

no

 

Defines which services are monitored by the SAPDatabase resource agent. Service names must correspond to the output of the saphostctrl -function GetDatabaseStatus command.

The database type DBTYPE defines the default value as follows:

  • ADA: "Database"
  • DB6: "{SID}|{db2sid}"
  • ORA: "Instance|Database|Listener"
  • SYB: "Server"

Set this parameter only if you need to monitor different services than the ones listed above.

Legal Notice

Copyright © 2025 Red Hat, Inc.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, the Red Hat logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top