3.1. Shared File System Master/Slave
File locking requirements
The shared file system requires an efficient and reliable file locking mechanism to function correctly. Not all SAN file systems are compatible with the configuration needs of the shared file system.
OCFS2 is incompatible with this master/slave configuration, because mutex file locking from Java is not supported.
NFSv3 is incompatible with this master/slave configuration. In the event of an abnormal termination of a master broker, which is an NFSv3 client, the NFSv3 server does not time out the lock held by the client. This renders the Red Hat JBoss A-MQ data directory inaccessible. Because of this, the slave broker cannot acquire the lock and therefore cannot start up. In this case, the only way to unblock the master/slave in NFSv3 is to reboot all broker instances.
NFSv4, on the other hand, is compatible with the master/slave configuration, as its design includes timeouts for locks. When an NFSv4 client holding a lock terminates abnormally, NFSv4 automatically releases the lock after thirty seconds, allowing another NFSv4 client to grab it.
It is possible for a slave to grab the lock from the master without the master's knowledge when NFSv4 crashes. This is so because the master broker does not automatically check whether it still has the lock, giving a slave the chance to grab it when the NFSv4 thirty second timeout elapses.
The persistence adapter's lockKeepAlivePeriod attribute enables you to avoid this scenario. Setting the lockKeepAlivePeriod attribute instructs the master to check, at intervals of the specified milliseconds, whether it still has the lock (lock is valid) and that the lock file still exists. If the master discovers the lock is invalid, it tries to regain it. If it fails or the lock file no longer exists, the master shuts down, allowing a slave to try to get the lock and become master. In attempting to get the lock, the slave also checks whether the lock files exists, and if not, assumes the integrity of the store has been compromised and shuts down.
To enable this lock checking mechanism, add the
lockKeepAlivePeriod attribute to the
persistenceAdaptor
element in the broker configuration. For example, like this:
which instructs the master broker to check at five second intervals whether the lock is still valid and that the lock file exists
In the shared file system master/slave configuration, there is nothing special to distinguish a master broker from the slave brokers. The membership of a particular master/slave group is defined by the fact that all of the brokers in the group use the same persistence layer and store their data in the same shared directory.
Example 3.1. Shared File System Broker Configuration
<broker ... >
...
<persistenceAdapter>
<kahaDB directory="/sharedFileSystem/sharedBrokerData" lockKeepAlivePeriod="5000"/>
</persistenceAdapter>
...
</broker>
All of the brokers in the group must share the same persistenceAdapter
element.