此内容没有您所选择的语言版本。
Chapter 6. Set Up Distribution Mode
6.1. About Distribution Mode 复制链接链接已复制到粘贴板!
When enabled, Red Hat JBoss Data Grid’s distribution mode stores each entry on a subset of the nodes in the grid instead of replicating each entry on every node. Typically, each entry is stored on more than one node for redundancy and fault tolerance.
As a result of storing entries on selected nodes across the cluster, distribution mode provides improved scalability compared to other clustered modes.
A cache using distribution mode can transparently locate keys across a cluster using the consistent hash algorithm.
6.2. Distribution Mode’s Consistent Hash Algorithm 复制链接链接已复制到粘贴板!
The hashing algorithm in Red Hat JBoss Data Grid is based on consistent hashing. The term consistent hashing is still used for this implementation, despite some divergence from a traditional consistent hash.
Distribution mode uses a consistent hash algorithm to select a node from the cluster to store entries upon. The consistent hash algorithm is configured with the number of copies of each cache entry to be maintained within the cluster. Unlike generic consistent hashing, the implementation used in JBoss Data Grid splits the key space into fixed segments. The number of segments is configurable using numSegments and cannot be changed without restarting the cluster. The mapping of keys to segments is also fixed — a key maps to the same segment, regardless of how the topology of the cluster changes.
The number of copies set for each data item requires balancing performance and fault tolerance. Creating too many copies of the entry can impair performance and too few copies can result in data loss in case of node failure.
Each hash segment is mapped to a list of nodes called owners. The order is important because the first owner (also known as the primary owner) has a special role in many cache operations (for example, locking). The other owners are called backup owners. There is no rule about mapping segments to owners, although the hashing algorithms simultaneously balance the number of segments allocated to each node and minimize the number of segments that have to move after a node joins or leaves the cluster.
6.3. Locating Entries in Distribution Mode 复制链接链接已复制到粘贴板!
The consistent hash algorithm used in Red Hat JBoss Data Grid’s distribution mode can locate entries deterministically, without multicasting a request or maintaining expensive metadata.
A PUT operation can result in as many remote calls as specified by the owners parameter, while a GET operation executed on any node in the cluster results in a single remote call. In the background, the GET operation results in the same number of remote calls as a PUT operation (specifically the value of the owners parameter), but these occur in parallel and the returned entry is passed to the caller as soon as one returns.
6.4. Return Values in Distribution Mode 复制链接链接已复制到粘贴板!
In Red Hat JBoss Data Grid’s distribution mode, a synchronous request is used to retrieve the previous return value if it cannot be found locally. A synchronous request is used for this task irrespective of whether distribution mode is using asynchronous or synchronous processes.
6.5. Configure Distribution Mode 复制链接链接已复制到粘贴板!
Distribution mode is a clustered mode in Red Hat JBoss Data Grid. Distribution mode can be added to any cache container, in both Library Mode and Remote Client-Server Mode, using the following procedure:
The distributed-cache Element
The distributed-cache element configures settings for the distributed cache using the following parameters:
-
The
nameparameter provides a unique identifier for the cache. -
If
statisticsare enabled at the container level, per-cache statistics can be selectively disabled for caches that do not require monitoring by setting thestatisticsattribute tofalse.
JGroups must be appropriately configured for clustered mode before attempting to load this configuration.
6.6. Synchronous and Asynchronous Distribution 复制链接链接已复制到粘贴板!
To elicit meaningful return values from certain public API methods, it is essential to use synchronized communication when using distribution mode.
Communication Mode example
For example, with three nodes in a cluster, node A, B and C, and a key K that maps nodes A and B. Perform an operation on node C that requires a return value, for example Cache.remove(K). To execute successfully, the operation must first synchronously forward the call to both node A and B, and then wait for a result returned from either node A or B. If asynchronous communication was used, the usefulness of the returned values cannot be guaranteed, despite the operation behaving as expected.