Chapter 5. The Grouping API
5.1. The Grouping API
The Grouping API can relocate groups of entries to a specified node or to a node selected using the hash of the group.
5.2. Grouping API Operations
Normally, Red Hat JBoss Data Grid uses the hash of a specific key to determine an entry’s destination node. However, when the Grouping API is used, a hash of the group associated with the key is used instead of the hash of the key to determine the destination node.
Each node can use an algorithm to determine the owner of each key. This removes the need to pass metadata (and metadata updates) about the location of entries between nodes. This approach is beneficial because:
- Every node can determine which node owns a particular key without expensive metadata updates across nodes.
- Redundancy is improved because ownership information does not need to be replicated if a node fails.
When using the Grouping API, each node must be able to calculate the owner of an entry. As a result, the group cannot be specified manually and must be either:
- Intrinsic to the entry, which means it was generated by the key class.
- Extrinsic to the entry, which means it was generated by an external function.
5.3. Grouping API Use Case
This feature allows logically related data to be stored on a single node. For example, if the cache contains user information, the information for all users in a single location can be stored on a single node.
The benefit of this approach is that when seeking specific (logically related) data, the Distributed Executor task is directed to run only on the relevant node rather than across all nodes in the cluster. Such directed operations result in optimized performance.
Grouping API Example
Acme, Inc. is a home appliance company with over one hundred offices worldwide. Some offices house employees from various departments, while certain locations are occupied exclusively by the employees of one or two departments. The Human Resources (HR) department has employees in Bangkok, London, Chicago, Nice and Venice.
Acme, Inc. uses Red Hat JBoss Data Grid’s Grouping API to ensure that all the employee records for the HR department are moved to a single node (Node AB) in the cache. As a result, when attempting to retrieve a record for a HR employee, the DistributedExecutor
only checks node AB and quickly and easily retrieves the required employee records.
Storing related entries on a single node as illustrated optimizes the data access and prevents time and resource wastage by seeking information on a single node (or a small subset of nodes) instead of all the nodes in the cluster.
5.4. Configure the Grouping API
5.4.1. Configure the Grouping API
Use the following steps to configure the Grouping API:
- Enable groups using either the declarative or programmatic method.
- Specify either an intrinsic or extrinsic group. For more information about these group types, see Specify an Intrinsic Group and Specify an Extrinsic Group.
- Register all specified groupers.
5.4.2. Enable Groups
The first step to set up the Grouping API is to enable groups. The following example demonstrates how to enable Groups:
Configuration c = new ConfigurationBuilder().clustering().hash().groups().enabled().build();
5.4.3. Specify an Intrinsic Group
Use an intrinsic group with the Grouping API if:
- the key class definition can be altered, that is if it is not part of an unmodifiable library.
- if the key class is not concerned with the determination of a key/value pair group.
Use the @Group
annotation in the relevant method to specify an intrinsic group. The group must always be a String, as illustrated in the example:
Specifying an Intrinsic Group Example
class User { <!-- Additional configuration information here --> String office; <!-- Additional configuration information here --> public int hashCode() { // Defines the hash for the key, normally used to determine location <!-- Additional configuration information here --> } // Override the location by specifying a group, all keys in the same // group end up with the same owner @Group String getOffice() { return office; } }
5.4.4. Specify an Extrinsic Group
Specify an extrinsic group for the Grouping API if:
- the key class definition cannot be altered, that is if it is part of an unmodifiable library.
- if the key class is concerned with the determination of a key/value pair group.
An extrinsic group is specified using an implementation of the Grouper
interface. This interface uses the computeGroup
method to return the group.
In the process of specifying an extrinsic group, the Grouper
interface acts as an interceptor by passing the computed value to computeGroup
. If the @Group
annotation is used, the group using it is passed to the first Grouper
. As a result, using an intrinsic group provides even greater control.
Specifying an Extrinsic Group Example
The following is an example that consists of a simple Grouper
that uses the key class to extract the group from a key using a pattern. Any group information specified on the key class is ignored in such a situation.
public class KXGrouper implements Grouper<String> { // A pattern that can extract from a "kX" (e.g. k1, k2) style key // The pattern requires a String key, of length 2, where the first character is // "k" and the second character is a digit. We take that digit, and perform // modular arithmetic on it to assign it to group "1" or group "2". private static Pattern kPattern = Pattern.compile("(^k)(\\d)$"); public String computeGroup(String key, String group) { Matcher matcher = kPattern.matcher(key); if (matcher.matches()) { String g = Integer.parseInt(matcher.group(2)) % 2 + ""; return g; } else return null; } public Class<String> getKeyType() { return String.class; } }
5.4.5. Register Groupers
After creation, each grouper must be registered to be used.
Programmatically Register a Grouper
Configuration c = new ConfigurationBuilder().clustering().hash().groups().addGrouper(new KXGrouper()).enabled().build();