Chapter 4. Data Grid Cache Interface
Data Grid provides a Cache interface that exposes simple methods for adding, retrieving and removing entries, including atomic mechanisms exposed by the JDK’s ConcurrentMap interface. Based on the cache mode used, invoking these methods will trigger a number of things to happen, potentially even including replicating an entry to a remote node or looking up an entry from a remote node, or potentially a cache store.
4.1. Cache API
For simple usage, using the Cache API should be no different from using the JDK Map API, and hence migrating from simple in-memory caches based on a Map to Data Grid’s Cache should be trivial.
4.1.1. Performance Concerns of Certain Map Methods
Certain methods exposed in Map have certain performance consequences when used with Data Grid, such as size() , values() , keySet() and entrySet() . Specific methods on the keySet
, values
and entrySet
are fine for use please see their Javadoc for further details.
Attempting to perform these operations globally would have large performance impact as well as become a scalability bottleneck. As such, these methods should only be used for informational or debugging purposes only.
It should be noted that using certain flags with the withFlags() method can mitigate some of these concerns, please check each method’s documentation for more details.
4.1.2. Mortal and Immortal Data
Further to simply storing entries, Data Grid’s cache API allows you to attach mortality information to data. For example, simply using put(key, value) would create an immortal entry, i.e., an entry that lives in the cache forever, until it is removed (or evicted from memory to prevent running out of memory). If, however, you put data in the cache using put(key, value, lifespan, timeunit) , this creates a mortal entry, i.e., an entry that has a fixed lifespan and expires after that lifespan.
In addition to lifespan , Data Grid also supports maxIdle as an additional metric with which to determine expiration. Any combination of lifespans or maxIdles can be used.
4.1.3. putForExternalRead operation
Data Grid’s Cache class contains a different 'put' operation called putForExternalRead . This operation is particularly useful when Data Grid is used as a temporary cache for data that is persisted elsewhere. Under heavy read scenarios, contention in the cache should not delay the real transactions at hand, since caching should just be an optimization and not something that gets in the way.
To achieve this, putForExternalRead()
acts as a put call that only operates if the key is not present in the cache, and fails fast and silently if another thread is trying to store the same key at the same time. In this particular scenario, caching data is a way to optimise the system and it’s not desirable that a failure in caching affects the on-going transaction, hence why failure is handled differently. putForExternalRead()
is considered to be a fast operation because regardless of whether it’s successful or not, it doesn’t wait for any locks, and so returns to the caller promptly.
To understand how to use this operation, let’s look at basic example. Imagine a cache of Person instances, each keyed by a PersonId , whose data originates in a separate data store. The following code shows the most common pattern of using putForExternalRead within the context of this example:
// Id of the person to look up, provided by the application PersonId id = ...; // Get a reference to the cache where person instances will be stored Cache<PersonId, Person> cache = ...; // First, check whether the cache contains the person instance // associated with with the given id Person cachedPerson = cache.get(id); if (cachedPerson == null) { // The person is not cached yet, so query the data store with the id Person person = dataStore.lookup(id); // Cache the person along with the id so that future requests can // retrieve it from memory rather than going to the data store cache.putForExternalRead(id, person); } else { // The person was found in the cache, so return it to the application return cachedPerson; }
Note that putForExternalRead should never be used as a mechanism to update the cache with a new Person instance originating from application execution (i.e. from a transaction that modifies a Person’s address). When updating cached values, please use the standard put operation, otherwise the possibility of caching corrupt data is likely.
4.2. AdvancedCache API
In addition to the simple Cache interface, Data Grid offers an AdvancedCache interface, geared towards extension authors. The AdvancedCache offers the ability to access certain internal components and to apply flags to alter the default behavior of certain cache methods. The following code snippet depicts how an AdvancedCache can be obtained:
AdvancedCache advancedCache = cache.getAdvancedCache();
4.2.1. Flags
Flags are applied to regular cache methods to alter the behavior of certain methods. For a list of all available flags, and their effects, see the Flag enumeration. Flags are applied using AdvancedCache.withFlags() . This builder method can be used to apply any number of flags to a cache invocation, for example:
advancedCache.withFlags(Flag.CACHE_MODE_LOCAL, Flag.SKIP_LOCKING) .withFlags(Flag.FORCE_SYNCHRONOUS) .put("hello", "world");
4.3. Listeners and Notifications
Data Grid offers a listener API, where clients can register for and get notified when events take place. This annotation-driven API applies to 2 different levels: cache level events and cache manager level events.
Events trigger a notification which is dispatched to listeners. Listeners are simple POJOs annotated with @Listener and registered using the methods defined in the Listenable interface.
Both Cache and CacheManager implement Listenable, which means you can attach listeners to either a cache or a cache manager, to receive either cache-level or cache manager-level notifications.
For example, the following class defines a listener to print out some information every time a new entry is added to the cache, in a non blocking fashion:
@Listener public class PrintWhenAdded { Queue<CacheEntryCreatedEvent> events = new ConcurrentLinkedQueue<>(); @CacheEntryCreated public CompletionStage<Void> print(CacheEntryCreatedEvent event) { events.add(event); return null; } }
For more comprehensive examples, please see the Javadocs for @Listener.
4.3.1. Cache-level notifications
Cache-level events occur on a per-cache basis, and by default are only raised on nodes where the events occur. Note in a distributed cache these events are only raised on the owners of data being affected. Examples of cache-level events are entries being added, removed, modified, etc. These events trigger notifications to listeners registered to a specific cache.
Please see the Javadocs on the org.infinispan.notifications.cachelistener.annotation package for a comprehensive list of all cache-level notifications, and their respective method-level annotations.
Please refer to the Javadocs on the org.infinispan.notifications.cachelistener.annotation package for the list of cache-level notifications available in Data Grid.
4.3.1.1. Cluster Listeners
The cluster listeners should be used when it is desirable to listen to the cache events on a single node.
To do so all that is required is set to annotate your listener as being clustered.
@Listener (clustered = true) public class MyClusterListener { .... }
There are some limitations to cluster listeners from a non clustered listener.
-
A cluster listener can only listen to
@CacheEntryModified
,@CacheEntryCreated
,@CacheEntryRemoved
and@CacheEntryExpired
events. Note this means any other type of event will not be listened to for this listener. - Only the post event is sent to a cluster listener, the pre event is ignored.
4.3.1.2. Event filtering and conversion
All applicable events on the node where the listener is installed will be raised to the listener. It is possible to dynamically filter what events are raised by using a KeyFilter (only allows filtering on keys) or CacheEventFilter (used to filter for keys, old value, old metadata, new value, new metadata, whether command was retried, if the event is before the event (ie. isPre) and also the command type).
The example here shows a simple KeyFilter
that will only allow events to be raised when an event modified the entry for the key Only Me
.
public class SpecificKeyFilter implements KeyFilter<String> { private final String keyToAccept; public SpecificKeyFilter(String keyToAccept) { if (keyToAccept == null) { throw new NullPointerException(); } this.keyToAccept = keyToAccept; } public boolean accept(String key) { return keyToAccept.equals(key); } } ... cache.addListener(listener, new SpecificKeyFilter("Only Me")); ...
This can be useful when you want to limit what events you receive in a more efficient manner.
There is also a CacheEventConverter that can be supplied that allows for converting a value to another before raising the event. This can be nice to modularize any code that does value conversions.
The mentioned filters and converters are especially beneficial when used in conjunction with a Cluster Listener. This is because the filtering and conversion is done on the node where the event originated and not on the node where event is listened to. This can provide benefits of not having to replicate events across the cluster (filter) or even have reduced payloads (converter).
4.3.1.3. Initial State Events
When a listener is installed it will only be notified of events after it is fully installed.
It may be desirable to get the current state of the cache contents upon first registration of listener by having an event generated of type @CacheEntryCreated
for each element in the cache. Any additionally generated events during this initial phase will be queued until appropriate events have been raised.
This only works for clustered listeners at this time. ISPN-4608 covers adding this for non clustered listeners.
4.3.1.4. Duplicate Events
It is possible in a non transactional cache to receive duplicate events. This is possible when the primary owner of a key goes down while trying to perform a write operation such as a put.
Data Grid internally will rectify the put operation by sending it to the new primary owner for the given key automatically, however there are no guarantees in regards to if the write was first replicated to backups. Thus more than 1 of the following write events (CacheEntryCreatedEvent
, CacheEntryModifiedEvent
& CacheEntryRemovedEvent
) may be sent on a single operation.
If more than one event is generated Data Grid will mark the event that it was generated by a retried command to help the user to know when this occurs without having to pay attention to view changes.
@Listener public class MyRetryListener { @CacheEntryModified public void entryModified(CacheEntryModifiedEvent event) { if (event.isCommandRetried()) { // Do something } } }
Also when using a CacheEventFilter
or CacheEventConverter
the EventType contains a method isRetry
to tell if the event was generated due to retry.
4.3.2. Cache manager-level notifications
Cache manager-level events occur on a cache manager. These too are global and cluster-wide, but involve events that affect all caches created by a single cache manager. Examples of cache manager-level events are nodes joining or leaving a cluster, or caches starting or stopping.
See the org.infinispan.notifications.cachemanagerlistener.annotation package for a comprehensive list of all cache manager-level notifications, and their respective method-level annotations.
4.3.3. Synchronicity of events
By default, all async notifications are dispatched in the notification thread pool. Sync notifications will delay the operation from continuing until the listener method completes or the CompletionStage completes (the former causing the thread to block). Alternatively, you could annotate your listener as asynchronous in which case the operation will continue immediately, while the notification is completed asynchronously on the notification thread pool. To do this, simply annotate your listener such:
Asynchronous Listener
@Listener (sync = false) public class MyAsyncListener { @CacheEntryCreated void listen(CacheEntryCreatedEvent event) { } }
Blocking Synchronous Listener
@Listener public class MySyncListener { @CacheEntryCreated void listen(CacheEntryCreatedEvent event) { } }
Non-Blocking Listener
@Listener public class MyNonBlockingListener { @CacheEntryCreated CompletionStage<Void> listen(CacheEntryCreatedEvent event) { } }
4.3.3.1. Asynchronous thread pool
To tune the thread pool used to dispatch such asynchronous notifications, use the <listener-executor />
XML element in your configuration file.
4.4. Asynchronous API
In addition to synchronous API methods like Cache.put() , Cache.remove() , etc., Data Grid also has an asynchronous, non-blocking API where you can achieve the same results in a non-blocking fashion.
These methods are named in a similar fashion to their blocking counterparts, with "Async" appended. E.g., Cache.putAsync() , Cache.removeAsync() , etc. These asynchronous counterparts return a CompletableFuture that contains the actual result of the operation.
For example, in a cache parameterized as Cache<String, String>
, Cache.put(String key, String value)
returns String
while Cache.putAsync(String key, String value)
returns CompletableFuture<String>
.
4.4.1. Why use such an API?
Non-blocking APIs are powerful in that they provide all of the guarantees of synchronous communications - with the ability to handle communication failures and exceptions - with the ease of not having to block until a call completes. This allows you to better harness parallelism in your system. For example:
Set<CompletableFuture<?>> futures = new HashSet<>(); futures.add(cache.putAsync(key1, value1)); // does not block futures.add(cache.putAsync(key2, value2)); // does not block futures.add(cache.putAsync(key3, value3)); // does not block // the remote calls for the 3 puts will effectively be executed // in parallel, particularly useful if running in distributed mode // and the 3 keys would typically be pushed to 3 different nodes // in the cluster // check that the puts completed successfully for (CompletableFuture<?> f: futures) f.get();
4.4.2. Which processes actually happen asynchronously?
There are 4 things in Data Grid that can be considered to be on the critical path of a typical write operation. These are, in order of cost:
- network calls
- marshalling
- writing to a cache store (optional)
- locking
Using the async methods will take the network calls and marshalling off the critical path. For various technical reasons, writing to a cache store and acquiring locks, however, still happens in the caller’s thread.