Chapter 3. Migrating Data Grid configuration
Find changes to Data Grid configuration that affect migration to Data Grid 8.
3.1. Data Grid cache configuration
Data Grid 8 provides empty cache containers by default. When you start Data Grid, it instantiates a cache manager so you can create caches at runtime.
However, in comparison with previous versions, there is no "default" cache out of the box.
In Data Grid 8, caches that you create through the CacheContainerAdmin
API are permanent to ensure that they survive cluster restarts.
Permanent caches
.administration()
.withFlags(AdminFlag.PERMANENT) 1
.getOrCreateCache("myPermanentCache", "org.infinispan.DIST_SYNC");
- 1
AdminFlag.PERMANENT
is enabled by default to ensure that caches survive restarts.
You do not need to set this flag when you create caches. However, you must separately add persistent storage to Data Grid for data to survive restarts, for example:
ConfigurationBuilder b = new ConfigurationBuilder(); b.persistence() .addSingleFileStore() .location("/tmp/myDataStore") .maxEntries(5000);
Volatile caches
.administration() .withFlags(AdminFlag.VOLATILE) 1 .getOrCreateCache("myTemporaryCache", "org.infinispan.DIST_SYNC"); 2
Data Grid 8 provides cache templates for server installations that you can use to create caches with recommended settings.
You can get a list of available cache templates as follows:
Use
Tab
auto-completion with the CLI:[//containers/default]> create cache --template=
Use the REST API:
GET 127.0.0.1:11222/rest/v2/cache-managers/default/cache-configs/templates
3.1.1. Cache encoding
When you create remote caches you should configure the MediaType for keys and values. Configuring the MediaType guarantees the storage format for your data.
To encode caches, you specify the MediaType in your configuration. Unless you have others requirements, you should use ProtoStream, which stores your data in a language-neutral, backwards compatible format.
<encoding media-type="application/x-protostream"/>
Distributed cache configuration with encoding
<infinispan> <cache-container> <distributed-cache name="myCache" mode="SYNC"> <encoding media-type="application/x-protostream"/> ... </distributed-cache> </cache-container> </infinispan>
If you do not encode remote caches, Data Grid Server logs the following message:
WARN (main) [org.infinispan.encoding.impl.StorageConfigurationManager] ISPN000599: Configuration for cache 'mycache' does not define the encoding for keys or values. If you use operations that require data conversion or queries, you should configure the cache with a specific MediaType for keys or values.
In a future version, cache encoding will be required for operations where data conversion takes place; for example, cache indexing and searching the data container, remote task execution, reading and writing data in different formats from the Hot Rod and REST endpoints, as well as using remote filters, converters, and listeners.
3.1.2. Cache health status
Data Grid 7.x includes a Health Check API that returns health status of the cluster as well as caches within it.
Data Grid 8 also provides a Health API. For embedded and server installations, you can access the Health API via JMX with the following MBean:
org.infinispan:type=CacheManager,name="default",component=CacheContainerHealth
Data Grid Server also exposes the Health API through the REST endpoint and the Data Grid Console.
7.x | 8.x | Description |
---|---|---|
|
| Indicates a cache is operating as expected. |
|
| Indicates a cache is in the rebalancing state but otherwise operating as expected. |
|
| Indicates a cache is not operating as expected and possibly requires troubleshooting. |
Additional resources
3.1.3. Changes to the Data Grid 8.1 configuration schema
This topic lists changes to the Data Grid configuration schema between 8.0 and 8.1.
New and modified elements and attributes
-
stack
adds support for inline JGroups stack definitions. -
stack.combine
andstack.position
attributes let you override and modify JGroups stack definitions. -
metrics
lets you configure how Data Grid exports metrics that are compatible with the Eclipse MicroProfile Metrics API. -
context-initializer
lets you specify aSerializationContextInitializer
implementation that initializes a Protostream-based marshaller for user types. -
key-transformers
lets you register transformers that convert custom keys to String for indexing with Lucene. -
statistics
now defaults to "false".
Deprecated elements and attributes
The following elements and attributes are now deprecated:
-
address-count
attribute for theoff-heap
element. -
protocol
attribute for thetransaction
element. -
duplicate-domains
attribute for thejmx
element. -
advanced-externalizer
-
custom-interceptors
-
state-transfer-executor
-
transaction-protocol
Removed elements and attributes
The following elements and attributes were deprecated in a previous release and are now removed:
-
deadlock-detection-spin
-
compatibility
-
write-skew
-
versioning
-
data-container
-
eviction
-
eviction-thread-policy
3.2. Eviction configuration
Data Grid 8 simplifies eviction configuration in comparison with previous versions. However, eviction configuration has undergone numerous changes across different Data Grid versions, which means migration might not be straightforward.
As of Data Grid 7.2, the memory
element replaces the eviction
element in the configuration. This section refers to eviction configuration with the memory
element only. For information on migrating configuration that uses the eviction
element, refer to the Data Grid 7.2 documentation.
3.2.1. Storage types
Data Grid lets you control how to store entries in memory, with the following options:
- Store objects in JVM heap memory.
- Store bytes in native memory (off-heap).
- Store bytes in JVM heap memory.
Changes in Data Grid 8
In previous 7.x versions, and 8.0, you use object
, binary
, and off-heap
elements to configure the storage type.
Starting with Data Grid 8.1, you use a storage
attribute to store objects in JVM heap memory or as bytes in off-heap memory.
To store bytes in JVM heap memory, you use the encoding
element to specify a binary storage format for your data.
Data Grid 7.x | Data Grid 8 |
---|---|
|
|
|
|
|
|
Object storage in Data Grid 8
By default, Data Grid 8.1 uses object storage (JVM heap):
<distributed-cache> <memory /> </distributed-cache>
You can also configure storage="HEAP"
explicitly to store data as objects in JVM heap memory:
<distributed-cache> <memory storage="HEAP" /> </distributed-cache>
Off-heap storage in Data Grid 8
Set "OFF_HEAP"
as the value of the storage
attribute to store data as bytes in native memory:
<distributed-cache> <memory storage="OFF_HEAP" /> </distributed-cache>
Off-heap address count
In previous versions, the address-count
attribute for offheap
lets you specify the number of pointers that are available in the hash map to avoid collisions. With Data Grid 8.1, address-count
is no longer used and off-heap memory is dynamically re-sized to avoid collisions.
Binary storage in Data Grid 8
Specify a binary storage format for cache entries with the encoding
element:
<distributed-cache> <!--Configure MediaType for entries with binary formats.--> <encoding media-type="application/x-protostream"/> <memory ... /> </distributed-cache>
As a result of this change, Data Grid no longer stores primitives and String mixed with byte[]
, but stores only byte[]
.
3.2.2. Eviction threshold
Eviction lets Data Grid control the size of the data container by removing entries when the container becomes larger than a configured threshold.
In Data Grid 7.x and 8.0, you specify two eviction types that define the maximum limit for entries in the cache:
-
COUNT
measures the number of entries in the cache. -
MEMORY
measures the amount of memory that all entries in the cache take up.
Depending on the configuration you set, when either the count or the total amount of memory exceeds the maximum, Data Grid removes unused entries.
Data Grid 7.x and 8.0 also use the size
attribute that defines the size of the data container as a long. Depending on the storage type you configure, eviction occurs either when the number of entries or amount of memory exceeds the value of the size
attribute.
With Data Grid 8.1, the size
attribute is deprecated along with COUNT
and MEMORY
. Instead, you configure the maximum size of the data container in one of two ways:
-
Total number of entries with the
max-count
attribute. -
Maximum amount of memory, in bytes, with the
max-size
attribute.
Eviction based on total number of entries
<distributed-cache> <memory max-count="..." /> </distributed-cache>
Eviction based on maximum amount of memory
<distributed-cache> <memory max-size="..." /> </distributed-cache>
3.2.3. Eviction strategies
Eviction strategies control how Data Grid performs eviction.
Data Grid 7.x and 8.0 let you set one of the following eviction strategies with the strategy
attribute:
Strategy | Description |
---|---|
| Data Grid does not evict entries. This is the default setting unless you configure eviction. |
| Data Grid removes entries from memory so that the cache does not exceed the configured size. This is the default setting when you configure eviction. |
|
Data Grid does not perform eviction. Eviction takes place manually by invoking the |
|
Data Grid does not write new entries to the cache if doing so would exceed the configured size. Instead of writing new entries to the cache, Data Grid throws a |
With Data Grid 8.1, you can use the same strategies as in previous versions. However, the strategy
attribute is replaced with the when-full
attribute.
<distributed-cache> <memory when-full="<eviction_strategy>" /> </distributed-cache>
Eviction algorithms
With Data Grid 7.2, the ability to configure eviction algorithms was deprecated along with the Low Inter-Reference Recency Set (LIRS).
From version 7.2 onwards, Data Grid includes the Caffeine caching library that implements a variation of the Least Frequently Used (LFU) cache replacement algorithm known as TinyLFU. For off-heap storage, Data Grid uses a custom implementation of the Least Recently Used (LRU) algorithm.
3.2.4. Eviction configuration comparison
Compare eviction configuration between different Data Grid versions.
Object storage and evict on number of entries
7.2 to 8.0
<memory> <object size="1000000" eviction="COUNT" strategy="REMOVE"/> </memory>
8.1
<memory max-count="1MB" when-full="REMOVE"/>
Object storage and evict on amount of memory
7.2 to 8.0
<memory> <object size="1000000" eviction="MEMORY" strategy="MANUAL"/> </memory>
8.1
<memory max-size="1MB" when-full="MANUAL"/>
Binary storage and evict on number of entries
7.2 to 8.0
<memory> <binary size="500000000" eviction="MEMORY" strategy="EXCEPTION"/> </memory>
8.1
<cache> <encoding media-type="application/x-protostream"/> <memory max-size="500 MB" when-full="EXCEPTION"/> </cache>
Binary storage and evict on amount of memory
7.2 to 8.0
<memory> <binary size="500000000" eviction="COUNT" strategy="MANUAL"/> </memory>
8.1
<memory max-count="500 MB" when-full="MANUAL"/>
Off-heap storage and evict on number of entries
7.2 to 8.0
<memory> <off-heap size="10000000" eviction="COUNT"/> </memory>
8.1
<memory storage="OFF_HEAP" max-count="10MB"/>
Off-heap storage and evict on amount of memory
7.2 to 8.0
<memory> <off-heap size="1000000000" eviction="MEMORY"/> </memory>
8.1
<memory storage="OFF_HEAP" max-size="1GB"/>
Additional resources
3.3. Expiration configuration
Expiration removes entries from caches based on their lifespan or maximum idle time.
When migrating your configuration from Data Grid 7.x to 8, there are no changes that you need to make for expiration. The configuration remains the same:
Lifespan expiration
<expiration lifespan="1000" />
Max-idle expiration
<expiration max-idle="1000" interval="120000" />
For Data Grid 7.2 and earlier, using max-idle
with clustered caches had technical limitations that resulted in performance degradation.
As of Data Grid 7.3, Data Grid sends touch commands to all owners in clustered caches when client read entries that have max-idle
expiration values. This ensures that the entries have the same relative access time across the cluster.
Data Grid 8 sends the same touch commands for max-idle
expiration across clusters. However there are some technical considerations you should take into account before you start using max-idle
. Refer to Configuring Data Grid caches to read more about how expiration works and to review how the touch commands affect performance with clustered caches.
Additional resources
3.4. Persistent cache stores
In comparison with Data Grid 7.x, there are some changes to cache store configuration in Data Grid 8.
Persistence SPI
Data Grid 8.1 introduces the NonBlockingStore
interface for cache stores. The NonBlockingStore
SPI exposes methods that must never block the invoking thread.
Cache stores that connect Data Grid to persistent data sources implement the NonBlockingStore
interface.
For custom cache store implementations that use blocking operations, Data Grid provides a BlockingManager
utility class to handle those operations.
The introduction of the NonBlockingStore
interface deprecates the following interfaces:
-
CacheLoader
-
CacheWriter
-
AdvancedCacheLoader
-
AdvancedCacheWriter
Custom cache stores
Data Grid 8 lets you configure custom cache stores with the store
element as in previous versions.
The following changes apply:
-
The
singleton
attribute is removed. Useshared=true
instead. -
The
segmented
attribute is added and defaults totrue
.
Segmented cache stores
As of Data Grid 8, cache store configuration defaults to segmented="true"
and applies to the following cache store elements:
-
store
-
file-store
-
string-keyed-jdbc-store
-
jpa-store
-
remote-store
-
rocksdb-store
-
soft-index-file-store
Single file cache stores
The relative-to
attribute for Single File cache stores is removed in Data Grid 8. If your cache store configuration includes this attribute, Data Grid ignores it and uses only the path
attribute to configure store location.
JDBC cache stores
JDBC cache stores must include an xlmns
namespace declaration, which was not required in some Data Grid 7.x versions.
<persistence> <string-keyed-jdbc-store xmlns="urn:infinispan:config:store:jdbc:11.0" shared="true"> ... </persistence>
JDBC connection factories
Data Grid 7.x JDBC cache stores can use the following ConnectionFactory
implementations to obtain a database connection:
-
ManagedConnectionFactory
-
SimpleConnectionFactory
-
PooledConnectionFactory
Data Grid 8 now use connections factories based on Agroal, which is the same as Red Hat JBoss EAP, to connect to databases. It is no longer possible to use c3p0.properties
and hikari.properties
files.
Segmentation
JDBC String-Based cache store configuration that enables segmentation, which is now the default, must include the segmentColumnName
and segmentColumnType
parameters, as in the following programmatic examples:
MySQL Example
builder.table() .tableNamePrefix("ISPN") .idColumnName("ID_COLUMN").idColumnType(“VARCHAR(255)”) .dataColumnName("DATA_COLUMN").dataColumnType(“VARBINARY(1000)”) .timestampColumnName("TIMESTAMP_COLUMN").timestampColumnType(“BIGINT”) .segmentColumnName("SEGMENT_COLUMN").segmentColumnType("INTEGER")
PostgreSQL Example
builder.table() .tableNamePrefix("ISPN") .idColumnName("ID_COLUMN").idColumnType(“VARCHAR(255)”) .dataColumnName("DATA_COLUMN").dataColumnType(“BYTEA”) .timestampColumnName("TIMESTAMP_COLUMN").timestampColumnType("BIGINT”) .segmentColumnName("SEGMENT_COLUMN").segmentColumnType("INTEGER");
Write-behind
The thread-pool-size
attribute for Write-Behind mode is removed in Data Grid 8.
Removed cache stores and loaders
Data Grid 7.3 deprecates the following cache stores and loaders that are no longer available in Data Grid 8:
- Cassandra Cache Store
- REST Cache Store
- LevelDB Cache Store
- CLI Cache Loader
Cache store migrator
Cache stores in previous versions of Data Grid store data in a binary format that is not compatible with Data Grid 8.
Use the StoreMigrator
utility to migrate data in persistent cache stores to Data Grid 8.
3.5. Data Grid cluster transport
Data Grid uses JGroups technology to handle communication between clustered nodes.
JGroups stack configuration elements and attributes have not significantly changed from previous Data Grid versions.
As in previous versions, Data Grid provides preconfigured JGroups stacks that you can use as a starting point for building custom cluster transport configuration optimized for your network requirements. Likewise, Data Grid provides the ability to add JGroups stacks defined in external XML files to your infinispan.xml
.
Data Grid 8 has brought usability improvements to make cluster transport configuration easier:
-
Inline stacks let you configure JGroups stacks directly within
infinispan.xml
using thejgroups
element. -
Declare JGroups schemas within
infinispan.xml
. - Preconfigured JGroups stacks for UDP and TCP protocols.
- Inheritance attributes that let you extend JGroups stacks to adjust specific protocols and properties.
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:infinispan:config:11.0 https://infinispan.org/schemas/infinispan-config-11.0.xsd urn:infinispan:server:11.0 https://infinispan.org/schemas/infinispan-server-11.0.xsd urn:org:jgroups http://www.jgroups.org/schema/jgroups-4.2.xsd" 1 xmlns="urn:infinispan:config:11.0" xmlns:server="urn:infinispan:server:11.0"> <jgroups> 2 <stack name="xsite" extends="udp"> 3 <relay.RELAY2 site="LON" xmlns="urn:org:jgroups"/> <remote-sites default-stack="tcp"> <remote-site name="LON"/> <remote-site name="NYC"/> </remote-sites> </stack> </jgroups> <cache-container ...> ... </infinispan>
3.5.1. Transport security
As in previous versions, Data Grid 8 uses the JGroups SYM_ENCRYPT and ASYM_ENCRYPT protocols to encrypt cluster communication.
Node authentication
In Data Grid 7.x, the JGroups SASL protocol enables nodes to authenticate against security realms in both embedded and remote server installations.
As of Data Grid 8, it is not possible to configure node authentication against security realms. Likewise Data Grid 8 does not recommend using the JGroups AUTH protocol for authenticating clustered nodes.
However, with embedded Data Grid installations, JGroups cluster transport includes a SASL configuration as part of the jgroups
element. As in previous versions, the SASL configuration relies on JAAS notions, such as CallbackHandlers
, to obtain certain information necessary for node authentication.
Additional resources
3.6. Data Grid authorization
Data Grid uses role-based access control (RBAC) to restrict access to data and cluster encryption to secure communication between nodes.
Cache manager authorization
<infinispan> <cache-container default-cache="secured" name="secured"> <security> <authorization> 1 <identity-role-mapper /> 2 <role name="admin" permissions="ALL" /> 3 <role name="reader" permissions="READ" /> <role name="writer" permissions="WRITE" /> <role name="supervisor" permissions="READ WRITE EXEC"/> </authorization> </security> </cache-container> </infinispan>
Implicit cache authorization
Data Grid 8 improves usability by allowing caches to inherit authorization configuration from the cache-container
so you do not need to explicitly configure roles and permissions for each cache.
<local-cache name="secured">
<security>
<authorization/> 1
</security>
</local-cache>
- 1
- Uses roles and permissions defined in the cache container.
Additional resources