Accueiil
Products
Red Hat Data Grid
8.6
Configuring Data Grid Caches
Chapter 6. Configuring persistent storage

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 6. Configuring persistent storage

Data Grid uses cache stores and loaders to interact with persistent storage.

Durability: Adding cache stores allows you to persist data to non-volatile storage so it survives restarts.
Write-through caching: Configuring Data Grid as a caching layer in front of persistent storage simplifies data access for applications because Data Grid handles all interactions with the external storage.
Data overflow: Using eviction and passivation techniques ensures that Data Grid keeps only frequently used data in-memory and writes older entries to persistent storage.

6.1. Segmented cache stores
Copier lien

Cache stores can organize data into hash space segments to which keys map. Stores are segmented by default.

Segmented stores increase read performance for bulk operations; for example, streaming over data (Cache.size, Cache.entrySet.stream), pre-loading the cache, and doing state transfer operations.

Important

If you change the numSegments parameter in the configuration after you add a segmented cache store, Data Grid cannot read data from that cache store.

6.2. Shared cache stores
Copier lien

Data Grid cache stores can be local to a given node or shared across all nodes in the cluster. By default, cache stores are local (shared="false").

Local cache stores are unique to each node; for example, a file-based cache store that persists data to the host filesystem.
Local cache stores should use "purge on startup" to avoid loading stale entries from persistent storage.
Shared cache stores allow multiple nodes to use the same persistent storage; for example, a JDBC cache store that allows multiple nodes to access the same database.
Shared cache stores ensure that only the primary owner write to persistent storage, instead of backup nodes performing write operations for every modification.

Important

Purging deletes data, which is not typically the desired behavior for persistent storage.

Local cache store

<persistence>
  <store shared="false"
         purge="true"/>
</persistence>

Shared cache store

<persistence>
  <store shared="true"
         purge="false"/>
</persistence>

6.3. Transactions with persistent cache stores
Copier lien

Data Grid supports transactional operations with JDBC-based cache stores only. To configure caches as transactional, you set transactional=true to keep data in persistent storage synchronized with data in memory.

For all other cache stores, Data Grid does not enlist cache loaders in transactional operations. This can result in data inconsistency if transactions succeed in modifying data in memory but do not completely apply changes to data in the cache store. In these cases manual recovery is not possible with cache stores.

6.4. Write-through cache stores
Copier lien

Write-through is a cache writing mode where writes to memory and writes to cache stores are synchronous. When a client application updates a cache entry, in most cases by invoking Cache.put(), Data Grid does not return the call until it updates the cache store. This cache writing mode results in updates to the cache store concluding within the boundaries of the client thread.

The primary advantage of write-through mode is that the cache and cache store are updated simultaneously, which ensures that the cache store is always consistent with the cache.

However, write-through mode can potentially decrease performance because the need to access and update cache stores directly adds latency to cache operations.

Write-through configuration

Data Grid uses write-through mode unless you explicitly add write-behind configuration to your caches. There is no separate element or method for configuring write-through mode.

For example, the following configuration adds a file-based store to the cache that implicitly uses write-through mode:

<distributed-cache>
  <persistence passivation="false">
    <file-store>
      <index path="path/to/index" />
      <data path="path/to/data" />
    </file-store>
  </persistence>
</distributed-cache>

6.5. Write-behind cache stores
Copier lien

Write-behind is a cache writing mode where writes to memory are synchronous and writes to cache stores are asynchronous.

When clients send write requests, Data Grid adds those operations to a modification queue. Data Grid processes operations as they join the queue so that the calling thread is not blocked and the operation completes immediately.

If the number of write operations in the modification queue increases beyond the size of the queue, Data Grid adds those additional operations to the queue. However, those operations do not complete until Data Grid processes operations that are already in the queue.

For example, calling Cache.putAsync returns immediately and the Stage also completes immediately if the modification queue is not full. If the modification queue is full, or if Data Grid is currently processing a batch of write operations, then Cache.putAsync returns immediately and the Stage completes later.

Write-behind mode provides a performance advantage over write-through mode because cache operations do not need to wait for updates to the underlying cache store to complete. However, data in the cache store remains inconsistent with data in the cache until the modification queue is processed. For this reason, write-behind mode is suitable for cache stores with low latency, such as unshared and local file-based cache stores, where the time between the write to the cache and the write to the cache store is as small as possible.

Write-behind configuration

XML

<distributed-cache>
  <persistence>
    <table-jdbc-store xmlns="urn:infinispan:config:store:sql:16.0"
                      dialect="H2"
                      shared="true"
                      table-name="books">
      <connection-pool connection-url="jdbc:h2:mem:infinispan"
                       username="sa"
                       password="changeme"
                       driver="org.h2.Driver"/>
      <write-behind modification-queue-size="2048"
                    fail-silently="true"/>
    </table-jdbc-store>
  </persistence>
</distributed-cache>

JSON

{
  "distributed-cache": {
    "persistence" : {
      "table-jdbc-store": {
        "dialect": "H2",
        "shared": "true",
        "table-name": "books",
        "connection-pool": {
          "connection-url": "jdbc:h2:mem:infinispan",
          "driver": "org.h2.Driver",
          "username": "sa",
          "password": "changeme"
        },
        "write-behind" : {
          "modification-queue-size" : "2048",
          "fail-silently" : true
        }
      }
    }
  }
}

YAML

distributedCache:
  persistence:
    tableJdbcStore:
      dialect: "H2"
      shared: "true"
      tableName: "books"
      connectionPool:
        connectionUrl: "jdbc:h2:mem:infinispan"
        driver: "org.h2.Driver"
        username: "sa"
        password: "changeme"
      writeBehind:
        modificationQueueSize: "2048"
        failSilently: "true"

ConfigurationBuilder

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.persistence()
       .async()
       .modificationQueueSize(2048)
       .failSilently(true);

Failing silently

Write-behind configuration includes a fail-silently parameter that controls what happens when either the cache store is unavailable or the modification queue is full.

If fail-silently="true" then Data Grid logs WARN messages and rejects write operations.
If fail-silently="false" then Data Grid throws exceptions if it detects the cache store is unavailable during a write operation. Likewise if the modification queue becomes full, Data Grid throws an exception.
In some cases, data loss can occur if Data Grid restarts and write operations exist in the modification queue. For example the cache store goes offline but, during the time it takes to detect that the cache store is unavailable, write operations are added to the modification queue because it is not full. If Data Grid restarts or otherwise becomes unavailable before the cache store comes back online, then the write operations in the modification queue are lost because they were not persisted.

6.6. Passivation
Copier lien

Passivation configures Data Grid to write entries to cache stores when it evicts those entries from memory. In this way, passivation prevents unnecessary and potentially expensive writes to persistent storage.

Activation is the process of restoring entries to memory from the cache store when there is an attempt to access passivated entries. For this reason, when you enable passivation, you must configure a cache store that implements the NonBlockingStore interface. The store’s characteristics() method must also indicate that it supports both read and write operations.

When Data Grid evicts an entry from the cache, it notifies cache listeners that the entry is passivated then stores the entry in the cache store. When Data Grid gets an access request for an evicted entry, it lazily loads the entry from the cache store into memory and then notifies cache listeners that the entry is activated while keeping the value still in the store.

Note

Passivation uses the first cache loader in the Data Grid configuration and ignores all others.
Passivation is not supported with:
- Transactional stores. Passivation writes and removes entries from the store outside the scope of the actual Data Grid commit boundaries.
- Shared stores. Shared cache stores require entries to always exist in the store for other owners. For this reason, passivation is not supported because entries cannot be removed.

If you enable passivation with transactional stores or shared stores, Data Grid throws an exception.

6.6.1. How passivation works
Copier lien

Passivation disabled

Writes to data in memory result in writes to persistent storage.

If Data Grid evicts data from memory, then data in persistent storage includes entries that are evicted from memory. In this way persistent storage is a superset of the in-memory cache. This is recommended when you require highest consistency as the store will be able to be read again after a crash.

If you do not configure eviction, then data in persistent storage provides a copy of data in memory.

Passivation enabled

Data Grid adds data to persistent storage only when it evicts data from memory, an entry is removed or upon shutting down the node.

When Data Grid activates entries, it restores data in memory but keeps the data in the store still. This allows for writes to be just as fast as without a store, and still maintains consistency. When an entry is created or updated only the in memory will be updated and thus the store will be outdated for the time being.

Note

Passivation is not supported when a store is also configured as shared. This is due to entries can become out of sync between nodes depending on when a write is evicted versus read.

To gurantee data consistency any store that is not shared should always have purgeOnStartup enabled. This is true for both passivation enabled or disabled since a store could hold an outdated entry while down and resurrect it at a later point.

The following table shows data in memory and in persistent storage after a series of operations:

Expand

Operation	Passivation disabled	Passivation enabled
Insert k1.	Memory: k1 Disk: k1	Memory: k1 Disk: -
Insert k2.	Memory: k1, k2 Disk: k1, k2	Memory: k1, k2 Disk: -
Eviction thread runs and evicts k1.	Memory: k2 Disk: k1, k2	Memory: k2 Disk: k1
Read k1.	Memory: k1, k2 Disk: k1, k2	Memory: k1, k2 Disk: k1
Eviction thread runs and evicts k2.	Memory: k1 Disk: k1, k2	Memory: k1 Disk: k1, k2
Remove k2.	Memory: k1 Disk: k1	Memory: k1 Disk: k1

6.7. Global persistent location
Copier lien

Data Grid preserves global state so that it can restore cluster topology and cached data after restart.

Data Grid uses file locking to prevent concurrent access to the global persistent location. The lock is acquired on startup and released on a node shutdown. The presence of a dangling lock file indicates that the node was not shutdown cleanly, either because of a crash or external termination. In the default configuration, Data Grid will refuse to start up to avoid data corruption with the following message:

ISPN000693: Dangling lock file '%s' in persistent global state, probably left behind by an unclean shutdown

The behavior can be changed by configuring the global state unclean-shutdown-action setting to one of the following:

FAIL: Prevents startup of the cache manager if a dangling lock file is found in the persistent global state. This is the default behavior.
PURGE: Clears the persistent global state if a dangling lock file is found in the persistent global state.
IGNORE: Ignores the presence of a dangling lock file in the persistent global state.

Remote caches

Data Grid Server saves cluster state to the $RHDG_HOME/server/data directory.

Important

You should never delete or modify the server/data directory or its content. Data Grid restores cluster state from this directory when you restart your server instances.

Changing the default configuration or directly modifying the server/data directory can cause unexpected behavior and lead to data loss.

Embedded caches

Data Grid defaults to the user.dir system property as the global persistent location. In most cases this is the directory where your application starts.

For clustered embedded caches, such as replicated or distributed, you should always enable and configure a global persistent location to restore cluster topology.

When using a file-based cache store, you should always configure a global persistent location. You should never configure an absolute path for a file-based cache store that is outside this location. For more details, see Configuring the Global Persistent Location. If you do, Data Grid writes the following exception to logs:

ISPN000558: "The store location 'foo' is not a child of the global persistent location 'bar'"

6.7.1. Configuring the global persistent location
Copier lien

Enable and configure the location where Data Grid stores global state for clustered embedded caches.

Note

Data Grid Server enables global persistence and configures a default location. You should not disable global persistence or change the default configuration for remote caches.

Prerequisites

Add Data Grid to your project.

Procedure

Enable global state in one of the following ways:
- Add the global-state element to your Data Grid configuration.
- Call the globalState().enable() methods in the GlobalConfigurationBuilder API.

Define whether the global persistent location is unique to each node or shared between the cluster.

Expand

Location type	Configuration
Unique to each node	`persistent-location` element or `persistentLocation()` method
Shared between the cluster	`shared-persistent-location` element or `sharedPersistentLocation(String)` method

Set the path where Data Grid stores cluster state.
For example, file-based cache stores the path is a directory on the host filesystem.
Values can be:
- Absolute and contain the full location including the root.
- Relative to a root location.
If you specify a relative value for the path, you must also specify a system property that resolves to a root location.
For example, on a Linux host system you set global/state as the path. You also set the my.data property that resolves to the /opt/data root location. In this case Data Grid uses /opt/data/global/state as the global persistent location.

Global persistent location configuration

XML

<infinispan>
  <cache-container>
    <global-state>
      <persistent-location path="global/state" relative-to="my.data"/>
    </global-state>
  </cache-container>
</infinispan>

JSON

{
  "infinispan" : {
    "cache-container" : {
      "global-state": {
        "persistent-location" : {
          "path" : "global/state",
          "relative-to" : "my.data"
        }
      }
    }
  }
}

YAML

cacheContainer:
  globalState:
      persistentLocation:
        path: "global/state"
        relativeTo : "my.data"

GlobalConfigurationBuilder

new GlobalConfigurationBuilder().globalState()
                                .enable()
                                .persistentLocation("global/state", "my.data");

6.8. File-based cache stores
Copier lien

File-based cache stores provide persistent storage on the local host filesystem where Data Grid is running. For clustered caches, file-based cache stores are unique to each Data Grid node.

Warning

Never use filesystem-based cache stores on shared file systems, such as an NFS or Samba share, because they do not provide file locking capabilities and data corruption can occur.

Additionally if you attempt to use transactional caches with shared file systems, unrecoverable failures can happen when writing to files during the commit phase.

Soft-Index File Stores

SoftIndexFileStore is the default implementation for file-based cache stores and stores data in a set of append-only files.

When append-only files:

Reach their maximum size, Data Grid creates a new file and starts writing to it.
Reach the compaction threshold of less than 50% usage, Data Grid overwrites the entries to a new file and then deletes the old file.

Note

Using SoftIndexFileStore in a clustered cache should enable purge on startup to ensure stale entries are not resurrected.

B+ trees

To improve performance, append-only files in a SoftIndexFileStore are indexed using a B+ Tree that can be stored both on disk and in memory. The in-memory index uses Java soft references to ensure it can be rebuilt if removed by Garbage Collection (GC) then requested again.

Because SoftIndexFileStore uses Java soft references to keep indexes in memory, it helps prevent out-of-memory exceptions. GC removes indexes before they consume too much memory while still falling back to disk.

SoftIndexFileStore creates a B+ tree per configured cache segment. This provides an additional "index" as it only has so many elements and provides additional parallelism for index updates.

Each entry in the B+ tree is a node. By default, the size of each node is limited to 4096 bytes. SoftIndexFileStore throws an exception if keys are longer after serialization occurs.

File limits

SoftIndexFileStore will use two plus the configured openFilesLimit amount of files at a given time. The two additional file pointers are reserved for the log appender for newly updated data and another for the compactor which writes compacted entries into a new file.

The amount of open allocated files allocated for indexing is one tenth of the total number of the configured openFilesLimit. This number has a minimum of 1 or the number of cache segments. Any number remaning from configured limit is allocated for open data files themselves.

Segmentation

Soft-index file stores by default are segmented. The append log(s) are not directly segmented and segmentation is handled directly by the index. It is possible to disable segmentation, which will effectively change the store to store a single index for all contents of the cache as if the number of cache segments was set to 1.

Note that this store can be segmented even when not running in a clustered cache such as distributed. In that case the clustering segments configuration value is still used to configure how many segments are to be used.

Expiration

The SoftIndexFileStore has full support for expired entries and their requirements.

6.8.1. Configuring file-based cache stores
Copier lien

Add file-based cache stores to Data Grid to persist data on the host filesystem.

Prerequisites

Enable global state and configure a global persistent location if you are configuring embedded caches.

Procedure

Add the persistence element to your cache configuration.
Optionally specify true as the value for the passivation attribute to write to the file-based cache store only when data is evicted from memory.
Include the file-store element and configure attributes as appropriate.
Specify false as the value for the shared attribute.
File-based cache stores should always be unique to each Data Grid instance. If you want to use the same persistent across a cluster, configure shared storage such as a JDBC string-based cache store .
Configure the index and data elements to specify the location where Data Grid creates indexes and stores data.
Include the write-behind element if you want to configure the cache store with write-behind mode.

File-based cache store configuration

XML

<distributed-cache>
  <persistence passivation="true">
     <file-store shared="false">
        <data path="data"/>
        <index path="index"/>
        <write-behind modification-queue-size="2048" />
     </file-store>
  </persistence>
</distributed-cache>

JSON

{
  "distributed-cache": {
    "persistence": {
      "passivation": true,
      "file-store" : {
        "shared": false,
        "data": {
          "path": "data"
        },
        "index": {
          "path": "index"
        },
        "write-behind": {
          "modification-queue-size": "2048"
        }
      }
    }
  }
}

YAML

distributedCache:
  persistence:
    passivation: "true"
    fileStore:
      shared: "false"
      data:
        path: "data"
      index:
        path: "index"
      writeBehind:
        modificationQueueSize: "2048"

ConfigurationBuilder

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.persistence().passivation(true)
       .addSoftIndexFileStore()
          .shared(false)
          .dataLocation("data")
          .indexLocation("index")
          .modificationQueueSize(2048);

ò :leveloffset: +1

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 6. Configuring persistent storage

6.1. Segmented cache stores
Copier lien

6.2. Shared cache stores
Copier lien

6.3. Transactions with persistent cache stores
Copier lien

6.4. Write-through cache stores
Copier lien

Write-through configuration

6.5. Write-behind cache stores
Copier lien

Write-behind configuration

Failing silently

6.6. Passivation
Copier lien

6.6.1. How passivation works
Copier lien

6.7. Global persistent location
Copier lien

Remote caches

Embedded caches

6.7.1. Configuring the global persistent location
Copier lien

Global persistent location configuration

6.8. File-based cache stores
Copier lien

Soft-Index File Stores

6.8.1. Configuring file-based cache stores
Copier lien

File-based cache store configuration

Apprendre

Essayez, achetez et vendez

Communautés

À propos de Red Hat

Rendre l’open source plus inclusif

À propos de la documentation Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 6. Configuring persistent storage

6.1. Segmented cache storesCopier lienLien copié sur presse-papiers!

6.2. Shared cache storesCopier lienLien copié sur presse-papiers!

6.3. Transactions with persistent cache storesCopier lienLien copié sur presse-papiers!

6.4. Write-through cache storesCopier lienLien copié sur presse-papiers!

Write-through configuration

6.5. Write-behind cache storesCopier lienLien copié sur presse-papiers!

Write-behind configuration

Failing silently

6.6. PassivationCopier lienLien copié sur presse-papiers!

6.6.1. How passivation worksCopier lienLien copié sur presse-papiers!

6.7. Global persistent locationCopier lienLien copié sur presse-papiers!

Remote caches

Embedded caches

6.7.1. Configuring the global persistent locationCopier lienLien copié sur presse-papiers!

Global persistent location configuration

6.8. File-based cache storesCopier lienLien copié sur presse-papiers!

Soft-Index File Stores

6.8.1. Configuring file-based cache storesCopier lienLien copié sur presse-papiers!

File-based cache store configuration

Apprendre

Essayez, achetez et vendez

Communautés

À propos de Red Hat

Rendre l’open source plus inclusif

À propos de la documentation Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

6.1. Segmented cache stores
Copier lien

6.2. Shared cache stores
Copier lien

6.3. Transactions with persistent cache stores
Copier lien

6.4. Write-through cache stores
Copier lien

6.5. Write-behind cache stores
Copier lien

6.6. Passivation
Copier lien

6.6.1. How passivation works
Copier lien

6.7. Global persistent location
Copier lien

6.7.1. Configuring the global persistent location
Copier lien

6.8. File-based cache stores
Copier lien

6.8.1. Configuring file-based cache stores
Copier lien