Chapter 9. Monitoring Data Grid Servers
9.1. Working with Data Grid Server Logs
Data Grid uses Apache Log4j 2 to provide configurable logging mechanisms that capture details about the environment and record cache operations for troubleshooting purposes and root cause analysis.
9.1.1. Data Grid Log Files
Data Grid writes log messages to the following directory:$RHDG_HOME/${infinispan.server.root}/log
server.log
-
Messages in human readable format, including boot logs that relate to the server startup.
Data Grid creates this file by default when you launch servers. server.log.json
-
Messages in JSON format that let you parse and analyze Data Grid logs.
Data Grid creates this file when you enable theJSON-FILE
appender.
9.1.2. Configuring Data Grid Log Properties
You configure Data Grid logs with log4j2.xml
, which is described in the Log4j 2 manual.
Procedure
-
Open
$RHDG_HOME/${infinispan.server.root}/conf/log4j2.xml
with any text editor. - Change logging configuration as appropriate.
-
Save and close
log4j2.xml
.
9.1.2.1. Log Levels
Log levels indicate the nature and severity of messages.
Log level | Description |
---|---|
| Fine-grained debug messages, capturing the flow of individual requests through the application. |
| Messages for general debugging, not related to an individual request. |
| Messages about the overall progress of applications, including lifecycle events. |
| Events that can lead to error or degrade performance. |
| Error conditions that might prevent operations or activites from being successful but do not prevent applications from running. |
| Events that could cause critical service failure and application shutdown. |
In addition to the levels of individual messages presented above, the configuration allows two more values: ALL
to include all messages, and OFF
to exclude all messages.
9.1.2.2. Data Grid Log Categories
Data Grid provides categories for INFO
, WARN
, ERROR
, FATAL
level messages that organize logs by functional area.
org.infinispan.CLUSTER
- Messages specific to Data Grid clustering that include state transfer operations, rebalancing events, partitioning, and so on.
org.infinispan.CONFIG
- Messages specific to Data Grid configuration.
org.infinispan.CONTAINER
- Messages specific to the data container that include expiration and eviction operations, cache listener notifications, transactions, and so on.
org.infinispan.PERSISTENCE
- Messages specific to cache loaders and stores.
org.infinispan.SECURITY
- Messages specific to Data Grid security.
org.infinispan.SERVER
- Messages specific to Data Grid servers.
org.infinispan.XSITE
- Messages specific to cross-site replication operations.
9.1.2.3. Log Appenders
Log appenders define how Data Grid records log messages.
- CONSOLE
-
Write log messages to the host standard out (
stdout
) or standard error (stderr
) stream.
Uses theorg.apache.logging.log4j.core.appender.ConsoleAppender
class by default. - FILE
-
Write log messages to a file.
Uses theorg.apache.logging.log4j.core.appender.RollingFileAppender
class by default. - JSON-FILE
-
Write log messages to a file in JSON format.
Uses theorg.apache.logging.log4j.core.appender.RollingFileAppender
class by default.
9.1.2.4. Log Patterns
The CONSOLE
and FILE
appenders use a PatternLayout
to format the log messages according to a pattern.
An example is the default pattern in the FILE appender:%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p (%t) [%c{1}] %m%throwable%n
-
%d{yyyy-MM-dd HH:mm:ss,SSS}
adds the current time and date. -
%-5p
specifies the log level, aligned to the right. -
%t
adds the name of the current thread. -
%c{1}
adds the short name of the logging category. -
%m
adds the log message. -
%throwable
adds the exception stack trace. -
%n
adds a new line.
Patterns are fully described in the PatternLayout
documentation .
9.1.2.5. Enabling and Configuring the JSON Log Handler
Data Grid provides a JSON log handler to write messages in JSON format.
Prerequisites
Ensure that Data Grid is not running. You cannot dynamically enable log handlers.
Procedure
When you start Data Grid, it writes each log message as a JSON map in the following file:$RHDG_HOME/${infinispan.server.root}/log/server.log.json
9.1.3. Access Logs
Hot Rod and REST endpoints can record all inbound client requests as log entries with the following categories:
-
org.infinispan.HOTROD_ACCESS_LOG
logging category for the Hot Rod endpoint. -
org.infinispan.REST_ACCESS_LOG
logging category for the REST endpoint.
9.1.3.1. Enabling Access Logs
Access logs for Hot Rod and REST endpoints are disabled by default. To enable either logging category, set the level to TRACE
in the Data Grid logging configuration, as in the following example:
<Logger name="org.infinispan.HOTROD_ACCESS_LOG" additivity="false" level="TRACE"> <AppenderRef ref="HR-ACCESS-FILE"/> </Logger>
9.1.3.2. Access Log Properties
The default format for access logs is as follows:
%X{address} %X{user} [%d{dd/MMM/yyyy:HH:mm:ss Z}] "%X{method} %m %X{protocol}" %X{status} %X{requestSize} %X{responseSize} %X{duration}%n
The preceding format creates log entries such as the following:
127.0.0.1 - [DD/MM/YYYY:HH:MM:SS +0000] "PUT /rest/v2/caches/default/key HTTP/1.1" 404 5 77 10
Logging properties use the %X{name}
notation and let you modify the format of access logs. The following are the default logging properties:
Property | Description |
---|---|
|
Either the |
| Principal name, if using authentication. |
|
Method used. |
|
Protocol used. |
|
An HTTP status code for the REST endpoint. |
| Size, in bytes, of the request. |
| Size, in bytes, of the response. |
| Number of milliseconds that the server took to handle the request. |
Use the header name prefixed with h:
to log headers that were included in requests; for example, %X{h:User-Agent}
.
9.2. Configuring Statistics, Metrics, and JMX
Enable statistics that Data Grid exports to a MicroProfile Metrics endpoint or via JMX MBeans. You can also register JMX MBeans to perform management operations.
9.2.1. Enabling Data Grid Statistics
Data Grid lets you enable statistics for Cache Managers and caches. However, enabling statistics for a Cache Manager does not enable statistics for the caches that it controls. You must explicitly enable statistics for your caches.
Data Grid server enables statistics for Cache Managers by default.
Procedure
- Enable statistics declaratively or programmatically.
Declaratively
<cache-container statistics="true"> 1 <local-cache name="mycache" statistics="true"/> 2 </cache-container>
Programmatically
GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() .cacheContainer().statistics(true) 1 .build(); ... Configuration config = new ConfigurationBuilder() .statistics().enable() 2 .build();
9.2.2. Enabling Data Grid Metrics
Configure Data Grid to export gauges and histograms.
Procedure
- Configure metrics declaratively or programmatically.
Declaratively
<cache-container statistics="true"> 1 <metrics gauges="true" histograms="true" /> 2 </cache-container>
Programmatically
GlobalConfiguration globalConfig = new GlobalConfigurationBuilder() .statistics().enable() 1 .metrics().gauges(true).histograms(true) 2 .build();
9.2.3. Collecting Data Grid Metrics
Collect Data Grid metrics with monitoring tools such as Prometheus.
Prerequisites
-
Enable statistics. If you do not enable statistics, Data Grid provides
0
and-1
values for metrics. - Optionally enable histograms. By default Data Grid generates gauges but not histograms.
Procedure
Get metrics in Prometheus (OpenMetrics) format:
$ curl -v http://localhost:11222/metrics
Get metrics in MicroProfile JSON format:
$ curl --header "Accept: application/json" http://localhost:11222/metrics
Next steps
Configure monitoring applications to collect Data Grid metrics. For example, add the following to prometheus.yml
:
static_configs: - targets: ['localhost:11222']
Reference
- Prometheus Configuration
- Enabling Data Grid Statistics
9.2.4. Configuring Data Grid to Register JMX MBeans
Data Grid can register JMX MBeans that you can use to collect statistics and perform administrative operations. However, you must enable statistics separately to JMX otherwise Data Grid provides 0
values for all statistic attributes.
Procedure
- Enable JMX declaratively or programmatically.
Declaratively
<cache-container>
<jmx enabled="true" /> 1
</cache-container>
- 1
- Registers Data Grid JMX MBeans.
Programmatically
GlobalConfiguration globalConfig = new GlobalConfigurationBuilder()
.jmx().enable() 1
.build();
- 1
- Registers Data Grid JMX MBeans.
9.2.4.1. Data Grid MBeans
Data Grid exposes JMX MBeans that represent manageable resources.
org.infinispan:type=Cache
- Attributes and operations available for cache instances.
org.infinispan:type=CacheManager
- Attributes and operations available for cache managers, including Data Grid cache and cluster health statistics.
For a complete list of available JMX MBeans along with descriptions and available operations and attributes, see the Data Grid JMX Components documentation.
Reference
9.3. Retrieving Server Health Statistics
Monitor the health of your Data Grid clusters in the following ways:
-
Programmatically with
embeddedCacheManager.getHealth()
method calls. - JMX MBeans
- Data Grid REST Server
9.3.1. Accessing the Health API via JMX
Retrieve Data Grid cluster health statistics via JMX.
Procedure
Connect to Data Grid server using any JMX capable tool such as JConsole and navigate to the following object:
org.infinispan:type=CacheManager,name="default",component=CacheContainerHealth
- Select available MBeans to retrieve cluster health statistics.
9.3.2. Accessing the Health API via REST
Get Data Grid cluster health via the REST API.
Procedure
Invoke a
GET
request to retrieve cluster health.GET /rest/v2/cache-managers/{cacheManagerName}/health
Data Grid responds with a JSON
document such as the following:
{ "cluster_health":{ "cluster_name":"ISPN", "health_status":"HEALTHY", "number_of_nodes":2, "node_names":[ "NodeA-36229", "NodeB-28703" ] }, "cache_health":[ { "status":"HEALTHY", "cache_name":"___protobuf_metadata" }, { "status":"HEALTHY", "cache_name":"cache2" }, { "status":"HEALTHY", "cache_name":"mycache" }, { "status":"HEALTHY", "cache_name":"cache1" } ] }
Get cache manager status as follows:
GET /rest/v2/cache-managers/{cacheManagerName}/health/status
Reference
See the REST v2 (version 2) API documentation for more information.