Chapter 27. Integration with Apache Hadoop
The JBoss Data Grid connector allows the JBoss Data Grid to be a Hadoop compliant data source. It accomplishes this integration by providing implementations of Hadoop's
InputFormat
and OutputFormat
, allowing applications to read and write data to a JBoss Data Grid server with best data locality. While JBoss Data Grid's implementation of the InputFormat
and OutputFormat
allow one to run traditional Hadoop Map/Reduce jobs, they may also be used with any tool or utility that supports Hadoop's InputFormat
data source.
27.1. Hadoop Dependencies
The JBoss Data Grid implementations of Hadoop's formats are found in the following Maven dependency:
<dependency> <groupId>org.infinispan.hadoop</groupId> <artifactId>infinispan-hadoop-core</artifactId> <version>0.2.0.Final-redhat-1</version> </dependency>