Este contenido no está disponible en el idioma seleccionado.
16.2. The MapReduce API
16.2.1. The MapReduce API Copiar enlaceEnlace copiado en el portapapeles!
Copiar enlaceEnlace copiado en el portapapeles!
In JBoss Data Grid, each MapReduce task has four main components:
Mapper
Reducer
Collator
MapReduceTask
The
Mapper
class implementation is a component of MapReduceTask
, which is invoked once per input cache entry key/value pair. Map
is a the process of applying a given function to each element of a list, returning a list of results
Each node in the JBoss Data Grid executes the
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
Mapper
on a given cache entry key/value input pair. It then transforms this cache entry key/value pair into an intermediate key/value pair, which is emitted into the provided Collator
instance.
At this stage, for each output key there may be multiple output values. The multiple values must be reduced to a single value, and this is the task of the
Reducer
. JBoss Data Grid's distributed execution environment creates one instance of Reducer
per execution node.
The same
Reducer
interface is used for Combiners
. A Combiner
is similar to a Reducer
, except that it must be able to work on partial results. The Combiner
is executed on the results of the Mapper
, on the same node, without considering the other nodes that might have generated values for the same intermediate key.
As
Combiners
only see a part of the intermediate values, they cannot be used in all scenarios, however when used they can reduce network traffic significantly.
The
Collator
coordinates results from Reducers
that have been executed on JBoss Data Grid, and assembles a final result that is delivered to the initiator of the MapReduceTask
. The Collator
is applied to the final map key/value result of MapReduceTask
.