このコンテンツは選択した言語では利用できません。

Chapter 16. MapReduce


16.1. About MapReduce

The JBoss Data Grid MapReduce model is an adaptation of Google's MapReduce model.
MapReduce is a programming model used to process and generate large data sets. It is typically used in distributed computing environments where nodes are clustered. In JBoss Data Grid, MapReduce allows transparent distributed processing of very large amounts of data across the data grid by performing most computations as locally possible to where the data is stored.
MapReduce uses the two distinct computational phases of map and reduce to process information requests through the data grid. The process occurs as follows:
  1. The user initiates a task on a cache instance, which runs on a cluster node (the master node).
  2. The master node receives the task input, divides the task, and sends tasks for map phase execution on the grid.
  3. Each node executes a Mapper function on its input, and returns intermediate results back to the master node.
    • If the distributedReducePhase parameter is set to "true", the map results are inserted in an intermediary cache, rather than being returned to the master node.
    • If a Combiner has been specified with task.combinedWith(Reducer), the Combiner is called on the Mapper results and the combiner's results are retured to the master node or inserted in the intermediary cache.
  4. The master node collects all intermediate results from the map phase and merges all intermediate values associated with the same intermediate key.
    • If the distributedReducePhase parameter is set to "true", the merging of the intermediate values is done on each node, as the Mapper or Combiner results are inserted in the intermediary cache.The master node only receives the intermediate keys.
  5. The master node sends intermediate key/value pairs for reduction on the grid.
    • If the distributedReducePhase parameter is set to "false", the reduction phase is executed only on the master node.
  6. The final results of the reduction phase are returned.
    • If the distributedReducePhase parameter is set to "true", the master node running the task receives all results from the reduction phase and returns the final result to the MapReduce task initiator.
    • If a Collator has been specified with task.execute(Collator), the Collator is executed on the reduction phase results, and the Collator result is returned to the task initiator.
トップに戻る
Red Hat logoGithubredditYoutubeTwitter

詳細情報

試用、購入および販売

コミュニティー

Red Hat ドキュメントについて

Red Hat をお使いのお客様が、信頼できるコンテンツが含まれている製品やサービスを活用することで、イノベーションを行い、目標を達成できるようにします。 最新の更新を見る.

多様性を受け入れるオープンソースの強化

Red Hat では、コード、ドキュメント、Web プロパティーにおける配慮に欠ける用語の置き換えに取り組んでいます。このような変更は、段階的に実施される予定です。詳細情報: Red Hat ブログ.

会社概要

Red Hat は、企業がコアとなるデータセンターからネットワークエッジに至るまで、各種プラットフォームや環境全体で作業を簡素化できるように、強化されたソリューションを提供しています。

Theme

© 2025 Red Hat