MapReduce is a design pattern for data processing. The idea behind MapReduce is simple. A large task is broken down into smaller subtasks. Each subtask is performed independently. The results of all these subtasks are then combined to produce the final result. It should be obvious that MapReduce has two principal phases:
If you have done functional programming in the past, the idea should not be new to you. In the paradigm of functional programming, map()
takes an array as an input and performs an operation on each element on the array. reduce()takes
the result array of map()
as its input and combines all the elements in that array into a single element by performing some operation. To elaborate the idea, consider the array of integers [1, 2, 3, 4, 5]. We have to find the sum of...