This is a very early draft preview of the upcoming book.
The best feature of computing is to automate a task that would be laborious for a human. Most of our day-to-day lives consist of repeating the same process over and over. Crossword puzzles, doing the dishes, making a grocery list, the pages of a book --- these are things which require repetition.
The notion of a _loop_ is certainly one of the best parts of computer science. It feels odd have to mention it; but there was a time before loops, when programming was in its earliest stages of infancy. The invention of the loop is for us what the 1969 moon landing will be for space travelers in 2389: a quaint idea that seems trivial in hindsight, but which blazed away the underbrush and inspired future generations to aim higher.
It's probably better for us to use the term _iteration_ rather than loop. Every language you encounter has some facility for iteration, but it isn't always in the form of the classical _for_ loop that you're probably familiar with. Functional languages (which, to believe the current trends, portend the future of large-scale systems) iterate without loops, but instead rely on recursion and deep memory stacks to provide iteration.
In the same way that the invention of the loop vaulted computer science forward, we could say much the same for the _Map-Reduce algorithm_. To call it an algorithm is correct - it's a recipe for performing work - but it could also be considered to be a design pattern, because it succeeds at solving so many different kinds of problems.
To get a real-world sense of the simplicity and power of map-reduce, we begin by considering the following small data set:We realize that we can come up with the answer by breaking up the solution into two small steps. First, we we transform the original data set into a new one: Now that the data is organized this way, it's easy to find the answer by _reducing_ the key data characteristic. (To Be Continued...)