Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


This chapter explored Spark and showed you how it adds iterative processing as a new rich framework upon which applications can be built atop YARN. In particular, we highlighted:

  • The distributed data-structure-based processing model of Spark and how it allows very efficient in-memory data processing

  • The broader Spark ecosystem and how multiple additional projects are built atop it to specialize the computational model even further

In the next chapter we will explore Apache Pig and its programming language, Pig Latin. We will see how this tool can greatly simplify software development for Hadoop by abstracting away some of the MapReduce and Spark complexity.