Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


In this chapter, we introduced four tools to ease development on Hadoop. In particular, we covered:

  • How Hadoop streaming allows the writing of MapReduce jobs using dynamic languages

  • How Kite Data simplifies interfacing with heterogeneous data sources

  • How Apache Crunch provides a high-level abstraction to write pipelines of Spark and MapReduce jobs that implement common design patterns

  • How Morphlines allows us to declare chains of commands and data transformations that can then be embedded in any Java codebase

In Chapter 10, Running a Hadoop 2 Cluster, we will shift our focus from the domain of software development to system administration. We will discuss how to set up, manage, and scale a Hadoop cluster, while taking aspects such as monitoring and security into consideration.