Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Choosing a framework


In the previous chapters, we looked at the MapReduce and Spark programming APIs to write distributed applications. Although very powerful and flexible, these APIs come with a certain level of complexity and possibly require significant development time.

In an effort to reduce verbosity, we introduced the Pig and Hive frameworks, which compile domain-specific languages, Pig Latin and Hive QL, into a number of MapReduce jobs or Spark DAGs, effectively abstracting the APIs away. Both languages can be extended with UDFs, which is a way of mapping complex logic to the Pig and Hive data models.

At times when we need a certain degree of flexibility and modularity, things can get tricky. Depending on the use case and developer needs, the Hadoop ecosystem presents a vast choice of APIs, frameworks, and libraries. In this chapter, we identify four categories of users and match them with the following relevant tools:

  • Developers that want to avoid Java in favor of scripting MapReduce...