Book Image

Mastering Hadoop

By : Karanth
Book Image

Mastering Hadoop

By: Karanth

Overview of this book

Do you want to broaden your Hadoop skill set and take your knowledge to the next level? Do you wish to enhance your knowledge of Hadoop to solve challenging data processing problems? Are your Hadoop jobs, Pig scripts, or Hive queries not working as fast as you intend? Are you looking to understand the benefits of upgrading Hadoop? If the answer is yes to any of these, this book is for you. It assumes novice-level familiarity with Hadoop.
Table of Contents (15 chapters)
14
Index

Hadoop on the cloud

All major cloud service providers have Hadoop as a PaaS offering. Amazon with Elastic MapReduce, Microsoft with their HDInsight offering, and Google with their Hadoop on the Google Cloud platform are the frontrunners in this space. The first to offer Hadoop on the cloud, way back in 2009, was Amazon.

We will briefly compare and contrast EMR and HDInsight in the following table:

Amazon AWS EMR

Microsoft Azure HDInsight

It was released in 2009. It has more than five years of service and technology maturity.

It was released in 2012. It has around two years of service and technology maturity.

The popularity of AWS makes the learning curve less steep for a new user. EMR is integrated with the popular AWS console.

Microsoft Azure is picking up, but it's not yet as popular as AWS. HDInsight is also integrated with Microsoft Azure dashboards.

It has the ability to deploy clusters with the MapR Hadoop distribution.

Hadoop distribution is limited to Microsoft's...