Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Sharing resources


In Hadoop 1, the only time one had to consider resource sharing was in considering which scheduler to use for the MapReduce JobTracker. Since all jobs were eventually translated into MapReduce code having a policy for resource sharing at the MapReduce level was usually sufficient to manage cluster workloads in the large.

Hadoop 2 and YARN changed this picture. As well as running many MapReduce jobs, a cluster might also be running many other applications atop other YARN ApplicationMasters. Tez and Spark are frameworks in their own right that run additional applications atop their provided interfaces.

If everything runs on YARN, then it provides ways of configuring the maximum resource allocation (in terms of CPU, memory, and soon I/O) consumed by each container allocated to an application. The primary goal here is to ensure that enough resources are allocated to keep the hardware fully utilized without either having unused capacity or overloading it.

Things get somewhat more...