Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Job scheduling in YARN


Most cluster resources are multitenant in nature, that is, a number of teams or people share the cluster resources. Allocation of resources to satisfy the needs of all these tenants becomes important and is the responsibility of the scheduler. Individual clusters per team or person is not viable as they render poor utilization.

YARN provides a pluggable model to schedule policies. The initial versions of Hadoop had a simple First in First Out (FIFO) scheduler. However, FIFO was found to be inadequate in dealing with the complexities of multitenancy. We will discuss two other scheduling strategies that are used in Hadoop today, CapacityScheduler and FairScheduler.

CapacityScheduler

The concept behind CapacityScheduler is to guarantee a tenant-promised capacity on a shared cluster. If other tenants utilize less than the requested capacity, the scheduler allows the tenant to tap into these unused resources. The number one goal of CapacityScheduler is not to allow a single...