Book Image

YARN Essentials

Book Image

YARN Essentials

Overview of this book

Table of Contents (17 chapters)
YARN Essentials
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
1
Need for YARN
9
YARN – Alternative Solutions
Index

ResourceManager failures


In the initial versions of the YARN framework, ResourceManager failures meant a total cluster failure, as it was a single point of failure. The ResourceManager stores the state of the cluster, such as the metadata of the submitted application, information on cluster resource containers, information on the cluster's general configurations, and so on. Therefore, if the ResourceManager goes down because of some hardware failure, then there is no way to avoid manually debugging the cluster and restarting the ResourceManager. During the time the ResourceManager is down, the cluster is unavailable, and once it gets restarted, all jobs would need a restart, so the half-completed jobs lose any data and need to be restarted again. In short, a restart of the ResourceManager used to restart all the running ApplicationMasters.

The latest versions of YARN address this problem in two ways. One way is by creating an active-passive ResourceManager architecture, so that when one goes...