Book Image

Learn Apache Mesos

By : Manuj Aggarwal
Book Image

Learn Apache Mesos

By: Manuj Aggarwal

Overview of this book

Apache Mesos is an open source cluster manager that provides efficient resource isolation and sharing across distributed applications or frameworks. This book will help you build a strong foundation of Mesos' capabilities along with practical examples to support the concepts explained throughout the book. Learn Apache Mesos dives straight into how Mesos works. You will be introduced to the distributed system and its challenges and then learn how you can use Mesos and its framework to solve data problems. You will also gain a full understanding of Mesos' internal mechanisms and get equipped to use Mesos and develop applications. Furthermore, this book lets you explore all the steps required to create highly available clusters and build your own Mesos frameworks. You will also cover application deployment and monitoring. By the end of this book, you will have learned how to use Mesos to make full use of machines and how to simplify data center maintenance.
Table of Contents (10 chapters)

Fault tolerance

To gracefully handle failures, Mesos implements two features, both enabled by default, known as check pointing and slave recovery. Check pointing is a feature enabled in both the framework and on the slave, which allows certain information about the state of the cluster to be persistent periodically to the disk. The state of the cluster is written to the disk on the Mesos slave server. The check-pointed data includes information on the task, such as executors and status updates. The second one is slave recovery. Slave recovery allows the Mesos slave daemon to read the state from the disk, and reconnect to running executors and tasks should the Mesos slave daemon fail or be restarted. If the Mesos slave daemon fails or is restarted, slave recovery helps to read the state from the disk and reconnect to the running executors and tasks.

So, by just refreshing the Mesos...