Book Image

Apache Mesos Essentials

By : Dharmesh Kakadia
Book Image

Apache Mesos Essentials

By: Dharmesh Kakadia

Overview of this book

<p>Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It allows developers to concurrently run the likes of Hadoop, Spark, Storm, and other applications on a dynamically shared pool of nodes. With Mesos, you have the power to manage a wide range of resources in a multi-tenant environment.</p> <p>Starting with the basics, this book will give you an insight into all the features that Mesos has to offer. You will first learn how to set up Mesos in various environments from data centers to the cloud. You will then learn how to implement self-managed Platform as a Service environment with Mesos using various service schedulers, such as Chronos, Aurora, and Marathon. You will then delve into the depths of Mesos fundamentals and learn how to build distributed applications using Mesos primitives.</p> <p>Finally, you will round things off by covering the operational aspects of Mesos including logging, monitoring, high availability, and recovery.</p>
Table of Contents (15 chapters)
Apache Mesos Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

An example Hadoop job


Let's run an example job that counts the frequency of each word across files:

  1. We need to input some text to run the example. We will download Alice's Adventures in Wonderland from Project Gutenberg's site:

    ubuntu@master:~ $ wget http://www.gutenberg.org/files/11/11.txt –O /tmp/alice.txt
  2. Create an input directory on HDFS and put the downloaded file into it:

    ubuntu@master:~ $ bin/hadoop dfs –mkdir input
    ubuntu@master:~ $ bin/hadoop dfs –put /temp/alice.txt input
  3. Run the Hadoop WordCount example from the hadoop-examples jar file, which is part of the Hadoop distribution:

    ubuntu@master:~ $ bin/hadoop jar hadoop-examples.jar wordcount input output
  4. The output of the program will be in the output directory of HDFS. We can see the output using the following command:

    ubuntu@master:~ $ bin/hadoop dfs –cat output/*

The output will have words with the corresponding frequencies on each line.