Book Image

MongoDB Cookbook

By : Amol Nayak
Book Image

MongoDB Cookbook

By: Amol Nayak

Overview of this book

<p>MongoDB is a high-performance and feature-rich NoSQL database that forms the backbone of numerous complex development systems. You will certainly find the MongoDB solution you are searching for in this book.</p> <p>Starting with how to initialize the server in three different modes with various configurations, you will then learn a variety of skills including the basics of advanced query operations and features in MongoDB and monitoring and backup using MMS. From there, you can delve into recipes on cloud deployment, integration with Hadoop, and improving developer productivity. By the end of this book, you will have a clear idea about how to design, develop, and deploy MongoDB.</p>
Table of Contents (17 chapters)
MongoDB Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Running MapReduce jobs on Hadoop using streaming


In the previous recipe, we implemented a simple MapReduce job using the Java API of Hadoop. The use case was the same as the one in the recipes of Chapter 3, Programming Language Drivers, where we saw MapReduce implemented using Mongo client APIs in Python and Java. In this recipe, we will use Hadoop streaming to implement MapReduce jobs.

The concept of streaming works based on communication using stdin and stdout. Get more information on what Hadoop streaming is and how it works at http://hadoop.apache.org/docs/r1.2.1/streaming.html.

Getting ready

Refer to the Executing our first sample MapReduce job using the mongo-hadoop connector recipe to see how to set up Hadoop for development purposes and build the mongo-hadoop project using gradle. As far as Python libraries are concerned, we will install the required library from source. However, you can use pip to carry out the setup if you do not wish to build from source. We will also see how to...