Book Image

Visual Studio 2013 and .NET 4.5 Expert Cookbook

Book Image

Visual Studio 2013 and .NET 4.5 Expert Cookbook

Overview of this book

Table of Contents (14 chapters)
Visual Studio 2013 and .NET 4.5 Expert Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Working with HDInsight (Hadoop) for Big Data processing


SQL Azure provides a relational database technology to the Windows Azure platform. However, sometimes the data becomes so vast that it could not be handled using a relational database. Even sometimes, the data that needs to be analyzed is not relational at all. Hadoop is a new technology that has been introduced recently to help in analyzing Big Data problems.

Hadoop is an Apache-based open source project. This technology stores data in Hadoop Distributed File System (HDFS) and then lets the developers create MapReduce jobs to analyze that data. The main advantages of a Hadoop filesystem is that it stores data in multiple servers, and then allows to run chunks of MapReduce jobs, letting Big Data be processed in parallel.

HDInsight is the name of the Windows Azure Apache Hadoop-based service. HDInsight lets HDFS to store data in clusters and distribute it across multiple virtual machines. It also spreads the MapReduce job across VMs....