Book Image

The Ultimate Hands-On Hadoop [Video]

By : Frank Kane
Book Image

The Ultimate Hands-On Hadoop [Video]

By: Frank Kane

Overview of this book

Understanding Hadoop is a highly valuable skill for anyone working at companies that work with large amounts of data. Companies such as Amazon, eBay, Facebook, Google, LinkedIn, IBM, Spotify, Twitter, and Yahoo, use Hadoop in some way to process huge chunks of data. This video course will make you familiar with Hadoop's ecosystem and help you to understand how to apply Hadoop skills in the real world. The course starts by taking you through the installation process of Hadoop on your desktop. Next, you will manage big data on a cluster with Hadoop Distributed File System (HDFS) and MapReduce, and use Pig and Spark to analyze data on Hadoop. Moving along, you will learn how to store and query your data using applications, such as Sqoop, Hive, MySQL, Phoenix, and MongoDB. Next, you will design real-world systems using the Hadoop ecosystem and learn how to manage clusters with Yet Another Resource Negotiator (YARN), Mesos, Zookeeper, Oozie, Zeppelin, and Hue. Towards the end, you will uncover the techniques to handle and stream data in real-time using Kafka, Flume, Spark Streaming, Flink, and Storm. By the end of this course, you will become well-versed with the Hadoop ecosystem and will develop the skills required to store, analyze, and scale big data using Hadoop. All the codes and supporting files for this course are available at - https://github.com/packtpublishing/the-ultimate-hands-on-hadoop
Table of Contents (12 chapters)
12
Learning More
Chapter 1
Learning All the Buzzwords and Installing the Hortonworks Data Platform Sandbox
Section 1
Introduction and Installation of Hadoop
This video introduces Hadoop. You will learn how to install the Hortonworks Sandbox in a virtual machine on a PC, which is the quickest way to get up and running with Hadoop, so you can start learning and experimenting with it. You will also learn how to download some real movie ratings data and use Hive to analyze it.