Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 2. Storage

After the overview of Hadoop in the previous chapter, we will now start looking at its various component parts in more detail. We will start at the conceptual bottom of the stack in this chapter: the means and mechanisms for storing data within Hadoop. In particular, we will discuss the following topics:

  • Describe the architecture of the Hadoop Distributed File System (HDFS)

  • Show what enhancements to HDFS have been made in Hadoop 2

  • Explore how to access HDFS using command-line tools and the Java API

  • Give a brief description of ZooKeeper—another (sort of) filesystem within Hadoop

  • Survey considerations for storing data in Hadoop and the available file formats

In Chapter 3, Processing – MapReduce and Beyond, we will describe how Hadoop provides the framework to allow data to be processed.