Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

HDFS – advantages and drawbacks


HDFS has its advantages and drawbacks. Some of its advantages are as follows:

  • HDFS is inexpensive because of two reasons. Firstly, the filesystem relies on commodity storage disks that are much less expensive than the storage media used for enterprise grade storage. Secondly, the filesystem shares the hardware with the computation framework as well, in this case, MapReduce. Also, HDFS is open source and does not levy licensing fee on the user.

  • HDFS has been around for more than 7 years and is considered mature technology. There is a large community behind it and a broad range of organizations that are storing petabytes of data on HDFS.

  • HDFS is optimized for MapReduce workloads. It provides very high performance for sequential reads and writes, which is the typical access pattern in MapReduce jobs.

But, HDFS does not cater to all the data needs that may arise in an enterprise. The main drawback of HDFS is that it is not POSIX compliant. This means:

  • HDFS is immutable...