Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using multiple disks/volumes and limiting HDFS disk usage


Hadoop supports specifying multiple directories for the DataNode data directory. This feature allows us to utilize multiple disks/volumes to store data blocks in DataNodes. Hadoop tries to store equal amounts of data in each directory. It also supports limiting the amount of disk space used by HDFS.

How to do it...

The following steps will show you how to add multiple disk volumes:

  1. Create HDFS data storage directories in each volume.

  2. Locate the hdfs-site.xml configuration file. Provide a comma-separated list of directories corresponding to the data storage locations in each volume under the dfs.datanode.data.dir property as follows:

    <property>
             <name>dfs.datanode.data.dir</name>
             <value>/u1/hadoop/data, /u2/hadoop/data</value>
    </property>
  3. In order to limit disk usage, add the following property to the hdfs-site.xml file to reserve space for non-DFS usage. The value specifies the number...