Hadoop supports specifying multiple directories for the DataNode data directory. This feature allows us to utilize multiple disks/volumes to store data blocks in DataNodes. Hadoop tries to store equal amounts of data in each directory. It also supports limiting the amount of disk space used by HDFS.
The following steps will show you how to add multiple disk volumes:
Create HDFS data storage directories in each volume.
Locate the
hdfs-site.xml
configuration file. Provide a comma-separated list of directories corresponding to the data storage locations in each volume under thedfs.datanode.data.dir
property as follows:<property> <name>dfs.datanode.data.dir</name> <value>/u1/hadoop/data, /u2/hadoop/data</value> </property>
In order to limit disk usage, add the following property to the
hdfs-site.xml
file to reserve space for non-DFS usage. The value specifies the number...