HDFS stores files across the cluster by breaking them down in to coarser-grained, fixed-size blocks. The default HDFS block size is 64 MB. Block size of a data product can affect the performance of the filesystem operations where larger block sizes would be more effective if you are storing and processing very large files. Block size of a data product can also affect the performance of MapReduce computations, as the default behavior of Hadoop is to create one Map task for each data block of the input files.
The following steps show you how to use the NameNode configuration file to set the HDFS block size:
Add or modify the following code in the
$HADOOP_HOME/etc/hadoop/hdfs-site.xml
file. The block size is provided using the number of bytes. This change would not change the block size of the files that are already in the HDFS. Only the files copied after the change will have the new block size.<property> <name>dfs.blocksize<...