HBase is a database running on Hadoop, and just like other databases, it keeps a lot of files open at the same time. Linux limits the number of file descriptors that any one process may open; the default limits are 1024 per process. To run HBase smoothly, you need to increase the maximum number of open file descriptors for the user, who started HBase. In our case, the user is called hadoop
.
You should also increase Hadoop's nproc
setting. The nproc
setting specifies the maximum number of processes that can exist simultaneously for the user. If nproc
is too low, an OutOfMemoryError
error may happen.
We will describe how to show and change the kernel settings, in this recipe.
You will need to make the following kernel setting changes to all servers of the cluster:
1. To confirm the current open file limits, log in as the
hadoop
user and execute the following command:hadoop$ ulimit -n 1024
2. To show the setting for maximum processes, use the
-u
option of theulimit
command:hadoop$ ulimit -u unlimited
3. Log in as the
root
user to increase open file andnproc
limits. Add the following settings to thelimits.conf
file:root# vi /etc/security/limits.conf hadoop soft nofile 65535 hadoop hard nofile 65535 hadoop soft nproc 32000 hadoop hard nproc 32000
4. To apply the changes, add the following line into the
/etc/pam.d/common-session
file:root# echo "session required pam_limits.so" >> /etc/pam.d/common-session
5. Log out and back in again, as the
hadoop
user, and confirm the setting values again; you should see the above changes have been applied:hadoop$ ulimit -n 65535 hadoop$ ulimit -u 32000
The previous setting changes the hadoop
user's open file limit to 65535
. It also changes the hadoop
user's max processes number to 32000
. With this change of the kernel setting, HBase can keep enough files open at the same time and also run smoothly.
Chapter 8, Basic Performance Tuning