In this recipe, we will be looking at the network design for the Hadoop cluster and what things to consider for planning a Hadoop cluster.
Make sure that the user has a running cluster with HDFS and YARN and has at least two nodes in the cluster.
Connect to the
master1.cyrus.com
Namenode and switch to the userhadoop
.Execute the commands as follows to check for the link speed and other network option modes:
$ ethtool eth0 $ iftop $ netstat -s
Always have a separate network for Hadoop traffic by using VLANs.
Ensure the DNS resolution works for both forward and reverse lookup.
Run a caching-only DNS within the Hadoop network, which caches records for faster resolution.
Consider NIC teaming or binding for better performance.
Use dedicated core switches and rack top switches.
Consider having static IPs per node in the cluster.
Disable IPv6 for all nodes and just use IPv4.
Increasing the size of the cluster will mean more connections and more data across nodes...