We will discuss the different types of nodes along with their role and usage in Hadoop Ecosystem:
- NameNode: The NameNode is an important part of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks across where the cluster data files are stored. The data for these files is not stored at all. Client applications communicate with NameNode whenever there is a need to locate a file, or when they want to modify a file. The modifications are stored by NameNode as a log that is appended to a native file system file edits. When a NameNode starts up, it reads the HDFS state from an image file, fsimage, and then applies the edits to the log file.
- Secondary NameNode: Secondary NameNode's whole purpose is to have a checkpoint in HDFS. The Secondary NameNode is just a helper node for NameNode; it merges the fsimage and the edits log files periodically and keeps edits log size within a limit.
- DataNode: A DataNode stores data in HDFS....