Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (13 chapters)
Scaling Big Data with Hadoop and Solr Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Common problems and their solutions


The following is a list of common problems and their solutions:

  • When I try to format the HDFS node, I get the exception java.io.IOException: Incompatible clusterIDs in namenode and datanode?

    This issue usually appears if you have a different/older cluster and you are trying to format a new namenode; however, the datanodes still point to older cluster ids. This can be handled by one of the following:

    1. By deleting the DFS data folder, you can find the location from hdfs-site.xml and restart the cluster

    2. By modifying the version file of HDFS usually located at <HDFS-STORAGE-PATH>/hdfs/datanode/current/

    3. By formatting namenode with the problematic datanode's cluster ID:

        $ hdfs namenode -format -clusterId <cluster-id>
      
  • My Hadoop instance is not starting up with the ./start-all.sh script? When I try to access the web application, it shows the page not found error?

    This could be happening because of a number of issues. To understand the issue, you must look at the Hadoop logs first. Typically, Hadoop logs can be accessed from the /var/log folder if the precompiled binaries are installed as the root user. Otherwise, they are available inside the Hadoop installation folder.

  • I have setup N node clusters, and I am running the Hadoop cluster with ./start-all.sh. I am not seeing many nodes in the YARN/NameNode web application?

    This again can be happening due to multiple reasons. You need to verify the following:

    1. Can you reach (connect to) each of the cluster nodes from namenode by using the IP address/machine name? If not, you need to have an entry in the /etc/hosts file.

    2. Is the ssh login working without password? If not, you need to put the authorization keys in place to ensure logins without password.

    3. Is datanode/nodemanager running on each of the nodes, and can you connect to namenode/AM? You can validate this by running ssh on the node running namenode/AM.

    4. If all these are working fine, you need to check the logs and see if there are any exceptions as explained in the previous question.

    5. Based on the log errors/exceptions, specific action has to be taken.