In this recipe, we will look at how we can recover deleted files from the Hadoop cluster. What if the user deletes a critical file with the -skipTrash
option? Can it be recovered?
This recipe, is more of a best effort to restore the files after deletion. When the delete command is executed, the Namenode updates its metadata in edits
file and then fires the invalidate
command to remove the blocks. If the cluster is very busy, the invalidation might take time and we can revoke the files. But, on an idle cluster, if we delete the files, Namenode will immediately fire the invalidate command in response to the Datanode heartbeat and as Datanode does not have any pending operations to do, it will delete the blocks.
Make sure that the user has a running cluster with at least HDFS configured and working perfectly.