In this recipe, we are going to see how to recover deleted data from the trash to HDFS.
To recover accidently deleted data from HDFS, we first need to enable the trash folder, which is not enabled by default in HDFS. This can be achieved by adding the following property to core-site.xml
:
<property> <name>fs.trash.interval</name> <value>120</value> </property>
Then, restart the HDFS daemons:
/usr/local/hadoop/sbin/stop-dfs.sh /usr/local/hadoop/sbin/start-dfs.sh
This will set the deleted file retention to 120 minutes.
Now, let's try to delete a file from HDFS:
hadoop fs -rmr /LICENSE.txt 15/10/30 10:26:26 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 120 minutes, Emptier interval = 0 minutes. Moved: 'hdfs://localhost:9000/LICENSE.txt' to trash at: hdfs://localhost:9000/user...