Book Image

Big Data Forensics: Learning Hadoop Investigations

Book Image

Big Data Forensics: Learning Hadoop Investigations

Overview of this book

Table of Contents (15 chapters)
Big Data Forensics – Learning Hadoop Investigations
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The Hadoop shell command collection


Collecting HDFS data from within the Hadoop layer solves many of the problems that affect host operating system collections. First, the collection only has to be performed from a single machine. By accessing Hadoop through a Hadoop client's command line, all HDFS files are available, so the collection does not involve collecting data from each node individually. Second, the collected data does not require any piecing together or file carving in the analysis phase. The data that is collected is already pieced together as the logical Hadoop files, so no carving or data reconstruction is required.

The following is a list of limitations of collecting HDFS data from the Hadoop shell command line:

  • This method is only possible when Hadoop is online and its command line is accessible

  • Forensic tools such as dd and md5sum cannot easily be used during the collection of the data

  • Deleted data and data in memory that has not been written to disk may not be available

  • Hadoop...