Book Image

Big Data Forensics: Learning Hadoop Investigations

Book Image

Big Data Forensics: Learning Hadoop Investigations

Overview of this book

Table of Contents (15 chapters)
Big Data Forensics – Learning Hadoop Investigations
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Collecting Hive evidence


Hive is a platform for analyzing data. It uses a familiar SQL querying language, so there is no need to write Java code for MapReduce functions. Hive operates such as a database and stores all metadata in a database, so accessing the database via queries should be familiar to people who have experience working with relational databases. Hive has several important components that are critical to understand for investigations:

  • Hive Data Storage: The type and location of data stored and accessed by Hive, which includes HDFS, Amazon S3, and other locations

  • Metastore: The database that contains Hive data metadata (not in HDFS)

  • HiveQL: The Hive query language, which is a SQL-like language

  • Databases and Tables: The logical containers of Hive data

  • Hive Shell: The shell interpreter for HiveQL

  • Hive Clients: The mechanisms for connecting a Hive server, such as Hive Thrift clients, Java Database Connectivity (JDBC) clients, and ODBC clients

Hive stores record-based data in files...