Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

The data model


Hive data is organized as databases. A database is a logical collection of Hive tables. A database within Hive assigns a namespace for its tables. If no namespace is assigned to Hive tables, it belongs to the default namespace. The creation of a database results in the creation of an HDFS directory for the files in the database. This directory serves as the namespace for the tables. The CREATE DATABASE MasteringHadoop command creates a MasteringHadoop database. When we list the HDFS directory structure, we see a directory created for this database, as shown:

drwxr-xr-x   - sandeepkaranth supergroup          0 2014-05-15 08:55 /user/hive/warehouse/masteringhadoop.db

A table is the basic unit of data storage similar to traditional RDBMS. It logically groups records of the same type. Records are rows corresponding to typed columns. A table maps to a single directory within HDFS. Hive also allows imposing structures on existing data locations via external tables. Metadata stored...