Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
About the Author
About the Reviewers

The data model

Hive data is organized as databases. A database is a logical collection of Hive tables. A database within Hive assigns a namespace for its tables. If no namespace is assigned to Hive tables, it belongs to the default namespace. The creation of a database results in the creation of an HDFS directory for the files in the database. This directory serves as the namespace for the tables. The CREATE DATABASE MasteringHadoop command creates a MasteringHadoop database. When we list the HDFS directory structure, we see a directory created for this database, as shown:

drwxr-xr-x   - sandeepkaranth supergroup          0 2014-05-15 08:55 /user/hive/warehouse/masteringhadoop.db

A table is the basic unit of data storage similar to traditional RDBMS. It logically groups records of the same type. Records are rows corresponding to typed columns. A table maps to a single directory within HDFS. Hive also allows imposing structures on existing data locations via external tables. Metadata stored...