Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 4. Advanced Hive

SQL is a popular data-processing language that has been around for four decades. There are scores of people who are already familiar with Relational Data Stores and SQL. A natural step in onboarding more users onto Hadoop is to flatten the learning curve by bringing in concepts they are well versed with. Hive introduces relational and SQL concepts into Hadoop MapReduce. In the chapter on Pig, you saw the advanced usage of Pig scripts to author MapReduce workflows. In this chapter, we will delve into the advanced usage of Hive.

Apache Hive is often described as a data warehouse infrastructure. Traditionally, business intelligence is gathered from a data warehouse, a database that stores data from many sources within an enterprise. This database stores both historical and current data in an enterprise. This data store is primarily queried for reporting and analytics. Traditionally, infrastructure that made data warehouses consisted of Relational Databases (RDBMS), and...