Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Summary


In its early days, Hadoop was sometimes erroneously seen as the latest supposed relational database killer. Over time, it has become more apparent that the more sensible approach is to view it as a complement to RDBMS technologies and that, in fact, the RDBMS community has developed tools such as SQL that are also valuable in the Hadoop world.

HiveQL is an implementation of SQL on Hadoop and was the primary focus of this chapter. In regard to HiveQL and its implementations, we covered the following topics:

  • How HiveQL provides a logical model atop data stored in HDFS in contrast to relational databases where the table structure is enforced in advance

  • How HiveQL supports many standard SQL data types and commands including joins and views

  • The ETL-like features offered by HiveQL, including the ability to import data into tables and optimize the table structure through partitioning and similar mechanisms

  • How HiveQL offers the ability to extend its core set of operators with user-defined code...