Book Image

Apache Hive Essentials

By : Dayong Du
Book Image

Apache Hive Essentials

By: Dayong Du

Overview of this book

Table of Contents (17 chapters)
Apache Hive Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The INNER JOIN statement


Hive JOIN is used to combine rows from two or more tables together. Hive supports common JOIN operations such as what's in the RDBMS, for example, JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, and CROSS JOIN. However, Hive only supports equal JOIN instead of unequal JOIN, because unequal JOIN is difficult to be converted to MapReduce jobs.

The INNER JOIN in Hive uses JOIN keywords, which return rows meeting the JOIN conditions from both left and right tables. The INNER JOIN keyword can also be omitted by comma-separated table names since Hive 0.13.0. See the following examples to show various inner JOIN statements in Hive:

  • Prepare another table to join and load data:

    jdbc:hive2://> CREATE TABLE IF NOT EXISTS employee_hr
    . . . . . . .> (
    . . . . . . .>   name string,
    . . . . . . .>   employee_id int,
    . . . . . . .>   sin_number string,
    . . . . . . .>   start_date date
    . . . . . . .> )
    . . . . . . .> ROW FORMAT DELIMITED
    . . . . . ...