Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Performing a join with Hive


This recipe will guide you on how to use Hive to perform a join across two datasets. The first dataset is the book details dataset of the Book-Crossing database and the second dataset is the reviewer ratings for those books. This recipe will use Hive to find the authors with the most number of ratings of more than 3 stars.

Getting ready

Follow the previous Hive batch mode – using a query file recipe.

How to do it...

This section demonstrates how to perform a join using Hive. Proceed with the following steps:

  1. Start the Hive CLI and use the Book-Crossing database:

    $ hive
    hive > USE bookcrossing;
    
  2. Create the books and book ratings tables by executing the create-book-crossing.hql Hive query file after referring to the previous Hive batch mode commands using a query file recipe. Use the following commands to verify the existence of those tables in the Book-Crossing database:

    hive > SELECT * FROM books LIMIT 10;
    ….
    hive > SELECT * FROM RATINGS LIMIT 10;
    ….
    
  3. Now, we...