Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Accessing a Hive table data in Pig using HCatalog


There can be scenarios where we want to access the same dataset from both Hive and Pig. There can also be scenarios where we want to process the results of a Hive query that's mapped to a Hive table using Pig. In such cases, we can take advantage of the HCatalog integration in Pig to access HCatalog managed Hive tables from Pig without worrying about the data definition, data storage format, or the storage location.

Getting ready

Follow the Hive batch mode - using a query file recipe from Chapter 6, Hadoop Ecosystem – Apache Hive to create the Hive table that we'll be using in this recipe.

How to do it...

This section demonstrates how to access a Hive table from Pig. Proceed with the following steps:

  1. Start the Pig's Grunt shell with the -useHCatalog flag, as follows. This will load the HCatalog JARs that are necessary to access HCatalog managed tables in Hive:

    $ pig -useHCatalog
    
  2. Use the following command in the Grunt shell to load the users table...