There can be scenarios where we want to access the same dataset from both Hive and Pig. There can also be scenarios where we want to process the results of a Hive query that's mapped to a Hive table using Pig. In such cases, we can take advantage of the HCatalog integration in Pig to access HCatalog managed Hive tables from Pig without worrying about the data definition, data storage format, or the storage location.
Follow the Hive batch mode - using a query file recipe from Chapter 6, Hadoop Ecosystem – Apache Hive to create the Hive table that we'll be using in this recipe.
This section demonstrates how to access a Hive table from Pig. Proceed with the following steps:
Start the Pig's Grunt shell with the
-useHCatalog
flag, as follows. This will load the HCatalog JARs that are necessary to access HCatalog managed tables in Hive:$ pig -useHCatalog
Use the following command in the Grunt shell to load the
users
table...