We can query the datasets that have been mapped to Hive tables using HiveQL, which is similar to SQL. These queries can be simple data-exploration operations such as counts, orderby, and group by
as well as complex joins, summarizations, and analytic operations. In this recipe, we'll explore simple data exploration Hive queries. The subsequent recipes in this chapter will present some of the advanced querying use cases.
Install Hive and follow the earlier Creating databases and tables using Hive CLI recipe.
This section demonstrates how to perform a simple SQL-style query using Hive.
Start Hive by issuing the following command:
$ hive
Issue the following query in the Hive CLI to inspect the users aged between 18 and 34. Hive uses a MapReduce job in the background to perform this data-filtering operation:
hive> SELECT user_id, location, age FROM users WHERE age>18 and age <34 limit 10; Total...