-
Book Overview & Buying
-
Table Of Contents
Apache Hive Essentials
By :
Another aspect to manipulate data in Hive is to properly order or sort the data or result sets to clearly identify the important facts, such as top N values, maximum, minimum, and so on.
There are the following keywords used in Hive to order and sort data:
ORDER BY (ASC|DESC): This is similar to the RDBMS ORDER BY statement. A sorted order is maintained across all of the output from every reducer. It performs the global sort using only one reducer, so it takes a longer time to return the result. Usage with LIMIT is strongly recommended for ORDER BY. When hive.mapred.mode = strict (by default, hive.mapred.mode = nonstrict) is set and we do not specify LIMIT, there are exceptions. This can be used as follows:
jdbc:hive2://> SELECT name FROM employee ORDER BY NAME DESC; +----------+ | name | +----------+ | Will | | Shelley | | Michael | | Lucy | +----------+ 4 rows selected (57.057 seconds)
SORT BY (ASC|DESC): This indicates which columns to sort when ordering...