Book Image

Apache Hive Essentials

By : Dayong Du
Book Image

Apache Hive Essentials

By: Dayong Du

Overview of this book

Table of Contents (17 chapters)
Apache Hive Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Advanced aggregation – ROLLUP and CUBE


The ROLLUP statement enables a SELECT statement to calculate multiple levels of aggregations across a specified group of dimensions. The ROLLUP statement is a simple extension to the GROUP BY clause with high efficiency and minimal overhead to a query. Compared to GROUPING SETS that creates specified levels of aggregations, ROLLUP creates n+1 levels of aggregations, where n is the number of grouping columns. First, it calculates the standard aggregate values specified in the GROUP BY clause. Then, it creates higher-level subtotals, moving from right to left through the list of combinations of grouping columns, as shown in the following example:

GROUP BY a,b,c WITH ROLLUP

This is equivalent to the following:

GROUP BY a,b,c GROUPING SETS ((a,b,c),(a,b),(a),())

The CUBE statement takes a specified set of grouping columns and creates aggregations for all of their possible combinations. If n columns are specified for CUBE, there will be 2n combinations of...