What is the use of GROUPING SETS clause in Hive queries?
This is little bit rarely used clause but it has great advantages. Knowing the functionality of this is very important because you come across any situation in which you might need to use this. Along with this clause we need to know about few more clauses like Grouping__ID, Grouping function, Cubes and Rollups.
What is GROUPING SETS clause?
Grouping Sets clause in Group By clauase allows the user to perform more than one Group By operation in the same record set. Any GROUPING SET can be expressed as a multiple Group By operation connected by UNION operation.
What is Grouping__ID?
(Please observe the double underscore in the above function.)
This is a function in SQL that is used to identify in which group the row is used in aggregation. You will understand this function when you look at an example that is mentioned below.
ROLLUP is a built in function that was built on GROUPING SETS. A typical query with ROLLUP looks like below:
SELECT dept_id, location, category, SUM(salary) FROM employee
GROUP BY dept_id, location, category ROLLUP;
This query is equivalent to below using GROUPING SETS: