In this post we will see how we can extract unique records from a Hive table. This can be achieved…
One of the most frequent questions during Data Engineering interviews. These are called Ranking functions in Hive. These are the…
This post will focus on calculating moving average or sum using Hive queries. We might have come across this question…
Hive CLI and Beeline both can be used to interact with Hive execution engine. But there are few differences between…
If you are an aspiring Data Engineer and reading this article implies that you have landed on a wonderful website.…
Map join in Hive has several different names like Auto Map join, Map side join and Broadcast join. It is…
Adaptive Query Execution(AQE) Spark is one of the vastly used frameworks in Data Engineering to process huge data. As…
What is the use of GROUPING SETS clause in Hive queries? This is little bit rarely used clause but it…
How can you handle skewed data in Hive or Spark? How can we get rid of data skewness in Spark…
What is Skewness in Data? Data skew means data is distributed unevenly or asymmetrically. Let’s try to understand this in…