Adaptive Query Execution(AQE) Spark is one of the vastly used frameworks in Data Engineering to process huge data. As…
What is the use of GROUPING SETS clause in Hive queries? This is little bit rarely used clause but it…
How can you handle skewed data in Hive or Spark? How can we get rid of data skewness in Spark…
What is Skewness in Data? Data skew means data is distributed unevenly or asymmetrically. Let’s try to understand this in…
Bigdata developers always have to have some knowledge about internal working of all the components. That’s where we get to…
How to create a dataframe using a custom schema in Spark? This is one of the most common interview questions.…
This post will explain the difference between the SQL functions rownum and rowid. ROWID : 1). ROWID is a…
In most of your interviews you might have came across the question to write word count program in MapReduce or…
What is Scala Monad? Monad is neither a data type nor class/trait. Monad is a concept. There are lot ways…
What is SCDs or Slowly Changing Dimensions? Slowly changing dimensions is a concept related to data warehousing. They track the…