Skip to content
Big Data Interview
The Interview Hacker and Technical guide
  • Home
  • Blogs
  • About Us
  • Contact Us
  • Privacy Policy

Category: Hive

How to set configuration to start Reduce jobs after completion of certain proportion of the Map jobs in Hive or Hadoop?

June 8, 2021 admin Leave a comment

Within the MapReduce framework in Platform Symphony, you can specify the proportion of the total number of map tasks in a…

Continue Reading →

Posted in: Big Data, Hive, MapReduce

How to process JSON data or file in HIVE without using JsonSerDe?

December 20, 2020 admin Leave a comment

It is very rare that the usage of HIVE with JSON. But sometimes business requirements might force the developers to…

Continue Reading →

Posted in: Big Data, Hive Filed under: HIVEJSON

How to perform minus operation in Hive using joins?

December 4, 2020 admin Leave a comment

What is Minus operation? Below is a picture that shows Venn diagram of result of minus operation between two tables…

Continue Reading →

Posted in: Big Data, Hive Filed under: MinuOperationusingjoins

Difference between IN operator and EXISTS operator in HIVE or SQL.

October 26, 2020 admin Leave a comment

EXISTS EXISTS operator will be used when we need to check if there is any row exists with a condition.…

Continue Reading →

Posted in: Hive, Spark SQL Filed under: EXISTS operator, IN and EXISTS in SQL, IN Operator, SQL

How to delete duplicate records in Hive (or) How to extract unique records in Hive using analytical functions.

July 14, 2020 admin Leave a comment

In this post we will see how we can extract unique records from a Hive table. This can be achieved…

Continue Reading →

Posted in: Hive

RANK() vs DENSE_RANK() vs ROW_NUMBER() in Hive (or) Differences between RANK(), DENSE_RANK() and ROW_NUMBER() (or) Ranking window functions in Hive

admin Leave a comment

One of the most frequent questions during Data Engineering interviews. These are called Ranking functions in Hive. These are the…

Continue Reading →

Posted in: Hive Filed under: DENSE_RANK() in Hive(), RANK() in Hive, RANK() vs DENSE_RANK() vs ROW_NUMBER() in Hive, ROW_NUMBER() in Hive

How to calculate moving sum or moving average in Hive?

admin Leave a comment

This post will focus on calculating moving average or sum using Hive queries. We might have come across this question…

Continue Reading →

Posted in: Hive Filed under: Moving Average in Hive, Moving sum in Hive

Hive CLI vs Beeline (or) Difference between Hive CLI and Beeline

July 13, 2020 admin Leave a comment

Hive CLI and Beeline both can be used to interact with Hive execution engine. But there are few differences between…

Continue Reading →

Posted in: Hive Filed under: Beeline, Hive CLI, Hive CLI vs Beeline

Map join in Hive (or) Map side join in Hive (or) Auto Map join in Hive (or) Broadcast join in Hive

admin Leave a comment

Map join in Hive has several different names like Auto Map join, Map side join and Broadcast join. It is…

Continue Reading →

Posted in: Hive Filed under: Joins in Hive, Map join in Hive

Explain about Grouping Sets in Hive (or) Grouping Sets in SQL?

May 14, 2020 admin Leave a comment

What is the use of GROUPING SETS clause in Hive queries? This is little bit rarely used clause but it…

Continue Reading →

Posted in: Hive Filed under: CUBE, GROUPING SETS, GROUPING__ID(), HIVE QL, ROLLUP, SQL

Post navigation

Page 1 of 2
1 2 Next →

Recent Posts

  • Save action in Spark takes too long time/Save operation spills huge data on to disk and fails with the error “No space left on device”
  • How to set configuration to start Reduce jobs after completion of certain proportion of the Map jobs in Hive or Hadoop?
  • HDFS commands

Recent Comments

  • curry 7 sour patch on Spark groupByKey vs reduceByKey vs aggregateByKey
  • jordan 4 on Hive – Order By vs Sort By vs Cluster By vs Distribute By
  • louboutin shoes on Spark RDD vs Dataframe vs Dataset

Archives

  • August 2021
  • June 2021
  • May 2021
  • January 2021
  • December 2020
  • October 2020
  • July 2020
  • May 2020
  • April 2020
  • March 2020
  • November 2019
  • July 2019
  • June 2019
  • May 2019

Follow Us

Contact Us

  • Email
    sparkandbigdatainterview@gmail.com
Privacy Policy
Copyright © 2023 Big Data Interview