Skip to content
Big Data Interview
The Interview Hacker and Technical guide
  • Home
  • Blogs
  • About Us
  • Contact Us
  • Privacy Policy

Month: June 2019

Currying Function in Scala

June 25, 2019 admin Leave a comment

Currying in Scala is a technique of transforming a function that takes multiple arguments into a function that takes only…

Continue Reading →

Posted in: Uncategorized

Higher Order Functions in Scala

June 24, 2019 admin Leave a comment

Higher Order functions take other functions as parameters and return function as result, i.e., passing functions as parameters to other…

Continue Reading →

Posted in: Uncategorized

Difference between createOrReplaceTempView and registerTempTable (or) createOrReplaceTempView vs registerTempTable.

June 18, 2019 admin Leave a comment

All the functions mentioned below are more or less same functionally, but there very minor differences among them. createOrReplaceTempView createTempView…

Continue Reading →

Posted in: Spark, Spark SQL

map vs mapValues

June 16, 2019 admin Leave a comment

mapValues – This function works with PairRDDs only. So this function always requires an RDD of type RDD[(a,b)]. mapValues functions…

Continue Reading →

Posted in: Scala, Spark

Hive – Order By vs Sort By vs Cluster By vs Distribute By

June 14, 2019 admin 2 Comments

Hive has so many clubbing operations like Order By, Sort By etc. Each clause it’s own uses, advantages and disadvantages.…

Continue Reading →

Posted in: Hive

What are the differences between Spark 1.x and Spark 2.x?

June 9, 2019 admin Leave a comment

Even though Spark is very faster compared to Hadoop, Spark 1.6x has some performance issues which are corrected in Spark…

Continue Reading →

Posted in: Uncategorized

Hive query to get sum of all the positive values and negative values of a column into two different columns

admin Leave a comment

Assume that we have a table as below: Column_name 1 -2 3 -4 5 Need to write a query to…

Continue Reading →

Posted in: Hive, Spark Filed under: Hive

LATERAL VIEW in Hive

admin 1 Comment

Some you will be asked a question in Hive like, we have a table in which one of the columns…

Continue Reading →

Posted in: Hadoop Filed under: Hive, Lateral View

Difference between SparkContext and SparkSession (or) SparkSession vs SparkContext

admin Leave a comment

One of the most commonly asked interview questions. If you are mid-level experienced professional this will be compulsory question. In…

Continue Reading →

Posted in: Spark

Why do companies prefer Cloud platforms in Bigdata processing?

admin Leave a comment

Now a days almost all the companies are using Cloud platforms despite any technology. There are lot reasons behind it.…

Continue Reading →

Posted in: Cloud computing, EMR, Hadoop, S3, Spark

Post navigation

Page 1 of 2
1 2 Next →

Recent Posts

  • Save action in Spark takes too long time/Save operation spills huge data on to disk and fails with the error “No space left on device”
  • How to set configuration to start Reduce jobs after completion of certain proportion of the Map jobs in Hive or Hadoop?
  • HDFS commands

Recent Comments

  • curry 7 sour patch on Spark groupByKey vs reduceByKey vs aggregateByKey
  • jordan 4 on Hive – Order By vs Sort By vs Cluster By vs Distribute By
  • louboutin shoes on Spark RDD vs Dataframe vs Dataset

Archives

  • August 2021
  • June 2021
  • May 2021
  • January 2021
  • December 2020
  • October 2020
  • July 2020
  • May 2020
  • April 2020
  • March 2020
  • November 2019
  • July 2019
  • June 2019
  • May 2019

Follow Us

Contact Us

  • Email
    sparkandbigdatainterview@gmail.com
Privacy Policy
Copyright © 2023 Big Data Interview