Skip to content

Big Data Interview

The Interview Hacker and Technical guide
  • Home
  • Blogs
  • About Us
  • Contact Us
  • Privacy Policy

Blog

  1. Pages:
  2. 1
  3. 2
  4. 3
  5. 4
  6. 5
  7. 6
  8. 7
  9. 8
  10. 9
  11. »

Save action in Spark takes too long time/Save operation spills huge data on to disk and fails with the error “No space left on device”

August 12, 2021 admin Leave a comment

It is very common in Big Data environment to deal with data of size Terabytes. We might observer some times…

Continue Reading →

Posted in: Spark Filed under: SparkOptimizations, SparkPerformanceEnhancementTechniques

How to set configuration to start Reduce jobs after completion of certain proportion of the Map jobs in Hive or Hadoop?

June 8, 2021 admin Leave a comment

Within the MapReduce framework in Platform Symphony, you can specify the proportion of the total number of map tasks in a…

Continue Reading →

Posted in: Big Data, Hive, MapReduce

HDFS commands

May 31, 2021 admin Leave a comment

HDFS commands Interview questions    1). Difference between the commands hadoop dfs and hadoop fs? hadoop dfs – This is…

Continue Reading →

Posted in: HDFSCommands Filed under: HDFSCommands, HDFSInterviewQuestions

Lead and Lag using Spark Scala

May 13, 2021 admin Leave a comment

Sometimes while processing data we will come across some situations where we need to find the difference of price a…

Continue Reading →

Posted in: Scala, Spark Filed under: Lag, Lead

Option, Some, None in Scala (OR) How to handle null values in Scala?

January 19, 2021 admin Leave a comment

Functional programming is like writing a series of algebraic equations, and because you don’t use null values in algebra, you…

Continue Reading →

Posted in: Scala

What is Singleton object in Scala?

January 17, 2021 admin Leave a comment

What is Singleton object? Scala doesn’t have a concept called static. Instead of static, scala has something called Singleton object.…

Continue Reading →

Posted in: Scala

How to process JSON data or file in HIVE without using JsonSerDe?

December 20, 2020 admin Leave a comment

It is very rare that the usage of HIVE with JSON. But sometimes business requirements might force the developers to…

Continue Reading →

Posted in: Big Data, Hive Filed under: HIVEJSON

How to add unique index or unique row number to reach row of a DataFrame?

December 7, 2020 admin Leave a comment

There are multiple ways to do this Spark. Here we have discussed two of the approaches to accomplish this task.…

Continue Reading →

Posted in: Big Data, Spark

Advanced performance enhancement techniques in Spark.

December 6, 2020 admin Leave a comment

Design choices: Language choice This impossible to answer and highly depends on your requirement. If you want to perform some…

Continue Reading →

Posted in: Big Data, Spark Filed under: Spark, SparkPerformanceEnhancementTechniques

zip, zipWithIndex and zipWithUniqueId in Spark

admin Leave a comment

These functions are little rarely used in Spark as they confined to be used with RDDs only and RDDs are…

Continue Reading →

Posted in: Big Data, Spark Filed under: zip, zipWithIndex, zipWithUniqueId
  1. Pages:
  2. 1
  3. 2
  4. 3
  5. 4
  6. 5
  7. 6
  8. 7
  9. 8
  10. 9
  11. »

Post navigation

Page 1 of 9
1 2 … 9 Next →

Follow Us

Contact Us

  • Email
    sparkandbigdatainterview@gmail.com
Privacy Policy
Copyright © 2023 Big Data Interview