Skip to content

Big Data Interview

The Interview Hacker and Technical guide
  • Home
  • Blogs
  • About Us
  • Contact Us
  • Privacy Policy

Blog

  1. Pages:
  2. «
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9

What do you know about ORC file format?

May 11, 2019 admin Leave a comment

What is ORC file format? How ORC is better than RC file format?   ORC stands for Optimized Record Columnar…

Continue Reading →

Posted in: Uncategorized

What is Avro file in Spark/Hadoop?

admin Leave a comment

Tell me something about Avro file format. What do you know about Avro files?   These are the common question…

Continue Reading →

Posted in: Uncategorized

What is Sequence file in Spark/Hadoop?

May 10, 2019 admin Leave a comment

You may have come across questions like below during any of your spark interview. So to get full knowledge on…

Continue Reading →

Posted in: Uncategorized

Is Java needed for Big Data/Spark interview?

May 8, 2019 admin Leave a comment

Many of us might be thinking is really Java required for a Big Data/Spark/Data engineer interview? If yes, what all…

Continue Reading →

Posted in: Uncategorized

What is difference between cache() and persist in Spark?

May 7, 2019 admin Leave a comment

Similar and related questions: How do you cache dataset in Spark? How many ways to cache the data in Spark?…

Continue Reading →

Posted in: Uncategorized

How to set number of reducers for a Sqoop job?

admin Leave a comment

How can you set number of reducers for Sqoop job? How many reducers did you use for your Sqoop job?…

Continue Reading →

Posted in: Uncategorized

What is meant by shared variable? What are the shared variables available in spark?

May 6, 2019 admin Leave a comment

What is a shared variable? A variable that is available on all of the executors or nodes that work on…

Continue Reading →

Posted in: Uncategorized

How to prepare for Spark interview?

May 5, 2019 admin Leave a comment

How to prepare resume for Bigdata interview? Spark interview includes lot of other bigdata technologies like Hadoop, Hive, Sqoop, Flume…

Continue Reading →

Posted in: Hadoop, Spark

Why/How Spark is faster than Hadoop?

admin Leave a comment

What is Hadoop? Hadoop is a framework that is used to process large data sets using a programming paradigm called…

Continue Reading →

Posted in: Spark Filed under: Bigdata, Hadoop, Spark

What is the difference between repartition and coalesce?

admin 5 Comments

I’m not sure how many of us use these two operations frequently in our projects, but these two operations are…

Continue Reading →

Posted in: Coalesce, Repartition, Shuffling, Spark Filed under: Coalesce, Partitioning, Repaartition, Spark
  1. Pages:
  2. «
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9

Post navigation

Page 9 of 9
← Previous 1 … 8 9

Follow Us

Contact Us

  • Email
    sparkandbigdatainterview@gmail.com
Privacy Policy
Copyright © 2023 Big Data Interview