Why MapReduce task won’t run when we perform select * from table in Hive?

While executing Hive queries you might have observed that the MapReduce task won't start when you do perform a Select * from table query without giving any other clauses. This is because to improve the performance of the Hive queries. In technical words, when we do select * from table in Hive, then the task will completed as a FetchTask rather than a MapReduce task. So there is no need of MapReduce tasks in this case.

Simple analogy for this is, Map task will be kind of Where clause in Hive and Reduce task will be similar to any aggregation operation like group by. When you perform a Select * from table, there won't be any where or group by operations. So there is no need of MapReduce task. To get the complete knowledge of background work of this, run the EXPLAIN command on select * from table command, where you can see the background tasks of this query.

 

Please give us your valuable feedback about this post..

Leave a Reply

Your email address will not be published. Required fields are marked *