What happens to number of partitions when we union two RDDs?

Assume that we have an RDD of m partitions and another RDD of n partitions. When we perform union operation on first RDD with second RDD we will get a new RDD of partitions m+n.

Let's take an example scenario and do it. We have an RDD of some number of partitions. Here I have a repartitioned the RDD to 1 partition and stored in a variable. And I have repartitioned the same RDD to 4 partitions. Then I have performed union operation on these two RDDs. Then I got a new RDD of 5 partitions.

Look at the screenshot of the operations performed.

 

If you liked this post or if you feel anything can be enhanced in this post please let us know..

1 thought on “What happens to number of partitions when we union two RDDs?”

Leave a Reply

Your email address will not be published. Required fields are marked *