Within the MapReduce framework in Platform Symphony, you can specify the proportion of the total number of map tasks in a job that must be completed before any reduce tasks are scheduled.
Specify this ratio using the parameter specific to your Hadoop version:
- 2.4.x: mapreduce.job.reduce.slowstart.completedmaps
- 1.1.1: mapred.reduce.slowstart.completed.maps
Configure reducer start using the command line during job submission or using a configuration file. The default value is 0.05, so that reducer tasks start when 5% of map tasks are complete. You can set this value to anything between 0 and 1. For example, at 0, the reducer tasks start even as the map tasks start. At 0.75, the reducer tasks start when 75% of the map tasks are complete.
The same configuration through command line:
- Hadoop 2.4.x:
- Hadoop 1.1.1:
Through configuration file:
- Open the mapred-site.xml configuration file from the $HADOOP_HOME/conf directory.
- Add the following property parameter depending on your Hadoop version. For example:
- Hadoop version 2.4.x:
- Hadoop version 1.1.1:
- If you did not set HADOOP_HOME to your Hadoop configuration before installing Platform Symphony or if you did not set PMR_EXTERNAL_CONFIG_PATH to your Hadoop configuration after installing Platform Symphony, copy the mapred-site.xml file to the $PMR_HOME/conf directory.