摘要:In this paper, we propose a novel algorithm to solve the starving problem of the small jobs and reduce the process time of the small jobs on Hadoop platform. Current schedulers of MapReduce/Hadoop are quite successful in achieving data locality and scheduling the reduce tasks with a greedy algorithm. Some jobs may have hundreds of map tasks and just several reduce tasks, in which case, the reduce tasks of the large jobs require more time for waiting, which will result in the starving problem of the small jobs. Since the map tasks and the reduce tasks are scheduled separately, we can change the way the scheduler launches the reduce tasks without affecting the map phase. Therefore we develop an optimized algorithm to schedule the reduce tasks with the shortest remaining time (SRT) of the map tasks. We apply our algorithm to the fair scheduler and the capacity scheduler, which are both widely used in real production environment. The evaluation results show that the SRT algorithm can decrease the process time of the small jobs effectively.