摘要:In many steel plate stockyards, steel plates are piled up on the yard for a restriction of the yard's site area,and the steel plates must be moved from the top op the pile one by one. Therefore, we must turn over the many steel plates to pick up objective plates for the sorting. Workloads of turning over the steel plates are relevant to an order of pick up the steel plates. So, planning to determine the order can reduce costs for delivering the steel plates.This planning problem is a large scale multi-decision problem including hierarchical decisions. For this reasons, it is difficult to optimize the problem using usual optimization techniques such as Taboo search and Genetic algorithm. In this paper, the planning problems are modeled as Markov decision processes, and optimum planning is achieved by Hierarchical reinforcement learning. The proposed method is demonstrated through several numerical experiments to compare a method using a simple heuristics. The results off the numerical experiments show that the proposed method can save the costs about more than 10% compared to the method using the heuristics.