首页    期刊浏览 2024年11月09日 星期六
登录注册

文章基本信息

  • 标题:Neural networks and Monte-Carlo method usage in multi-agent systems for sudoku problem solving
  • 本地全文:下载
  • 作者:Katerina Poloziuk ; Vadym Yaremenko
  • 期刊名称:Technology Audit and Production Reserves
  • 电子版ISSN:2706-5448
  • 出版年度:2020
  • 卷号:6
  • 期号:2
  • 页码:38-41
  • DOI:10.15587/2706-5448.2020.218427
  • 语种:English
  • 出版社:PC Technology Center
  • 摘要:The object of research is multi-agent systems based on Deep Reinforcement Learning algorithms and analysis of ways to establish interaction within the system, based on intelligent agents. Also, part of the material in this paper covers ways to organize the management and administration of agents at the meta-level: external controllers and tools to optimize their work, describing architectural solutions that should accelerate agents’ training. The studied full-fledged multi-agent system would be flexible to expansion and would give effective acceleration in agent training and problem-solving quality.In this paper, the following neural network models were considered: DQN, DDQN, PPO, TD (methods based on Q-Learning), an approach using a neural network with Monte-Carlo tree search. The presented models were tested on a Sudoku problem with a dataset of 5039 combinations, dimensions 2x2, 4x4, and 9x9. Several sets of agent rewards were used. The presentation of data during the learning and problem-solving process was described. Also was built a multi-agent system based on the model using a Monte-Carlo tree search.According to the study results, it was revealed that for tasks in a complex environment, the models based on Q-Learning are practically ineffective (plots support the statement). The training process for these models is quite demanding on the characteristics of the workstation hardware. It was also determined that the Monte-Carlo tree search method does a good job. Even with a small number of iterations, it shows results better than other Deep Learning methods (45–50% accuracy for 9x9). However, a significant drawback is a complexity of training the model, and the hardware requirements are too large for this kind of research.
  • 关键词:DQN;DDQN;TD;PPO;neural network;deep learning;reinforcement learning;multi-agent system;MCTS;Q-Learning.
国家哲学社会科学文献中心版权所有