文章基本信息

标题：The Missing Link Between Memory and Reinforcement Learning
其他标题：The Missing Link Between Memory and Reinforcement Learning
本地全文：下载
作者：Balkenius, Christian ; Tjøstheim, Trond A. ; Johansson, Birger 等
期刊名称：Frontiers in Psychology
电子版ISSN：1664-1078
出版年度：2020
卷号：11
页码：3446
DOI：10.3389/fpsyg.2020.560080
出版社：Frontiers Media
摘要：Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism that uses the value to decide on which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have earlier developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can be used to support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from making a decision. The extended model supports decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice over time. We argue that a system like this forms the `missing link' between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.
关键词：memory model; Decision Making; accumulator model; episodic memory; Semantic memory