We present exact algorithms for identifying deterministic-actions' effects
and preconditions in dynamic partially observable domains. They apply when one
does not know the action model(the way actions affect the world) of a domain and
must learn it from partial observations over time. Such scenarios are common in
real world applications. They are challenging for AI tasks because traditional
domain structures that underly tractability (e.g., conditional independence)
fail there (e.g., world features become correlated). Our work departs from
traditional assumptions about partial observations and action models. In
particular, it focuses on problems in which actions are deterministic of simple
logical structure and observation models have all features observed with some
frequency. We yield tractable algorithms for the modified problem for such
domains.
Our algorithms take sequences of partial observations over time
as input, and output deterministic action models that could have lead to those
observations. The algorithms output all or one of those models (depending on our
choice), and are exact in that no model is misclassified given the observations.
Our algorithms take polynomial time in the number of time steps and state
features for some traditional action classes examined in the AI-planning
literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs
and Reinforcement Learning are inexact and exponentially intractable for such
domains. Our experiments verify the theoretical tractability guarantees, and
show that we identify action models exactly. Several applications in planning,
autonomous exploration, and adventure-game playing already use these results.
They are also promising for probabilistic settings, partially observable
reinforcement learning, and diagnosis.