Sufficient Markov Decision Processes with Alternating Deep Neural
Networks
Longshaokan Wang; Eric B. Laber; Katie Witkiewitz
arXiv2017
30
witkiewitz2017sufficient
Abstract
Advances in mobile computing technologies have made it possible to monitor
and apply data-driven interventions across complex systems in real time. Markov
decision processes (MDPs) are the primary model for sequential decision
problems with a large or indefinite time horizon. Choosing a representation of
the underlying decision process that is both Markov and low-dimensional is
non-trivial. We propose a method for constructing a low-dimensional
representation of the original decision process for which: 1. the MDP model
holds; 2. a decision strategy that maximizes mean utility when applied to the
low-dimensional representation also maximizes mean utility when applied to the
original process. We use a deep neural network to define a class of potential
process representations and estimate the process of lowest dimension within
this class. The method is illustrated using data from a mobile study on heavy
drinking and smoking among college students.