Timezone: »
VI2N: A Network for Planning Under Uncertainty based on Value of Information
Samantha Johnson · Michael Buice · Koosha Khalvati
Event URL: https://openreview.net/forum?id=7mNtXCY5UDU »
Planning under uncertainty is an important issue in both neuroscience and computer science that as not been solved. By representing problems in Reinforcement Learning (RL) as Partially Observable Markov Decision Processes (POMDPs), they can be addressed from a theoretical perspective. While solving POMDPs is known to be NP-Hard, recent advances through deep learning have produced impressive neural network solvers, namely the Value Iteration Network (VIN) and the QMDP-Net. These solvers allow for increased learning and generalization to novel domains, but are not complete solutions to the RL problem. In this paper, we propose a new architecture, the VI$^2$N, a POMDP-solving neural network with a built-in Pairwise Heuristic that demonstrates the ability of imitation and reinforcement learning in novel domains where information gathering is necessary. This study shows the VI$^2$N to be at least as good as the state-of-the-art model on the tested environments.
Planning under uncertainty is an important issue in both neuroscience and computer science that as not been solved. By representing problems in Reinforcement Learning (RL) as Partially Observable Markov Decision Processes (POMDPs), they can be addressed from a theoretical perspective. While solving POMDPs is known to be NP-Hard, recent advances through deep learning have produced impressive neural network solvers, namely the Value Iteration Network (VIN) and the QMDP-Net. These solvers allow for increased learning and generalization to novel domains, but are not complete solutions to the RL problem. In this paper, we propose a new architecture, the VI$^2$N, a POMDP-solving neural network with a built-in Pairwise Heuristic that demonstrates the ability of imitation and reinforcement learning in novel domains where information gathering is necessary. This study shows the VI$^2$N to be at least as good as the state-of-the-art model on the tested environments.
Author Information
Samantha Johnson (University of Chicago)
Michael Buice (Allen Institute)
Koosha Khalvati (University of Washington)
More from the Same Authors
-
2022 : Using Sum-Product Networks to estimate neural population stutcture in the brain »
Koosha Khalvati · Samantha Johnson · Stefan Mihalas · Michael Buice -
2022 Poster: Learning dynamics of deep linear networks with multiple pathways »
Jianghong Shi · Eric Shea-Brown · Michael Buice -
2021 Poster: Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex »
Colin Conwell · David Mayo · Andrei Barbu · Michael Buice · George Alvarez · Boris Katz -
2021 Poster: Tensor decompositions of higher-order correlations by nonlinear Hebbian plasticity »
Gabriel Ocker · Michael Buice -
2019 : Poster Session »
Pravish Sainath · Mohamed Akrout · Charles Delahunt · Nathan Kutz · Guangyu Robert Yang · Joseph Marino · L F Abbott · Nicolas Vecoven · Damien Ernst · andrew warrington · Michael Kagan · Kyunghyun Cho · Kameron Harris · Leopold Grinberg · John J. Hopfield · Dmitry Krotov · Taliah Muhammad · Erick Cobos · Edgar Walker · Jacob Reimer · Andreas Tolias · Alexander Ecker · Janaki Sheth · Yu Zhang · Maciej Wołczyk · Jacek Tabor · Szymon Maszke · Roman Pogodin · Dane Corneil · Wulfram Gerstner · Baihan Lin · Guillermo Cecchi · Jenna M Reinen · Irina Rish · Guillaume Bellec · Darjan Salaj · Anand Subramoney · Wolfgang Maass · Yueqi Wang · Ari Pakman · Jin Hyung Lee · Liam Paninski · Bryan Tripp · Colin Graber · Alex Schwing · Luke Prince · Gabriel Ocker · Michael Buice · Benjamin Lansdell · Konrad Kording · Jack Lindsey · Terrence Sejnowski · Matthew Farrell · Eric Shea-Brown · Nicolas Farrugia · Victor Nepveu · Jiwoong Im · Kristin Branson · Brian Hu · Ramakrishnan Iyer · Stefan Mihalas · Sneha Aenugu · Hananel Hazan · Sihui Dai · Tan Nguyen · Doris Tsao · Richard Baraniuk · Anima Anandkumar · Hidenori Tanaka · Aran Nayebi · Stephen Baccus · Surya Ganguli · Dean Pospisil · Eilif Muller · Jeffrey S Cheng · Gaël Varoquaux · Kamalaker Dadi · Dimitrios C Gklezakos · Rajesh PN Rao · Anand Louis · Christos Papadimitriou · Santosh Vempala · Naganand Yadati · Daniel Zdeblick · Daniela M Witten · Nicholas Roberts · Vinay Prabhu · Pierre Bellec · Poornima Ramesh · Jakob H Macke · Santiago Cadena · Guillaume Bellec · Franz Scherr · Owen Marschall · Robert Kim · Hannes Rapp · Marcio Fonseca · Oliver Armitage · Jiwoong Im · Thomas Hardcastle · Abhishek Sharma · Wyeth Bair · Adrian Valente · Shane Shang · Merav Stern · Rutuja Patil · Peter Wang · Sruthi Gorantla · Peter Stratton · Tristan Edwards · Jialin Lu · Martin Ester · Yurii Vlasov · Siavash Golkar -
2019 Poster: Comparison Against Task Driven Artificial Neural Networks Reveals Functional Organization of Mouse Visual Cortex »
Jianghong Shi · Eric Shea-Brown · Michael Buice -
2019 Poster: A Bayesian Theory of Conformity in Collective Decision Making »
Koosha Khalvati · Saghar Mirbagheri · Seongmin A. Park · Jean-Claude Dreher · Rajesh PN Rao -
2016 Poster: A Probabilistic Model of Social Decision Making based on Reward Maximization »
Koosha Khalvati · Seongmin A. Park · Jean-Claude Dreher · Rajesh PN Rao -
2015 Poster: A Bayesian Framework for Modeling Confidence in Perceptual Decision Making »
Koosha Khalvati · Rajesh PN Rao