Timezone: »
State aggregation is a popular model reduction method rooted in optimal control. It reduces the complexity of engineering systems by mapping the system’s states into a small number of meta-states. The choice of aggregation map often depends on the data analysts’ knowledge and is largely ad hoc. In this paper, we propose a tractable algorithm that estimates the probabilistic aggregation map from the system’s trajectory. We adopt a soft-aggregation model, where each meta-state has a signature raw state, called an anchor state. This model includes several common state aggregation models as special cases. Our proposed method is a simple two- step algorithm: The first step is spectral decomposition of empirical transition matrix, and the second step conducts a linear transformation of singular vectors to find their approximate convex hull. It outputs the aggregation distributions and disaggregation distributions for each meta-state in explicit forms, which are not obtainable by classical spectral methods. On the theoretical side, we prove sharp error bounds for estimating the aggregation and disaggregation distributions and for identifying anchor states. The analysis relies on a new entry-wise deviation bound for singular vectors of the empirical transition matrix of a Markov process, which is of independent interest and cannot be deduced from existing literature. The application of our method to Manhattan traffic data successfully generates a data-driven state aggregation map with nice interpretations.
Author Information
Yaqi Duan (Princeton University)
Zheng Tracy Ke (Harvard University)
Mengdi Wang (Princeton University)
Mengdi Wang is interested in data-driven stochastic optimization and applications in machine and reinforcement learning. She received her PhD in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in 2013. At MIT, Mengdi was affiliated with the Laboratory for Information and Decision Systems and was advised by Dimitri P. Bertsekas. Mengdi became an assistant professor at Princeton in 2014. She received the Young Researcher Prize in Continuous Optimization of the Mathematical Optimization Society in 2016 (awarded once every three years).
More from the Same Authors
-
2021 Spotlight: On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method »
Junyu Zhang · Chengzhuo Ni · zheng Yu · Csaba Szepesvari · Mengdi Wang -
2022 : Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization »
Kaixuan Huang · Yu Wu · Xuezhou Zhang · Shenyinying Tu · Qingyun Wu · Mengdi Wang · Huazheng Wang -
2022 : Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization »
Kaixuan Huang · Yu Wu · Xuezhou Zhang · Shenyinying Tu · Qingyun Wu · Mengdi Wang · Huazheng Wang -
2022 : Pareto-Optimal Diagnostic Policy Learning in Clinical Applications via Semi-Model-Based Deep Reinforcement Learning »
zheng Yu · Yikuan Li · Joseph Kim · Kaixuan Huang · Yuan Luo · Mengdi Wang -
2022 : Pareto-Optimal Diagnostic Policy Learning in Clinical Applications via Semi-Model-Based Deep Reinforcement Learning »
zheng Yu · Yikuan Li · Joseph Kim · Kaixuan Huang · Yuan Luo · Mengdi Wang -
2022 : Provable Benefits of Representational Transfer in Reinforcement Learning »
Alekh Agarwal · Yuda Song · Kaiwen Wang · Mengdi Wang · Wen Sun · Xuezhou Zhang -
2022 Poster: Communication Efficient Distributed Learning for Kernelized Contextual Bandits »
Chuanhao Li · Huazheng Wang · Mengdi Wang · Hongning Wang -
2022 Poster: Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks »
Shuoguang Yang · Xuezhou Zhang · Mengdi Wang -
2022 Poster: Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization »
Hui Yuan · Chengzhuo Ni · Huazheng Wang · Xuezhou Zhang · Le Cong · Csaba Szepesvari · Mengdi Wang -
2021 Poster: On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method »
Junyu Zhang · Chengzhuo Ni · zheng Yu · Csaba Szepesvari · Mengdi Wang -
2021 Poster: Sharp Impossibility Results for Hyper-graph Testing »
Jiashun Jin · Zheng Tracy Ke · Jiajun Liang -
2020 Poster: Generalized Leverage Score Sampling for Neural Networks »
Jason Lee · Ruoqi Shen · Zhao Song · Mengdi Wang · zheng Yu -
2020 Poster: High-Dimensional Sparse Linear Bandits »
Botao Hao · Tor Lattimore · Mengdi Wang -
2020 Poster: Variational Policy Gradient Method for Reinforcement Learning with General Utilities »
Junyu Zhang · Alec Koppel · Amrit Singh Bedi · Csaba Szepesvari · Mengdi Wang -
2020 Spotlight: Variational Policy Gradient Method for Reinforcement Learning with General Utilities »
Junyu Zhang · Alec Koppel · Amrit Singh Bedi · Csaba Szepesvari · Mengdi Wang -
2020 Poster: On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces »
Zhuoran Yang · Chi Jin · Zhaoran Wang · Mengdi Wang · Michael Jordan -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 : Unsupervised State Embedding and Aggregation towards Scalable Reinforcement Learning »
Mengdi Wang -
2019 Poster: Learning low-dimensional state embeddings and metastable clusters from time series data »
Yifan Sun · Yaqi Duan · Hao Gong · Mengdi Wang -
2018 : Poster Session 1 »
Kyle H Ambert · Brandon Araki · Xiya Cao · Sungjoon Choi · Hao(Jackson) Cui · Jonas Degrave · Yaqi Duan · Mattie Fellows · Carlos Florensa · Karan Goel · Aditya Gopalan · Ming-Xu Huang · Jonathan Hunt · Cyril Ibrahim · Brian Ichter · Maximilian Igl · Zheng Tracy Ke · Igor Kiselev · Anuj Mahajan · Arash Mehrjou · Karl Pertsch · Alexandre Piche · Nicholas Rhinehart · Thomas Ringstrom · Reazul Hasan Russel · Oleh Rybkin · Ion Stoica · Sharad Vikram · Angelina Wang · Ting-Han Wei · Abigail H Wen · I-Chen Wu · Zhengwei Wu · Linhai Xie · Dinghan Shen -
2018 : Spotlights 1 »
Ming-Xu Huang · Hao(Jackson) Cui · Arash Mehrjou · Yaqi Duan · Sharad Vikram · Angelina Wang · Karan Goel · Jonathan Hunt · Zhengwei Wu · Dinghan Shen · Mattie Fellows -
2018 Poster: Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization »
Minshuo Chen · Lin Yang · Mengdi Wang · Tuo Zhao -
2018 Poster: Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model »
Aaron Sidford · Mengdi Wang · Xian Wu · Lin Yang · Yinyu Ye -
2017 Poster: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2017 Oral: Diffusion Approximations for Online Principal Component Estimation and Global Convergence »
Chris Junchi Li · Mengdi Wang · Tong Zhang -
2016 Poster: Accelerating Stochastic Composition Optimization »
Mengdi Wang · Ji Liu · Ethan Fang