Timezone: »
Spotlight
Entropy Rate Estimation for Markov Chains with Large State Space
Yanjun Han · Jiantao Jiao · Chuan-Zheng Lee · Tsachy Weissman · Yihong Wu · Tiancheng Yu
Entropy estimation is one of the prototypical problems in distribution property testing. To consistently estimate the Shannon entropy of a distribution on $S$ elements with independent samples, the optimal sample complexity scales sublinearly with $S$ as $\Theta(\frac{S}{\log S})$ as shown by Valiant and Valiant \cite{Valiant--Valiant2011}. Extending the theory and algorithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with $S$ states from a sample path of $n$ observations. We show that
\begin{itemize}
\item Provided the Markov chain mixes not too slowly, \textit{i.e.}, the relaxation time is at most $O(\frac{S}{\ln^3 S})$, consistent estimation is achievable when $n \gg \frac{S^2}{\log S}$.
\item Provided the Markov chain has some slight dependency, \textit{i.e.}, the relaxation time is at least $1+\Omega(\frac{\ln^2 S}{\sqrt{S}})$, consistent estimation is impossible when $n \lesssim \frac{S^2}{\log S}$.
\end{itemize}
Under both assumptions, the optimal estimation accuracy is shown to be $\Theta(\frac{S^2}{n \log S})$. In comparison, the empirical entropy rate requires at least $\Omega(S^2)$ samples to be consistent, even when the Markov chain is memoryless. In addition to synthetic experiments, we also apply the estimators that achieve the optimal sample complexity to estimate the entropy rate of the English language in the Penn Treebank and the Google One Billion Words corpora, which provides a natural benchmark for language modeling and relates it directly to the widely used perplexity measure.
Author Information
Yanjun Han (Stanford University)
Jiantao Jiao (University of California, Berkeley)
Chuan-Zheng Lee (Stanford University)
Tsachy Weissman (Stanford University)
Yihong Wu (Yale University)
Tiancheng Yu (Tsinghua University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Entropy Rate Estimation for Markov Chains with Large State Space »
Wed. Dec 5th through Thu the 6th Room Room 210 #82
More from the Same Authors
-
2022 Spotlight: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2022 Spotlight: Lightning Talks 3B-1 »
Tianying Ji · Tongda Xu · Giulia Denevi · Aibek Alanov · Martin Wistuba · Wei Zhang · Yuesong Shen · Massimiliano Pontil · Vadim Titov · Yan Wang · Yu Luo · Daniel Cremers · Yanjun Han · Arlind Kadra · Dailan He · Josif Grabocka · Zhengyuan Zhou · Fuchun Sun · Carlo Ciliberto · Dmitry Vetrov · Mingxuan Jing · Chenjian Gao · Aaron Flores · Tsachy Weissman · Han Gao · Fengxiang He · Kunzan Liu · Wenbing Huang · Hongwei Qin -
2022 : Efficient Federated Random Subnetwork Training »
Francesco Pase · Berivan Isik · Deniz Gunduz · Tsachy Weissman · Michele Zorzi -
2022 Poster: Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions »
Wei Zhang · Yanjun Han · Zhengyuan Zhou · Aaron Flores · Tsachy Weissman -
2021 Poster: Optimal prediction of Markov chains with and without spectral gap »
Yanjun Han · Soham Jana · Yihong Wu -
2021 Poster: On the Value of Interaction and Function Approximation in Imitation Learning »
Nived Rajaraman · Yanjun Han · Lin Yang · Jingbo Liu · Jiantao Jiao · Kannan Ramchandran -
2020 Poster: Minimax Optimal Nonparametric Estimation of Heterogeneous Treatment Effects »
Zijun Gao · Yanjun Han -
2020 Spotlight: Minimax Optimal Nonparametric Estimation of Heterogeneous Treatment Effects »
Zijun Gao · Yanjun Han -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Poster: Batched Multi-armed Bandits Problem »
Zijun Gao · Yanjun Han · Zhimei Ren · Zhengqing Zhou -
2019 Oral: Batched Multi-armed Bandits Problem »
Zijun Gao · Yanjun Han · Zhimei Ren · Zhengqing Zhou -
2018 Poster: The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal »
Jiantao Jiao · Weihao Gao · Yanjun Han -
2018 Spotlight: The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal »
Jiantao Jiao · Weihao Gao · Yanjun Han -
2018 Poster: Data Amplification: A Unified and Competitive Approach to Property Estimation »
Yi Hao · Alon Orlitsky · Ananda Theertha Suresh · Yihong Wu