Timezone: »
We present an asymptotic analysis of Viterbi Training (VT) and contrast it with a more conventional Maximum Likelihood (ML) approach to parameter estimation in Hidden Markov Models. While ML estimator works by (locally) maximizing the likelihood of the observed data, VT seeks to maximize the probability of the most likely hidden state sequence. We develop an analytical framework based on a generating function formalism and illustrate it on an exactly solvable model of HMM with one unambiguous symbol. For this particular model the ML objective function is continuously degenerate. VT objective, in contrast, is shown to have only finite degeneracy. Furthermore, VT converges faster and results in sparser (simpler) models, thus realizing an automatic Occam's razor for HMM learning. For more general scenario VT can be worse compared to ML but still capable of correctly recovering most of the parameters.
Author Information
Armen Allahverdyan (Yerevan Physics Institute)
Aram Galstyan (USC Information Sciences Institute)
More from the Same Authors
-
2019 Poster: Fast structure learning with modular regularization »
Greg Ver Steeg · Hrayr Harutyunyan · Daniel Moyer · Aram Galstyan -
2019 Spotlight: Fast structure learning with modular regularization »
Greg Ver Steeg · Hrayr Harutyunyan · Daniel Moyer · Aram Galstyan -
2019 Poster: Exact Rate-Distortion in Autoencoders via Echo Noise »
Rob Brekelmans · Daniel Moyer · Aram Galstyan · Greg Ver Steeg -
2018 Poster: Invariant Representations without Adversarial Training »
Daniel Moyer · Shuyang Gao · Rob Brekelmans · Aram Galstyan · Greg Ver Steeg -
2016 Poster: Variational Information Maximization for Feature Selection »
Shuyang Gao · Greg Ver Steeg · Aram Galstyan -
2014 Poster: Discovering Structure in High-Dimensional Data Through Correlation Explanation »
Greg Ver Steeg · Aram Galstyan