NeurIPS Poster Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models

Poster

Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models

Yihan Zhang · Nir Weinberger

Hall J (level 1) #822

Keywords: [ minimax rate ] [ high-dimensional statistics ] [ parameter estimation ] [ spectral estimator ] [ hidden Markov model ]

[ Abstract ]

[ Paper] [ Slides] [ Poster] [ OpenReview]

Abstract: We consider a high-dimensional mean estimation problem over a binary hidden Markov model, which illuminates the interplay between memory in data, sample size, dimension, and signal strength in statistical inference. In this model, an estimator observes

n

$n$ samples of a

d

$d$ -dimensional parameter vector

θ_{*} \in R^{d}

$\theta_{*}\in\mathbb{R}^{d}$ , multiplied by a random sign

S_{i}

$S_i$ (

1 \leq i \leq n

$1\le i\le n$ ), and corrupted by isotropic standard Gaussian noise. The sequence of signs

{S_{i}}_{i \in [n]} \in {- 1, 1}^{n}

$\{S_{i}\}_{i\in[n]}\in\{-1,1\}^{n}$ is drawn from a stationary homogeneous Markov chain with flip probability

δ \in [0, 1 / 2]

$\delta\in[0,1/2]$ . As

δ

$\delta$ varies, this model smoothly interpolates two well-studied models: the Gaussian Location Model for which

δ = 0

$\delta=0$ and the Gaussian Mixture Model for which

δ = 1 / 2

$\delta=1/2$ . Assuming that the estimator knows

δ

$\delta$ , we establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of

∥ θ_{*} ∥, δ, d, n

$\|\theta_{*}\|,\delta,d,n$ . We then provide an upper bound to the case of estimating

δ

$\delta$ , assuming a (possibly inaccurate) knowledge of

θ_{*}

$\theta_{*}$ . The bound is proved to be tight when

θ_{*}

$\theta_{*}$ is an accurately known constant. These results are then combined to an algorithm which estimates

θ_{*}

$\theta_{*}$ with

δ

$\delta$ unknown a priori, and theoretical guarantees on its error are stated.

Chat is not available.