Poster
Distributed estimation of the inverse Hessian by determinantal averaging
Michal Derezinski · Michael Mahoney
East Exhibition Hall B, C #209
Keywords: [ Optimization ] [ Optimization -> Convex Optimization; Optimization ] [ Stochastic Optimization ]
In distributed optimization and distributed numerical linear algebra,
we often encounter an inversion bias: if we want to compute a
quantity that depends on the inverse of a sum of distributed matrices,
then the sum of the inverses does not equal the inverse of the sum.
An example of this occurs in distributed Newton's method, where we
wish to compute (or implicitly work with) the inverse Hessian
multiplied by the gradient.
In this case, locally computed estimates are biased, and so taking a
uniform average will not recover the correct solution.
To address this, we propose determinantal averaging, a new
approach for correcting the inversion bias.
This approach involves reweighting the local estimates of the Newton's
step proportionally to the determinant of the local Hessian estimate,
and then averaging them together to obtain an improved global
estimate. This method provides the first known distributed Newton step that is
asymptotically consistent, i.e., it recovers the exact step in
the limit as the number of distributed partitions grows to infinity.
To show this, we develop new expectation identities and moment bounds
for the determinant and adjugate of a random matrix.
Determinantal averaging can be applied not only to Newton's method,
but to computing any quantity that is a linear tranformation of a
matrix inverse, e.g., taking a trace of the inverse covariance matrix,
which is used in data uncertainty quantification.
Live content is unavailable. Log in and register to view live content