NeurIPS Poster Improved Algorithms for Convex-Concave Minimax Optimization

Poster

Improved Algorithms for Convex-Concave Minimax Optimization

Yuanhao Wang · Jian Li

Poster Session 2 #656

[ Abstract ] [ Paper PDF ]

[ Paper ]

Abstract: This paper studies minimax optimization problems

min_{\x} max_{\y} f (\x, \y)

$\min_\x \max_\y f(\x,\y)$ , where

f (\x, \y)

$f(\x,\y)$ is

m_{\x}

$m_\x$ -strongly convex with respect to

\x

$\x$ ,

m_{\y}

$m_\y$ -strongly concave with respect to

\y

$\y$ and

(L_{\x}, L_{\x \y}, L_{\y})

$(L_\x,L_{\x\y},L_\y)$ -smooth. Zhang et al. \cite{zhang2019lower} provided the following lower bound of the gradient complexity for any first-order method:

Ω (\sqrt{\frac{L_{\x}}{m_{\x}} + \frac{L_{\x \y}^{2}}{m_{\x} m_{\y}} + \frac{L_{\y}}{m_{\y}}} \ln (1 / ϵ)) .

$\Omega\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L_{\x\y}^2}{m_\x m_\y}+\frac{L_\y}{m_\y}}\ln(1/\epsilon)\Bigr).$ This paper proposes a new algorithm and proved a gradient complexity bound of

\Tilde O (\sqrt{\frac{L_{\x}}{m_{\x}} + \frac{L \cdot L_{\x \y}}{m_{\x} m_{\y}} + \frac{L_{\y}}{m_{\y}}} \ln (1 / ϵ)),

$\Tilde{O}\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L\cdot L_{\x\y}}{m_\x m_\y}+\frac{L_\y}{m_\y}}\ln\left(1/\epsilon\right)\Bigr),$ where

L = max {L_{\x}, L_{\x \y}, L_{\y}}

$L=\max\{L_\x,L_{\x\y},L_\y\}$ . This improves over the best known upper bound

\Tilde O (\sqrt{\nicefrac L^{2} m_{\x} m_{\y}} \ln^{3} (1 / ϵ))

$\Tilde{O}\left(\sqrt{\nicefrac{L^2}{m_\x m_\y}} \ln^3\left(1/\epsilon\right)\right)$ by Lin et al. \cite{lin2020near}. Our bound achieves linear convergence rate and tighter dependency on condition numbers, especially when

L_{\x \y} ≪ L

$L_{\x\y}\ll L$ (i.e., the weak interaction regime). Via simple reduction, our new bound also implies improved bounds for strongly convex-concave problems and convex-concave problems. When

f

$f$ is quadratic, we can further improve the bound to

O (\sqrt{\frac{L_{\x}}{m_{\x}} + \frac{L_{\x \y}^{2}}{m_{\x} m_{\y}} + \frac{L_{\y}}{m_{\y}}} {(\frac{L^{2}}{m_{\x} m_{\y}})}^{o (1)} \ln (1 / ϵ))

$O\Bigl(\sqrt{\frac{L_\x}{m_\x}+\frac{L_{\x\y}^2}{m_\x m_\y}+\frac{L_\y}{m_\y}}\left(\frac{L^2}{m_\x m_\y}\right)^{o(1)}\ln(1/\epsilon)\Bigr)$ , which matches the lower bound up to a sub-polynomial factor.

Chat is not available.