Ancient and modern machine learning rely on solving a variety of nonconvex optimization problems, for instance:
\begin{eqnarray*}
\mathrm{(ERM)}\quad &&\min_x\ f(x) := \frac{1}{n}\sum_i f_i(x),\\
\mathrm{(Stochastic)}\quad &&\min_x\ f(x) := \mathbb{E}_\xi[F(x,\xi)],
\end{eqnarray*}
where all the function involved can be nonconvex (e.g., in
deep learning).
My work on solving these problems focuses on the following key aspects:
- Design, analysis, and implementation of algorithms that obtain an approximate critical points provably fast.
- Methods that can provably escape saddle points rapidly (e.g., by mixing first and second-order information)
- General nonconvex problems with other types of structure that enables global optimality
- Design and deveopment of distributed and parallel methods for the above problems.
Selected publications
- S. Sra. Scalable nonconvex inexact proximal splitting. NIPS, 2012.
- S. Reddi, S. Sra, B. Poczos, A. Smola. Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. NIPS, 2016.
- S. Reddi, A. Hefny, S. Sra, B. Poczos, A. Smola. Stochastic variance reduction for nonconvex optimization. ICML, 2016.
- S. Reddi, S. Sra, B. Poczos, A. Smola. Stochastic Frank-Wolfe Methods for Nonconvex Optimization. Allerton CC, 2016.
- S. Reddi, S. Sra, B. Poczos, A. Smola. Fast incremental method for smooth nonconvex optimization. IEEE CDC, 2016.