A control-theoretic perspective on optimal high-order optimization

Abstract

We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function$$\varPhi : {\mathbb {R}}^d \rightarrow {\mathbb {R}}$$Φ:Rd→Rthat is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators$$ abla \varPhi $$∇Φand$$ abla ^2 \varPhi $$∇2Φtogether with a feedback control law$$\lambda (\cdot )$$λ(·)satisfying the algebraic equation$$(\lambda (t))^p\Vert abla \varPhi (x(t))\Vert ^{p-1} = \theta $$(λ(t))p‖∇Φ(x(t))‖p-1=θfor some$$\theta \in (0, 1)$$θ∈(0,1). Our first contribution is to prove the existence and uniqueness of a local solution to this system via the Banach fixed-point theorem. We present a simple yet nontrivial Lyapunov function that allows us to establish the existence and uniqueness of a global solution under certain regularity conditions and analyze the convergence properties of trajectories. The rate of convergence is$$O(1/t^{(3p+1)/2})$$O(1/t(3p+1)/2)in terms of objective function gap and$$O(1/t^{3p})$$O(1/t3p)in terms of squared gradient norm. Our second contribution is to provide two algorithmic frameworks obtained from discretization of our continuous-time system, one of which generalizes the large-step A-HPE framework of Monteiro and Svaiter (SIAM J Optim 23(2):1092–1125, 2013) and the other of which leads to a new optimalp-th order tensor algorithm. While our discrete-time analysis can be seen as a simplification and generalization of Monteiro and Svaiter (2013), it is largely motivated by the aforementioned continuous-time analysis, demonstrating the fundamental role that the feedback control plays in optimal acceleration and the clear advantage that the continuous-time perspective brings to algorithmic design. A highlight of our analysis is that we show that all of thep-th order optimal tensor algorithms that we discuss minimize the squared gradient norm at a rate of$$O(k^{-3p})$$O(k-3p), which complements the recent analysis in Gasnikov et al. (in: COLT, PMLR, pp 1374–1391, 2019), Jiang et al. (in: COLT, PMLR, pp 1799–1801, 2019) and Bubeck et al. (in: COLT, PMLR, pp 492–507, 2019).

Document Details

Document Type: Pub Defense Publication
Publication Date: Oct 22, 2021
Source ID: 10.1007/s10107-021-01721-3

Entities

People

Michael I. Jordan
Tianyi Lin

Organizations

United States Naval Research Laboratory

A control-theoretic perspective on optimal high-order optimization

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas