Proximal gradient method
Proximal gradient methods are a generalized form of projection used to solve non-differentiable convex optimization problems.
Many interesting problems can be formulated as convex optimization problems of form:
where are convex functions defined from where some of the functions are non-differentiable, this rules out our conventional smooth optimization techniques like Steepest descent method, conjugate gradient method etc. Proximal gradient methods can be used instead. These methods proceed by splitting, in that the functions are used individually so as to yield an easily implementable algorithm. They are called proximal because each non smooth function among is involved via its proximity operator. Iterative Shrinkage thresholding algorithm,[1] projected Landweber, projected gradient, alternating projections, alternating-direction method of multipliers, alternating split Bregman are special instances of proximal algorithms.[2] For the theory of proximal gradient methods from the perspective of and with applications to statistical learning theory, see proximal gradient methods for learning.
Notations and terminology
Let , the -dimensional Euclidean space, be the domain of the function . Suppose is a non-empty convex subset of . Then, the indicator function of is defined as
- -norm is defined as ( )
The distance from to is defined as
If is closed and convex, the projection of onto is the unique point such that .
The subdifferential of at is given by
Projection onto convex sets (POCS)
One of the widely used convex optimization algorithms is projections onto convex sets (POCS). This algorithm is employed to recover/synthesize a signal satisfying simultaneously several convex constraints. Let be the indicator function of non-empty closed convex set modeling a constraint. This reduces to convex feasibility problem, which require us to find a solution such that it lies in the intersection of all convex sets . In POCS method each set is incorporated by its projection operator . So in each iteration is updated as
However beyond such problems projection operators are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximity operators are best suited for other purposes.
Definition
The proximity operator of a convex function at is defined as the unique solution to
and is denoted .
Note that in the specific case where is the indicator function of some convex set
showing that the proximity operator is indeed a generalisation of the projection operator.
The proximity operator of is characterized by inclusion
If is differentiable then above equation reduces to
Examples
Special instances of Proximal Gradient Methods are
See also
Notes
- Daubechies, I; Defrise, M; De Mol, C (2004). "An iterative thresholding algorithm for linear inverse problems with a sparsity constraint". Communications on Pure and Applied Mathematics. 57 (11): 1413–1457. arXiv:math/0307152. Bibcode:2003math......7152D. doi:10.1002/cpa.20042.
- Details of proximal methods are discussed in Combettes, Patrick L.; Pesquet, Jean-Christophe (2009). "Proximal Splitting Methods in Signal Processing". arXiv:0912.3522 [math.OC].
References
- Rockafellar, R. T. (1970). Convex analysis. Princeton: Princeton University Press.
- Combettes, Patrick L.; Pesquet, Jean-Christophe (2011). Springer's Fixed-Point Algorithms for Inverse Problems in Science and Engineering. 49. pp. 185–212.
External links
- Stephen Boyd and Lieven Vandenberghe Book, Convex optimization
- EE364a: Convex Optimization I and EE364b: Convex Optimization II, Stanford course homepages
- EE227A: Lieven Vandenberghe Notes Lecture 18
- ProximalOperators.jl: a Julia package implementing proximal operators.
- ProximalAlgorithms.jl: a Julia package implementing algorithms based on the proximal operator, including the proximal gradient method.
- Proximity Operator repository: a collection of proximity operators implemented in Matlab and Python.