Convex functions#

This section is devoted to convex sets and convex functions, which have useful properties for optimization algorithms.

Some special sets#

Lines and line segments#

Points $y$ on a line through two points $x_1 \neq x_2$ in $\mathbb{R}^{n}$ can be expressed by

\[ y = \theta x_1 + (1 - \theta) x_2, \]

\[ y = x_2 + \theta (x_1 - x_2), \]

where $\theta \in \mathbb{R}$.

For a line segment, $\theta$ is limited to $0 \leq \theta \leq 1$.

x1 = [0.0; 0.5];
x2 = [1.0; 0.0];
theta = linspace (-0.4, 1.4, 100);
y = @(t) x2 + t .* (x1 - x2);
yy = y(theta);
plot (yy(1,:), yy(2,:), 'k');
hold on;
tprops = {'FontSize', 18};
mprops = {'MarkerSize', 9, ...
          'MarkerFaceColor', 'black'};
for t = [1.2, 1, 0.6, 0, -0.2]
  yy = y(t);
  plot (yy(1), yy(2), 'kx', mprops{:});
  text (yy(1) - 0.22, yy(2) - 0.05, ...
        sprintf ('\\theta = %.1f', t), tprops{:});
end
plot (x1(1), x1(2), 'ko', mprops{:});
text (x1(1), x1(2) + 0.1, 'x1', tprops{:});
plot (x2(1), x2(2), 'ko', mprops{:});
text (x2(1), x2(2) + 0.1, 'x2', tprops{:});
axis equal;
axis off;

_images/nlo-lecture-05-convex-functions_2_0.png

Affine sets#

An affine set contains the line through any two distinct points in the set.

For example the solution set of linear equations $\{x | Ax = b\}$ is affine.

Conversely, every affine set can be expressed as solution set of a system of linear equations.

Convex sets#

A convex set $C$ contains the line segment through any two distinct points in the set.

\[ x_1, x_2 \in C, \quad 0 \leq \theta \leq 1 \quad\implies\quad \theta x_1 + (1 - \theta) x_2 \in C. \]

Examples (from [BV04] (p. 24): first set is convex, the latter two not)

convex_sets

Hyperplanes#

Affine and convex sets of the form $\{ x \in \mathbb{R}^{n} \colon a^{T}x = b \}$ with normal vector $a \neq 0$.

x1 = [0, 0.7];
x2 = [1, 0.0];
x0 = x1 + 0.5 .* (x2 - x1);
x = 0.25 .* (x2 - x1);
a = [0.15, 0.15 / x1(2)];
plot ([x1(1), x2(1)], [x1(2), x2(2)], 'k-');
hold on;
plot (x0(1), x0(2), 'ko');
quiver (x0(1), x0(2), a(1), a(2), 'k', 'LineWidth', 2);
quiver (x0(1), x0(2), x(1), x(2), 'k', 'LineWidth', 2);
tprops = {'FontSize', 18};
text (x0(1) - 0.04, x0(2) - 0.04, 'x_0', tprops{:});
text (x0(1) + a(1) + 0.02, x0(2) + a(2) + 0.02, 'a', tprops{:});
text (x0(1) + x(1) + 0.02, x0(2) + x(2) + 0.02, 'x', tprops{:});
text (x2(1) - 0.2, x2(2) + 0.02, 'a^{T}x = b', tprops{:});
axis equal;
axis off;

_images/nlo-lecture-05-convex-functions_6_0.png

Halfspaces#

Convex sets of the form $\{ x \in \mathbb{R}^{n} \colon a^{T}x \leq b \}$ with normal vector $a \neq 0$.

x1 = [0, 0.7];
x2 = [1, 0.0];
x0 = x1 + 0.5 .* (x2 - x1);
a = [0.15, 0.15 / x1(2)];
x3 = x1 - a;
x4 = x2 - a;
fill ([x1(1), x2(1), x4(1), x3(1)], [x1(2), x2(2), x4(2), x3(2)], ...
      [220, 220, 220] / 256, 'EdgeColor', 'none');
hold on;
plot ([x1(1), x2(1)], [x1(2), x2(2)], 'k-', 'LineWidth', 2);
plot (x0(1), x0(2), 'ko');
quiver (x0(1), x0(2), a(1), a(2), 'k', 'LineWidth', 2);
tprops = {'FontSize', 18};
text (x0(1) - 0.04, x0(2) - 0.04, 'x_0', tprops{:});
text (x0(1) + a(1) + 0.02, x0(2) + a(2) + 0.02, 'a', tprops{:});
text (x2(1) - 0.2, x2(2) + 0.2, 'a^{T}x \geq b', tprops{:});
text (x2(1) - 0.3, x2(2),       'a^{T}x \leq b', tprops{:});
axis equal;
axis off;

_images/nlo-lecture-05-convex-functions_8_0.png

Ellipsoid#

Convex sets of the form

\[ \{ x \in \mathbb{R}^{n} \colon (x - x_{c})^{T} P^{-1} (x - x_{c}) \leq 1 \}, \]

with center $x_{c} \in \mathbb{R}^{n}$ and $P \succ 0$ (symmetric positive definite) or

\[ \{ x_{c} + Au \in \mathbb{R}^{n} \colon \lVert u \rVert \leq 1 \}, \]

with $A$ square and non-singular.

rx = 2;
ry = 0.7;
rz = 1;
[x, y, z] = ellipsoid (0, 0, 0, rx, ry, rz, 200);
colormap ('gray');
surf (x, y, z);
hold on;
plot3 ([0, rx], [0, 0], [1, 1], 'k-', 'LineWidth', 2);
plot3 ([0, 0], [0, ry], [1, 1], 'k-', 'LineWidth', 2);
plot3 (0, 0, 1, 'ko', 'MarkerSize', 9);
text  (-0.1, -0.1, 1, 'x_{c}', 'FontSize', 18);
shading flat;
view (0, 90);  % reduce to x-y-plane
xlim ([-rx, rx]);
ylim ([-1, 1]);
zlim ([-rz, rz]);
axis off;

_images/nlo-lecture-05-convex-functions_10_0.png

Polyhedra#

A Polyhedron is an intersection of a finite number of halfspaces and hyperplanes.

x1 = [ 0.0, 0.0];
x2 = [ 0.3, 0.6];
x3 = [-0.2, 0.9];
x4 = [-0.6, 0.5];
x5 = [-0.5, 0.2];
x = [x1; x2; x3; x4; x5];
fill (x(:,1), x(:,2), ...
      [220, 220, 220] / 256, ...
      'EdgeColor', 'none');
hold on;

tprops = {'FontSize', 18};
for i = 1:5
  j = mod (i, 5) + 1;
  p = @(t) x(i,:)' + t .* (x(j,:)' - x(i,:)');
  p0 = p(0.5);  % Start of normal vector
  p1 = p(-0.2);
  p2 = p(1.2);
  % Normal vector a
  d = 0.3;  % direction and scale
  if (x(i,1) < 0)
    d = -d;
  end
  a = (x(j,:)' - x(i,:)');
  a = [1; -a(1)/a(2)];
  a = a / norm (a) * d;
  plot ([p1(1), p2(1)], [p1(2), p2(2)], 'k-', 'LineWidth', 2);
  quiver (p0(1), p0(2), a(1), a(2), 'k', 'LineWidth', 2);
  t = p0 + 1.2 .* a;
  text (t(1), t(2), sprintf ('a_{%d}', i), tprops{:});
end

text (-0.2, 0.45, 'P', tprops{:});
axis equal;
axis off;

_images/nlo-lecture-05-convex-functions_12_0.png

The intersection of convex sets is convex.

Proof.

Let $x_{1}, x_{2} \in C \cap D$, where $C$ and $D$ are convex sets. Then $\theta x_{1} + (1 - \theta) x_{2} \in C \cap D$.

\[\tag*{$\Box$}\]

Solution set of finitely many linear inequalities and equalities is convex

\[ Ax \leq b, \qquad Cx = d, \]

where $A \in \mathbb{R}^{m \times n}$, $C \in \mathbb{R}^{p \times n}$, and $\leq$ is component-wise inequality.

Convex functions#

Let $X \subset \mathbb{R}^{n}$ be a convex set. A function $f \colon \mathbb{R}^{n} \to \mathbb{R}$ is convex on $X$, if

\[ f( \theta x + (1 - \theta) y ) \;\leq\; \theta f(x) + (1 - \theta)f(y) \]

for all $x,y \in X$ and $0 \leq \theta \leq 1$.

x = linspace (-0.7, 0.9, 50);
y = x.^2;
plot (x, y, 'k');
hold on;
plot ([-0.5, 0.7], [-0.5, 0.7].^2, 'ko-', 'LineWidth', 2);
text (-0.75, 0.2, '(x, f(x))', 'FontSize', 18);
text ( 0.75, 0.5, '(y, f(y))', 'FontSize', 18);
axis off;

_images/nlo-lecture-05-convex-functions_15_0.png

$f$ is concave if $-f$ is convex.
$f$ is strictly convex if $X \subset \mathbb{R}^{n}$ is a convex set and

\[ f( \theta x + (1 - \theta) y ) \;<\; \theta f(x) + (1 - \theta)f(y), \]

for all $x,y \in X$, $x \neq y$, and $0 < \theta < 1$.

Examples on $\mathbb{R}$#

Convex:

affine: $ax + b$ on $\mathbb{R}$, for any $a, b \in \mathbb{R}$
exponential: $e^{ax}$, for any $a \in \mathbb{R}$
powers: $x^{\alpha}$ on $\mathbb{R}_{++}$, for $\alpha \geq 1$ or $\alpha \leq 0$
powers of absolute value: $\lvert x \rvert^{p}$ on $\mathbb{R}$, for $p \geq 1$
negative entropy: $x\log(x)$ on $\mathbb{R}_{++}$

Concave:

affine: $ax + b$ on $\mathbb{R}$, for any $a, b \in \mathbb{R}$
powers: $x^{\alpha}$ on $\mathbb{R}_{++}$, for $0 \leq \alpha \leq 1$
logarithm: $\log(x)$ on $\mathbb{R}_{++}$

Examples on $\mathbb{R}^{n}$ (vectors)#

affine function $f(x) = a^{T}x + b$ are convex and concave
norms are convex: $\lVert x \rVert_{p} = (\sum_{i = 1}^{n} \lvert x_i \rvert^{p} )^{1/p}$ for $p \geq 1$; $\lVert x \rVert_{\infty} = \max(\lvert x_1 \rvert, \ldots, \lvert x_n \rvert)$

Examples on $\mathbb{R}^{m \times n}$ ($m \times n$ matrices)#

affine function

\[ f(X) = \operatorname{tr} \left( A^{T}X \right) + b = \left( \sum_{i = 1}^{m}\sum_{j = 1}^{n} A_{ij} X_{ij} \right) + b \]
spectral (maximum singular value) norm

\[ f(X) = \lVert X \rVert_{2} = \sigma_{\max}(X) = \sqrt{\lambda_{\max}(X^{T}X)}. \]

Theorem 6: Unique global minimum

Let $X \subset \mathbb{R}^{n}$ be a convex set and $f \colon X \to \mathbb{R}$ a convex function, then each local minimum is a global minimum.

Furthermore, if $f \in C^{1}$ (continuously differentiable) over an open and convex set $X \subset \mathbb{R}^{n}$, then every stationary point is a global minimum of $f$ over $X$.

1st order convex function condition#

Theorem 7: 1st order convex function condition

Let $X \subset \mathbb{R}^{n}$ be a non-empty, convex set and $f \in C^{1}(\mathbb{R}^{n}, \mathbb{R})$ is convex over $X$, then

\[ f(y) \;\geq\; f(x) + \nabla f(x)^{T}(y - x) \]

for all $x,y \in X$.

x = linspace (-0.7, 0.9, 50);
y = x.^2;
plot (x, y, 'k');
hold on;
plot ([0.1, 0.9], 0.8 .* [0.1, 0.9] - 0.16, 'k-', 'LineWidth', 2);
plot (0.4, 0.4^2, 'ko-', 'LineWidth', 2);
text (-0.6, 0.5, 'f(y)', 'FontSize', 18);
text ( 0.4, 0.1, '(x, f(x))', 'FontSize', 18);
text ( 0.9, 0.5, 'f(x) + \nabla f(x)^{T}(y - x)', 'FontSize', 18);
xlim ([-0.7, 1.3]);
axis off;

_images/nlo-lecture-05-convex-functions_24_0.png

The first-order approximation of $f$ is global underestimator.

Theorem 7 can be extended to strict convexity, which is omitted here.

2nd order convex function condition#

Theorem 8: 2nd order convex function condition

Let $X \subset \mathbb{R}^{n}$ be a open, non-empty, convex set and $f \in C^{2}(\mathbb{R}^{n}, \mathbb{R})$ is convex over $X$, then

\[ \nabla^{2} f(x) \succeq 0 \quad \text{(positive semi-definite)} \]

for all $x \in X$.

Again, the extension of Theorem 8 to strict convexity is omitted here.

Examples convex function condition#

Quadratic function: $f(x) = \frac{1}{2}x^{T}Ax + b^{T}x + c$, where $A$ is a symmetric $n \times n$ matrix, $x,b \in \mathbb{R}^{n}$, and $c \in \mathbb{R}$.

\[ \nabla f(x) = Ax + b \quad\text{and}\quad \nabla^{2} f(x) = A. \]

If $A \succeq 0$, then $f$ is a convex function.

Least-squares objective function: $f(x) = \lVert Ax - b \rVert_{2}^{2}$, with $A \in \mathbb{R}^{m \times n}$, $x \in \mathbb{R}^{n}$, and $b \in \mathbb{R}^{m}$.

\[ \nabla f(x) = 2A^{T}(Ax - b) \quad\text{and}\quad \nabla^{2} f(x) = 2A^{T}A \]

is a convex function for any $A$!

Quadratic-over-linear: $f(x,y) = x^2 / y$.

\[\begin{split} \nabla^{2} f(x,y) = \frac{2}{y^3} \begin{pmatrix} y \\ -x \end{pmatrix} \begin{pmatrix} y \\ -x \end{pmatrix}^{T} \succeq 0 \end{split}\]

is a convex function for $y > 0$.

% Following equivalent to (x,y,z := x^2/y) for plotting.
[y, z] = meshgrid (0:.08:2);
x = sqrt(y.*z);
f = @(y, z) sqrt (x .* z);
mesh (f(y,z), y, z);
hold on;
mesh (-f(y,z), y, z);
grid on;
xlabel ('x');
ylabel ('y');
zlabel ('f(x,y)');
title ('f(x,y) = x^2 / y');

_images/nlo-lecture-05-convex-functions_32_0.png

Operations that preserve convexity#

Non-negative multiple: $\alpha f$ is convex, if $f$ is convex and $\alpha \geq 0$.
Sum: $f_1 + f_2$ is convex, if $f_1$ and $f_2$ are convex. This extends to infinite sums and integrals.
Composition with affine function: f(Ax + b) is convex, if $f$ is convex.
Point-wise maximum: $f(x) = \max\{ f_1(x), \ldots, f_m(x) \}$ is convex, if $f_1, \ldots, f_m$ are convex.

More examples for convex functions:

(Any) norm of an affine function: $f(x) = \lVert Ax + b \rVert$.
Log-barrier-function for expressing linear inequalities:

\[ f(x) = -\sum_{i = 1}^{m} \log\left( b_i - a_i^{T}x \right), \quad \operatorname{dom}(f) = \left\{ x \colon a_i^{T}x < b_i, i = 1, \ldots, m \right\}. \]
Piecewise-linear-function: $f(x) = \max_{i = 1, \ldots, m} \left( b_i - a_i^{T}x \right)$.

For many more examples, see [BV04] (chapters 2 and 3).

Selected Topics in Mathematical Optimization

Convex functions

Contents

Convex functions#

Some special sets#

Lines and line segments#

Affine sets#

Convex sets#

Hyperplanes#

Halfspaces#

Ellipsoid#

Polyhedra#

Convex functions#

Examples on \(\mathbb{R}\)#

Examples on \(\mathbb{R}^{n}\) (vectors)#

Examples on \(\mathbb{R}^{m \times n}\) (\(m \times n\) matrices)#

1st order convex function condition#

2nd order convex function condition#

Examples convex function condition#

Operations that preserve convexity#