From Surf Wiki (app.surf) — the open knowledge base

Hilbert projection theorem

On closed convex subsets in Hilbert space

Summary

On closed convex subsets in Hilbert space

In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x in a Hilbert space H and every nonempty closed convex C \subseteq H, there exists a unique vector m \in C for which |c - x| is minimized over the vectors c \in C; that is, such that |m - x| \leq |c - x| for every c \in C.

Finite dimensional case

Some intuition for the theorem can be obtained by considering the first order condition of the optimization problem.

Consider a finite dimensional real Hilbert space H with a subspace C and a point x. If m \in C is a minimizer or minimum point of the function N : C \to \R defined by N(c) := |c - x| (which is the same as the minimum point of c \mapsto |c - x|^2), then derivative must be zero at m.

In matrix derivative notation: \begin{aligned} \partial \lVert x - c \rVert^2 &= \partial \langle c - x, c - x \rangle \ &= 2 \langle c - x, \partial c\rangle \end{aligned} Since \partial c is a vector in C that represents an arbitrary tangent direction, it follows that m - x must be orthogonal to every vector in C.

Statement

If the closed subset C is also a vector subspace of H then this minimizer m is the unique element in C such that x - m is orthogonal to C.

Detailed elementary proof

Let \delta := \inf_{c \in C} |x - c| be the distance between x and C, \left(c_n\right)_{n=1}^{\infty} a sequence in C such that the distance squared between x and c_n is less than or equal to \delta^2 + 1/n. Let n and m be two integers, then the following equalities are true: \left|c_n - c_m\right|^2 = \left|c_n - x\right|^2 + \left|c_m - x\right|^2 - 2 \left\langle c_n - x , , , c_m - x\right\rangle and 4 \left|\frac{c_n + c_m}2 - x\right|^2 = \left|c_n - x\right|^2 + \left|c_m - x\right|^2 + 2 \left\langle c_n - x , , , c_m - x\right\rangle Therefore \left|c_n - c_m\right|^2 = 2 \left|c_n - x\right|^2 + 2\left|c_m - x\right|^2 - 4\left|\frac{c_n + c_m}2 - x\right|^2 (This equation is the same as the formula a^2 = 2 b^2 + 2 c^2 - 4 M_a^2 for the length M_a of a median in a triangle with sides of length a, b, and c, where specifically, the triangle's vertices are x, c_m, c_n).

By giving an upper bound to the first two terms of the equality and by noticing that the midpoint of c_n and c_m belong to C and has therefore a distance greater than or equal to \delta from x, it follows that: |c_n - c_m|^2 ; \leq ; 2\left(\delta^2 + \frac{1}{n}\right) + 2\left(\delta^2 + \frac{1}{m}\right) - 4\delta^2 = 2\left(\frac{1}{n} + \frac{1}{m}\right)

The last inequality proves that \left(c_n\right)_{n=1}^{\infty} is a Cauchy sequence. Since C is complete, the sequence is therefore convergent to a point m \in C, whose distance from x is minimal. \blacksquare

Let m_1 and m_2 be two minimum points. Then: |m_2 - m_1|^2 = 2|m_1 - x|^2 + 2|m_2 - x|^2 - 4 \left|\frac{m_1 + m_2}2 - x\right|^2

Since \frac{m_1 + m_2}2 belongs to C, we have \left|\frac{m_1 + m_2} 2 - x\right|^2 \geq \delta^2 and therefore |m_2 - m_1|^2 \leq 2 \delta^2 + 2 \delta^2 - 4 \delta^2 = 0.

Hence m_1 = m_2, which proves uniqueness. \blacksquare

Assume that C is a closed vector subspace of H. It must be shown the minimizer m is the unique element in C such that \langle m - x, c \rangle = 0 for every c \in C.

Proof that the condition is sufficient: Let z \in C be such that \langle z - x, c \rangle = 0 for all c \in C. If c \in C then c - z \in C and so |c-x|^2 = |(z-x) + (c-z)|^2 = |z-x|^2 + |c-z|^2 + 2 \langle z-x, c-z \rangle = |z-x|^2 + |c-z|^2 which implies that |z-x|^2 \leq |c-x|^2. Because c \in C was arbitrary, this proves that |z-x| = \inf_{c \in C} |c - x| and so z is a minimum point.

Proof that the condition is necessary: Let m \in C be the minimum point. Let c \in C and t \in \R. Because m + t c \in C, the minimality of m guarantees that |m-x| \leq |(m + t c) - x|. Thus |(m + t c) - x|^2 - |m-x|^2 = 2t\langle m-x, c\rangle + t^2 |c|^2 is always non-negative and \langle m-x, c\rangle must be a real number. If \langle m - x, c\rangle \neq 0 then the map f(t) := 2t\langle m - x, c\rangle + t^2 |c|^2 has a minimum at t_0 := - \frac{\langle m - x, c\rangle}{|c|^2} and moreover, f\left(t_0\right) which is a contradiction. Thus \langle m - x, c\rangle = 0. \blacksquare

Proof by reduction to a special case

It suffices to prove the theorem in the case of x = 0 because the general case follows from the statement below by replacing C with C - x.

Furthermore, letting d := \inf_{c \in C} | c |, if \left(c_n\right){n=1}^{\infty} is any sequence in C such that \lim{n \to \infty} \left|c_n\right| = d in \RBecause the norm | \cdot | : H \to \R is continuous, if \lim_{n \to \infty} x_n converges in H then necessarily \lim_{n \to \infty} \left|x_n\right| converges in \R. But in general, the converse is not guaranteed. However, under this theorem's hypotheses, knowing that \lim_{n \to \infty} \left|c_n\right| = d in \R is sufficient to conclude that \lim_{n \to \infty} c_n converges in H. then \lim_{n \to \infty} c_n = m in H.

Let C be as described in this theorem and let d := \inf_{c \in C} | c |. This theorem will follow from the following lemmas.

^2.</math>

Because C is convex, if m, n \in \N then \frac{1}{2}\left(c_m + c_n\right) \in C so that by definition of the infimum, d \leq \left| \frac{1}{2}\left(c_m + c_n\right) \right|, which implies that 4d^2 \leq \left|c_m + c_n\right|^2. By the parallelogram law, \left|c_m + c_n\right|^2 + \left|c_m - c_n\right|^2 = 2 \left|c_m\right|^2 + 2 \left|c_n\right|^2 where 4d^2 \leq \left|c_m + c_n\right|^2 now implies 4 d^2 + \left|c_m - c_n\right|^2 ~~\leq~~ 2 \left|c_m\right|^2 + 2 \left|c_n\right|^2 and so \begin{alignat}{4} \left|c_m - c_n\right|^2 ~~\leq~~ 2 \left|c_m\right|^2 + 2 \left|c_n\right|^2 - 4 d^2 \end{alignat} The assumption \lim_{n \to \infty} \left|c_n\right| = d implies that the right hand side (RHS) of the above inequality can be made arbitrary close to 0 by making m and n sufficiently large.Explicitly, this means that given any \epsilon 0 there exists some integer N 0 such that "the quantity" is ,\leq \epsilon whenever m, n \geq N. Here, "the quantity" refers to the inequality's right hand side 2 \left|c_m\right|^2 + 2 \left|c_n\right|^2 - 4 d^2 and later in the proof, "the quantity" will also refer to \left|c_m - c_n\right|^2 and then \left|c_m - c_n\right|. By definition of "Cauchy sequence," \left(c_n\right){n=1}^{\infty} is Cauchy in H if and only if "the quantity" \left|c_m - c_n\right| satisfies this aforementioned condition. The same must consequently also be true of the inequality's left hand side \left|c_m - c_n\right|^2 and thus also of \left|c_m - c_n\right|, which proves that \left(c_n\right){n=1}^{\infty} is a Cauchy sequence in H.

Since H is complete, there exists some c \in H such that \lim_{n \to \infty} c_n = c in H. Because every c_n belongs to C, which is a closed subset of H, their limit c must also belongs to this closed subset, which proves that c \in C. Since the norm | ,\cdot, | : H \to \R is a continuous function, \lim_{n \to \infty} c_n = c in H implies that \lim_{n \to \infty} \left|c_n\right| = |c| in \R. But \lim_{n \to \infty} \left|c_n\right| = d also holds (by assumption) so that |c| = d (because limits in \R are unique). \blacksquare

A reader of this article who is looking at the proof of this Lemma might be newer to the subject, which is why this proof is so very detailed.

The existence of the sequence follows from the definition of the infimum, as is now shown. The set S := { | c | : c \in C } is a non-empty subset of non-negative real numbers and d := \inf_{c \in C} | c | = \inf S. Let n \geq 1 be an integer. Because \inf S there exists some s_n \in S such that s_n Since s_n \in S, d = \inf S \leq s_n holds (by definition of the infimum). Thus d \leq s_n and now the squeeze theorem implies that \lim_{n \to \infty} s_n = d in \R. (This first part of the proof works for any non-empty subset of S \subseteq \R for which d := \inf_{s \in S} s is finite).

For every n \in \N, the fact that s_n \in S = { | c | : c \in C } means that there exists some c_n \in C such that s_n = \left| c_n \right|. The convergence \lim_{n \to \infty} s_n = d in \R thus becomes \lim_{n \to \infty} \left|c_n\right| = d in \R. \blacksquare

Lemma 2 and Lemma 1 together prove that there exists some c \in C such that |c| = d. Lemma 1 can be used to prove uniqueness as follows. Suppose b \in C is such that |b| = d and denote the sequence b, c, b, c, b, c, \ldots by \left(c_n\right){n=1}^{\infty} so that the subsequence \left(c{2n}\right){n=1}^{\infty} of even indices is the constant sequence c, c, c, \ldots while the subsequence \left(c{2n - 1}\right){n=1}^{\infty} of odd indices is the constant sequence b, b, b, \ldots. Because \left|c_n\right| = d for every n \in \N, \lim{n \to \infty} \left|c_n\right| = \lim_{n \to \infty} d = d in \R, which shows that the sequence \left(c_n\right){n=1}^{\infty} satisfies the hypotheses of Lemma 1. Lemma 1 guarantees the existence of some x \in C such that \lim{n \to \infty} c_n = x in H. Because \left(c_n\right)_{n=1}^{\infty} converges to x, so do all of its subsequences. In particular, the subsequence c, c, c, \ldots converges to x, which implies that x = c (because limits in H are unique and this constant subsequence also converges to c). Similarly, x = b because the subsequence b, b, b, \ldots converges to both x and b. Thus b = c, which proves the theorem. \blacksquare

Consequences

H = C \oplus C^{\bot}.

If c \in C \cap C^{\bot} then 0 = \langle ,c, ,c, \rangle = |c|^2, which implies c = 0. \blacksquare

Proof that C^{\bot} is a closed vector subspace of H:

Let P := \prod_{c \in C} \mathbb{F} where \mathbb{F} is the underlying scalar field of H and define \begin{alignat}{4} L : ,& H && \to ,&& P \ & h && \mapsto,&& \left(\langle ,h, ,c, \rangle\right)_{c \in C} \ \end{alignat} which is continuous and linear because this is true of each of its coordinates h \mapsto \langle h, c \rangle. The set C^{\bot} = L^{-1}(0) = L^{-1}\left({ 0 }\right) is closed in H because { 0 } is closed in P and L : H \to P is continuous. The kernel of any linear map is a vector subspace of its domain, which is why C^{\bot} = \ker L is a vector subspace of H. \blacksquare

Let x \in H. The Hilbert projection theorem guarantees the existence of a unique m \in C such that |x - m| \leq |x - c| \text{ for all } c \in C (or equivalently, for all x - c \in x - C). Let p := x - m so that x = m + p \in C + p and it remains to show that p \in C^{\bot}. The inequality above can be rewritten as: |p| \leq |z| \quad \text{ for all } z \in x - C. Because m \in C and C is a vector space, m + C = C and C = - C, which implies that x - C = x + C = p + m + C = p + C. The previous inequality thus becomes |p| \leq |z| \quad \text{ for all } z \in p + C. or equivalently, |p| \leq |p + c| \quad \text{ for all } c \in C. But this last statement is true if and only if \langle ,p, c, \rangle = 0 every c \in C. Thus p \in C^{\bot}. \blacksquare

Properties

Expression as a global minimum

The statement and conclusion of the Hilbert projection theorem can be expressed in terms of global minimums of the following functions. Their notation will also be used to simplify certain statements.

Given a non-empty subset C \subseteq H and some x \in H, define a function d_{C,x} : C \to 0, \infty) \quad \text{ by } c \mapsto |x - c|. A [global minimum point of d_{C,x}, if one exists, is any point m in ,\operatorname{domain} d_{C,x} = C, such that d_{C,x}(m) ,\leq, d_{C,x}(c) \quad \text{ for all } c \in C, in which case d_{C,x}(m) = |m - x| is equal to the global minimum value of the function d_{C, x}, which is: \inf_{c \in C} d_{C,x}(c) = \inf_{c \in C} |x - c|.

Effects of translations and scalings

When this global minimum point m exists and is unique then denote it by \min(C, x); explicitly, the defining properties of \min(C, x) (if it exists) are: \min(C, x) \in C \quad \text { and } \quad \left|x - \min(C, x)\right| \leq |x - c| \quad \text{ for all } c \in C. The Hilbert projection theorem guarantees that this unique minimum point exists whenever C is a non-empty closed and convex subset of a Hilbert space. However, such a minimum point can also exist in non-convex or non-closed subsets as well; for instance, just as long is C is non-empty, if x \in C then \min(C, x) = x.

If C \subseteq H is a non-empty subset, s is any scalar, and x, x_0 \in H are any vectors then ,\min\left(s C + x_0, s x + x_0\right) = s \min(C, x) + x_0 which implies: \begin{alignat}{6} \min&(s C, s x) &&= s &&\min(C, x) \ \min&(- C, - x) &&= - &&\min(C, x) \ \end{alignat} \begin{alignat}{6} \min\left(C + x_0, x + x_0\right) &= \min(C, x) + x_0 \ \min\left(C - x_0, x - x_0\right) &= \min(C, x) - x_0 \ \end{alignat} \begin{alignat}{6} \min&(C, - x) {} &&= \min(C + x, 0) - x \ \min&(C, 0) ;+; x;;;; &&= \min(C + x, x) \ \min&(C - x, 0) {} &&= \min(C, x) - x \ \end{alignat}

Examples

The following counter-example demonstrates a continuous linear isomorphism A : H \to H for which ,\min(A(C), A(x)) \neq A(\min(C, x)). Endow H := \R^2 with the dot product, let x_0 := (0, 1), and for every real s \in \R, let L_s := { (x, s x) : x \in \R } be the line of slope s through the origin, where it is readily verified that \min\left(L_s, x_0\right) = \frac{s}{1+s^2}(1, s). Pick a real number r \neq 0 and define A : \R^2 \to \R^2 by A(x, y) := (r x, y) (so this map scales the x-coordinate by r while leaving the y-coordinate unchanged). Then A : \R^2 \to \R^2 is an invertible continuous linear operator that satisfies A\left(L_s\right) = L_{s/r} and A\left(x_0\right) = x_0, so that ,\min\left(A\left(L_s\right), A\left(x_0\right)\right) = \frac{s}{r^2 + s^2} (1, s) and A\left(\min\left(L_s, x_0\right)\right) = \frac{s}{1 + s^2} \left(r, s\right). Consequently, if C := L_s with s \neq 0 and if (r, s) \neq (\pm 1, 1) then ,\min(A(C), A\left(x_0\right)) \neq A\left(\min\left(C, x_0\right)\right).

Iterated projections

For any closed convex nonempty subset C \subset H, let P_C: H \to C be the projection function.

If there are multiple closed convex subsets C_1, C_2, \dots, C_n, then one can approximate the projection operator P_{C_1 \cap \dots \cap C_n} by applying P_{C_1}, P_{C_2}, \dots, P_{C_n} in sequence, then do it again and again. That is, one can approximate (P_{C_n} \dots P_{C_2} P_{C_1})^k \to P_{C_1 \cap \dots \cap C_n} as k \to \infty. The Kaczmarz method is a commonly used special case. Such methods can be computationally effective. For example, if C is a complicated shape, then projecting directly to C may be difficult. However, C can be approximated as an intersection of simple objects like half-spaces, hyperplanes, finite-dimensional subspaces, or cones.

If C is a closed subspace, then it is convex. In this case, the projection function P: H \to C is an orthogonal projection (a continuous linear operator that is self-adjoint). A classic theorem states that, if C_1, \dots, C_n are closed subspaces, then \lim_{k \to \infty}|(P_{C_1} \cdots P_{C_n})^kx - P_{C_1 \cap \dots \cap C_n} x| = 0, \quad \forall x \in H

Notes

References

Bibliography

References

"The Matrix Cookbook".
Deutsch, Frank. (2001). "The Method of Alternating Projections". Springer New York.
Netyanun, Anupan. (August 2006). "Iterated Products of Projections in Hilbert Space". The American Mathematical Monthly.

Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

convex-analysis theorems-in-functional-analysis

Want to explore this topic further?

Ask Mako anything about Hilbert projection theorem — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report