The Stone-Weierstraß theorem#
( Part of the ‘23 lecture Building Blocks of Modern Machine Learning: (Self-)Attention and Diffusion Models. )
(Stone-Weierstraß)
Let \(K \subset \mathbb R^d\) be a compact set. Consider the \(\mathbb R\)-algebra \(C(K,\mathbb R)\) of continuous, real-valued functions on it. Let \(\mathcal A \subset C(K,\mathbb R)\) be a subalgebra, satisfying
\(\mathcal A\) contains the constant function \(1\).
\(\mathcal A\) separates points, that is, for every \(x,y \in K\), \(x\not= y\) there exists \(\phi \in \mathcal A\) such that
Then: \(\mathcal A\) is dense in \(C(K,\mathbb R)\) with respect to the supremum norm
That is, for every \(f \in C(K,\mathbb R)\), every \(\epsilon > 0\) there exists a \(g \in \mathcal A\) such that
We will follow the beautiful proof of [BD81]. A key ingredient is the following inequality.
(Bernoulli’s inequality)
For \(z \ge -1\) and \(m \in \mathbb N_{\ge 1}\), we have
Proof
For \(z \ge 0\) the inequality follows from
For \(z=-1\) the inequality reads as
which is true for \(m \ge 1\). To prove the inequality on \([-1,0]\) it is now sufficient to show that the derivative of
does not change sign on \([-1,0]\). And, indeed
Using this lemma we can show that, on \([0,1]\), we can approximate a certain step-function arbitrarily close by a polynomial.
Given \(\delta \in (0,1)\), \(\epsilon > 0\), there exists a polynomial \(\Phi\) in one variable such that
\(\Phi(y) \in [0,1]\) for \(y \in [0,1]\).
\(\Phi(y) \in [1-\epsilon,1]\) for \(y \in [0,\frac\delta2]\).
\(\Phi(y) \in [0,\epsilon]\) for \(y \in [\delta,1]\).
Proof
Pick \(k \in \mathbb N_{\ge 1}\) such that
Given \(m \in \mathbb N_{\ge 1}\) define
Then
Further, for \(y \in [0,\frac\delta2]\) and using Bernoulli’s inequality
We observe that since \(k\delta/2 < 1\), this tends to \(1\) (independent of \(y \in [0,\frac\delta2]\)).
For \(y \in [\delta,1]\)
Here we used Bernoulli’s inequality for the second line. We observe that since \(k\delta > 1\), this tends to \(0\) (independent of \(y \in [\delta,1]\)).
Together, we see that we can choose \(m\) large enough such that \(\Phi := \Phi_m\) satisfies the required bounds.
Under the assumptions of the theorem, for every point \(x_0 \in K\), every open neighborhood \(U \subset K\) of \(x_0\) there is a smaller neighborhood \(U' \subset U\) of \(x_0\) such that for every \(\epsilon > 0\) there is a \(\phi \in \mathcal A\) with
Proof
For every \(x \in K\) there is \(\psi_x \in \mathcal A\) with
Define \(\psi_x' := \psi_x - \psi_x(x_0) \in \mathcal A\). Then
Define
Then
Define
Then \(V_x\) is an open neighborhood of \(x\). Hence
Seince \(K\setminus U\) is compact, there exists a finite family of points \(x_1, \dots, x_n \in K\) with
Define
Then
Define
By compactness of \(K\setminus U\) there is a \(\delta > 0\) such that
Define
Then \(U' \subset U\) and
Using Lemma 2, there is a Polynomial \(\Phi\) such that
Hence, \(\phi := \Phi(p) \in \mathcal A\) does the job.
We can now prove the theorem.
Proof of Stone-Weierstraß.
Claim: for all closed subsets \(A,B \subset K\), \(A \cap B = \emptyset\), for every \(0< \epsilon <1\) there is \(\phi \in \mathcal A\) such that
Indeed, let \(U := K \setminus B\). For \(x \in A\) consider the neighborhood \(U' = U'_x\) of \(x\) (Lemma 3). Since \(A\) is compact, there exist \(x_1, \dots, x_m \in A\) such that
By the choice of \(U'_{x_i}\) ther exists \(\phi_i \in \mathcal A\) such that
Define
Then
This proves the claim.
Let now \(f \in C(K,\mathbb R)\) be given. By considering \(f + ||f||_\infty\) we can assume that \(f \ge 0\). Let \(0 < \epsilon < 1/3\) be given. Choose \(n \in \mathbb N_{\ge 1}\) with
Define for \(j=0,\dots,n\)
Then
Using the claim, we can take \(\phi_i \in \mathcal A\) with
Set
For \(x \in K\) there exists a \(j \in \mathbb N_{\ge 1}\) such that
In particular
and
Moreover, \(x \in B_i\) for every \(i \ge j-2\) which implies
Then
If \(j \ge 2\), then
If \(j=1\), then
is trivially true. Hence
as desired.
Bruno Brosowski and Frank Deutsch. An elementary proof of the stone-weierstrass theorem. Proceedings of the American Mathematical Society, 81(1):89–92, 1981.