CL003 – Continuity

May 17, 2025

Table of Contents

Definition of Continuity
Lipschitz Continuity
References

When is a function said continuous? What conditions must a function satisfy to be considered continuous?

Definition of Continuity

A function \(f: \mathbb{R}^n \to \mathbb{R}^m\) is continuous at \(\mathbf{x}_0 \in \text{dom } f\) if

\(f\) is defined at \(\mathbf{x}_0\).
The limit \(\lim_{\mathbf{x} \to \mathbf{x}_0} f(\mathbf{x})\) exists.
The limit equals the function value, \(\lim_{\mathbf{x} \to \mathbf{x}_0} f(\mathbf{x}) = f(\mathbf{x}_0)\).

A function is continuous if it is continuous at every point of its domain. An alternate definition of continuity is as follows.

\(\delta-\epsilon\) definition of continuity:

A function \(f: \mathbb{R}^n \to \mathbb{R}^m\) is continuous at \(\mathbf{x}_0 \in \text{dom } f\) if for all \(\epsilon > 0\) there exists a \(\delta\) such that

\[\mathbf{x} \in \text{dom } f, \hspace{1cm} \|\mathbf{x} - \mathbf{x}_0\|_2 \leq \delta \implies \|f(\mathbf{x}) - f(\mathbf{x}_0) \|_2 \leq \epsilon\]

Continuity can also be described in terms of limits: whenever the sequence \(\mathbf{x}_1, \mathbf{x}_2, \dots, \in \text{dom } f \) converges to a point \(\mathbf{x}_0 \in \text{dom } f\), the sequence \(f(\mathbf{x}_1), f(\mathbf{x}_2), \dots,\) converges to \(f(\mathbf{x}_0)\), i.e.,

\[f(\lim_{n \to \infty} \mathbf{x}_n) = \lim_{n \to \infty} f(\mathbf{x}_n)\]

A function \(f\) is continuous at \(\mathbf{x}_0 \in \text{dom } f\) iff for any sequence \(\{\mathbf{x_n}\} \rightarrow \mathbf{x}_0\), we have \(\{f(\mathbf{x_n})\}\rightarrow f(\mathbf{x}_0)\).

Take any sequence \(\{\mathbf{x_n}\}\) that converges to this \(\mathbf{x}_0\).
For each element in the sequence, compute the corresponding function value \(\{f(\mathbf{x_n})\}\). This is a sequence of real numbers. This should converge to \(f(\mathbf{x}_0)\). If this happens for any sequence \(\{\mathbf{x_n}\}\) converging to \(\mathbf{x}_0\), then we say the function \(f\) is continuous at \(\mathbf{x}_0\).

Lipschitz Continuity

This is a stronger notion of continuity. All Lipschitz Continuous functions are continuous, but not all continuous functions are Lipschitz continuous. A function \(f: \mathbb{R}^n \rightarrow \mathbb{R}^m\) is \(L\)-Lipschitz continuous iff

\[\| f(\mathbf{x}) - f(\mathbf{y}) \| \leq L \| \mathbf{x} - \mathbf{y} \| \hspace{1cm} \forall \mathbf{x}, \mathbf{y} \in \text{dom } f\]

where \(L\) is a positive real number, \(0 < L < \infty\). Note that, the norm appearing in the LHS and RHS do not need to be the same if the domain and co-domain belong to different spaces.

Take any two points \(\mathbf{x}, \mathbf{y} \in \text{dom } f\), look at their difference in the function values. If the difference in the function values is not too high with respect to how far the points themselves are, i.e., if we are able to bound the function growth in terms of the distance between the points, then the function is Lipschitz continuous.

The \(L\) tells us how fast the function grows. The higher the \(L\), the steeper the function will be. If \(L\) is zero, the function is not growing at all.

Example 01:

Let \(f(x) = 2x+3\). To show that the function is Lipschitz continuous, we need to find a bound \(L\).

\[|f(x) - f(y) | = |2x+3 - 2y-3| = |2x-2y| = 2|x-y|\]

Then \(|f(x) - f(y)| \leq 2|x-y|\). The function is 2-Lipschitz continuous. As we can observe the Lipschitz constant for a simple linear function is just its slope.

Note	We can also write \(\|f(x) - f(y)\| \leq 4\|x-y\|, \leq 5\|x-y\|, \dots\). But here we are looking for the least \(L\).

Example 02:

Let \(f(\mathbf{x}) = \mathbf{w}^\top \mathbf{x}\), which is a linear function that takes n-dimensional input and outputs a real number. To show that the function is Lipschitz continuous, we need to find a bound \(L\).

\[|f(\mathbf{x}) - f(\mathbf{y})| = | \mathbf{w}^\top \mathbf{x} - \mathbf{w}^\top \mathbf{y}| = | \mathbf{w}^\top (\mathbf{x-y})| \leq \|\mathbf{w}\| \|\mathbf{x-y}\|\]

The last inequality is by the Cauchy-Schwartz inequality.

Example 03:

Let \(f(\mathbf{x}) = \mathbf{A} \mathbf{x}\), which is a linear function that takes n-dimensional input and outputs a m-dimensional vector. Let \(\mathbf{A}\) be a symmetric matrix. To show that the function is Lipschitz continuous, we need to find a bound \(L\), which is a scalar.

\[\|\mathbf{Ax} - \mathbf{Ay} \| = \|\mathbf{A} (\mathbf{x} - \mathbf{y}) \|\]

By SVD, we can write \(\mathbf{A} = \sum_{i=1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i^\top\).
\((\mathbf{x} - \mathbf{y})\) is a vector in the input space, \(\mathbb{R}^n\). So it can be written as \(\mathbf{x} - \mathbf{y} = \sum_{j=1}^n \lambda_j \mathbf{v}_j\).

\[\begin{align*} \mathbf{A} (\mathbf{x} - \mathbf{y}) & = \sum_{i=1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i^\top \sum_{j=1}^n \lambda_j \mathbf{v}_j \\ & = \sum_{i=1}^r \sum_{j=1}^n \lambda_j \sigma_i \mathbf{u}_i \mathbf{v}_i^\top \mathbf{v}_j \\ & = \sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i && \text{ as } V \text{ vectors are orthogonal } \\ \end{align*}\]

This is just a linear combination of \(\mathbf{u}_i\)'s. Then

\[\begin{align*} \| \mathbf{A} (\mathbf{x} - \mathbf{y}) \| & = \|\sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i \| \\ & = \sqrt{ \langle \sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i, \sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i \rangle} = \sqrt{ (\sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i)^\top (\sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i )} \\ & = \sqrt{ \sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i^\top (\sum_{i=1}^r \lambda_i \sigma_i \mathbf{u}_i )} = \sqrt{ \sum_{i=1}^r \lambda_i^2 \sigma_i^2 } \\ \end{align*}\]

Here \(\lambda_i\) changes as per the vector \((\mathbf{x} - \mathbf{y})\), the \(\sigma_i\) are fixed.

\[\begin{align*} \| \mathbf{A} (\mathbf{x} - \mathbf{y}) \| & \leq \sqrt{ \sum_{i=1}^r \lambda_i^2 \sigma^2_{max} } \\ & = \sqrt{ \sigma^2_{max} \sum_{i=1}^r \lambda_i^2} = \sigma_{max} \sqrt{ \sum_{i=1}^r \lambda_i^2} \end{align*}\]

as singular values are always non-negative, \(\sqrt{ \sigma^2_{max}} = \sigma_{max}\).

Note	For example: \(\sqrt{3\lambda_1^2 + 4\lambda_2^2} \leq \sqrt{4\lambda_1^2 + 4\lambda_2^2} = \sqrt{4} \sqrt{\lambda_1^2 + \lambda_2^2}\)

Hence, we are able to show \(\| \mathbf{A} (\mathbf{x} - \mathbf{y}) \| \leq \sigma_{max} \|\mathbf{x} - \mathbf{y} \|\). Thus the Lipschitz constant is the highest singular value of the matrix \(\mathbf{A}\).

Example 04:

Let \(f(x) = x^2\). To show that the function is Lipschitz continuous, we need to find a bound \(L\).

\(|x^2 -y^2| = |x+y| |x-y|\). This quantity can never be

\[|x+y| |x-y| \not \leq L | x-y|\]

For any given \(L\), we can pick \(x,y\) such that \(|x+y| > L\). So by contradiction, we say that this function \(f\) is not Lipschitz continuous. This happens because the function grows too steeply compared to how \(x\) grows.

But we restrict the domain of the function to \(x \in [-1,1]\). Then the function is Lipschitz continuous because \(|x+y|\) is upper bounded by 2. So the Lipschitz constant is 2.

Theorem that helps us find the Lipschitz constant:

If \(f\) is differentiable, then say the magnitude of the derivative is bounded by some number, \(|f'(\mathbf{x})| \leq L\) for all \(\mathbf{x} \in \text{dom } f\). Then such a number \(L\) is the Lipschitz constant.

Example 05:

Let \(f(x) = x \log x\) where \(x >0\). Then \(f'(x) = 1 + \log x\). We need to find a constant such that \(|1 + \log x| \leq L\) for all \(x>0\). We cannot find such a \(L\). Hence the function \(f(x)\) is not Lipschitz continuous.

References

Boyd, S. P., & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.

Essential Maths

Categories

Tags

Recent Posts

CL004 – First-Order Derivatives