# Concentration of measure

## Abstract discussion

Concentration of measure can be loosely described as the following principle.

• If $f$ is a function that depends sufficiently continuously on random variables $X_1,\dots,X_n$ that are sufficiently bounded and sufficiently independent, then the probability that $f-\mathbb{E}f$ is large is very small.

## Examples

Here are some examples, which give an idea of the principle in action.

#### Example 1: sums of independent Bernoulli variables

Let $X_1,\dots,X_n$ be independent Bernoulli random variables, with probability p of equalling 1. Let $f(X_1,\dots,X_n)=n^{-1}(X_1+\dots+X_n).$ Then the expectation of $f$ is p. The distribution of f is binomial (up to the scaling factor of $n^{-1}$). The variance of f is $n^{-1}p(1-p),$ so we expect deviations from the mean of order of magnitude $n^{-1/2}.$ It turns out that deviations of significantly more than this are extremely unlikely: it can be shown that the probability that $n^{-1}(X_1+\dots+X_n)$ differs from p by more than $tn^{-1/2}$ is at most $\exp(-ct^2)$ for some absolute constant $c\gt0.$ In particular, the probability that $n^{-1}(X_1+\dots+X_n)$ differs from p by more than a constant amount $\delta$ is at most $\exp(-c\delta^2n).$

Many of the comments have implicitly used this fact. For example, if you pick a random point of $^n,$ then the number of 1s will be a sum of n independent Bernoulli variables with $p=1/3,$ as will the number of 2s and the number of 3s. Therefore, if you choose a random sequence, the probability is at most $3\exp(-c\delta^2n)$ that any of these numbers will differ from $n/3$ by more than $\delta n.$

A considerably more general fact than this follows from Azuma's inequality.

#### Example 2: Lipschitz functions on the sphere

Let x be a random point in the unit sphere of $\mathbb{R}^n,$ and let f be a 1-Lipschitz function on the sphere (with respect to the spherical distance, say). Then it is a consequence of Lévy's isoperimetric inequality on the sphere that $\mathbb{P}[|f(x)-\mathbb{E}f|\geq\delta]\leq 2\exp(-c\delta^2n).$

We can relate this to the general principle above by thinking of f as a function of the coordinates $X_1,X_2,\dots,X_n$ of x. These coordinates are by no means independent, but they do enjoy a high degree of approximate independence.

A quick sketch of how this statement relates to the isoperimetric inequality: the set of x such that f(x) is at most the median of f has measure at least 1/2. If we now expand this set by $\delta$ (that is, we take all points within $\delta$ of the set) then we obtain a set on which f(x) is not more than $\delta$ above its median. The isoperimetric inequality tells us that the set of measure at least 1/2 with smallest expansion is a hemisphere, and a simple calculation shows that the measure of the complement of the expansion of a hemisphere is at most $\exp(-c\delta^2n).$ Therefore, with very high probability f is at most its median plus $\delta.$ A similar argument gives a bound in the other direction. And if f is concentrated about its median, then its mean must be close to its median.