# Fourier-analytic proof of Sperner

## Fourier analysis

We identify $^n$ with the power set $2^{[n]}$ of [n]. Given any function $f: 2^{[n]} \to {\Bbb R}$, we define the Fourier transform $\hat f: 2^{[n]} \to {\Bbb R}$ by the formula

$\hat f(A) := {\Bbb E}_{X \in 2^{[n]}} (-1)^{|A \cap X|} f(X)$;

we have the Plancherel formula

$\sum_{A \in 2^{[n]}} |\hat f(A)|^2 = {\Bbb E}_{X \in 2^{[n]}} |f(X)|^2$ (1)

and the inversion formula

$f(X) = \sum_{A \in 2^{[n]}} (-1)^{|A \cap X|} \hat f(A)$. (2)

For any scale $1 \lt r \lt n$, we consider the bilinear form

$B_r( f, g ) := {\Bbb E} f(X) g(Y)$ (3)

where X is drawn randomly from $2^{[n]}$, and Y is formed from X by setting $j \in Y$ whenever $j \in X$, and when $j \not \in X$, setting $j \in Y$ with probability $r/n$. To avoid significant distortion in the measure, we will be working in the regime $1 \leq r \ll \sqrt{n}$. Note that if $B_r(1_A, 1_A)$ is large, and r is not too small, then A will contain many combinatorial lines (since Y will be strictly larger than X with high probability); thus it is of interest to get lower bounds on $B_r$.

We also define the influence Inf(f) as

$Inf(f) := {\Bbb E} |f(X) - f(X')|^2$ (4)

where X' is formed from X by randomly flipping one bit (from 0 to 1, or vice versa). In Fourier space, we have

$Inf(f) = 4\sum_{A \in 2^{[n]}} \frac{|A|}{n} |\hat f(A)|^2$. (5)

## Dichotomy between structure and randomness

For any $f: 2^{[n]} \to {\Bbb R}$, define the uniformity norm $\|f\|_{U(r)}$ at scale r by the formula

$\|f\|_{U(r)} := \sup_g \max( |B_r(f,g)|, |B_r(g,f)| )$ (6)

where g ranges over all functions $g: 2^{[n]} \to [-1,1]$. Functions which are small in the uniform norm are thus "negligible" for the purposes of computing $B_r$ and can be discarded. (This is analogous to the theory of counting arithmetic progressions of length three, in which functions with small Fourier coefficients (or small Gowers $U^2$ norm) can be discarded.)

Lemma 1 Let $f: 2^{[n]} \to [-1,1]$ be such that $\|f\|_{U(r)} \geq \eta$, where $1 \leq r \ll \sqrt{n}$ and $\eta \gt 0$. Then there exists a function $F: 2^{[n]} \to [-1,1]$ of influence $Inf(F) \ll_\eta \frac{1}{r}$ such that
$\langle f, F \rangle := {\Bbb E}_{X \in 2^{[n]}} f(X) F(X) \gg_\eta 1.$

It seems that this is more or less proven in Ryan's notes. (May be worth redoing it here on the wiki. Note that the scale r here basically corresponds to $\epsilon n$ in Ryan's notes.)

## Structure theorem

Now we use the general "energy increment" argument (see e.g. Terry's paper on this) to obtain a structural theorem.

Corollary 2 (Koopman-von Neumann type theorem) Let $f: 2^{[n]} \to [0,1]$ and $0 \lt \eta \lt 1$. Then there exists $k = k(\eta)$ such that whenever $\sqrt{n} \gg r_k \gt \ldots \gt r_1 \gt 1$ be a sequence of scales. Then there exists $1 \leq i \leq k$ and a decomposition
$f = f_{U^\perp} + f_U$ (7)
where $f_{U^\perp}, f_U$ take values in [-1,1],
$Inf( f_{U^\perp} ) \ll_{\eta} 1/r_i$ (8)
and
$\| f_U \|_{U(r_{i+1})} \leq \eta.$ (9)
Furthermore, $f_{U^\perp}$ is non-negative and
${\Bbb E} f = {\Bbb E} f_{U^\perp}.$ (10)

Proof (sketch) It's likely that there is now a slick proof of this type of thing using the duality arguments of Gowers or of Reingold-Trevisan-Tulsiani-Vadhan. But here is the "old school" approach from my long AP paper with Ben, running the energy increment algorithm:

1. Initialise i=1, and let ${\mathcal B}$ be the trivial partition of $^n$.
2. Let $f_{U^\perp} := {\Bbb E}(f|{\mathcal B})$ be the conditional expectation of f with respect to the partition; initially, this is just the constant function ${\Bbb E} f$, but becomes more complicated as the partition gets finer. Set $f_U := f - f_{U^\perp}$
3. If $\| f_U \|_{U(r_{i+1})} \leq \eta$ then STOP.
4. Otherwise, we apply Lemma 1 to find a function F of influence $O_\eta(1/r_{i+1})$ that $f_U$ correlates with. Partition F into level sets (discretising in multiples of $\eta$ or so, randomly shifting the cut-points to avoid edge effects) and use this to augment the partition ${\mathcal B}$. In doing so, the energy ${\Bbb E} |f_{U^\perp}|^2$ increases by some non-negligible amount $c(\eta) \gt 0$, thanks to Pythagoras' theorem: this is why this process will stop before k steps.
A certain tedious amount of verification is needed to show that all the hypotheses are satisfied. One technical step is that one needs to use the Weierstrass approximation theorem to approximate $f_{U^\perp}$ by a polynomial combination of low-influence functions, and one needs to show that polynomial combinations of low-influence functions are low-influence. The complexity of the polynomial is controlled by some function of $\eta$ and this explains the losses in that parameter in (8). $\Box$
Now we prove DHJ(2). Let $f = 1_A$ have density $\delta$. We let $\eta$ be a small number depending on $\delta$. Now we pick some scales $1 \leq r_1 \lt \ldots \lt r_k$ between 1 and $\sqrt{n}$ (the exact choices are not so important, so long as the scales are widely separated), and apply the above corollary, to decompose $f = f_{U^\perp} + f_U$, where $f_U$ is uniform at some scale $r_i$ and $f_{U^\perp}$ is low-influence at a coarser scale $r_{i+1}$.
We then look at $B_{r_i}(f, f)$. Decomposing into four pieces and using the uniformity of $f_U$, this expression is equal to $B_{r_i}(f_{U^\perp}, f_{U^\perp}) + O(\eta)$. But the low influence of $f_{U^\perp}$ means that when one randomly flips a bit from a 0 to a 1, $f_{U^\perp}$ changes by $O_\eta(1/r_{i+1})$ on the average. Iterating this $r_i$ times, we see that $B_{r_i}(f_{U^\perp}, f_{U^\perp}) = {\Bbb E}( f_{U^\perp}^2 ) + O_\eta( r_i / r_{i+1} )$. But $f_{U^\perp}$ has density $\delta$, and so the main term is at least $\delta^2$. Choosing $\eta$ small enough and the $r_i$ widely separated enough we obtain a nontrivial lower bound on $B_{r_i}(f_{U^\perp}, f_{U^\perp})$, which gives DHJ(2).