# Difference between revisions of "An outline of a density-increment argument"

(→Notation and definitions) |
|||

(33 intermediate revisions by 3 users not shown) | |||

Line 10: | Line 10: | ||

In general, if X is a finite set and A and B are subsets of X, we define ''the density of'' A ''in'' B to be <math>|A\cap B|/|B|.</math> | In general, if X is a finite set and A and B are subsets of X, we define ''the density of'' A ''in'' B to be <math>|A\cap B|/|B|.</math> | ||

+ | |||

+ | A ''sequence fragment'' in <math>[3]^n</math> is a subset <math>Z\subset[n]</math> and a function <math>\sigma</math> from Z to <math>\{1,2,3\}.</math> An ''extension'' of the sequence fragment <math>(Z,\sigma)</math> is a sequence <math>x\in[3]^n</math> such that the coordinates of the restriction of x to Z are given by <math>\sigma.</math> For example, if n=5, then <math>**3*2</math> is a sequence fragment, and <math>12332</math> is one of its eight extensions. | ||

==A very broad overview of the argument== | ==A very broad overview of the argument== | ||

Line 19: | Line 21: | ||

However, a different principle comes to the rescue. It is the observation that there are ''two'' kinds of density increment that will be good enough for us. One is a density increment on a subspace, as we already know. The other is a density increment on a 12-set (obtained after passing to a subspace if necessary). Let us spell this out in more detail. Suppose that <math>\mathcal{A}</math> is a line-free set of density <math>\delta</math> and we want to obtain a contradiction. We know two things. First, we can find a 12-subset of a subspace in which <math>\mathcal{A}</math> has increased density <math>\delta+\eta</math>, and secondly, if we can ever prove that there is increased density <math>\delta+\eta/2</math> (say) on a subspace then we are done. However, we can add a third principle, which is that if we can find a 12-subset of a subspace on which the density goes up from <math>\delta+\eta</math> then we are again done. Why? Because we are in the same position as we were in before, but with a slightly higher density, so we can iterate. | However, a different principle comes to the rescue. It is the observation that there are ''two'' kinds of density increment that will be good enough for us. One is a density increment on a subspace, as we already know. The other is a density increment on a 12-set (obtained after passing to a subspace if necessary). Let us spell this out in more detail. Suppose that <math>\mathcal{A}</math> is a line-free set of density <math>\delta</math> and we want to obtain a contradiction. We know two things. First, we can find a 12-subset of a subspace in which <math>\mathcal{A}</math> has increased density <math>\delta+\eta</math>, and secondly, if we can ever prove that there is increased density <math>\delta+\eta/2</math> (say) on a subspace then we are done. However, we can add a third principle, which is that if we can find a 12-subset of a subspace on which the density goes up from <math>\delta+\eta</math> then we are again done. Why? Because we are in the same position as we were in before, but with a slightly higher density, so we can iterate. | ||

− | To be | + | Just to be explicit about this, let us briefly describe how the iteration would go. We start with a line-free set <math>\mathcal{A}</math> of density <math>\delta</math> and hope ultimately to obtain a contradiction. To do this, we adopt as a more immediate aim to find a subspace S inside which <math>\mathcal{A}</math> has density at least <math>\delta+\eta(\delta)/2.</math> As a first step, we find a 12-subset of a subspace, inside which <math>\mathcal{A}</math> has density at least <math>\delta+\eta.</math> We now focus our attention on the subproblem of trying to use this to obtain a subspace inside which <math>\mathcal{A}</math> has density at least <math>\delta+\eta/2.</math> But here again we can use the same sort of idea. We do not have to find such a subspace directly: instead, we could prove that if no such subspace exists (so the density of <math>\mathcal{A}</math> inside all subspaces is at most <math>\delta+\eta/2,)</math> then we can find a 12-subset of some further subspace, inside which <math>\mathcal{A}</math> has density at least <math>\delta+\eta+\theta(\delta).</math> And then we can try again to find a subspace where the density is at least <math>\delta+\eta/2.</math> Repeating this argument leads to an iteration within the main iteration, which will eventually terminate with a density increase on a subspace (since the density within a 12-set cannot exceed 1). And then we have completed another step of the main iteration, so we are done. |

+ | |||

+ | We shall refer to the main iteration (where we aim for a density increase on a subspace) as the ''outer iteration'' and the other iteration (where we aim for a density increase on a 12-set) as the ''inner iteration''. | ||

+ | |||

+ | ==A brief description of the main steps== | ||

+ | |||

+ | '''Step 1.''' Let <math>\mathcal{A}</math> be a line-free set of uniform density <math>\delta.</math> Then there is a subspace S of dimension tending to infinity with n and a 12-subset <math>\mathcal{U}\otimes\mathcal{V}</math> of S such that the [[equal-slices_measure|equal-slices density]] of <math>\mathcal{A}</math> inside <math>\mathcal{U}\otimes\mathcal{V}</math> is at least <math>\delta+2\eta,</math> where <math>\eta=\eta(\delta).</math> Moreover, the equal-slices density of <math>\mathcal{U}\otimes\mathcal{V}</math> is at least <math>\eta(\delta).</math> (The [[Line-free_sets_correlate_locally_with_complexity-1_sets|proof on this wiki]] gives <math>\eta(\delta)=\delta^2/27.)</math> | ||

+ | |||

+ | '''Step 2.''' By passing to a further subspace, we can obtain the same conclusion except that now the density increase drops from <math>\eta</math> to | ||

+ | <math>\eta/2</math> and the density of <math>\mathcal{U}\otimes\mathcal{V}</math> drops to <math>\theta(\delta).</math> | ||

+ | |||

+ | '''Step 3.''' Consider the following distribution on points <math>(U,V,W)\in[3]^n.</math> You first choose a random pair <math>(U_0,V_0)</math> of disjoint subsets of [n] (uniformly from all <math>3^n</math> such pairs) and some random small wildcard sets <math>Z_1,\dots,Z_m</math> that are disjoint from <math>U_0</math> and <math>V_0.</math> You assign values to the wildcards, fill <math>U_0</math> with 1s, <math>V_0</math> with 2s and put 3s everywhere else. Then the resulting point is almost exactly uniformly distributed. (One should think of the size of each <math>Z_i</math> as being randomly chosen and bounded above by a large constant depending on <math>\delta.</math> One should also think of m as a large constant, though eventually it will replaced by a slow-growing function of n, such as <math>\log\log n.</math>) | ||

+ | |||

+ | '''Step 4.''' Let <math>Z=Z_1\cup\dots\cup Z_m.</math> When we randomly choose <math>Z_1,\dots,Z_m</math> and the assignment of their values, we obtain a random sequence fragment <math>(Z,\sigma).</math> If the probability that <math>(U,V,W)\in\mathcal{A}</math> given <math>(Z,\sigma)</math> and also given that <math>U\in\mathcal{U}</math> and <math>V\in\mathcal{V}</math> is ever more than <math>\delta+\eta</math> (plus a tiny tiny amount that can depend on m) then we've got a density increase on a 12-set in a subspace, at least if the density of <math>\mathcal{U}\otimes\mathcal{V}</math> given <math>(Z,\sigma)</math> is not too small. This allows us to do an inner iteration (see the broad overview above). Therefore, we shall try to prove that this must happen. | ||

+ | |||

+ | '''Step 5.''' One potential way of showing that it happens is to prove that there is a positive probability that when we fix <math>(Z,\sigma)</math> then with positive probability <math>U\in\mathcal{U}</math> and <math>V\in\mathcal{V},</math> and furthermore the probability that <math>(U,V,W)\in\mathcal{A}</math> given that <math>U\in\mathcal{U}</math> and <math>V\in\mathcal{V}</math> is ''less'' than <math>\delta+\eta</math> by some positive amount. This follows by averaging: if with positive probability we get a density decrease on a 12-set then with positive probability we get a density increase on a 12-set. | ||

+ | |||

+ | '''Step 6.''' If we randomly choose <math>(U_0,V_0)</math> and <math>Z_1,\dots,Z_m</math> as in Step 3, then there is a positive probability, depending only on m and the density of <math>\mathcal{U}\otimes\mathcal{V},</math> that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}</math> for every sequence <math>(\eta_1,\dots,\eta_m)\in[3]^m.</math> Here, <math>U_0+2V_0+\sum_{i=1}^m\eta_iZ_i</math> stands for the sequence that is 1 on <math>U_0,</math> 2 on <math>V_0,</math> <math>\eta_i</math> on <math>Z_i,</math> and 3 everywhere else. This follows from the [[Sperner's theorem|multidimensional Sperner theorem]], or possibly from very similar arguments rather than from the theorem itself. | ||

+ | |||

+ | '''Step 7.''' For any given choice of <math>Z_1,\dots,Z_m,</math> the set of all <math>(U_0,V_0)</math> such that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}</math> for every sequence <math>(\eta_1,\dots,\eta_m)\in[3]^m</math> is a 12-set, which we denote by <math>\mathcal{U}(Z_1,\dots,Z_m)\otimes\mathcal{V}(Z_1,\dots,Z_m).</math> | ||

+ | |||

+ | '''Step 8.''' By averaging, we can pick <math>Z_1,\dots,Z_m</math> such that <math>\mathcal{U}(Z_1,\dots,Z_m)\otimes\mathcal{V}(Z_1,\dots,Z_m)</math> is dense in <math>[3]^{[n]\setminus Z}.</math> Let us fix this choice of <math>Z_1,\dots,Z_m</math> and set <math>\mathcal{U}'=\mathcal{U}(Z_1,\dots,Z_m)</math> and <math>\mathcal{V}'=\mathcal{V}(Z_1,\dots,Z_m).</math> | ||

+ | |||

+ | '''Step 9.''' Define a bipartite graph <math>G_1</math> by joining a pair of disjoint sets <math>(U_0,V_0)</math> to an m-tuple <math>(\eta_1,\dots,\eta_m)\in[3]^m</math> if and only if <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}.</math> Define another bipartite graph <math>G_2</math> by joining <math>(U_0,V_0)</math> to <math>(\eta_1,\dots,\eta_m)</math> if and only if <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}.</math> | ||

+ | |||

+ | '''Step 10.''' If any vertex <math>(U_0,V_0)</math> has degree at least <math>(\delta+\eta/2)3^m</math> in <math>G_1,</math> then <math>\mathcal{A}</math> has density <math>\delta+\eta/2</math> in the subspace <math>U_0+2V_0+\langle Z_1,\dots,Z_m\rangle,</math> which gives us our density increase on a subspace and completes the outer iteration. (Recall that eventually m is allowed to tend to infinity.) | ||

+ | |||

+ | '''Step 11.''' Therefore, we may assume that the maximum degree, and hence the average degree, of the vertices <math>(U_0,V_0)\in\mathcal{U}'\otimes\mathcal{V}'</math> in <math>G_1</math> is at most <math>(\delta+\eta/2)3^m,</math> which implies, by averaging, that the average <math>G_1</math>-degree of the <math>(\eta_1,\dots,\eta_m)</math> inside <math>\mathcal{U}'\otimes\mathcal{V}'</math> is at most <math>(\delta+\eta/2)|\mathcal{U}'\otimes\mathcal{V}'|.</math> | ||

+ | |||

+ | '''Step 12.''' But ''every'' vertex in <math>\mathcal{U}'\otimes\mathcal{V}'</math> is joined to every sequence <math>(\eta_1,\dots,\eta_m).</math> Therefore, on average, the probability that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}</math> given that <math>(U_0,V_0)\in\mathcal{U}'\otimes\mathcal{V}'</math> is at most <math>\delta+\eta/2.</math> | ||

+ | |||

+ | '''Step 13.''' By Step 5, we are free to assume that the probability that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}</math> given that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}</math> is at least <math>\delta+\eta</math> minus a very tiny amount. | ||

+ | |||

+ | '''Step 14.''' It is possible to split the set of <math>(U_0,V_0)</math> such that <math>U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}</math> into four subsets, each of which is a 12-set. | ||

+ | |||

+ | '''Step 15.''' Therefore, by averaging, we can get a small density increase on a 12-set, thereby completing the inner iteration. | ||

+ | |||

+ | ==More details== | ||

+ | |||

+ | The arguments below are presented in more detail in the article on [[passing between measures]]. | ||

+ | |||

+ | ===A general averaging argument=== | ||

+ | |||

+ | We shall keep finding ourselves in the following situation. We have a finite set X and a partition of X into subsets <math>X_1,\dots,X_r.</math> We also have a subset <math>A\subset X</math> of density <math>\delta.</math> We can of course conclude that there exists i such that the density of A in <math>X_i</math> is at least <math>\delta,</math> but we need more than this: we want <math>X_i</math> not to be too small. For this we have to make a small sacrifice: we go for density at least <math>\delta-\theta,</math> and that then allows us to ensure that <math>X_i</math> has density at least <math>\theta r^{-1}.</math> (Proof: if not, then call <math>X_i</math> ''small'' if it has density less than <math>\theta r^{-1}.</math> The union of all small sets has density less than <math>\theta,</math> and the density of the part of A that lives in large sets is at most <math>\delta-\theta,</math> contradicting the fact that A has density <math>\delta.</math>) | ||

+ | |||

+ | ===A lemma about equal-slices measure=== | ||

+ | |||

+ | We would like to prove that if a subset <math>\mathcal{A}</math> of <math>[3]^n</math> has density <math>\delta</math> in equal-slices measure, then one can pass to a subspace with dimension tending to infinity such that the density of <math>\mathcal{A}</math> in the subspace is <math>\delta-o(1).</math> Here we give an informal argument. | ||

+ | |||

+ | The basic idea is to show that the equal slices measure can be approximated (in <math>L_1,</math> or equivalently in total variation distance) by a positive linear combination of uniform measures on subspaces. Here is a sketch of the argument. Consider first the measure you get by fixing r, choosing a random point x, choosing m places randomly where x=0 and randomly deciding whether to change each of the chosen coordinates to a 1. Then by symmetry the distribution of the resulting point y will be a linear combination of uniform distributions on the layers within m of r. Furthermore, the probability that you end up in layer r+k depends only on k. So now if you choose a random point in equal-slices measure and then do the above process, you will end up randomly choosing a slice and randomly moving to a slice just above it, in the same way whatever slice you start with -- with the exception that if you start with r>n-m then the above process cannot be carried out. But that is very unlikely if m is small, so the resulting distribution is very close to the equal-slices density since the measure of each slice is roughly the convolution of some measure on a small interval with a uniform measure on <math>[n].</math> | ||

+ | |||

+ | '''Corollary.''' Let m be small compared with n and let <math>\mathcal{A}</math> have equal-slices density <math>\delta.</math> Then there is a combinatorial subspace on which <math>\mathcal{A}</math> has uniform density at least <math>\delta-o(1).</math> | ||

+ | |||

+ | '''Proof.''' Equal-slices measure <math>\mu</math> can be approximated in total variation distance by a convex combination <math>\sum_{i=1}^t\lambda_i\alpha_i</math> of uniform measures <math>\alpha_i</math> on subspaces. Therefore, if <math>\mu(\mathcal{A})=\delta,</math> then <math>\sum_i\lambda_i\alpha_i(\mathcal{A})\geq\delta-o(1),</math> from which it follows by averaging that there exists i such that <math>\alpha_i(\mathcal{A})\geq\delta-o(1).</math> | ||

+ | |||

+ | ===A sketch proof of Step 2=== | ||

+ | |||

+ | Continuing with the notation of the proof just above, we are given that <math>\mu(\mathcal{A})\geq(\delta+\eta)\mu(\mathcal{U}\otimes\mathcal{V})</math> and that <math>\mu(\mathcal{U}\otimes\mathcal{V})\geq\gamma.</math> We would like to find i such that <math>\alpha_i(\mathcal{U}\otimes\mathcal{V})</math> is not too small and <math>\alpha_i(\mathcal{A})\geq(\delta+\eta/2)\alpha_i(\mathcal{U}\otimes\mathcal{V}).</math> We are assuming for convenience that <math>\mathcal{A}\subset\mathcal{U}\otimes\mathcal{V},</math> since for the time being it is the intersection of the two sets that interests us. | ||

+ | |||

+ | We know that <math>\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V})\geq\gamma-o(1),</math> and that <math>\sum_i\lambda_i\alpha_i(\mathcal{A})\geq(\delta+\eta-o(1))\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}).</math> | ||

+ | |||

+ | We are now in a situation similar to that of the general averaging argument above, but not quite similar enough for it to be directly applicable. (At some point I'll write a more general version that ''is'' directly applicable---this is easy to do.) Let B (for "bad") be the set of all i such that <math>\alpha_i(\mathcal{U}\otimes\mathcal{V})<\gamma\eta/3.</math> Then <math>\sum_{i\in B}\lambda_i\alpha_i(\mathcal{A})<\gamma\eta/3\leq (2\eta/5)\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}).</math> Therefore, <math>\sum_{i\notin B}\lambda_i\alpha_i(\mathcal{A})> (\delta+\eta/2)\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}),</math> and from this it follows that we can find a good i (that is, one such that <math>\alpha_i(\mathcal{U}\otimes\mathcal{V})\geq\gamma\eta/3</math>) such that <math>\alpha_i(\mathcal{A})\geq(\delta+\eta/2)\alpha_i(\mathcal{U}\otimes\mathcal{V}).</math> | ||

+ | |||

+ | ===A sketch proof of Step 3=== | ||

+ | |||

+ | It would be good to have general and very streamlined arguments for these kinds of statements. Here's a rough argument though. Let r and s be three small numbers and consider the following distribution: you choose a random sequence x and then randomly change r of its 3s to 1s and s of its 3s to 2s. Since the original point was random, with high probability it belongs to a [[slice]] <math>\Gamma_{a,b,c}</math> for which a, b and c are all close to n/3. The final point is then a random element of the slice <math>\Gamma_{a+r,b+s,c-r-s}.</math> Since a, b and c are close to n/3 and r and s are small, the sizes of the slices <math>\Gamma_{a,b,c}</math> and <math>\Gamma_{a+r,b+s,c-r-s}</math> are roughly equal. Also, the map <math>(a,b,c)\mapsto(a+r,b+s,c-r-s)</math> is invertible almost everywhere, and certainly when a, b and c are near the centre. Therefore, the new distribution is approximately uniform. | ||

− | + | The distribution we care about in Step 3 is an average of distributions of the kind defined in the previous paragraph, and therefore it too is approximately uniform. |

## Latest revision as of 04:32, 9 March 2009

## Contents

## Introduction

One of the proof strategies we have considered seriously is the density-increment method. The idea here is to prove that if [math]\mathcal{A}[/math] is a subset of [math][3]^n[/math] of density [math]\delta[/math] with no combinatorial line then there is a combinatorial subspace S of dimension tending to infinity with n such that the density of [math]\mathcal{A}[/math] inside S is larger than [math]\delta[/math] by an amount that depends on [math]\delta[/math] only. If we can prove such a result, then we can drop down to S and repeat the argument. If the initial dimension n is large enough, then we push the density up to 1 before we run out of dimensions, and thereby obtain a contradiction.

## Notation and definitions

Let x be an element of [math][3]^n.[/math] Write [math]U(x),V(x)[/math] and [math]W(x)[/math] for the sets of i such that [math]x_i=1,2[/math] and 3, respectively. These are called the *1-set*, the *2-set* and the *3-set* of x. The map [math]x\mapsto(U(x),V(x),W(x))[/math] is a one-to-one correspondence between [math][3]^n[/math] and the set of all triples [math](U,V,W)[/math] of sets that partition [math][n].[/math] We shall use the notation x and also the notation [math](U,V,W)[/math] and will pass rapidly between them.

Let [math]\mathcal{U}[/math] and [math]\mathcal{V}[/math] be collections of subsets of [math][n].[/math] We shall write [math]\mathcal{U}\otimes\mathcal{V}[/math] for the collection of all sequences [math]x[/math] such that [math]U(x)\in\mathcal{U}[/math] and [math]V(x)\in\mathcal{V}.[/math] A set of the form [math]\mathcal{U}\otimes\mathcal{V}[/math] will be called a *12-set*.

In general, if X is a finite set and A and B are subsets of X, we define *the density of* A *in* B to be [math]|A\cap B|/|B|.[/math]

A *sequence fragment* in [math][3]^n[/math] is a subset [math]Z\subset[n][/math] and a function [math]\sigma[/math] from Z to [math]\{1,2,3\}.[/math] An *extension* of the sequence fragment [math](Z,\sigma)[/math] is a sequence [math]x\in[3]^n[/math] such that the coordinates of the restriction of x to Z are given by [math]\sigma.[/math] For example, if n=5, then [math]**3*2[/math] is a sequence fragment, and [math]12332[/math] is one of its eight extensions.

## A very broad overview of the argument

One of the main features of the argument is that it is *local* in character. In other words, it does not try to understand fully the obstructions to uniformity that cause a set to fail to have the expected number of combinatorial lines. Instead, we are content to find non-quasirandom behaviour inside small subspaces of [math][3]^n.[/math] This happens right from the start. A conjecture that one might hope to be true is that if a set [math]\mathcal{A}[/math] is line-free, then there is a dense 12-set [math]\mathcal{U}\otimes\mathcal{V}[/math] such that the density of [math]\mathcal{A}[/math] inside [math]\mathcal{U}\otimes\mathcal{V}[/math] is a bit larger than the density of [math]\mathcal{A}[/math] in [math][3]^n.[/math] However, this is not in fact our first step. Instead, we argue that this kind of correlation happens when one restricts to an appropriate subspace. This part of the argument is given in some detail on a different page, (where what we are calling 12-sets are called special sets of complexity 1).

Once we have correlation with a 12-set (even if we have had to pass to a subspace to get it) it is natural to try to prove that a 12-set can be partitioned into combinatorial subspaces, since then by averaging we could show that [math]\mathcal{A}[/math] correlated with one of those subspaces, and we would be done. An argument similar to the proof of the multidimensional Sperner theorem can be used to prove that a dense 12-set contains many combinatorial subspaces. However, it is not at all obvious that one can partition the 12-set into such subspaces (even if small errors are allowed).

However, a different principle comes to the rescue. It is the observation that there are *two* kinds of density increment that will be good enough for us. One is a density increment on a subspace, as we already know. The other is a density increment on a 12-set (obtained after passing to a subspace if necessary). Let us spell this out in more detail. Suppose that [math]\mathcal{A}[/math] is a line-free set of density [math]\delta[/math] and we want to obtain a contradiction. We know two things. First, we can find a 12-subset of a subspace in which [math]\mathcal{A}[/math] has increased density [math]\delta+\eta[/math], and secondly, if we can ever prove that there is increased density [math]\delta+\eta/2[/math] (say) on a subspace then we are done. However, we can add a third principle, which is that if we can find a 12-subset of a subspace on which the density goes up from [math]\delta+\eta[/math] then we are again done. Why? Because we are in the same position as we were in before, but with a slightly higher density, so we can iterate.

Just to be explicit about this, let us briefly describe how the iteration would go. We start with a line-free set [math]\mathcal{A}[/math] of density [math]\delta[/math] and hope ultimately to obtain a contradiction. To do this, we adopt as a more immediate aim to find a subspace S inside which [math]\mathcal{A}[/math] has density at least [math]\delta+\eta(\delta)/2.[/math] As a first step, we find a 12-subset of a subspace, inside which [math]\mathcal{A}[/math] has density at least [math]\delta+\eta.[/math] We now focus our attention on the subproblem of trying to use this to obtain a subspace inside which [math]\mathcal{A}[/math] has density at least [math]\delta+\eta/2.[/math] But here again we can use the same sort of idea. We do not have to find such a subspace directly: instead, we could prove that if no such subspace exists (so the density of [math]\mathcal{A}[/math] inside all subspaces is at most [math]\delta+\eta/2,)[/math] then we can find a 12-subset of some further subspace, inside which [math]\mathcal{A}[/math] has density at least [math]\delta+\eta+\theta(\delta).[/math] And then we can try again to find a subspace where the density is at least [math]\delta+\eta/2.[/math] Repeating this argument leads to an iteration within the main iteration, which will eventually terminate with a density increase on a subspace (since the density within a 12-set cannot exceed 1). And then we have completed another step of the main iteration, so we are done.

We shall refer to the main iteration (where we aim for a density increase on a subspace) as the *outer iteration* and the other iteration (where we aim for a density increase on a 12-set) as the *inner iteration*.

## A brief description of the main steps

**Step 1.** Let [math]\mathcal{A}[/math] be a line-free set of uniform density [math]\delta.[/math] Then there is a subspace S of dimension tending to infinity with n and a 12-subset [math]\mathcal{U}\otimes\mathcal{V}[/math] of S such that the equal-slices density of [math]\mathcal{A}[/math] inside [math]\mathcal{U}\otimes\mathcal{V}[/math] is at least [math]\delta+2\eta,[/math] where [math]\eta=\eta(\delta).[/math] Moreover, the equal-slices density of [math]\mathcal{U}\otimes\mathcal{V}[/math] is at least [math]\eta(\delta).[/math] (The proof on this wiki gives [math]\eta(\delta)=\delta^2/27.)[/math]

**Step 2.** By passing to a further subspace, we can obtain the same conclusion except that now the density increase drops from [math]\eta[/math] to
[math]\eta/2[/math] and the density of [math]\mathcal{U}\otimes\mathcal{V}[/math] drops to [math]\theta(\delta).[/math]

**Step 3.** Consider the following distribution on points [math](U,V,W)\in[3]^n.[/math] You first choose a random pair [math](U_0,V_0)[/math] of disjoint subsets of [n] (uniformly from all [math]3^n[/math] such pairs) and some random small wildcard sets [math]Z_1,\dots,Z_m[/math] that are disjoint from [math]U_0[/math] and [math]V_0.[/math] You assign values to the wildcards, fill [math]U_0[/math] with 1s, [math]V_0[/math] with 2s and put 3s everywhere else. Then the resulting point is almost exactly uniformly distributed. (One should think of the size of each [math]Z_i[/math] as being randomly chosen and bounded above by a large constant depending on [math]\delta.[/math] One should also think of m as a large constant, though eventually it will replaced by a slow-growing function of n, such as [math]\log\log n.[/math])

**Step 4.** Let [math]Z=Z_1\cup\dots\cup Z_m.[/math] When we randomly choose [math]Z_1,\dots,Z_m[/math] and the assignment of their values, we obtain a random sequence fragment [math](Z,\sigma).[/math] If the probability that [math](U,V,W)\in\mathcal{A}[/math] given [math](Z,\sigma)[/math] and also given that [math]U\in\mathcal{U}[/math] and [math]V\in\mathcal{V}[/math] is ever more than [math]\delta+\eta[/math] (plus a tiny tiny amount that can depend on m) then we've got a density increase on a 12-set in a subspace, at least if the density of [math]\mathcal{U}\otimes\mathcal{V}[/math] given [math](Z,\sigma)[/math] is not too small. This allows us to do an inner iteration (see the broad overview above). Therefore, we shall try to prove that this must happen.

**Step 5.** One potential way of showing that it happens is to prove that there is a positive probability that when we fix [math](Z,\sigma)[/math] then with positive probability [math]U\in\mathcal{U}[/math] and [math]V\in\mathcal{V},[/math] and furthermore the probability that [math](U,V,W)\in\mathcal{A}[/math] given that [math]U\in\mathcal{U}[/math] and [math]V\in\mathcal{V}[/math] is *less* than [math]\delta+\eta[/math] by some positive amount. This follows by averaging: if with positive probability we get a density decrease on a 12-set then with positive probability we get a density increase on a 12-set.

**Step 6.** If we randomly choose [math](U_0,V_0)[/math] and [math]Z_1,\dots,Z_m[/math] as in Step 3, then there is a positive probability, depending only on m and the density of [math]\mathcal{U}\otimes\mathcal{V},[/math] that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}[/math] for every sequence [math](\eta_1,\dots,\eta_m)\in[3]^m.[/math] Here, [math]U_0+2V_0+\sum_{i=1}^m\eta_iZ_i[/math] stands for the sequence that is 1 on [math]U_0,[/math] 2 on [math]V_0,[/math] [math]\eta_i[/math] on [math]Z_i,[/math] and 3 everywhere else. This follows from the multidimensional Sperner theorem, or possibly from very similar arguments rather than from the theorem itself.

**Step 7.** For any given choice of [math]Z_1,\dots,Z_m,[/math] the set of all [math](U_0,V_0)[/math] such that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}[/math] for every sequence [math](\eta_1,\dots,\eta_m)\in[3]^m[/math] is a 12-set, which we denote by [math]\mathcal{U}(Z_1,\dots,Z_m)\otimes\mathcal{V}(Z_1,\dots,Z_m).[/math]

**Step 8.** By averaging, we can pick [math]Z_1,\dots,Z_m[/math] such that [math]\mathcal{U}(Z_1,\dots,Z_m)\otimes\mathcal{V}(Z_1,\dots,Z_m)[/math] is dense in [math][3]^{[n]\setminus Z}.[/math] Let us fix this choice of [math]Z_1,\dots,Z_m[/math] and set [math]\mathcal{U}'=\mathcal{U}(Z_1,\dots,Z_m)[/math] and [math]\mathcal{V}'=\mathcal{V}(Z_1,\dots,Z_m).[/math]

**Step 9.** Define a bipartite graph [math]G_1[/math] by joining a pair of disjoint sets [math](U_0,V_0)[/math] to an m-tuple [math](\eta_1,\dots,\eta_m)\in[3]^m[/math] if and only if [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}.[/math] Define another bipartite graph [math]G_2[/math] by joining [math](U_0,V_0)[/math] to [math](\eta_1,\dots,\eta_m)[/math] if and only if [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}.[/math]

**Step 10.** If any vertex [math](U_0,V_0)[/math] has degree at least [math](\delta+\eta/2)3^m[/math] in [math]G_1,[/math] then [math]\mathcal{A}[/math] has density [math]\delta+\eta/2[/math] in the subspace [math]U_0+2V_0+\langle Z_1,\dots,Z_m\rangle,[/math] which gives us our density increase on a subspace and completes the outer iteration. (Recall that eventually m is allowed to tend to infinity.)

**Step 11.** Therefore, we may assume that the maximum degree, and hence the average degree, of the vertices [math](U_0,V_0)\in\mathcal{U}'\otimes\mathcal{V}'[/math] in [math]G_1[/math] is at most [math](\delta+\eta/2)3^m,[/math] which implies, by averaging, that the average [math]G_1[/math]-degree of the [math](\eta_1,\dots,\eta_m)[/math] inside [math]\mathcal{U}'\otimes\mathcal{V}'[/math] is at most [math](\delta+\eta/2)|\mathcal{U}'\otimes\mathcal{V}'|.[/math]

**Step 12.** But *every* vertex in [math]\mathcal{U}'\otimes\mathcal{V}'[/math] is joined to every sequence [math](\eta_1,\dots,\eta_m).[/math] Therefore, on average, the probability that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}[/math] given that [math](U_0,V_0)\in\mathcal{U}'\otimes\mathcal{V}'[/math] is at most [math]\delta+\eta/2.[/math]

**Step 13.** By Step 5, we are free to assume that the probability that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{A}[/math] given that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}[/math] is at least [math]\delta+\eta[/math] minus a very tiny amount.

**Step 14.** It is possible to split the set of [math](U_0,V_0)[/math] such that [math]U_0+2V_0+\sum_i\eta_iZ_i\in\mathcal{U}\otimes\mathcal{V}[/math] into four subsets, each of which is a 12-set.

**Step 15.** Therefore, by averaging, we can get a small density increase on a 12-set, thereby completing the inner iteration.

## More details

The arguments below are presented in more detail in the article on passing between measures.

### A general averaging argument

We shall keep finding ourselves in the following situation. We have a finite set X and a partition of X into subsets [math]X_1,\dots,X_r.[/math] We also have a subset [math]A\subset X[/math] of density [math]\delta.[/math] We can of course conclude that there exists i such that the density of A in [math]X_i[/math] is at least [math]\delta,[/math] but we need more than this: we want [math]X_i[/math] not to be too small. For this we have to make a small sacrifice: we go for density at least [math]\delta-\theta,[/math] and that then allows us to ensure that [math]X_i[/math] has density at least [math]\theta r^{-1}.[/math] (Proof: if not, then call [math]X_i[/math] *small* if it has density less than [math]\theta r^{-1}.[/math] The union of all small sets has density less than [math]\theta,[/math] and the density of the part of A that lives in large sets is at most [math]\delta-\theta,[/math] contradicting the fact that A has density [math]\delta.[/math])

### A lemma about equal-slices measure

We would like to prove that if a subset [math]\mathcal{A}[/math] of [math][3]^n[/math] has density [math]\delta[/math] in equal-slices measure, then one can pass to a subspace with dimension tending to infinity such that the density of [math]\mathcal{A}[/math] in the subspace is [math]\delta-o(1).[/math] Here we give an informal argument.

The basic idea is to show that the equal slices measure can be approximated (in [math]L_1,[/math] or equivalently in total variation distance) by a positive linear combination of uniform measures on subspaces. Here is a sketch of the argument. Consider first the measure you get by fixing r, choosing a random point x, choosing m places randomly where x=0 and randomly deciding whether to change each of the chosen coordinates to a 1. Then by symmetry the distribution of the resulting point y will be a linear combination of uniform distributions on the layers within m of r. Furthermore, the probability that you end up in layer r+k depends only on k. So now if you choose a random point in equal-slices measure and then do the above process, you will end up randomly choosing a slice and randomly moving to a slice just above it, in the same way whatever slice you start with -- with the exception that if you start with r>n-m then the above process cannot be carried out. But that is very unlikely if m is small, so the resulting distribution is very close to the equal-slices density since the measure of each slice is roughly the convolution of some measure on a small interval with a uniform measure on [math][n].[/math]

**Corollary.** Let m be small compared with n and let [math]\mathcal{A}[/math] have equal-slices density [math]\delta.[/math] Then there is a combinatorial subspace on which [math]\mathcal{A}[/math] has uniform density at least [math]\delta-o(1).[/math]

**Proof.** Equal-slices measure [math]\mu[/math] can be approximated in total variation distance by a convex combination [math]\sum_{i=1}^t\lambda_i\alpha_i[/math] of uniform measures [math]\alpha_i[/math] on subspaces. Therefore, if [math]\mu(\mathcal{A})=\delta,[/math] then [math]\sum_i\lambda_i\alpha_i(\mathcal{A})\geq\delta-o(1),[/math] from which it follows by averaging that there exists i such that [math]\alpha_i(\mathcal{A})\geq\delta-o(1).[/math]

### A sketch proof of Step 2

Continuing with the notation of the proof just above, we are given that [math]\mu(\mathcal{A})\geq(\delta+\eta)\mu(\mathcal{U}\otimes\mathcal{V})[/math] and that [math]\mu(\mathcal{U}\otimes\mathcal{V})\geq\gamma.[/math] We would like to find i such that [math]\alpha_i(\mathcal{U}\otimes\mathcal{V})[/math] is not too small and [math]\alpha_i(\mathcal{A})\geq(\delta+\eta/2)\alpha_i(\mathcal{U}\otimes\mathcal{V}).[/math] We are assuming for convenience that [math]\mathcal{A}\subset\mathcal{U}\otimes\mathcal{V},[/math] since for the time being it is the intersection of the two sets that interests us.

We know that [math]\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V})\geq\gamma-o(1),[/math] and that [math]\sum_i\lambda_i\alpha_i(\mathcal{A})\geq(\delta+\eta-o(1))\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}).[/math]

We are now in a situation similar to that of the general averaging argument above, but not quite similar enough for it to be directly applicable. (At some point I'll write a more general version that *is* directly applicable---this is easy to do.) Let B (for "bad") be the set of all i such that [math]\alpha_i(\mathcal{U}\otimes\mathcal{V})\lt\gamma\eta/3.[/math] Then [math]\sum_{i\in B}\lambda_i\alpha_i(\mathcal{A})\lt\gamma\eta/3\leq (2\eta/5)\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}).[/math] Therefore, [math]\sum_{i\notin B}\lambda_i\alpha_i(\mathcal{A})\gt (\delta+\eta/2)\sum_i\lambda_i\alpha_i(\mathcal{U}\otimes\mathcal{V}),[/math] and from this it follows that we can find a good i (that is, one such that [math]\alpha_i(\mathcal{U}\otimes\mathcal{V})\geq\gamma\eta/3[/math]) such that [math]\alpha_i(\mathcal{A})\geq(\delta+\eta/2)\alpha_i(\mathcal{U}\otimes\mathcal{V}).[/math]

### A sketch proof of Step 3

It would be good to have general and very streamlined arguments for these kinds of statements. Here's a rough argument though. Let r and s be three small numbers and consider the following distribution: you choose a random sequence x and then randomly change r of its 3s to 1s and s of its 3s to 2s. Since the original point was random, with high probability it belongs to a slice [math]\Gamma_{a,b,c}[/math] for which a, b and c are all close to n/3. The final point is then a random element of the slice [math]\Gamma_{a+r,b+s,c-r-s}.[/math] Since a, b and c are close to n/3 and r and s are small, the sizes of the slices [math]\Gamma_{a,b,c}[/math] and [math]\Gamma_{a+r,b+s,c-r-s}[/math] are roughly equal. Also, the map [math](a,b,c)\mapsto(a+r,b+s,c-r-s)[/math] is invertible almost everywhere, and certainly when a, b and c are near the centre. Therefore, the new distribution is approximately uniform.

The distribution we care about in Step 3 is an average of distributions of the kind defined in the previous paragraph, and therefore it too is approximately uniform.