Difference between revisions of "Line-free sets correlate locally with complexity-1 sets"

From Polymath Wiki
Jump to: navigation, search
(Step 5)
(Old stuff, probably to be junked: Removed junk (but it can be retrieved from page history).)
Line 72: Line 72:
==Old stuff, probably to be junked==
For convenience we shall use [[equal-slices measure]] but this is not fundamental to the argument.
The model of equal-slices measure we use is this. If p, q and r are non-negative real numbers with p+q+r=1, and  <math>(X_1,\dots,X_n)</math> are independent random variables with probabilities p, q and r of equalling 1, 2 and 3, respectively, then we define <math>\mu_{p,q,r}(\mathcal{A})</math> to be the probability that <math>(X_1,\dots,X_n)</math> lies in <math>\mathcal{A}.</math> We then define the ''density'' of <math>\mathcal{A}</math> to be the average of <math>\mu_{p,q,r}(\mathcal{A})</math> over all possible triples p,q,r.
Now let us do some averaging. Let us write <math>\delta_{p,q,r}</math> for <math>\mu_{p,q,r}(\mathcal{A}).</math> Let us also use the notation (U,V,W) for the <math>x\in[3]^n</math> that has 1-set U, 2-set V and 3-set W.
First, we prove two similar lemmas  that are very simple, but also rather useful.
'''Lemma 1.''' ''The probability distribution of (U,V,W) conditioned on W is the equal-slices measure of (U,V) with ground set <math>[n]\setminus W.</math>''
'''Proof.''' We are asking for the distribution of the random variable <math>(X_1,\dots,X_n)</math> when we condition on the event that <math>W_i=3</math> for every <math>i\in W.</math> Let us condition further on the value of r. Then for each fixed p, q such that p+q=1-r, and each <math>i\notin W,</math> we have that <math>X_i=1</math> with probability p/(1-r) and <math>X_i=2</math> with probability q/(1-r). When we average over p and q, the numbers p/(1-r) and q/(1-r) are uniformly distributed over pairs of positive reals that add up to 1. For each r, we therefore obtain precisely the equal-slices probability distribution on the random variables <math>X_i</math>with <math>i\notin W,</math> so the same is true when we average over r.<math>\Box</math>
It is obviously not the case that the set W in a random triple (U,V,W) is distributed according to equal-slices measure: rather, we choose r with density 2(1-r) and then choose elements of W independently with probability r. When we refer to a random set W or discuss probabilities of events associated with W, it will be this measure that we refer to. (In other words, we take the marginal distribution on W, just as we should.)
To be continued, but possibly not for a while as I have a lot to do in the near future.

Revision as of 17:16, 25 February 2009

Warning: I think I can prove something rigorously, but will not be sure until it is completely written up. The writing up will continue when I have the time to do it.

The aim of this page is to present a proof that if [math]\mathcal{A}[/math] is a dense subset of [math][3]^n[/math] that contains no combinatorial line, then there is a combinatorial subspace X of [math]\mathcal{A}[/math] with dimension tending to infinity and a dense subset [math]\mathcal{B}[/math] of X of complexity 1. It is written in a slightly unconventional way, with first a short sketch, then a longer one that fleshes out a few details, and then a longer one still. That way, even while it is incomplete it should be understandable to some extent, and if I get stuck then it will be clearer where the problem lies.

Short sketch of argument


Throughout this sketch, [math]\mathcal{A}[/math] refers to a subset of [math][3]^n[/math] of density [math]\delta[/math] in the uniform distribution on [math][3]^n.[/math] We shall sometimes use letters such as x, y and z for elements of [math][3]^n[/math] and we shall sometimes write them as triples (U,V,W) of sets that partition [n]. A triple of sets corresponds to the 1-set, the 2-set and the 3-set of a sequence. We shall pass freely between the two ways of thinking about [math][3]^n,[/math] at each stage using whichever is more convenient.

If (U,V,W) is an element of [math][3]^n[/math] and (U',V',W') is an arbitrary triple of disjoint sets (not necessarily partitioning [n]), we shall write (U,V,W)++(U',V',W') for the sequence obtained from (U,V,W) by changing everything in U' to 1, everything in V' to 2, and everything in W' to 3. For example, writing § for an unspecified coordinate, we have 331322311++§§§1§22§3=331122213. (We think of (U',V',W') as "overwriting" (U,V,W).) If Z is a subset of [n], we shall also write [math](U,V,W)++[3]^Z[/math] for the combinatorial subspace consisting of all [math](U,V,W)++(U',V',W')[/math] with [math](U',V',W')\in[3]^Z,[/math] and [math](U,V,W)++[2]^Z[/math] for the subset of this combinatorial subspace consisting of all points with [math]W'=\emptyset.[/math]

An unexpected aspect of the proof is that we shall use both equal-slices measure and uniform measure. This decision was not arbitrary: it turns out that either measure on its own has inconvenient features that make the proof difficult, but that these difficulties can be be dealt with by passing from one to the other. (Roughly speaking, uniform measure is better for averaging arguments over subspaces, but equal-slices measure is better when we want Varnavides-type statements.) For this we need a tighter version of the statement that the versions DHJ(3) for the two measures are equivalent. We need that any set of density [math]\delta[/math] in one of the measures can be restricted to a combinatorial subspace where its density is at least [math]\delta-\eta[/math] in the other. I'm fairly sure that the argument for the equivalence of the two versions (given here) can be strengthened to give this conclusion, and will in due course make absolutely sure.

The main steps

Step 1. If a, b and c are all within [math]C\sqrt n[/math] of n/3 and a+b+c=n, and if r, s and t are three integers that add up to 0 and are all at most [math]m=o(\sqrt{n})[/math] in modulus, then the size of the slice [math]\Gamma_{a,b,c}[/math] is 1+o(1) times the size of the slice [math]\Gamma_{a+r,b+s,c+t}.[/math]

Step 2. Let [math]\mu[/math] be some probability distribution on combinatorial subspaces S of [math][3]^n[/math] and for each S let [math]\sigma_S[/math] be a probability distribution on S. (We shall abbreviate [math]\sigma_S[/math] to [math]\sigma[/math] if S is clear from the context.) Let [math]\nu[/math] be the distribution on [math][3]^n[/math] that results if you choose a subspace S at random according to [math]\mu[/math] and then a random point x of S according to [math]\sigma[/math]. Suppose that the distribution [math]\nu[/math] is approximately uniform and the distributions [math]\sigma_S[/math] are reasonably nice. Then we may assume that for [math]\mu[/math]-almost all subpaces [math]S\subset[3]^n[/math] the [math]\sigma[/math]-density of [math]\mathcal{A}\cap S[/math] is at least [math](\delta-\eta).[/math]

Step 3. By 1,2 and an averaging argument, we find [math](U,V,W)[/math] and [math]Z\subset U\cup V[/math] of size [math]o(\sqrt{n})[/math] (but not much smaller than [math]\sqrt{n}[/math]) with two properties. First, out of all pairs [math](U',V')\in[2]^Z,[/math] the equal-slices proportion such that [math](U,V,W)++(U',V',\emptyset)[/math] belongs to [math]\mathcal{A}[/math] is at least [math]\delta/3.[/math] Secondly, out of all triples [math](U',V',W')\in[3]^Z,[/math] the equal-slices proportion such that [math](U,V,W)++(U',V',W')[/math] belongs to [math]\mathcal{A}[/math] is at least [math]\delta-\eta.[/math]

Step 4. Fixing such (U,V,W) and Z, let us write (U',V',W') instead of (U,V,W)++(U',V',W'). Then if [math]U_1\subset U_2[/math] and [math](U_1,Z\setminus U_1,\emptyset)[/math] and [math](U_2,Z\setminus U_2,\emptyset)[/math] both belong to [math]\mathcal{A},[/math] then, writing [math]V_i[/math] for [math]Z\setminus U_i,[/math] we have that [math](U_1,V_2,Z\setminus(U_1\cup V_2))[/math] does not belong to [math]\mathcal{A}.[/math]

Step 5. Let [math]\mathcal{U}[/math] be the set of all U such that [math](U,Z\setminus U,\emptyset)[/math] belongs to [math]\mathcal{A},[/math] and let [math]\mathcal{V}=\{Z\setminus U:U\in\mathcal{U}\}.[/math] Then the set of all pairs [math](U_1,V_2)[/math] such that [math]U_1\in\mathcal{U}[/math] and [math]V_2\in\mathcal{V}[/math] is equal-slices dense (this follows from the proof of Sperner's theorem). It follows that [math]\mathcal{A}[/math] is disjoint from an equal-slices-dense set of complexity 1.

Step 6. We can partition the set of all disjoint pairs [math](U_1,V_2)[/math] according to which of the sets [math]\mathcal{U}\times\mathcal{V},[/math] [math]\mathcal{U}\times\mathcal{V}^c,[/math] [math]\mathcal{U}^c\times\mathcal{V}[/math] or [math]\mathcal{U}^c\times\mathcal{V}^c[/math] they belong to. There must be at least one of the three sets other than [math]\mathcal{U}\times\mathcal{V}[/math] in which [math]\mathcal{A}[/math] has a density increment. Thus, we have a local equal-slices density increment on a set of complexity 1.

Further details

Step 1

This one is easy. First let us prove the comparable result in [math][2]^n.[/math] That is, let us prove that if a is within [math]O(\sqrt{n})[/math] of n/2 and [math]r=o(\sqrt{n},[/math] then [math]\binom na=(1+o(1))\binom n{a+r}.[/math] This is because the ratio of [math]\binom nk[/math] to [math]\binom n{k+1}[/math] is (k+1)/(n-k), so if [math]k=n/2+O(\sqrt{n}),[/math] then the ratio is [math]1+O(n^{-1/2}).[/math] If we now multiply [math]r=o(\sqrt{n})[/math] such ratios together we get [math]1+o(1).[/math]

To get from there to a comparable statement about the sizes of slices in [math][3]^n,[/math] note that we can get from [math](a,b,c)[/math] to [math](a+r,b+s,c+t)[/math] by two operations where we add [math]o(\sqrt n)[/math] to one coordinate and subtract [math]o(\sqrt{n})[/math] from another. Each time we do so, we multiply by [math]1+o(1),[/math] by the result for [math][2]^n[/math] (but applied to [math][2]^p[/math] with p close to 2n/3).

Step 2

First let us make the statement more precise. Let us say that a probability distribution [math]\nu[/math] on a finite set X is [math]\epsilon[/math]-uniform if [math]\nu(A)[/math] never differs from [math]|A|/|X|[/math] by more than [math]\epsilon.[/math] (A probabilist would say that the total variation distance between [math]\nu[/math] and the uniform distribution is at most [math]\epsilon.[/math]) Then the precise claim is the following. Let [math]\epsilon,\eta\gt0.[/math] Suppose that [math]\mu[/math] is a probability distribution on some collection [math]\Sigma[/math] of combinatorial subspaces S of [math][3]^n.[/math] Now choose a point x randomly by first choosing a subspace S [math]\mu[/math]-randomly from [math]\Sigma[/math] and then choosing [math]x[/math] [math]\sigma_S[/math]-randomly from S. Suppose that the resulting distribution [math]\nu[/math] is [math]\epsilon[/math]-uniform. Then either we can find a combinatorial subspace [math]S\in\Sigma[/math] such that [math]\sigma_S(\mathcal{A}\cap S)\geq\delta+\epsilon[/math] or, when you choose S randomly according to the distribution [math]\mu,[/math] the probability that [math]\sigma_S(\mathcal{A}\cap S)\leq\delta-\eta[/math] is at most [math]2\epsilon/\eta.[/math]

Proof. Let us first work out a lower bound for the expectation of [math]\delta(S):=\sigma_S(\mathcal{A}\cap S).[/math] This expectation is [math]\sum_{S\in\Sigma}\mu(S)\delta(S),[/math] which is precisely the probability that you obtain a point in [math]\mathcal{A}[/math] if you first pick a [math]\mu[/math]-random S and then pick a [math]\sigma_S[/math]-random point in S. In other words, it is [math]\nu(\mathcal{A}),[/math] which by hypothesis is within [math]\epsilon[/math] of [math]\delta,[/math] and is therefore at least [math]\delta-\epsilon.[/math] If the probability that [math]\delta(S)\lt\delta-\eta[/math] is p and [math]\delta(S)[/math] is bounded above by [math]\delta+\epsilon,[/math] then the expectation of [math]\delta(S)[/math] is at most [math]p(\delta-\eta)+(1-p)(\delta+\epsilon),[/math] which equals [math]\delta+\epsilon-p(\eta+\epsilon).[/math] If [math]p\gt2\epsilon/\eta,[/math] then this is less than [math]\delta+\epsilon-2\epsilon,[/math] which is a contradiction. [math]\Box[/math]

In the informal statement of Step 2 above, we said "we may assume" that almost all densities are at least [math]\delta-\eta.[/math] The reason is that the above argument shows that the only thing that could go wrong is if there exists a subspace [math]S\in\Sigma[/math] such that [math]\delta(S)=\sigma_S(\mathcal{A}\cap S)\geq\delta+\epsilon.[/math] But we shall choose the measures [math]\sigma_S[/math] in such a way that if this happens then we can pass to a further subspace inside which the uniform density is at least [math]\delta+\epsilon/2.[/math] And if we can do that, then we have our desired density increment.

Step 3

Now let us pick a random point [math](U,V,W)[/math] and a random set [math]Z\subset[n][/math] of size [math]m=o(\sqrt{n}).[/math] We claim first that the distribution of an equal-slices-random point in the combinatorial subspace [math]S=(U,V,W)++[3]^Z[/math] is approximately uniform, and also that the distribution of an equal-layers random point in the set [math]T=(U,V,W)++[2]^Z[/math] is approximately uniform. (For the sake of clarity, I'll say "equal-layers" for [math][2]^n[/math] and "equal-slices" for [math][3]^n.[/math]) Just in case there is any doubt, the equal-slices measures on the subspaces are not the restrictions of equal-slices measure on [math][3]^n[/math] to those subspaces: rather, they are what you get when you think of the subspaces as copies of [math][3]^m.[/math]

To prove this assertion (which is essentially already proved in the discussion of the equivalence of DHJ(3) for the two measures), let us first fix three non-negative integers [math]a,b,c[/math] that add up to m, and then examine the distribution of the point [math]x[/math] chosen by first picking a random [math](U,V,W),[/math] then picking a random triple [math](U',V',W')[/math] belonging to the slice [math]\Gamma_{a,b,c}[/math] of [math][3]^Z,[/math] and finally taking the point [math]x=(U,V,W)++(U',V',W').[/math] This is equivalent to choosing [math](U',V',W')[/math] first and then filling up the rest of the sequence randomly. Since [math]U', V'[/math] and [math]W'[/math] are random sets of size a, b and c, the effect of this is to change very slightly the density associated with each slice. More precisely, the densities of near-central slices are hardly affected, while the densities of outlying slices are irrelevant because their total measure is tiny.

Once we've done that for a single triple (a,b,c) we can average over all of them (with appropriate weights) and get the result. For now, I will not give this argument in any more detail.

A similar argument (in fact, almost exactly the same argument) proves that if you choose an equal-layers random point in [math](U,V,W)++[2]^Z,[/math] then it too will have a distribution that is [math]\epsilon[/math]-uniform.

Now let us find the particular [math](U,V,W)[/math] and Z that we are looking for. Because the distribution of an equal-slices random point in [math]S=(U,V,W)++[3]^Z[/math] is [math]\epsilon[/math]-uniform, the hypotheses of Step 2 are satisfied for the uniform measure on the subspaces S of this form and the equal-slices measure inside. Therefore, we are free to assume that the proportion of such subspaces S inside which the equal-slices density is less than [math]\delta-\eta[/math] is at most [math]2\epsilon/\eta.[/math] But we also know that if we choose a random point from a random set of the form [math](U,V,W)++[2]^Z,[/math] then it is [math]\epsilon[/math]-uniform, so its probability of being in [math]\mathcal{A}[/math] is at least [math]\delta-\epsilon.[/math] It follows that with probability at least [math]\delta/3[/math] the density of [math]\mathcal{A}[/math] inside [math](U,V,W)++[2]^Z[/math] is at least [math]\delta/3.[/math] So provided we choose [math]\epsilon[/math] and [math]\eta[/math] so that [math]2\epsilon/\eta[/math] is less than [math]\delta/3,[/math] we can find [math](U,V,W)[/math] and [math]Z[/math] such that both statements hold. This proves Step 3.

Step 4

In one way this is trivial, and in another it is the observation that drives the whole argument (and has been mentioned in different guises and by various people several times on the blog threads). If [math]U_1\subset U_2[/math] and [math](U_1,Z\setminus U_1,\emptyset)[/math] and [math](U_2,Z\setminus U_2,\emptyset)[/math] both belong to [math]\mathcal{A},[/math] then, writing [math]V_i[/math] for [math]Z\setminus U_i,[/math] the claim is that [math](U_1,V_2,Z\setminus(U_1\cup V_2))[/math] does not belong to [math]\mathcal{A}.[/math] But that is because the points [math](U_1,V_1,\emptyset), (U_2,V_2,\emptyset)[/math] and [math](U_1,V_2,Z\setminus(U_1\cup V_2))[/math] form a combinatorial line, the first two points of which belong to [math]\mathcal{A}.[/math]

Step 5

The set of all pairs [math](U_1,V_2)[/math] such that [math]U_1\in\mathcal{U}[/math] and [math]V_2\in\mathcal{V}[/math] is in one-to-one correspondence with the set of all pairs [math]U_1\subset U_2[/math] such that [math]U_1,U_2\in\mathcal{U}.[/math] From Step 3 we know that the equal-layers density of [math]\mathcal{U}[/math] is at least [math]\delta/3.[/math] Therefore, if we choose a random permutation [math]\pi[/math] of [math][n],[/math] the expected density of initial segments that lie in [math]\mathcal{U}[/math] is at least [math]\delta/3.[/math] It follows from Cauchy-Schwarz that the expected density of pairs of initial segments is at least [math]\delta^2/9.[/math] Therefore, the set of all disjoint pairs [math](U_1,V_2)[/math] that belong to [math]\mathcal{U}\times\mathcal{V}[/math] has density at least [math]\delta^2/9[/math] in the set of all disjoint pairs (where the density of pairs is given by first choosing their cardinalities randomly and then choosing the sets given the cardinalities).

It remains to deduce from this that the collection of points [math](U,V,W)[/math] such that [math]U\in\mathcal{U}[/math] and [math]V\in\mathcal{V}[/math] is equal-slices dense. Hang on, I've just shown precisely the statement that the equal-slices density of this set is at least [math]\delta^2/9.[/math]

Step 6

This one is very simple. We have partitioned [math][3]^Z[/math] into four special sets of complexity 1. [math]\mathcal{A}[/math] is disjoint from one of those sets, which has density at least [math]\delta^2/9.[/math] Therefore, of at least one of the other three we must be able to say that its density is [math]\alpha[/math] but it contains at least [math]\alpha\delta+\delta^2/27)3^m[/math] points of [math]\mathcal{A}[/math] (since otherwise the density of [math]\mathcal{A}[/math] would not be [math]\delta[/math]). This gives us a density increment of at least [math]\delta^2/27[/math] on some special set of complexity 1, which itself must have density at least [math]\delta^2/27[/math] in [math][3]^Z[/math].