Finding narrow admissible tuples

From Polymath Wiki
Jump to: navigation, search

For any natural number [math]k_0[/math], an admissible [math]k_0[/math]-tuple is a finite set [math]{\mathcal H}[/math] of integers of cardinality [math]k_0[/math] which avoids at least one residue class modulo [math]p[/math] for each prime [math]p[/math]. (Note that one only needs to check those primes [math]p[/math] of size at most [math]k_0[/math], so this is a finitely checkable condition.) Let [math]H(k_0)[/math] denote the minimal diameter [math]\max {\mathcal H} - \min {\mathcal H}[/math] of an admissible [math]k_0[/math]-tuple. As part of the Polymath8 project, we would like to find as good an upper bound on [math]H(k_0)[/math] as possible for given values of [math]k_0[/math]. To a lesser extent, we would also be interested in lower bounds on this quantity. There is some scattered numerical evidence that the optimal value of H is roughly of size [math]k_0 \log k_0 + k_0[/math] for [math]k_0[/math] in the range of interest.

Upper bounds

Upper bounds are primarily constructed through various "sieves" that delete one residue class modulo [math]p[/math] from an interval for a lot of primes [math]p[/math]. Examples of sieves, in roughly increasing order of efficiency, are listed below.

Zhang sieve

The Zhang sieve uses the tuple

[math]{\mathcal H} = \{p_{m+1}, \ldots, p_{m+k_0}\}[/math]

where [math]m[/math] is minimized subject to staying admissible. Any [math]m[/math] with [math]p_{m+1} \gt k_0[/math] yields an admissible tuple; in particular, one can just take [math]{\mathcal H}[/math] to be the first [math]k_0[/math] primes past [math]k_0[/math], but this is not optimal. Applying the prime number theorem then gives the upper bound [math]H \leq (1+o(1)) k_0\log k_0[/math].

Hensley-Richards sieve

The Hensley-Richards sieve [HR1973], [HR1973b], [R1974] uses the tuple

[math]{\mathcal H} = \{-p_{m+\lfloor k_0/2\rfloor - 1}, \ldots, -p_{m+1}, -1, +1, p_{m+1},\ldots, p_{m+\lfloor k_0/2+1/2\rfloor-1}\}[/math]

where m is optimised to minimize the diameter while staying admissible.

Asymmetric Hensley-Richards sieve

The asymmetric Hensley-Richard sieve uses the tuple

[math]{\mathcal H} = \{-p_{m+\lfloor k_0/2\rfloor - 1-i}, \ldots, -p_{m+1}, -1, +1, p_{m+1},\ldots, p_{m+\lfloor k_0/2+1/2\rfloor-1+i}\}[/math]

where [math]i[/math] is an integer and [math]i,m[/math] are optimised to minimize the diameter while staying admissible.

Schinzel sieve

Given [math]0\lty\ltz\ltx[/math], the Schinzel sieve (discussed in [S1961], [HR1973], [GR1998], [CJ2001]) sieves the interval [math][1,x][/math] by [math]1 \bmod p[/math] for primes [math]p \le y[/math] and by [math]0\bmod p[/math] for primes [math]y \lt p \le z[/math]. Provided that [math]z[/math] is large enough ([math]z=k_0[/math] clearly suffices), the first [math]k_0[/math] survivors form an admissible [math]k_0[/math]-tuple (but not necessarily the narrowest one in the interval). The case [math]y=1[/math] corresponds to a sieve of Eratosthenes; if one minimizes [math]z[/math] and takes the first [math]k_0[/math] survivors greater than 1, this yields the same admissible [math]k_0[/math] tuple as Zhang, with the minimal possible value of [math]m[/math].

Shifted Schinzel sieve

As a generalization of the Schinzel sieve, one may instead sieve shifted intervals [math][s,s+x][/math]. This is effectively equivalent to sieving the interval [math][0,x][/math] of the residue classes [math]-s\ \bmod\ p[/math] for primes [math]p\le y[/math] and [math] 1-s\ \bmod\ p[/math] for primes [math]y\ltp\le z[/math].

Greedy sieve

Within a given interval, one sieves a single residue class [math]a \bmod p[/math] for increasing primes [math]p=2,3,5,\ldots[/math], with [math]a[/math] chosen to maximize the number of survivors. Ties can be broken in a number of ways: minimize [math]a\in[0,p-1][/math], maximize [math]a\in [0,p-1][/math], minimize [math]|a-\lfloor p/2\rfloor|[/math], or randomly. If not all residue classes modulo [math]p[/math] are occupied by survivors, then [math]a[/math] will be chosen so that no survivors are sieved. This necessarily occurs once [math]p[/math] exceeds the number of survivors but typically happens much sooner. One then chooses the narrowest [math]k_0[/math]-tuple [math]{\mathcal H}[/math] among the survivors (if there are fewer than [math]k_0[/math] survivors, retry with a wider interval).

Greedy-Schinzel hybrid

Heuristically, the performance of the greedy sieve is significantly improved by starting with a shifted Schinzel sieve on [math][s,\ s+x][/math] using [math]y = 2[/math] and [math]z = \sqrt{x}[/math] and then continuing in a greedy fashion, as proposed by Sutherland. One first optimizes the shift value [math]s[/math] over some larger interval (e.g. [math][-k_0\log\ k_0,\ k_0\log\ k_0][/math]) and then continues the sieving over primes [math]p \gt z[/math] greedily choosing the best residue class for each prime according to a chosen tie-breaking rule (in Sutherland's original implementation, ties are broken downward in [math][0,\ p-1][/math]).

Seeded greedy sieve

Given an initial sequence [math]{\mathcal S}[/math] that is known to contain an admissible [math]k_0[/math]-tuple, one can apply greedy sieving to the minimal interval containing [math]{\mathcal S}[/math] until an admissible sequence of survivors remains, and then choose the narrowest [math]k_0[/math]=tuple it contains. The sieving methods above can be viewed as the special case where [math]{\mathcal S}[/math] is the set of integers in some interval. The main difference is that the choice of [math]{\mathcal S}[/math] affects when ties occur and how they are broken with greedy sieving. One approach is to take [math]{\mathcal S}[/math] to be the union of two [math]k_0[/math]-tuples that lie in roughly the same interval (see Iterated merging) below.

Iterated merging

Given an admissible [math]k_0[/math]-tuple [math]\mathcal{H}_1[/math], one can attempt to improve it using an iterated merging approach suggested by Castryck. One first uses a greedy (or greedy-Schinzel) sieve to construct an admissible [math]k_0[/math]-tuple [math]\mathcal{H}_2[/math] in roughly the same interval as [math]\mathcal{H}_1[/math], then performs a randomized greedy sieve using the seed set [math]\mathcal{S} = \mathcal{H}_1 \cup \mathcal{H}_2[/math] to obtain an admissible [math]k_0[/math]-tuple [math]\mathcal{H}_3[/math]. If [math]\mathcal{H}_3[/math] is narrower than [math]\mathcal{H}_2[/math], replace [math]\mathcal{H}_2[/math] with [math]\mathcal{H}_3[/math], otherwise try again with a new [math]\mathcal{H}_3[/math]. Eventually the diameter of [math]\mathcal{H}_2[/math] will become less than or equal to that of [math]\mathcal{H}_1[/math]. As long as [math]\mathcal{H}_1\ne \mathcal{H}_2[/math], one can continue to attempt to improve [math]\mathcal{H}_2[/math], but in practice one stops after some number of retries.

As described by Sutherland, one can then replace [math]\mathcal{H}_1[/math] with [math]\mathcal{H}_2[/math] and begin the process anew, yielding a randomized algorithm that can be run indefinitely. Key parameters to this algorithm are the choice of the interval used when constructing [math]\mathcal{H}_2[/math], which is typically made wider than the minimal interval containing [math]\mathcal{H}_1[/math] by a small factor [math]\delta[/math] on each side (Sutherland suggests [math]\delta = 0.0025[/math]), and the number of failed attempts allowed while attempting to impove [math]\mathcal{H}_2[/math].

Eventually this process will tend to converge to particular [math]\mathcal{H}_1[/math] that it cannot improve (or more generally, a set of similar [math]\mathcal{H}_1[/math]'s with the same diameter). Interleaving iterated merging with the local optimizations described below often allows the algorithm to make further progress.

Iterated merging can be viewed as a form of simulated annealing. The set [math]\mathcal{S}[/math] initially contains at least two admissible [math]k_0[/math]-tuples (typically many more), and as the algorithm proceeds the set [math]\mathcal{S}[/math] converges toward [math]\mathcal{H}_1[/math] and the number of admissible [math]k_0[/math]-tuples it contains declines. One can regard the cardinality of the difference between [math]\mathcal{S}[/math] and [math]\mathcal{H}_1[/math] as a measure of the "temperature" of a gradually cooling system, since the number of choices available to the algorithm declines as this cardinality is reduced (more precisely, one may consider the entropy of the possible sequence of tie-breaking choices available for a given [math]\mathcal{S}[/math]).

Local optimizations

Let [math]\mathcal H = \{h_1,\ldots, h_{k_0}\}[/math] be an admissible [math]k_0[/math]-tuple with endpoints [math]h_1[/math] and [math]h_{k_0}[/math], and let [math]\mathcal I[/math] be the interval [math][h_1,h_{k_0}][/math]. If there exists an integer [math]h\in\mathcal I[/math] such that removing one of [math]\mathcal H[/math]'s endpoints and inserting [math]h[/math] yields an admissible [math]k_0[/math]-tuple [math]\mathcal H'[/math], then call [math]\mathcal H[/math] contractible, and if not, say that [math]\mathcal H[/math] non-contractible. Note that [math]\mathcal H'[/math] necessarily has smaller diameter than [math]\mathcal H[/math]. Any of the sieving methods described above may produce admissible [math]k_0[/math]-tuples that are contractible, so it is worth testing for contractibility as a post-processing step after sieving and replacing [math]\mathcal H[/math] by [math]\mathcal H'[/math] if this test succeeds.

We can also shift [math]\mathcal H [/math] to the left by removing its right end point [math]h_{k_0}[/math] and replacing it with the greatest integer [math]h_0 \lt h_1[/math] that yields an admissible [math]k_0[/math]-tuple [math]\mathcal H'[/math], and we can similarly shift [math]\mathcal H[/math] to the right. The diameter of [math]\mathcal H'[/math] need not be less than [math]\mathcal H[/math], but if it is, it provides a useful replacement. More generally, by shifting [math]\mathcal H[/math] repeatedly we can produce a sequence of admissible [math]k_0[/math]-tuples that lie successively further to the left or right. In general the diameter of these tuples may grow as we do so, but it will also occasionally decline, and we may be able to find a shifted [math]\mathcal H'[/math] with smaller diameter than [math]\mathcal H[/math].

A more sophisticated local optimization involves a process of ``adjustment" proposed by Savitt. Let [math]\mathcal H [/math] be an admissible [math]k_0[/math]-tuple. For a prime [math]p[/math] and an integer [math]a[/math], let [math][a;p][/math] denote the residue class [math]a\bmod p[/math], i.e. the set of integers [math]\{ x : x = a \bmod p\}[/math]. Call [math][a;p][/math] occupied if it contains an element of [math]\mathcal H [/math].

Suppose that [math][a;p][/math] and [math][b;q][/math] are occupied residue classes, for some distinct primes [math]p[/math] and [math]q[/math], and that [math][a';p][/math] and [math][b';q][/math] are unoccupied. Let [math]\mathcal U[/math] be the intersection of [math]\mathcal H[/math] with [math][a;p] \cup [b;q][/math], and let [math]\mathcal V[/math] be a subset of the integers that lie in the intersection of the interval [math]I[/math] containing [math]H[/math] and the set [math][a';p] \cup [b';q][/math] such that the set [math]\mathcal H' [/math] formed by removing the elements of [math]\mathcal U[/math] from [math]\mathcal H [/math] and adding the elements of [math]\mathcal V [/math] is admissible. A necessary (and often sufficient) condition for and integer [math]v[/math] to lie in [math]\mathcal V[/math] is that [math]v[/math] must not lie in a residue class [math][c;r][/math] that is the unique unoccupied residue class modulo [math]r[/math] for any prime [math]r[/math] other than [math]p[/math] or [math]q[/math].

The admissible set [math]\mathcal H' [/math] lies in the interval [math]\mathcal I[/math] containing [math]\mathcal H[/math], so its diameter is no greater than that of [math]\mathcal H[/math], however its cardinality may differ. If it happens that [math]\mathcal H' [/math] contains more elements than [math]\mathcal H [/math], then by eliminating points at either end of [math]\mathcal H' [/math] we obtain an admissible [math]k_0[/math]-tuple that is narrower than [math]\mathcal H[/math] and may ``adjust" [math]\mathcal H [/math] by replacing it with [math]\mathcal H' [/math]. The process of adjustment can often be applied repeatedly, yielding a sequence of successively narrower admissible [math]k_0[/math]-tuples.

Further refinements

Lower bounds

There is a substantial amount of literature on bounding the quantity [math]\pi(x+y)-\pi(x)[/math], the number of primes in a shifted interval [math][x+1,x+y][/math], where [math]x,y[/math] are natural numbers. As a general rule, whenever a bound of the form

[math] \pi(x+y) - \pi(x) \leq F(y) [/math] (*)

is established for some function [math]F(y)[/math] of [math]y[/math], the method of proof also gives a bound of the form

[math] k_0 \leq F( H(k_0)+1 ).[/math] (**)

Indeed, if one assumes the prime tuples conjecture, any admissible [math]k_0[/math]-tuple of diameter [math]H[/math] can be translated into an interval of the form [math][x+1,x+H+1][/math] for some [math]x[/math]. In the opposite direction, all known bounds of the form (*) proceed by using the fact that for [math]x\gty[/math], the set of primes between [math]x+1[/math] and [math]x+y[/math] is admissible, so the method of proof of (*) invariably also gives (**) as well.

Examples of lower bounds are as follows;

Brun-Titchmarsh inequality

The Brun-Titchmarsh theorem gives

[math] \pi(x+y) - \pi(x) \leq (1 + o(1)) \frac{2y}{\log y}[/math]

which then gives the lower bound

[math] H(k_0) \geq (\frac{1}{2}-o(1)) k_0 \log k_0[/math].

Montgomery and Vaughan deleted the o(1) error from the Brun-Titchmarsh theorem [MV1973, Corollary 2], giving the more precise inequality

[math] k_0 \leq 2 \frac{H(k_0)+1}{\log (H(k_0)+1)}.[/math]

First Montgomery-Vaughan large sieve inequality

The large sieve inequality (in the sharp form of Selberg) [IK2004, Theorem 7.14] gives

[math] k_0 (\sum_{q \leq Q} \frac{\mu^2(q)}{\phi(q)}) \leq H(k_0) + Q^2[/math]

for any [math]Q \gt 1[/math], which is a parameter that one can optimise over (the optimal value is comparable to [math]H(k_0)^{1/2}[/math]).

Second Montgomery-Vaughan large sieve inequality

The second Montgomery-Vaughan large sieve inequality [MV1973, Corollary 1] gives

[math] k_0 \leq (\sum_{q \leq z} (H(k_0)+1+cqz)^{-1} \mu(q)^2 \prod_{p|q} \frac{1}{p-1})^{-1}[/math]

for any [math]z \gt 1[/math], which is a parameter similar to [math]Q[/math] in the previous inequality, and [math]c[/math] is an absolute constant. In the original paper of Montgomery and Vaughan, [math]c[/math] was taken to be [math]3/2[/math]; this was then reduced to [math]\sqrt{22}/\pi[/math] [B1995, p.162] and then to [math]3.2/\pi[/math] [M1978]. It is conjectured that [math]c[/math] can be taken to in fact be [math]1[/math].


Efforts to fill in the blank fields in this table are very welcome.

[math]k_0[/math] 3,500,000 341,640 181,000 34,429 26,024 23,283 22,949 10,719 7,140 6,329 5,453 5,000
Upper bounds
First [math]k_0[/math] primes past [math]k_0[/math] 59,874,594 5,005,362 2,530,338 420,878 310,134 275,082 270,698 117,714 75,222 65,924 55,892 50,840
Zhang sieve 59,093,364 4,923,060 2,486,370 411,946 303,558 268,544 264,460 114,814 73,448 64,182 54,516 49,586
Hensley-Richards sieve 57,554,086 4,802,222 2,422,558 402,790 297,454 262,794 258,780 112,868 72,538 63,708 53,654 48,634
Asymmetric Hensley-Richards 57,480,832 4,788,240 2,418,054 401,700 296,154 262,286 258,302 112,562 72,062 62,900 53,278 48,484
Shifted Schinzel sieve 56,789,070 4,740,846 2,396,594 399,248 294,810 260,714 256,702 112,200 71,930 62,892 53,236 48,472
Greedy-Schinzel hybrid 55,233,744 4,603,276 2,326,458 388,076 286,308 253,968 249,992 108,694 69,564 60,942 51,688 46,968
Best known tuple 55,233,504 4,597,926 2,323,344 386,344 285,102 252,720 248,816 108,440 69,280 60,726 51,526 46,806
[math]\lfloor k_0 \log k_0 + k_0 \rfloor[/math] 56,238,957 4,694,650 2,372,231 394,096 290,604 257,404 253,380 110,188 70,496 61,726 52,370 47,585
Lower bounds
Inclusion-exclusion ([math]p_\text{exh}[/math]= 19) 304,704 226,104 200,852 197,874 87,690 56,726 49,794 42,494 38,710
Inclusion-exclusion ([math]p_\text{exh}[/math]= 17) 3,379,776 1,739,850 301,864 224,100 198,998 195,962 86,940 56,238 49,344 42,114 38,342
Inclusion-exclusion ([math]p_\text{exh}[/math]= 13) 35,926,668 3,298,126 1,703,774 297,726 221,266 196,562 193,578 85,954 55,614 48,858 41,648 37,920
Partitioning ([math]p_\text{exh}[/math]= 7) 2,365,090 1,252,938 238,264 180,094 161,092 158,802 74,160 49,320 43,688 37,630 34,590
Partitioning ([math]p_\text{exh}[/math]= 5) 24,226,450 2,364,700 1,252,726 238,222 180,064 161,062 158,776 74,150 49,312 43,684 37,630 34,590
MV with [math]c=1[/math] (conjectural) 32,503,908 2,751,677 1,395,694 234,872 173,420 153,691 151,298 66,314 42,551 37,274 31,644 28,781
MV with [math]c=3.2/\pi[/math] 32,469,985 2,748,330 1,393,869 234,529 173,140 153,447 151,056 66,211 42,471 37,207 31,584 28,737
MV with [math]c=\sqrt{22}/\pi[/math] 31,765,216 2,677,851 1,357,096 227,078 167,860 148,719 146,393 63,917 40,946 35,903 30,478 27,708
Montgomery-Vaughan 31,756,667 2,676,967 1,356,644 226,987 167,793 148,656 146,338 63,886 40,929 35,887 30,463 27,696
Brun-Titchmarsh 30,137,225 2,517,690 1,272,083 211,046 155,555 137,756 135,599 58,863 37,610 32,916 27,910 25,351
Large sieve 28,080,008 2,342,970 1,184,955 197,097 145,712 128,972 126,932 55,179 35,236 30,983 26,389 24,038

[math]k_0[/math] 4,000 3,405 3,000 2,000 1,783 1,000 672 632 603 342
Upper bounds
First [math]k_0[/math] primes past [math]k_0[/math] 39,660 33,222 28,972 18,386 16,174 8,424 5,406 5,028 4,800 2,472
Zhang sieve 38,596 32,296 28,008 17,766 15,620 8,218 5,216 4,860 4,634 2,416
Hensley-Richards sieve 38,498 31,820 27,806 17,726 15,756 8,258 5,314 4,918 4,688 2,446
Asymmetric Hensley-Richards 37,932 31,762 27,638 17,676 15,470 8,168 5,220 4,876 4,672 2,424
Shifted Schinzel sieve 38,006 31,910 27,600 17,554 15,484 8,072 5,196 4,868 4,610 2,416
Greedy-Schinzel hybrid 36,756 30,750 26,754 17,054 15,036 7,854 5,030 4,710 4,452 2,350
Engelsma data 36,622 30,606 26,622 16,978 14,958 7,802 4,998 4,680 4,422 2,328
Best known tuple 36,610 30,600 26,606 16,978 14,950 7,802 4,998 4,680 4,422 2,328
[math]\lfloor k_0 \log k_0 + k_0 \rfloor[/math] 37,176 31,097 27,019 17,201 15,130 7,907 5,046 4,707 4,463 2,337
Lower bounds
Inclusion-exclusion ([math]p_\text{exh}[/math]= 23) 30,560 25,734 22,432 14,410 12,678 6,696 4,374 4,104 3,912 2,110
Inclusion-exclusion ([math]p_\text{exh}[/math]= 19) 30,366 25,566 22,284 14,332 12,614 6,672 4,344 4,080 3,870 2,096
Inclusion-exclusion ([math]p_\text{exh}[/math]= 17) 30,132 25,328 22,086 14,176 12,522 6,660 4,310 4,020 3,828 2,072
Inclusion-exclusion ([math]p_\text{exh}[/math]= 13) 29,824 25,058 21,838 14,046 12,408 6,594 4,278 3,976 3,792 2,046
Partitioning ([math]p_\text{exh}[/math]= 7) 27,556 23,524 20,704 13,724 12,244 6,810 4,574 4,276 4.052 2,328
Partitioning ([math]p_\text{exh}[/math]= 5) 27,556 23,524 20,704 13,722 12,244 6,808 4,574 4,276 4,052 2,328
MV with [math]c=1[/math] (conjectural) 22,564 18,898 16,456 10,500 9,253 4,858 3,124 2,919 2,771 1,454
MV with [math]c=3.2/\pi[/math] 22,523 18,866 16,428 10,480 9,236 4,847 3,118 2,913 2,765 1,450
MV with [math]c=\sqrt{22}/\pi[/math] 21,701 18,153 15,758 10,061 8,850 4,648 2,979 2,778 2,633 1,361
Montgomery-Vaughan 21,690 18,143 15,751 10,056 8,845 4,645 2,977 2,776 2,631 1,360
Brun-Titchmarsh 19,785 16,536 14,358 9,118 8,013 4,167 2,648 2,468 2,338 1,214
Large sieve 18,860 15,784 13,697 8,616 7,548 3,960 2,559 2,393 2,273 1,192

The bold number indicates the best currently known result for a twin-prime-like theorem.

For the Zhang tuples the minimal [math]m[/math] that produced an admissible [math]k_0[/math]-tuple was used. In some cases one can achieve a smaller diameter using an [math]m[/math] that is slightly larger than the minimal admissible value, as noted here.

The shifted Schinzel tuples were generated with [math]y=2[/math] using an optimally chosen interval contained in [math][-k_0\log k_0, 2k_0\log k_0][/math] (the interval is not in every case guaranteed to be optimal, particularly for larger values of [math]k_0[/math], but it is believed to be so).

The greedy-greedy tuples were generated using Sutherland's original algorithm, breaking ties downward in every case (and the optimal interval in [math][-k_0\log k_0, 2k_0\log k_0][/math] was selected on this basis). As noted by Castryck, breaking ties upward may produce better results in some cases.

The lower bounds listed in the partitioning and inclusion-exclusion rows were computed as described by Avishay in Sections 1 resp. 2 of this document (the case [math]k_0[/math]=342 corresponds to the trivial partition). The partitioning method was strengthened by using [math]H(343) \geq 2334[/math], [math]H(370) \geq 2530[/math] and [math]H(385) \geq 2656[/math] (a complete list of bounds for [math]k_0[/math] up to 4,000,000 can be found here), and (for [math]k_0 \leq[/math] 341640) by combining the partition method with sieving for primes up to [math]p_\text{exh}[/math], as described here. The inclusion-exclusion involved an exhaustive search (along these lines) up to [math]p_\text{exh}[/math], using the inclusion-exclusion set of primes greater than [math]p_\text{exh}[/math] and less than the first prime where the depth-2 inclusion-exclusion bound is no longer positive.