Tao’s Proof of (logarithmically averaged) Chowla’s conjecture for two point correlations

The Liouville function is the completely multiplicative function {\lambda} with {\lambda(p)=-1} for every prime {p}. The Chowla conjecture predicts that this function behaves randomly. Here is a version of this conjecture.

Conjecture 1 (Chowla) For every finite set {F\subset\{0,1,\dots\}},

\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\prod_{i\in F}\lambda(n+i)=0.

This conjecture received a lot of attention in the last decade, since Peter Sarnak derived from it an orthogonality criterion for the Liouville (or, similarly, the Mobius function), now known as the “Sarnak conjecture”. Both Chowla’s and Sarnak’s conjectures remain open, however many weaker versions and special cases have been derived (see, for instance, this survey). Perhaps the most impressive result so far is the following theorem of Tao, which was in turn the starting point for many of the more recent developments.

Theorem 2 (Tao) For every (distinct) {a,b\in{\mathbb N}},

\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log{(N/M)}}\sum_{n=M}^N\frac1n\lambda(n+a)\lambda(n+b)=0.

Tao’s main theorem in his paper is actually significantly more general than Theorem 2, for it allows more general affine maps inside the argument of {\lambda}, and in fact replaces {\lambda} with a much wider class of multiplicative functions. To see why Theorem 2 follows from Conjecture 1, observe that any sequence {x(n)} satisfying

\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nx(n)=0

must also satisfy

\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log{N/M}}\sum_{n=M}^N\frac1n x(n)=0.

In the terminology and notation of my earlier post, the conclusion of this theorem states that the uniform logarithmic average of the sequence {n\mapsto\lambda(n+a)\lambda(n+b)} is {0}, i.e.,

\displaystyle \underset{n\rightarrow\infty}{UW\text{-}\lim}~\lambda(n+a)\lambda(n+b)=0

for {W(n)=\log n}.

In this post I will present Tao’s proof restricted to the following special case. (Most of the proof remains unchanged in the general case, but the presentation becomes easier when we avoid general {a,b}).

Theorem 3 Let {\lambda} denote the Liouville function. Then

\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log(N/M)}\sum_{n=M}^N\frac1n\lambda(n)\lambda(n+1)=0

We will use a correspondence principle to reformulate the problem in the language of dynamical systems to, among other things, simplify the notation. This approach was also carried out by Tao in his blog, and the main motivation for me to write this post was to try to use an easier construction of a Furstenberg system, which would still keep track of the “congruence classes of the points” in the system, as detailed below the fold.

Continue reading

Posted in Number Theory, Probability | Tagged , , , , | Leave a comment

A viewpoint on Katai’s orthogonality criterion

The Liouville function, defined as the completely multiplicative function {\lambda} which sends every prime {p} to {-1}, encodes several important properties of the primes. For instance, the statement that

\displaystyle \frac1N\sum_{n=1}^N\lambda(n)=o(1)

is equivalent to the prime number theorem, while the improved (and essentially best possible) rate of convergence

\displaystyle \frac1N\sum_{n=1}^N\lambda(n)=O_\epsilon\left(\frac1{N^{1/2-\epsilon}}\right)

for every positive {\epsilon} is equivalent to the Riemann hypothesis.

Of particular interest are statements involving correlations (or lack thereof) between {\lambda} and certain “structured” functions. In this direction, Green and Tao established “orthogonality” of {\lambda} with any nilsequence. More precisely, they showed that for any nilpotent Lie group {G}, any discrete subgroup {\Gamma} for which {X:=G/\Gamma} is compact, any Lipschitz function {F:X\rightarrow{\mathbb C}}, any {A>0} and any {b\in G},

\displaystyle \frac1N\sum_{n=1}^N\lambda(n)F(b^n\Gamma)=O_A\left(\frac1{\log^AN}\right),

which was an important ingredient in their proof of the “finite complexity” version of the Dickson/Hardy-Littlewood prime tuple conjecture. Perhaps inspired by this result, Sarnak then conjectured that in fact the Liouville function (actually, the closely related Möbius function) is orthogonal to every deterministic sequence:

Conjecture 1 (Sarnak conjecture for the Liouville function) Let {X} be a compact metric space and let {T:X\rightarrow X} be a homeomorphism. If the topological dynamical system {(X,T)} has {0} topological entropy, then for every continuous {F\in C(X)} and every {x\in X},

\displaystyle \frac1N\sum_{n=1}^N\lambda(n)F(T^nx)=o(1).

Sarnak’s conjecture is still open, despite some remarkable progress (see the recent survey of Ferenczi, Kułaga-Przymus and Lemańczyk for information on the progress made so far).

A useful tool to establish partial cases of Sarnak’s conjecture (and other results not implied by the Sarnak conjecture) is the following orthogonality criterion of Katai.

Theorem 2 (Katai’s orthogonality criterion) Let {u:{\mathbb N}\rightarrow {\mathbb C}} be bounded and let {v:{\mathbb N}\rightarrow S^1} be a completely multiplicative function. If for every distinct primes {h,h'}

\displaystyle  \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nu(nh)\overline{u(nh')}=0 \ \ \ \ \ (1)


\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nu(n)\overline{v(n)}=0.

Remark 1 One can in fact relax the condition on {v} to be only a multiplicative function. This allows the criterion to be applied, for instance, with {v} being the Möbius function.

The statement of Theorem 2 is reminiscent of the van der Corput trick, which is a ubiquitous tool in the fields of uniform distribution and ergodic Ramsey theory, was the protagonist of a survey Vitaly Bergelson and I wrote and has been mentioned repeatedly in this blog. In fact one can prove both results (Katai’s orthogonality criterion and van der Corput’s trick) using the same simple functional analytic principle. In this post I present this principle and use it to motivate proofs for the two theorems.

Continue reading

Posted in Classic results, Number Theory, Tool | Tagged , , , , | Leave a comment

Entropy of measure preserving systems

A measure preserving system is a quadruple {(X,{\mathcal B},\mu,T)} where {X} is a set, {{\mathcal B}} is a {\sigma}-algebra, {\mu:{\mathcal B}\rightarrow[0,1]} is a probability measure and {T:X\rightarrow X} is a measurable map satisfying {\mu(T^{-1}A)=\mu(A)} for every {A\in{\mathcal B}}.

The notion of isomorphism in the category of measure preserving systems (defined, for instance, in this earlier post) is fairly flexible, as it allows {0} measure perturbations. For instance, the system {([0,1],{\mathcal B},\lambda,T_2)} where {\lambda} is the Lebesgue measure and {T_2x=2x\bmod1} is isomorphic to the Bernoulli system {(\{0,1\}^{\mathbb N},\mu,\sigma)}, where {\sigma:\{0,1\}^{\mathbb N}\rightarrow\{0,1\}^{\mathbb N}} is the left shift and {\mu} is the infinite product of the measure {p} on {\{0,1\}} defined so that {p(\{0\})=p(\{1\})=1/2}.

However, this flexibility can make it quite difficult to decide whether two given systems are isomorphic or not. While several spectral invariants can be used, such as ergodicity, weakly mixing or strongly mixing, they are not sufficient to distinguish even between two Bernoulli shifts. To address this problem, Kolmogorov and Sinai developed the notion of entropy in the late 1950’s. Since then, the entropy of measure preserving systems, together with its sister in topological dynamics, has became a fundamental tool in understanding the structure of dynamical systems, with applications far beyond distinguishing non-isomorphic systems.

In this post I will define and then briefly survey some of the basic properties of entropy. I’ll mostly follow the exposition in the upcoming book of Einsiedler, Lindenstrauss and Ward, whose draft is available here; other good sources are Walter’s book and, in a somewhat different perspective, Austin’s recent notes.

Continue reading

Posted in Classic results, Ergodic Theory, Tool | Tagged | 1 Comment

Optimal intersectivity

In ergodic Ramsey theory, one often wants to prove that certain dynamically defined sets in a probability space intersect (or “recur”) in non-trivial ways. Typically, this is achieved by studying the long term behavior of the sets as the dynamics flow. However, in certain situations, one can establish the desired intersection (or recurrence) using purely combinatorial arguments, and without using the fact that the sets are dynamically defined. In such cases, one ends up obtaining a “static” (as opposed to dynamical) statement. An instance of this situation is the following intersectivity result of Bergelson, first used in this paper, and which I have mentioned before in this blog.

Lemma 1 Let {A_1,A_2,\dots} be sets in a probability space {(X,\mu)} with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite set {I\subset{\mathbb N}} with density {d(I)>\delta}, such that for every non-empty finite set {F\subset I} we have

\displaystyle \mu\left(\bigcap_{n\in F}A_n\right)>0

A different kind of intersection property is the following static modification of Poincaré’s recurrence theorem.

Lemma 2 Let {(X,\mu)} be a probability space and let {A_1, A_2,\dots} be sets with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite subset {I\subset{\mathbb N}} such that for every {n,m\in I},

\displaystyle \mu\left(A_n\cap A_m\right)>\delta^2

Observe that if the events {A_1,A_2,\dots} are independent, then the lower bound {\delta^2} is essentially achieved, and so this lemma is that sense optimal. The purpose of this post is to present a proof of the following common strengthening of Lemmas 1 and 2.

Theorem 3 Let {A_1,A_2,\dots} be sets in a probability space {(X,\mu)} with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite set {I\subset{\mathbb N}} such that for every non-empty finite set {F\subset I} we have

\displaystyle \mu\left(\bigcap_{n\in F}A_n\right)>\delta^{|F|}

Observe that the bound {\delta^{|F|}} is optimal, again by considering the case of independent sets.

I learned this strengthening, as well as its proof from Konstantinos Tyros last December, when we were in Lyon, France attending the conference Ultrafilters, Ramsey Theory and Dynamics.

One could ask whether something more can be said about the set {I} in Theorem 3 other than being infinite . Something one can not hope for is that the set {I} has positive density, as showed in my previous post, using Forest’s theorem that not all sets of recurrence are sets of nice recurrence. On the other hand, one can indeed obtain certain combinatorial structures inside {I}. For instance, assuming the index set of the sets {(A_n)} is given the structure of a homogeneous tree, one can choose {I} to be a strong subtree; this is Theorem 3 in this paper of Dodos, Kanellopoulos and Tyros. Results of this kind are related to density versions of the Hales-Jewett and the Carlson-Simpson theorems due respectively to Furstenberg and Katznelson, and to Dodos, Kanellopoulos and Tyros. Continue reading

Posted in Combinatorics, Probability, Ramsey Theory, Tool | Tagged , , , , | Leave a comment

A proof a sumset conjecture of Erdős

Florian Richter, Donald Robertson and I have uploaded to the arXiv our paper entitled A proof a sumset conjecture of Erdős. The main goal of the paper is to prove the following theorem, which verifies a conjecture of Erdős discussed in this previous post.

Theorem 1 Let {A\subset{\mathbb N}} have positive upper density. Then it contains a sumset {B+C=\{b+c:b\in B, c\in C\}} for infinite sets {B} and {C}.

As usual, the upper density of a set {A\subset{\mathbb N}} is the quantity

\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N.

More generally, given a sequence {\Phi:N\mapsto\Phi_N=\{a_N,a_N+1,\dots,b_N\}} of intervals of integers with lengths {|\Phi_N|} tending to infinity (or even more generally, given any Følner sequence {\Phi:N\mapsto\Phi_N} in {{\mathbb N}}), we define the upper density with respect to {\Phi} as

\displaystyle \bar d_\Phi(A):=\limsup_{N\rightarrow\infty}\frac{\big|A\cap\Phi_N\big|}{|\Phi_N|}.

When the limit exists we say that the density of {A} with respect to {\Phi} exists and denote it by {d(A)} (and whenever we write {d(A)} we are implicitly stating that the limit exists).

As explained in my earlier post mentioned above (see Proposition 12), one can adapt an argument of Di Nasso, Goldbring, Jin, Leth, Lupini and Mahlburg to reduce Theorem 1 to the following statement.

Theorem 2 If {A\subset{\mathbb N}} has positive upper density, then there exist a sequence {\Phi} of intervals, a set {L\subset{\mathbb N}} and {\epsilon>0} such that for every finite subset {F\subset L}, the intersection

\displaystyle  \bigcap_{\ell\in F}(A-\ell)\cap\big\{m\in{\mathbb N}:d_\Phi(A-m\cap L)>0\big\} \ \ \ \ \ (1)

is infinite.

In my previous post I described how Theorem 2 can then be easily checked for weak mixing sets and for Bohr sets, whose definitions we now recall.

Definition 3 A set {A\subset{\mathbb N}} is:

  • a Bohr set if there exist {d\in{\mathbb N}}, a vector {\alpha\in{\mathbb T}^d} and an open set {U\subset{\mathbb T}^d} such that {A\supset\{n\in{\mathbb N}:n\alpha\in U\}\neq\emptyset}.
  • a weak mixing set with respect to the sequence of intervals {\Phi} if the set {\big\{n:\big|d_\Phi((A-n)\cap A)-d_\Phi(A)^2\big|<\epsilon\big\}} has density {1}.

It is well known in the realm of ergodic theory that weak mixing and almost periodicity are extreme points of a dichotomy (see for instance, this post) and so it was natural to try to patch these two observations into a proof of Theorem 2 in general. While this is ultimately how we prove Theorem 2, there were several problems with implementing this idea. Below the fold I explain the problems that arise and (briefly) sketch how we address them. Continue reading

Posted in Combinatorics, paper | Tagged , , , , , , , | Leave a comment

Single and multiple recurrence along non-polynomial sequences

Vitaly Bergelson, Florian Richter and I have recently uploaded to the arXiv our new paper “Single and multiple recurrence along non-polynomial sequences”. In this paper we address the question of what combinatorial structure is present in the set of return times of a non-polynomial sequence, i.e., sets of the form

\displaystyle   R=\big\{n\in{\mathbb N}:\mu(A\cap T^{-f(n)}A)>\mu^2(A)-\epsilon\big\} \ \ \ \ \ (1)

where {(X,\mu,T)} is a measure preserving system, {A\subset X}, {\epsilon>0} and {f:{\mathbb N}\rightarrow{\mathbb N}} is some non-polynomial function.

The departure point for this paper was the observation that when {f(n)} is a polynomial with integer coefficients and {f(0)=0}, set {R} in (1) is syndetic, i.e. has bounded gaps (for the special case {f(n)=n} this is the content of Khintchine’s theorem, and the general case was obtained by Furstenberg, see a proof in my earlier post), but the same is no longer true in general for non-polynomial sequences {f(n)}.

For instance when {f(n)=\lfloor n^a\rfloor} for some positive {a\in{\mathbb R}\setminus{\mathbb Z}} (where {\lfloor \cdot\rfloor} denotes the floor function) it is known that the set {R} from (1) is nonempty, and in fact has positive density but is not (in general) syndetic.

Our first theorem shows that nevertheless, the set {R} has an interesting (and a-priori surprising) combinatorial property, it is thick, i.e., contains arbitrarily long intervals of integers. One reason why this is surprising is that the notions of thick and syndetic sets are dual in the sense that a set is thick if and only if its complement is not syndetic and vice-versa. See my previous post for some discussion of the basic properties of thick and syndetic sets.

Theorem 1 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then the set {R} defined in (1) for {f(n)=\lfloor n^a\rfloor} is thick.

Theorem 1 implies in particular that for every {k\in{\mathbb N}} there exists some {n\in{\mathbb N}} such that each of the sets {T^{-f(n)}A}, {T^{-f(n+1)}A}, … , {T^{-f(n+k)}A} have a large intersection with {A}. Our second theorem proves that in fact all of these sets intersect together.

Theorem 2 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then for every {k\in{\mathbb N}} there exist (many) {n\in{\mathbb N}} such that

\displaystyle \mu\Big(A\cap T^{-f(n)}A\cap T^{-f(n+1)}A\cap\cdots\cap T^{-f(n+k)}A\Big)>0

where {f(n)=\lfloor n^a\rfloor}.

Note that both Theorems 1 and 2 fail when {f(n)} is a polynomial.

In the paper we deal with a considerably more general class of non-polynomial functions (see Definition 4 below) but for now I will restrict attention to the example {f(n)=\lfloor n^a\rfloor}. Continue reading

Posted in Ergodic Theory, paper | Tagged , , , , , | 1 Comment

Erdős Sumset conjecture

Hindman’s finite sums theorem is one of the most famous and useful theorems in Ramsey theory. It states that for any finite partition of the natural numbers, one of the cells {C} of this partition contains an IP-set, i.e., there exists an infinite set {I\subset{\mathbb N}} such that for any finite subset {F\subset I} the sum {\sum_{x\in F}x\in C}.

This result is a partition result with no obvious density counterpart. Indeed it is easy to see that the set {2{\mathbb N}-1} of odd numbers (which has density {1/2}) can not have an IP-set, since for any {x,y\in{\mathbb N}}, either {x}, {y} or {x+y} must be even. One could still hope that any set with positive density can be shifted in order to contain an IP-set. Indeed, for any set of positive Banach density {A} there is a shift {A-k} which contains a triple of the form {\{x,y,x+y\}}, and often configurations which are partition regular are also in some shift of any set with positive density (however this is not always true, even for configurations with {2} elements, as shown by Kriz’s theorem).

Nevertheless, E. Straus constructed an example of a set {A} with density

\displaystyle d(A):=\lim_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N

arbitrarily close to {1} and such that no shift {A-k} with {k\in{\mathbb N}} contains an IP-set. (Straus never published this result, see Theorem 11.6 in this chapter by Hindman or ) or Theorem 2.20 in this more recent paper.

Observe that by splitting an infinite set {I} into two infinite pieces {B,C}, it follows that any IP-set contains the sum {B+C:=\{b+c:b\in B,c\in C\}} for infinite sets {B} and {C}. Therefore the following fact can be seen as a weakening of Hindman’s theorem:

Proposition 1 For any finite partition {{\mathbb N}=C_1\cup\dots\cup C_r} of the natural numbers there exists a color {A\in\{C_1,\dots,C_r\}} and infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

Unlike IP-sets, the configuration {B+C} is invariant under shifts, so it makes sense to ask whether any set with positive density contains such configuration.

Conjecture 2 (Erdős sumset conjecture) Let {A\subset{\mathbb N}} and assume that the Banach density

\displaystyle d^*(A):=\lim_{N\rightarrow\infty}\max_{M\in{\mathbb N}}\frac{|A\cap\{M+1,M+2,\dots,M+N\}|}N

is positive. Then there exist infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

This past week I was at AIM in San Jose for a workshop on Nonstandard methods in combinatorial number theory and there was some interesting discussion surrounding this conjecture. A few years ago there was some significant progress in this direction, in a paper by Di Nasso, Golbring, Jin, Leth, Lupini and Mahlburg using nonstandard analysis, and perhaps because all the authors were present at this workshop there was some hope of extending the ideas developed in the paper to address the general case. In this post I present a standard rendering of the proof from the DGJLLM paper as well as some of the observations made during the workshop. Continue reading

Posted in Combinatorics, Number Theory, State of the art | Tagged , , , , , , , | 4 Comments