## Chowla’s conjecture and the Riemann hypothesis via the Liouville function

Last week I was at the Bernoulli center in Lausanne, Switzerland to participate in the Young Researchers in Mathematics program organized by Florian Richter, and since the topic of the program was in number theory, we had some interesting conversations about prime number theory from an ergodic theory viewpoint. I decided to record some amusing conclusions that seem to connect Chowla’s conjecture to the Riemann hypothesis. To be clear, I am not suggesting any kind of progress in either of these problems, and in fact nothing in this post is particularly deep.

This also seemed to be an appropriate place to spell out some basic facts and conjectures from prime number theory and the relations between them as seen from my personal (i.e. ergodic) perspective.

Posted in Classic results, Number Theory, Probability | Tagged , , , , | Leave a comment

## A proof of Erdos’ B+B+t conjecture

Bryna Kra, Florian Richter, Donald Robertson and I have just uploaded to the arXiv our article entitled “A proof of Erdös’s ${B+B+t}$ conjecture”. This (until now open) conjecture states that any set with positive density contains a configuration of the form ${B\oplus B+t}$ for some ${t\in{\mathbb N}}$ and some infinite subset ${B\subset A}$. Here we use the notation ${B\oplus B}$ to denote the restricted sumset ${\{b_1+b_2:b_1,b_2\in B,\ b_1\neq b_2\}}$. Our result establishes this property for sets having positive upper Banach density, which is a slightly weaker assumption than, say, having positive natural density.

Our main theorem is the following.

Theorem 1 Let ${A\subset{\mathbb N}}$ have positive upper Banach density, i.e. satisfying

$\displaystyle d^*(A):=\limsup_{N-M\rightarrow\infty}\frac1{N-M}\big|A\cap\{M+1,\dots,N\}\big|>0.$

Then there exist an infinite set ${B\subset A}$ and a shift ${t\in{\mathbb N}}$ such that ${A-t}$ contains the restricted sumset ${B\oplus B:=\{b_1+b_2:b_1,b_2\in B,\ b_1\neq b_2\}}$.

Our proof of Theorem 1 uses some of the ergodic theoretic tools developed in our earlier paper, where we proved the following result.

Theorem 2 Let ${A\subset{\mathbb N}}$ with ${d^*(A)>0}$ and let ${k\in{\mathbb N}}$. Then ${A}$ contains a sumset of the form ${B_1+\cdots+B_k}$, where ${B_1,\dots,B_k}$ are infinite sets.

Both Theorems 1 and 2 extend (in different directions) a result of myself, Florian and Donald obtained a few years ago that also resolved a sumset conjecture of Erdös:

Theorem 3 If ${A\subset{\mathbb N}}$ has positive upper Banach density, then there exist infinite sets ${B,C\subset{\mathbb N}}$ such that ${A\supset B+C=\{b+c:b\in B,c\in C\}}$.

To see how Theorem 3 follows from Theorem 1 note that if ${\tilde B\subset{\mathbb N}}$ is an infinite set such that ${\tilde B\oplus\tilde B\subset A-t}$ for some shift ${t\in{\mathbb N}}$, then letting ${B\subset\tilde B}$ be any infinite subset whose complement ${\tilde B\setminus B}$ is also infinite, and letting ${C:=t+\tilde B\setminus B}$ yields ${B+C\subset A}$.

A big part of the innovation in our earlier paper proving Theorem 2 was to obtain a different proof of Theorem 3 (which is the case ${k=2}$) that could be generalized to higher values of ${k}$. It turns out that this new proof can also be adapted to yield the stronger result in Theorem 1, although there are a few annoying issues that need to be handled. Continue reading

Posted in Ergodic Theory, paper, Ramsey Theory | Tagged , , , , | Leave a comment

## Infinite Sumsets in Sets with Positive Density

Bryna Kra, Florian Richter, Donald Robertson and I have just uploaded to the arXiv our article entitled “Infinite Sumsets in Sets with Positive Density”. The main result of the paper is the following.

Theorem 1 Let ${A\subset{\mathbb N}}$ have positive upper density, i.e. satisfying

$\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac1N\big|A\cap\{1,\dots,N\}\big|>0.$

Then for any ${k\in{\mathbb N}}$ there exist infinite sets ${B_1,\dots,B_k\subset{\mathbb N}}$ such that ${B_1+\cdots+B_k\subset A}$.

One can in fact replace the positive upper density condition with the weaker condition of positive upper Banach density. Theorem 1 extends the Erdös sumset conjecture discussed in these previous posts, and answers a question (Question 6.3) raised in our previous paper with Richter and Robertson. Continue reading

| Tagged , , , , , | 1 Comment

## Affine Images of Infinite sets

— 1. Szemerédi’s theorem as affine images —

Szemerédi’s theorem is usually stated as “every set ${A\subset{\mathbb N}}$ with positive upper density contains arbitrarily long arithmetic progressions”, but it can also be formulated without explicit mention of arithmetic progressions:
Theorem 1 (Szemerédi’s theorem, reformulated) Let ${A\subset{\mathbb N}}$ have positive upper density, i.e.

$\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N>0.$

Then for any finite set ${F\subset{\mathbb N}}$ there is an affine transformation ${\phi:x\mapsto ax+b}$ with coefficients ${a,b\in{\mathbb N}}$ such that ${\phi(F)\subset A}$.

To see how this result implies Szemerédi’s theorem, note that for any affine transformation ${\phi}$, the set ${\phi(\{1,2,\dots,k\})}$ is a ${k}$-term arithmetic progression.

Conversely, assuming Szemerédi’s theorem, given ${A\subset{\mathbb N}}$ with ${\bar d(A)>0}$ and a finite set ${F\subset{\mathbb N}}$ let ${k=\max F}$, so that ${F\subset\{1,\dots,k\}}$. Then there is a arithmetic progression of length ${k+1}$ contained in ${A}$, say ${\{b,b+a,\dots,b+ka\}\subset A}$. Letting ${\phi(x)=ax+b}$ it follows that ${\phi(i)\in A}$ for every ${i\leq k}$ and in particular ${\phi(F)\subset A}$.

The advantage of formulating Szemerédi’s theorem in this form is that it readily allows for analogues in other settings. For instance, the multidimensional Szemerédi theorem of Furstenberg and Katznelson can be formulated as follows.

Theorem 2 (Multidimensional Szemerédi theorem, reformulated) Let ${d\in{\mathbb N}}$ and let ${A\subset{\mathbb N}^d}$ have positive upper density, i.e.

$\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}^d|}{N^d}>0.$

Then for any finite set ${F\subset{\mathbb N}^d}$ there is an affine transformation ${\phi:x\mapsto ax+b}$ with coefficients ${a\in{\mathbb N}}$, ${b\in{\mathbb N}^d}$ such that ${\phi(F)\subset A}$.

## Additive transversality of fractal sets in the reals and the integers

Daniel Glasscock, Florian Richter and I have just uploaded to the arXiv our new paper entitled “Additive transversality of fractal sets in the reals and the integers”. In this paper we introduce a new notion of fractal sets of natural numbers in analogy with ${\times r}$-invariant subsets of ${{\mathbb T}={\mathbb R}/{\mathbb Z}}$, and obtain the analogues of several results pertaining ${\times r}$-invariant subsets of ${{\mathbb T}={\mathbb R}/{\mathbb Z}}$.

The prototypical example of the kind of fractal subsets of ${{\mathbb N}_0=\{0,1,2,\dots\}}$ that we deal with is the integer middle thirds Cantor set consisting of all ${n\in{\mathbb N}_0}$ that when represented in base ${3}$ use only the digits ${0}$ and ${2}$. Symbolically

$\displaystyle C_{\mathbb Z}:=\left\{\sum_{i=0}^na_i3^i:n\in{\mathbb N}_0, (a_0,\dots,a_n)\in\{0,2\}^{n+1}\right\}.$

For comparison, the classical middle thirds Cantor set consists of all ${x\in[0,1]}$ that when written in base ${3}$ use only the digits ${0}$ and ${2}$, and can be described symbolically as

$\displaystyle C_{\mathbb R}:=\left\{\sum_{i=1}^na_i3^{-i}:n\in{\mathbb N}_0, (a_1,\dots,a_n)\in\{0,2\}^{n+1}\right\}.$

## Multiple ergodic averages along functions from a Hardy field: convergence, recurrence and combinatorial applications

Vitaly Bergelson, Florian Richter and I have recently uploaded to the arXiv our new paper entitled “Multiple ergodic averages along functions from a Hardy field: convergence, recurrence and combinatorial applications”. We establish a new multiple recurrence result (and consequentially a new generalization of Szemerédi’s theorem) which contains several previously known results as special cases and many new results.

An example of a result that follows as a special case from our main theorem but was currently unknown is that for any positive real numbers ${a,b>0}$, any set ${E\subset{\mathbb N}}$ with positive upper density, i.e.

$\displaystyle \bar d(E):=\limsup_{N\rightarrow\infty}\frac{|E\cap\{1,\dots,N\}|}N>0,$

contains a configuration ${\{x,x+\lfloor n^a\rfloor,x_\lfloor n^b\rfloor\}}$ for some ${x,n\in{\mathbb N}}$. Here, as usual, we denote by ${\lfloor\cdot\rfloor}$ the floor function. This result was known when both ${a}$ and ${b}$ are integers, and when both ${a}$ and ${b}$ are non-integers, but the mixed case was, somewhat surprisingly, still unknown. Below the fold I briefly mention the history of related results and present the main steps in the proof, highlighting the main novelties.

## Distal systems and Expansive systems

A topological dynamical system is a pair ${(X,T)}$ where ${X}$ is a compact metric space and ${T:X\rightarrow X}$ is a continuous map. This gives rise to an action of the semigroup ${({\mathbb N},+)}$ on ${X}$, where ${n\in{\mathbb N}}$ acts via the iterated map ${T^n=T\circ T^{n-1}}$ (and as usual ${T^0}$ denotes the identity map). If ${T}$ is invertible (i.e. a homeomorphism), then in fact it induces a ${({\mathbb Z},+)}$-action. More generally, one may consider an action ${T=(T_g)_{g\in G}}$ of any semigroup ${G}$ on ${X}$ by continuous functions. This means that for each ${g\in G}$ we have a continuous map ${T_g:X\rightarrow X}$ which satisfy the semigroup law ${T_g\circ T_h=T_{gh}}$. In this case we say that ${(X,T)}$ is a ${G}$-system. Throughout this post I will denote by ${d_X}$ (or ${d}$ if the underlying space is clear) a compatible metric on ${X}$.

Two important classes of topological dynamical systems are the class of distal systems and the class of expansive systems.

Definition 1

• A ${G}$-system ${(X,T)}$ is distal if

$\displaystyle \forall x,y\in X, \ x\neq y,\ \inf\big\{d(T_g x,T_g y):g\in G\big\}>0.$

• A ${G}$-system ${(X,T)}$ is expansive if there exists ${\delta>0}$ such that

$\displaystyle \forall x,y\in X, \ x\neq y,\ \sup\big\{d(T_g x,T_g y):g\in G\big\}>\delta.$

While not entirely trivial, both properties are preserved by topological conjugacy (i.e., isomorphism in the category of ${G}$-systems), and in particular don’t depend on the choice of the (compatible) metric ${d}$.

At a first glance at the definition it looks like the two conditions are quite similar. They both fall into the loose statement that “if two points are distinct, then they are far apart in the future” (at least if the acting semigroup is ${{\mathbb N}}$). However, it turns out that the two properties are very much different, and in some sense actually incompatible.

In this post I will mention some examples and properties of distal and expansive systems which illustrate this difference between the two classes, and present a proof of the incompatibility result:

Theorem 2 An ${{\mathbb Z}}$-system which is both distal and expansive must be finite. The same is true for ${{\mathbb N}}$-systems.

It is clear that any finite system (i.e., where ${X}$ is a finite set) is both distal and expansive.

Posted in Classic results, Topological Dynamics | | 2 Comments

## Structure of multicorrelation sequences with integer part polynomial iterates along primes

Andreas Koutsogiannis, Anh Le, Florian Richter and I have just uploaded to the arXiv a new paper entitled “Structure of multicorrelation sequences with integer part polynomial iterates along primes”. This short paper establishes a natural common extension of our earlier result with Le and Richter and a theorem of Koutsogiannis. I decided this was a good opportunity to write more generally about multicorrelation sequences and their relation to nilsequences.

— 1. Single transformation —

Given an invertible measure preserving system ${(X,\mu,T)}$ and bounded functions ${f_0,\dots,f_k\in L^\infty(X)}$ one can form the multicorrelation sequence

$\displaystyle \alpha(n)=\int_Xf_0\cdot T^n f_1\cdots T^{kn}f_k\,\mathrm{d}\mu. \ \ \ \ \ (1)$

Here, as usual, we denote by ${T^nf}$ the function ${f\circ T^n}$. This is a natural extension of the concept of single correlation sequences, which correspond to the case ${k=1}$ and control, for instance, spectral and mixing properties of the system. Multicorrelation sequences (or multiple correlation sequences) appear naturally in the theory of multiple recurrence. For instance, if ${f_0=f_1=\cdots=f_k=1_A}$ for some ${A\subset X}$ with ${\mu(A)>0}$, then the fact that ${\alpha(n)>0}$ for some ${n}$ is the statement of Furstenberg’s famous multiple recurrence theorem, which, via the correspondence principle, is equivalent to the celebrated Szemerédi theorem on arithmetic progressions.

— 1.1. The Bergelson-Host-Kra theorem —

It turns out that, despite the large number of degrees of freedom involved, multicorrelation sequences are quite rigid. This was first captured in a beautiful result of Bergelson Host and Kra in 2005, which shows that multicorrelation sequences are always only a small perturbation away from a very structured kind of sequence they named nilsequence. Since their debut in the aforementioned paper, nilsequences have became surprisingly important in additive combinatorics and number theory, due largely to the Inverse Theorem for Gowers norms.

Definition 1 Let ${G}$ be a nilpotent Lie group and let ${\Gamma\subset G}$ be a discrete and co-compact subgroup. The (compact) homogeneous space ${X=G/\Gamma}$ is called a nilmanifold and ${G}$ acts naturally on ${X}$ via left multiplication. Given a continuous function ${F:X\rightarrow{\mathbb C}}$, a point ${x\in X}$ and ${g\in G}$, the sequence ${\phi:{\mathbb N}\rightarrow{\mathbb C}}$ defined via ${\phi(n)=F(g^nx)}$ is called a basic nilsequence. A sequence ${\psi}$ is called a nilsequence if for every ${\epsilon>0}$ there exists a basic nilsequence ${\phi}$ such that ${\|\phi-\psi\|_{\ell^\infty}<\epsilon}$.

There are several variants of the notion of nilsequence. In this post I chose to use the definition given in the original paper of Bergelson Host and Kra (in our paper we use the more common terminology to call a nilsequence to what I defined above as a basic nilsequence). In some other variants ${F}$ is allowed different degrees of regularity (Riemann integrability, or Lipschitz or even ${C^\infty}$).

It will also be convenient to use the Weyl and the Besicovitch seminorms on ${\ell^\infty}$, defined for ${f\in\ell^\infty}$, respectively, by the formulas

$\displaystyle \|f\|_W:=\limsup_{N-M\rightarrow\infty}\frac1{N-M}\sum_{n=M}^N\big|f(n)\big|\qquad \|f\|_B:=\limsup_{N\rightarrow\infty}\frac1{N}\sum_{n=1}^N\big|f(n)\big|$

Theorem 2 (Bergelson-Host-Kra, 2005) Let ${\alpha}$ be a multicorrelation sequence defined by (1) where ${(X,\mu,T)}$ is an ergodic system. Then there exists a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W=0}$.

— 1.2. Averaging along subsequences —

Several enhancements of Szemerédi’s theorem involve restrictions on the common difference of the arithmetic progression to be found inside an arbitrary set with positive density. This leads to the study of averages of multicorrelation sequences along subsequences of natural numbers. It turns out that the structure of multicorrelation sequences survives even when we restrict attention to moderately sparse subsequences. For instance, given ${\alpha}$ defined by (1) and letting ${\psi}$ be given by Theorem 2, we have also that

$\displaystyle \lim_{N-M\rightarrow\infty}\frac1{N-M}\sum_{n=M}^N\big|\alpha(n^2)-\psi(n^2)\big|=0.$

One can write the previous equation as ${\|(\alpha-\psi)\circ p\|_W=0}$ where ${p(n)=n^2}$. In fact, the same is true when ${p}$ is any polynomial with integer coefficients (see Theorem 3 below).

On a related direction, when ${p(n)}$ is the increasing enumeration of the prime numbers, or when ${p(n)=\lfloor n^c\rfloor}$ for an arbitrary ${c\in{\mathbb R}^{>0}}$ (where ${\lfloor\cdot\rfloor}$ denotes the floor function), then Le proved that ${\|(\alpha-\psi)\circ p\|_B=0}$. In this case one can not replace the Besicovitch seminorm with the Weyl seminorm.

Back to polynomials, a far reaching extension of Theorem 1 was obtained by Leibman in two papers published in 2010 and 2015. In particular, Leibman was able to remove the hypothesis that the system defining the multicorrelation sequence is ergodic.

Theorem 3 (Leibman, 2010-2015) Let ${(X,\mu,T)}$ be a measure preserving system and let ${f_0,\dots,f_k\in L^\infty(X)}$. Let ${p_1,\dots,p_k\in{\mathbb Z}[x]}$ and let

$\displaystyle \alpha(n)=\int_Xf_0\cdot T^{p_1(n)} f_1\cdots T^{p_k(n)}f_k\,\mathrm{d}\mu. \ \ \ \ \ (2)$

Then there exists a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W=0}$.

By taking ${p_i(n)=in}$, we see that the class of multicorrelations defined by (1) is contained in the class defined by (2). On the other hand, multicorrelations of the form (2) appear naturally when considering polynomial extensions of Szemerédi’s theorem or of Furstenberg’s multiple recurrence theorem.

Part of the aforementioned result of Le also applies to multicorrelations ${\alpha}$ of the form (2).

Theorem 4 (Le, 2017) Let ${\alpha}$ be defined by (2), let ${\psi}$ be the nilsequence given by Theorem 3 and let ${p(n)}$ be the increasing enumeration of the prime numbers. Then ${\big\|(\alpha-\psi)\circ p\big\|_B=0}$.

— 2. Commuting transformations —

Multidimensional extensions of Szemerédi’s theorem (which in turn are a significant special case of the density Hales-Jewett theorem) can also be obtained via ergodic theoretic methods but require the use of measure preserving systems with several commuting transformations. This in turn leads to a more general class of multicorrelation sequences:

$\displaystyle \alpha(n)=\int_Xf_0\cdot T_1^n f_1\cdots T_k^nf_k\,\mathrm{d}\mu, \ \ \ \ \ (3)$

where ${T_1,\dots,T_\ell}$ are commuting measure preserving transformations on the probability space ${(X,\mu,T)}$ and ${f_0,\dots,f_k\in L^\infty(X)}$. Letting ${T_i=T^i}$ we see that the class of multicorrelation sequences defined by (3) contains the class of sequences defined by (1). In view of Theorem 2, the following is a natural conjecture.

Conjecture 5 Let ${\alpha}$ be defined by (3). Does there exist a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W=0}$?

Unfortunately, the proofs of the results mentioned in the previous section all rely on the structure theory of Host and Kra, which allows one to understand multiple ergodic averages by studying characteristic nilfactors, but which is unavailable for commuting transformations. In particular, this conjecture is still open. Nevertheless, in 2014, Frantzikinakis was able to obtain the following

Theorem 6 (Frantzikinakis) Let ${\alpha}$ be defined by (3). Then for every ${\epsilon>0}$ there exists a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W<\epsilon}$.

Remark 1 Since ${\|a\|_W\leq\|a\|_{\ell^\infty}}$ for any ${a\in\ell^\infty}$, in Theorem 6 one can require ${\psi}$ to be basic nilsequence. To streamline the presentation I will keep the slightly weaker formulation in this and other results below.

A convenient way to encode a tuple of ${\ell}$ commuting measure preserving transformations ${T_1,\dots,T_\ell}$ on a probability space ${(X,\mu)}$ is via a single measure preserving action ${T=(T^v)_{v\in{\mathbb Z}^\ell}}$ of ${{\mathbb Z}^\ell}$, defined by ${T^{e_i}=T_i}$, where ${e_i}$ is the ${i}$-th vector in the canonical basis of ${{\mathbb Z}^\ell}$. Given a measure preserving ${{\mathbb Z}^\ell}$-action ${T}$ on a probability space and vector polynomials ${p_1,\dots,p_k:{\mathbb Z}\rightarrow{\mathbb Z}^\ell}$ (i.e. each ${p_i=(p_{i,1},\dots,p_{i,\ell})}$ for some ${p_{i,j}\in{\mathbb Z}[x]}$) one can consider the multicorrelation sequence

$\displaystyle \alpha(n)=\int_Xf_0\cdot T^{p_1(n)} f_1\cdots T^{p_k(n)} f_k\,\mathrm{d}\mu. \ \ \ \ \ (4)$

The class of multicorrelation sequences defined by (4) contains all the other classes of multicorrelation sequences mentioned before in this post and appear naturally when studying multidimensional polynomial extensions of Szemerédi’s theorem. Theorem 6 was proved by Frantzikinakis also in the more general situation where ${\alpha}$ is given by (4).

Earlier this year, Le, Richter and I obtained a refinement of Theorem 6, which stands to it in the same way that Theorem 4 stands to Theorem 2.

Theorem 7 Let ${\alpha}$ be defined by (4) and let ${p(n)}$ be the increasing enumeration of the prime numbers. Then for every ${\epsilon>0}$ there exists a nilsequence ${\psi}$ satisfying simultaneously ${\|\alpha-\psi\|_W<\epsilon}$ and ${\big\|(\alpha-\psi)\circ p\|_B<\epsilon}$.

— 3. Integer part of real polynomials —

Given a measure preserving ${{\mathbb Z}^\ell}$-action ${T}$ on a probability space ${(X,\mu)}$, vector polynomials ${p_1,\dots,p_k:{\mathbb R}\rightarrow{\mathbb R}^\ell}$ (i.e., each ${p_i=(p_{i,1},\dots,p_{i,\ell})}$ with ${p_{i,j}\in{\mathbb R}[x]}$) and functions ${f_0,\dots,f_k\in L^\infty(X)}$, one can construct the multicorrelation sequence

$\displaystyle \alpha(n)=\int_Xf_0\cdot T^{\lfloor p_1(n)\rfloor} f_1\cdots T^{\lfloor p_k(n)\rfloor} f_k\,\mathrm{d}\mu, \ \ \ \ \ (5)$

where for a vector ${x=(x_1,\dots,x_\ell)\in{\mathbb R}^\ell}$ we denote ${\lfloor x\rfloor=(\lfloor x_1\rfloor,\dots,\lfloor x_\ell\rfloor)}$. If all polynomials ${p_i}$ have only integer coefficients, then (5) becomes (4), so the class of multicorrelation sequences defined via (5) contains all the other multicorrelation sequences defined in this post.

The extension of Frantzikinakis theorem to multicorrelation sequences of the form (5) was obtained by Koutsogiannis.

Theorem 8 (Koutsogiannis) Let ${\alpha}$ be defined by (5). Then for every ${\epsilon>0}$ there exists a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W<\epsilon}$.

The main result of our paper is the common extension of Theorems 7 and 8.

Theorem 9 Let ${\alpha}$ be defined by (5) and let ${p(n)}$ be the increasing enumeration of the prime numbers. Then for every ${\epsilon>0}$ there exists a nilsequence ${\psi}$ satisfying simultaneously ${\|\alpha-\psi\|_W<\epsilon}$ and ${\big\|(\alpha-\psi)\circ p\|_B<\epsilon}$.

— 4. Proofs —

As already mentioned above, the proofs of the results listed in Section 1 all require the structure theory. This involves two steps: first is a reduction to nilsystems and then an analysis of the equidistribution properties of orbits in nilsystems. I plan to write another post about these ideas in the near future.

The breakthrough idea of Frantzikinakis, leading to his proof of Theorem 6 can be described in rough terms as follows. Using a common van der Corput-type procedure, one can show that a multicorrelation sequence ${\alpha}$ of the form (3) is orthogonal, with respect to the ${\|\cdot\|_W}$ seminorm, to any sequence ${\beta\in\ell^\infty}$ which is uniform (in a sense analogous to Gowers uniformity in finite groups.) The same is true for any nilsequence. Then, letting ${\gamma}$ be the orthogonal projection of ${\alpha}$ onto the space of all nilsequences (w.r.t. ${\|\cdot\|_W}$) it follows that the difference ${\alpha-\gamma}$ is also orthogonal to every uniform sequence ${\beta}$. By (a version of) the inverse theorem for uniformity seminorms, if ${\alpha-\gamma}$ were not negligible, then it would have to correlate with some nilsequence. This is not possible, since ${\gamma}$ is the orthogonal projection of ${\alpha}$ to the space of nilsequences, and it follows that ${\|\alpha-\gamma\|_W=0}$. Unfortunately, the space of nilsequences is not closed in the ${\|\cdot\|_W}$ seminorm, so ${\gamma}$ itself is not a nilsequence. Nevertheless it belongs to the ${\|\cdot\|_W}$-closure of the space of nilsequences, so for every ${\epsilon>0}$ there is a nilsequence ${\psi}$ such that ${\|\alpha-\psi\|_W<\epsilon}$.

When proving Theorem 7, because the primes are involved, we need to employ the ${W}$-trick. This in turn requires some control on the nilsequences ${\psi}$ that we pick. In order to obtain the necessary additional properties on ${\psi}$ we have to first improve Theorem 6. We do this by describing the (topological) Furstenberg system of a multicorrelation sequence ${\alpha}$.

Finally, the proof of Theorem 9 follows closely the proof of Theorem 8. The idea is to approximate ${\alpha}$ as defined in (5) by a multicorrelation sequence as defined in (4). To do this we pass to (a multidimensional analogue of) the suspension flow and use the fact that, if ${T}$ is a ${{\mathbb R}}$-action and ${q(n)=a_0+a_1n+\cdots+a_mn^m\in{\mathbb R}[n]}$ then

$\displaystyle T^{q(n)}=T^{a_0}\big(T^{a_1}\big)^n\big(T^{a_2}\big)^{n^2}\cdots\big(T^{a_m}\big)^{n^m}.$

Since the measure preserving transformations ${T^{a_1},\dots,T^{a_m}}$ commute, by extending the original ${{\mathbb Z}^\ell}$-action to a ${{\mathbb Z}^{\ell m}}$-action one can represent a real polynomial by polynomials with integer coefficients. There are of course some difficulties with implementing this strategy, since one also needs to handle the floor function, but this is ultimately taken care of by the fact that polynomials, and polynomials evaluated along primes, equidistribute mod ${1}$ in a well understood subset of the ${[0,1)}$ interval.

## Tao’s Proof of (logarithmically averaged) Chowla’s conjecture for two point correlations

The Liouville function is the completely multiplicative function ${\lambda}$ with ${\lambda(p)=-1}$ for every prime ${p}$. The Chowla conjecture predicts that this function behaves randomly. Here is a version of this conjecture.

Conjecture 1 (Chowla) For every finite set ${F\subset\{0,1,\dots\}}$,

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\prod_{i\in F}\lambda(n+i)=0.$

This conjecture received a lot of attention in the last decade, since Peter Sarnak derived from it an orthogonality criterion for the Liouville (or, similarly, the Mobius function), now known as the “Sarnak conjecture”. Both Chowla’s and Sarnak’s conjectures remain open, however many weaker versions and special cases have been derived (see, for instance, this survey). Perhaps the most impressive result so far is the following theorem of Tao, which was in turn the starting point for many of the more recent developments.

Theorem 2 (Tao) For every (distinct) ${a,b\in{\mathbb N}}$,

$\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log{(N/M)}}\sum_{n=M}^N\frac1n\lambda(n+a)\lambda(n+b)=0.$

Tao’s main theorem in his paper is actually significantly more general than Theorem 2, for it allows more general affine maps inside the argument of ${\lambda}$, and in fact replaces ${\lambda}$ with a much wider class of multiplicative functions. To see why Theorem 2 follows from Conjecture 1, observe that any sequence ${x(n)}$ satisfying

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nx(n)=0$

must also satisfy

$\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log{N/M}}\sum_{n=M}^N\frac1n x(n)=0.$

In the terminology and notation of my earlier post, the conclusion of this theorem states that the uniform logarithmic average of the sequence ${n\mapsto\lambda(n+a)\lambda(n+b)}$ is ${0}$, i.e.,

$\displaystyle \underset{n\rightarrow\infty}{UW\text{-}\lim}~\lambda(n+a)\lambda(n+b)=0$

for ${W(n)=\log n}$.

In this post I will present Tao’s proof restricted to the following special case. (Most of the proof remains unchanged in the general case, but the presentation becomes easier when we avoid general ${a,b}$).

Theorem 3 Let ${\lambda}$ denote the Liouville function. Then

$\displaystyle \lim_{N/M\rightarrow\infty}\frac1{\log(N/M)}\sum_{n=M}^N\frac1n\lambda(n)\lambda(n+1)=0$

We will use a correspondence principle to reformulate the problem in the language of dynamical systems to, among other things, simplify the notation. This approach was also carried out by Tao in his blog, and the main motivation for me to write this post was to try to use an easier construction of a Furstenberg system, which would still keep track of the “congruence classes of the points” in the system, as detailed below the fold.

Posted in Number Theory, Probability | | 1 Comment

## A viewpoint on Katai’s orthogonality criterion

The Liouville function, defined as the completely multiplicative function ${\lambda}$ which sends every prime ${p}$ to ${-1}$, encodes several important properties of the primes. For instance, the statement that

$\displaystyle \frac1N\sum_{n=1}^N\lambda(n)=o(1)$

is equivalent to the prime number theorem, while the improved (and essentially best possible) rate of convergence

$\displaystyle \frac1N\sum_{n=1}^N\lambda(n)=O_\epsilon\left(\frac1{N^{1/2-\epsilon}}\right)$

for every positive ${\epsilon}$ is equivalent to the Riemann hypothesis.

Of particular interest are statements involving correlations (or lack thereof) between ${\lambda}$ and certain “structured” functions. In this direction, Green and Tao established “orthogonality” of ${\lambda}$ with any nilsequence. More precisely, they showed that for any nilpotent Lie group ${G}$, any discrete subgroup ${\Gamma}$ for which ${X:=G/\Gamma}$ is compact, any Lipschitz function ${F:X\rightarrow{\mathbb C}}$, any ${A>0}$ and any ${b\in G}$,

$\displaystyle \frac1N\sum_{n=1}^N\lambda(n)F(b^n\Gamma)=O_A\left(\frac1{\log^AN}\right),$

which was an important ingredient in their proof of the “finite complexity” version of the Dickson/Hardy-Littlewood prime tuple conjecture. Perhaps inspired by this result, Sarnak then conjectured that in fact the Liouville function (actually, the closely related Möbius function) is orthogonal to every deterministic sequence:

Conjecture 1 (Sarnak conjecture for the Liouville function) Let ${X}$ be a compact metric space and let ${T:X\rightarrow X}$ be a homeomorphism. If the topological dynamical system ${(X,T)}$ has ${0}$ topological entropy, then for every continuous ${F\in C(X)}$ and every ${x\in X}$,

$\displaystyle \frac1N\sum_{n=1}^N\lambda(n)F(T^nx)=o(1).$

Sarnak’s conjecture is still open, despite some remarkable progress (see the recent survey of Ferenczi, Kułaga-Przymus and Lemańczyk for information on the progress made so far).

A useful tool to establish partial cases of Sarnak’s conjecture (and other results not implied by the Sarnak conjecture) is the following orthogonality criterion of Katai.

Theorem 2 (Katai’s orthogonality criterion) Let ${u:{\mathbb N}\rightarrow {\mathbb C}}$ be bounded and let ${v:{\mathbb N}\rightarrow S^1}$ be a completely multiplicative function. If for every distinct primes ${h,h'}$

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nu(nh)\overline{u(nh')}=0 \ \ \ \ \ (1)$

then

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^Nu(n)\overline{v(n)}=0.$

Remark 1 One can in fact relax the condition on ${v}$ to be only a multiplicative function. This allows the criterion to be applied, for instance, with ${v}$ being the Möbius function.

The statement of Theorem 2 is reminiscent of the van der Corput trick, which is a ubiquitous tool in the fields of uniform distribution and ergodic Ramsey theory, was the protagonist of a survey Vitaly Bergelson and I wrote and has been mentioned repeatedly in this blog. In fact one can prove both results (Katai’s orthogonality criterion and van der Corput’s trick) using the same simple functional analytic principle. In this post I present this principle and use it to motivate proofs for the two theorems.