## A proof of the Erdős sumset conjecture

Florian Richter, Donald Robertson and I have uploaded to the arXiv our paper entitled A proof of the Erdős sumset conjecture. The main goal of the paper is to prove the following theorem, which verifies a conjecture of Erdős discussed in this previous post.

Theorem 1 Let ${A\subset{\mathbb N}}$ have positive upper density. Then it contains a sumset ${B+C=\{b+c:b\in B, c\in C\}}$ for infinite sets ${B}$ and ${C}$.

As usual, the upper density of a set ${A\subset{\mathbb N}}$ is the quantity

$\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N.$

More generally, given a sequence ${\Phi:N\mapsto\Phi_N=\{a_N,a_N+1,\dots,b_N\}}$ of intervals of integers with lengths ${|\Phi_N|}$ tending to infinity (or even more generally, given any Følner sequence ${\Phi:N\mapsto\Phi_N}$ in ${{\mathbb N}}$), we define the upper density with respect to ${\Phi}$ as

$\displaystyle \bar d_\Phi(A):=\limsup_{N\rightarrow\infty}\frac{\big|A\cap\Phi_N\big|}{|\Phi_N|}.$

When the limit exists we say that the density of ${A}$ with respect to ${\Phi}$ exists and denote it by ${d(A)}$ (and whenever we write ${d(A)}$ we are implicitly stating that the limit exists).

As explained in my earlier post mentioned above (see Proposition 12), one can adapt an argument of Di Nasso, Goldbring, Jin, Leth, Lupini and Mahlburg to reduce Theorem 1 to the following statement.

Theorem 2 If ${A\subset{\mathbb N}}$ has positive upper density, then there exist a sequence ${\Phi}$ of intervals, a set ${L\subset{\mathbb N}}$ and ${\epsilon>0}$ such that for every finite subset ${F\subset L}$, the intersection

$\displaystyle \bigcap_{\ell\in F}(A-\ell)\cap\big\{m\in{\mathbb N}:d_\Phi(A-m\cap L)>0\big\} \ \ \ \ \ (1)$

is infinite.

In my previous post I described how Theorem 2 can then be easily checked for weak mixing sets and for Bohr sets, whose definitions we now recall.

Definition 3 A set ${A\subset{\mathbb N}}$ is:

• a Bohr set if there exist ${d\in{\mathbb N}}$, a vector ${\alpha\in{\mathbb T}^d}$ and an open set ${U\subset{\mathbb T}^d}$ such that ${A\supset\{n\in{\mathbb N}:n\alpha\in U\}\neq\emptyset}$.
• a weak mixing set with respect to the sequence of intervals ${\Phi}$ if the set ${\big\{n:\big|d_\Phi((A-n)\cap A)-d_\Phi(A)^2\big|<\epsilon\big\}}$ has density ${1}$.

It is well known in the realm of ergodic theory that weak mixing and almost periodicity are extreme points of a dichotomy (see for instance, this post) and so it was natural to try to patch these two observations into a proof of Theorem 2 in general. While this is ultimately how we prove Theorem 2, there were several problems with implementing this idea. Below the fold I explain the problems that arise and (briefly) sketch how we address them.

— 1. Generalizing from sets to functions —

The first problem is that one can not decompose an arbitrary set ${A\subset{\mathbb N}}$ into a Bohr set and a weak mixing set; instead one can only hope to decompose the indicator function ${1_A}$ of an arbitrary set ${A\subset{\mathbb N}}$ with positive density as the sum of an almost periodic and a weak mixing components. However, it is not obvious how to interpret (1) if ${A}$ has been replaced with a function.

To overcome this issue we make use of the notion of ultrafilters. An ultrafilter is a collection ${p}$ of subsets of ${{\mathbb N}}$ which is closed under finite intersections and has the properties that if ${E\in p}$ and ${E\subset I}$, then ${I\in p}$; and if ${E\notin p}$, then ${{\mathbb N}\setminus E\in p}$. One should think of ultrafilters in this context as elements of the Stone-Čech compactification ${\beta{\mathbb N}}$ of ${{\mathbb N}}$, so that ${{\mathbb N}}$ is a discrete but dense subset of ${\beta{\mathbb N}}$. Given a set ${A\subset{\mathbb N}}$ and an ultrafilter ${p\in\beta{\mathbb N}}$, we denote by ${A-p}$ the set ${\{n:A-n\in p\}}$. One can interpret ${A-p}$ as a generalized shift of ${A}$, which has some, but not all, properties of “principal” shifts ${A-m}$ with ${m\in{\mathbb N}}$. For more background on ultrafilters, see also this previous post.

With all the notions in place, one can now easily reduce Theorem 2 to the following statement (see Corollary 13 in my previous post).

Theorem 4 Let ${A\subset{\mathbb N}}$ satisfy ${\bar d_\Phi(A)>0}$ for some sequence ${\Phi}$ of intervals. Then there exist a subsequence ${\Psi}$ of ${\Phi}$ and a non-principal ultrafilter ${p\in\beta{\mathbb N}}$ such that

$\displaystyle \lim_{m\rightarrow p}d_\Psi\big(A-m\cap A-p\big)>0 \ \ \ \ \ (2)$

Here, as usual, the expression ${\lim_{m\rightarrow p}f(m)}$ denotes the number ${x}$ with the property that for every ${\epsilon>0}$ the set ${\{m:|f(m)-x|<\epsilon\}}$ belongs to ${p}$. It can be shown that such ${x}$ exists and is unique whenever ${f}$ is bounded.

The main advantage of Theorem 4 over Theorem 2 is that it is easy to write (2) in terms of only the indicator function ${1_A}$ of ${A}$ as

$\displaystyle \lim_{m\rightarrow p}\lim_{N\rightarrow\infty}\frac1{|\Psi_N|}\sum_{n\in\Psi_N}T^m1_A(n)T^p1_A(n)$

where ${T^mf(n):=f(n+m)}$ and ${\displaystyle T^pf(n)=f(n+p)=\lim_{m\rightarrow p}f(n+m)}$ for every ${m\in{\mathbb N}}$, ${p\in\beta{\mathbb N}}$ and ${f\in\ell^\infty({\mathbb N})}$. To make the notation even more suggestive of the functional analytic intuition we want to use, let’s define, for every ${f,g\in\ell^\infty({\mathbb N})}$ and sequence of intervals ${\Phi}$,

$\displaystyle \langle f,g\rangle_\Phi:=\lim_{N\rightarrow\infty}\frac1{|\Phi_N|}\sum_{n\in\Phi_N}f(n)\overline{g(n)}$

whenever the limit exists, and

$\displaystyle \|f\|_\Phi^2:=\limsup_{N\rightarrow\infty}\frac1{|\Phi_N|}\sum_{n\in\Phi_N}|f(n)|^2.$

With this notation we can briefly formulate a functional version of Theorem 4.

Theorem 5 Let ${f:{\mathbb N}\rightarrow[0,1]}$ and let ${\Phi}$ be a sequence of intervals with ${\|f\|_\Phi>0}$. Then there exists a subsequence ${\Psi}$ of ${\Phi}$ and a non-principal ultrafilter ${p}$ such that

$\displaystyle \lim_{m\rightarrow p}\big\langle T^mf, T^pf\big\rangle_\Psi>0 \ \ \ \ \ (3)$

It is not hard to show that Theorem 5 holds for weak mixing and for almost periodic functions.

Definition 6 Let ${f:{\mathbb N}\rightarrow{\mathbb C}}$ be a function and let ${\Phi}$ be a sequence of intervals.

• ${f}$ is Bohr almost periodic if for every ${\epsilon>0}$ there exists a trigonometric polynomial ${t(n)=\sum_{j=1}^kc_je^{2\pi i\theta_j n}}$, where ${k\in{\mathbb N}}$, ${c_j\in{\mathbb C}}$ and ${\theta_j\in{\mathbb R}}$, such that

$\displaystyle \big\|f(n)-t(n)\big\|_{\ell^\infty}<\epsilon$

• ${f}$ is Besicovitch almost periodic along ${\Phi}$ if for every ${\epsilon>0}$ there exists a trigonometric polynomial ${t(n)=\sum_{j=1}^kc_je^{2\pi i\theta_j n}}$, where ${k\in{\mathbb N}}$, ${c_j\in{\mathbb C}}$ and ${\theta_j\in{\mathbb R}}$, such that

$\displaystyle \big\|f(n)-t(n)\big\|_\Phi<\epsilon.$

• ${f}$ is weak mixing along ${\Phi}$ if for every ${\epsilon>0}$, the set

$\displaystyle \Big\{m\in{\mathbb N}:\big|\langle T^mf,f\rangle_\Phi\big|<\epsilon\Big\}$

has full density.

To establish Theorem 5 for a Bohr almost periodic function one needs to fix a small ${\epsilon}$ (say ${\epsilon=\|f\|_\Phi^2/3}$), then take any ultrafilter containing the infinite set ${R:=\{n:\|T^mf-f\|_{\ell^\infty}<\epsilon\}}$. It follows that ${\|T^pf-f\|_{\ell^\infty}<\epsilon}$ and that ${\|T^m-f\|_{\ell^\infty}\leq\epsilon}$ for all ${m\in R}$, whence (3) follows. Observe that the same argument does not quite work for a Besicovitch almost periodic function, although it can be combined with an application of the ergodic theorem to work in this case.

Strictly speaking, Theorem 5 is vacuous for weak mixing functions, since a function ${f:{\mathbb N}\rightarrow[0,1]}$ is weak mixing along ${\Phi}$ if and only if ${\|f\|_\Phi=0}$; but it is also not too hard to establish Theorem 5 when ${f-c}$ is weak mixing for some constant ${c}$ (this is the case when ${f=1_A}$ and ${A}$ is a weak mixing set). Indeed, one can take any ultrafilter ${p}$ which contains the (full density) set ${\Big\{m\in{\mathbb N}:\big|\langle T^mf_{wm},f_{wm}\rangle_\Phi\big|, where ${f_{wm}=f-c}$, and satisfies ${\langle T^pf,1\rangle\geq c}$. That such an ultrafilter exists can be established by modifying the proof of the intersectivity lemma of Bergelson.

— 2. The decompositions —

Next we face the second problem: it is not true that every function ${f:{\mathbb N}\rightarrow[0,1]}$ can be decomposed into the sum of a Besicovitch almost periodic function and a weak mixing function. Instead we need to resort to two decompositions of an arbitrary function into a pseudo-random and an almost periodic components.

These decompositions are on their own a nice outcome of the proof, and explain to some extent how to decompose a function on ${{\mathbb N}}$ in a way which mimics dynamical intuition. It is crucial for us to be able to pass to a subsequence of our initial sequence of intervals, as there are examples which can not be decomposed into reasonable pieces otherwise.

The next definition is heavily influenced by the corresponding notion in ergodic theory (see, for instance Definition 1 in this previous post)

Definition 7 Let ${\Phi}$ be a sequence of intervals. A function ${f:{\mathbb N}\rightarrow{\mathbb C}}$ is compact with respect to ${\Phi}$ if for every ${\epsilon>0}$ there exists ${K\in{\mathbb N}}$ such that

$\displaystyle \sup_{m\in{\mathbb N}}\ \min_{1\leq k\leq K}\|T^kf-T^mf\|_\Phi<\epsilon.$

Observe that every Besicovitch almost periodic function along ${\Phi}$ is compact with respect to ${\Phi}$; however the converse does not hold. One can show that ${f}$ is compact if it has a Furstenberg system where it is represented by a compact function. Similarly, a function is Besicovitch almost periodic if it has a Furstenberg system where it is represented by a function in the ${L^2}$ closure of the space spanned by the continuous eigenfunctions.

Our first decomposition is the analog of the Jacobs-de Leeuw-Glicksberg decomposition in ${{\mathbb N}}$.

Theorem 8 Let ${f:{\mathbb N}\rightarrow{\mathbb C}}$ be bounded and let ${\Phi}$ be a sequence of intervals. Then there exists a subsequence ${\Psi}$ of ${\Phi}$ and a decomposition

$\displaystyle f=f_c+f_{wm}$

such that ${f_c}$ is compact with respect to ${\Psi}$ and ${f_{wm}}$ is weak mixing with respect to ${\Psi}$.

The proof of Theorem 8 consists of applying Furstenberg’s correspondence principle to ${f}$, then using the usual Jacobs-de Leeuw-Glicksberg decomposition and then bringing back the two components from the measure preserving system into functions over ${{\mathbb N}}$. There is a subtle issue regarding the fact that the two components will not be continuous functions (as the decomposition is only a measurable gadget) and to bring the functions back to ${{\mathbb N}}$ requires some effort.

Unfortunately, it is not clear how to prove Theorem 5 for a compact function, so we need to also use a second decomposition.

Theorem 9 Let ${f:{\mathbb N}\rightarrow{\mathbb C}}$ be bounded and let ${\Phi}$ be a sequence of intervals. Then there exists a subsequence ${\Psi}$ of ${\Phi}$ and a decomposition

$\displaystyle f=f_b+f_\perp$

such that ${f_b}$ is Besicovitch almost periodic with respect to ${\Psi}$ and ${\langle f_\perp,t\rangle_\Psi=0}$ for every trigonometric polynomial ${t}$.

With both decompositions at hand, in order to prove Theorem 5 we can decompose ${f=f_{wm}+f_c=f_b+f_\perp}$ and then rewrite the left hand side of (3) as

$\displaystyle \lim_{m\rightarrow p}\langle T^mf_{wm},T^pf\rangle_\Psi+\langle T^m f_c,T^pf_b\rangle_\Psi+\langle T^mf_c,T^pf_\perp\rangle_\Psi. \ \ \ \ \ (4)$

The first term ${\langle T^mf_{wm},T^pf\rangle_\Psi}$ of (4) is very small for a full density set of ${m}$ (in view of the fact that ${f_{wm}}$ is a weak mixing function) and so for “most” ultrafilters ${\lim_{m\rightarrow p}\langle T^mf_{wm},T^pf\rangle_\Psi=0}$ (most here means with respect to a probability measure on ${\beta{\mathbb N}}$ which is related with the sequence ${\Phi}$).

The middle term ${\langle T^m f_c,T^pf_b\rangle_\Psi}$ of (4) can be easily managed when ${f_b}$ is Bohr almost periodic: just find a Bohr set ${B}$ with ${\|T^mf_c-f_c\|_\Psi<\epsilon}$ (for some small ${\epsilon>0}$) for every ${m\in B}$, and ${\|T^mf_b-f_b\|_{\ell^\infty}<\epsilon}$ for every ${m\in B}$. The second condition implies that whenever ${p}$ is an ultrafilter with ${B\in p}$ we have ${\|T^pf_b-f_b\|_{\ell^\infty}<\epsilon}$, and then it is not hard to show that ${\lim_{m\rightarrow p}\langle T^m f_c,T^pf_b\rangle_\Psi\geq\langle f,1\rangle^2-2\epsilon}$ which is positive when ${\epsilon}$ is small enough. It is not so easy to deal with the second term when ${f_b}$ is a general Besicovitch almost periodic function but one can use the approximation of Besicovitch almost periodic functions by trigonometric polynomials, together with the pointwise ergodic theorem to obtain a similar result.

However, it is not clear how to deal with the third term ${\langle T^mf_c,T^pf_\perp\rangle_\Psi}$ of (4). Observe that this is the term we know the least about, in the sense that ${f_c}$ is compact but it may not be Besicovitch almost periodic (which would be a stronger condition) and ${f_\perp}$ is orthogonal to trigonometric polynomials but it is not necessarily weak mixing (which again would be a stronger condition).

— 3. The final step —

At the end of last section we sketched the argument showing that for almost every ultrafilter containing a certain Bohr set ${B}$, the sum of the first two terms in (4) is non-negative. The last step is to show that for “many” ultrafilter containing ${B}$, the last term ${\langle T^mf_c,T^pf_\perp\rangle_\Psi}$ of (4) is not too negative as ${m\rightarrow p}$. Recall that ${B}$ contained only numbers ${m}$ for which ${\|T^mf_c-f_c\|<\epsilon}$, so we can essentially replace ${T^mf_c}$ with ${f_c}$. Thus we need to show:

Theorem 10 Let ${\Phi}$ be a sequence of intervals in ${{\mathbb N}}$, let ${f_c:{\mathbb N}\rightarrow{\mathbb R}}$ be bounded, let ${f_\perp}$ satisfy ${\langle f_\perp,t\rangle_\Psi=0}$ for every trigonometric polynomial ${t}$, let ${B}$ be a Bohr set and let ${\epsilon>0}$. Then there is a subsequence ${\Psi}$ of ${\Phi}$ and an ultrafilter ${p}$ containing ${B}$ such that

$\displaystyle \langle f_c,T^pf_\perp\rangle_\Psi\geq-\epsilon$

Our proof of Theorem 10 is inspired by an argument of Beiglböck described in my previous post. One can rephrase the statement of Beiglböck’s result as stating that for any sets ${A,B\subset{\mathbb N}}$ one can find an ultrafilter ${p}$ such that ${\langle 1_A,T^p1_B\rangle_\Psi\geq d(A)_\Psi d(B)_\Psi}$. In fact its proof is not hard to modify in order to show that for any functions ${f,g:{\mathbb N}\rightarrow[0,1]}$ there exists an ultrafilter ${p}$ such that

$\displaystyle \langle f,R^pg\rangle_\Psi\geq\langle f,1\rangle_\Psi\cdot\langle 1,g\rangle_\Psi$

However, our condition that ${p}$ must contain the set ${B}$ is crucial and makes the argument more complicated. In particular, we need crucially that ${f_\perp}$ is orthogonal to every trigonometric polynomial, and hence also orthogonal to the indicator function ${1_B}$ of the Bohr set ${B}$.