In 1977 Furstenberg gave a new proof of Szemerédi’s theorem using ergodic theory. The first step in that proof was to turn the combinatorial statement into a statement in ergodic theory. Thus Furstenberg created what is now known as Furstenberg’s correspondence principle. While this was not (by far) the most difficult part of the proof of Szemerédi’s theorem, it was this principle that allowed many generalizations of Szemerédi’s theorem to be proved via ergodic theoretical arguments. Most of those generalizations had to wait a long time before seeing a combinatorial proof, and for some, no combinatorial proof was ever found (yet).

In this post I will state and prove the correspondence principle and then I will use it to give another proof of Sárközy’s theorem, discussed in my previous post.

** — 1. Motivation and a more general setup — **

The original correspondence principle was stated for subsets of , but I will consider more general countable commutative semigroups.

This is useful when we want to use this technology to deal with combinatorial properties of (say) the multiplicative semigroup of integers, as well as additive structure on the higher dimension groups .

Definition 1 (Følner sequence)Let be a countable semigroup. A (left) Følner sequence in is a sequence where each is a finite subset of and such that for any the set satisfies

The standard example is the sequence of intervals in the group (note that the group operation is the addition). We note that not every countable semigroup has a Følner sequence. For instance the free group on two generators has no Følner sequence.

All solvable groups (in particular abelian groups) are amenable. The existence of a Følner sequence in a semigroup allows us to define the notion of density:

Definition 2 (Upper Density)Let be a countable semigroup admiting a Følner sequence . Then for any subset we define its upper density (with respect to ) by

Note that the limit doesn’t exist in general. Also, a set can have different densities with respect to different Følner sequences. As an example consider the set . With respect to the Følner sequence the density of is , but with respect to the standard Følner sequence the density is .

The idea of the Correspondence principle is to transfer combinatorial statements into statements about measure preserving systems. For instance, if , to say that contains an arithmetic progression of length is the same as to say that for some positive integer the intersection is non-empty. To see this note that is in that intersection if and only if the arithmetic progression is contained in .

Now if we introduce the shift map defined by , then contains an arithmetic progression of length if and only if . In many situations we prove that the intersection is non-empty by proving it’s actually big (has positive upper density). Now this formulation suggests an ergodic theoretical formulation of the Szemerédi’s theorem. First we need to introduce some concepts.

Let be a probability space and a probability preserving map, i.e. is a measurable function such that for each we have (here as usual we define ). The quadruple is called a measure preserving system. An ergodic theoretical formulation of Szemerédi’s theorem is

Theorem 3 (Furstenberg’s Multiple Recurrence theorem)Let be a measure preserving system and let have positive measure. Then for all positive integer there is some positive integer such that

More generally one can consider, for any semigroup , an action of on the probability space by probability preserving transformations.

** — 2. Furstenberg’s Correspondence principle — **

We now state the general correspondence principle that, in the particular case , allows one to derive Szemerédi’s theorem from Theorem 3.

Theorem 4 (Furstenberg’s Correspondence Principle)Let be a countable commutative semigroup (with identity) and admitting a Følner sequence . Let be a subset with positive upper density. Then there exists a probability space , a measure preserving -action on and a set such that and for any

*Proof:* Consider the compact space and let be the characteristic function of (i.e. if and otherwise). Consider the action of given by for any and . Let be the orbit closure of , i.e. where the topology on is the product topology.

If then for any we also have . Let be the set of all sequences such that . We have . Let be a sequence such that

Now consider in the induced topology and the Borel -algebra . For each positive integer we define the measure as the average of the point mass measure at the points for . In other words, for a subset we set where is the number of elements of such that . We have thus that . On the other hand, all measures are probability Borel measures, and thus live in the weak-* unit ball of the dual of the space of continuous functions on . Thus by compactness of that unit ball we have that (possibly after passing to a subsequence) the sequence converges (in the weak-* topology). Let be the limit measure. Then and for any we have

I now show how to use this result to convert the multiple recurrence theorem (theorem 3) into Szemerédi’s theorem. Recall that Szemerédi’s theorem states that a subset with positive upper Banach density contains arbitrarily long arithmetic progressions. If is such a set then it has positive upper density with respect to some Følner sequence consisting of intervals (cf. definition 2). We can now apply the correspondence theorem to get a measure preserving system and a subset with .

By theorem 3for all there is some such that , and from the correspondence theorem we know that

In particular there is some and so is the desired arithmetic progression (since was arbitrary contains arbitrarily long arithmetic progressions). This finishes the proof of Szemerédi’s theorem (assuming theorem 3).

As another simple example of an application of the correspondence principle, I now give the proof that a recurrent set is also an intersective set, as mentioned in my previous post. Let be a recurrent set and be a set of positive upper density (with respect to some Følner sequence). We want to prove that the intersection is non-empty. Applying the correspondence principle we find a measure preserving system and with . Since is recurrent, there is some such that . But then we have and if then , so both and thus .

** — 3. Sárközi’s theorem, revisited — **

In my last post, I proved that any divisible polynomial (I define this below) is a recurrent set. I also observed that that result is a consequence of Sárközi’s theorem (which states that divisible polynomials are intersective), because intersective sets are recurrent. In the last section I showed that recurrent sets are intersective, so Sárközi’s theorem can be stated as “divisible polynomials are recurrent”. More precisely

Theorem 5 (Sárközi/Furstenberg/Kamae-Mendès France)Let be non-zero polynomial such that for any integer the image is also an integer. Assume also that is a divisible polynomial, i.e.Then the set is recurrent, i.e. for any measure preserving system and any set with there is such that .

Sárközi proved that for polynomials with a zero (i.e. an integer such that ) the set is intersective (he first proved the case ) and independently Furstenberg, using his correspondence principle proved the same result for squares (so he actually proved that is a recurrent set, and then from the correspondence principle concluded that it must also be an intersective set). Later Kamae and Mendès France proved the general case when is just divisible (note that if has a zero then it is automatically divisible).

I will now provide a proof based on Furstenberg’s original argument but using a version of the van der Corput trick instead of the spectral theorem which would be a (slightly) softer way.

Proposition 6 (van der Corput trick in Hilbert spaces)Let be a bounded sequence in a Hilbert space . If for each positive integer we havethen as .

The proof of theorem 5 illustrates the basic idea behind recurrence theorems (which, thanks to the correspondence theorem can be used to deduce combinatorial results). The first step is to consider the characteristic function of instead of the set itself. Then we want to decompose the space (in this case ) into an orderly component and a pseudo-random component. This is of course a very vague statement, and in each case, to determine what is the orderly component (and what is the pseudo-random) is the important part. Elements in the pseudo-random component should have pseudo-random orbits, in the sense that they somehow average down to , and elements in the orderly component can be controlled.

In this case those component are orthogonal subspaces of . The van der Corput trick will be used to prove that the pseudo-random component does average down to , then the orderly part is easy to control.

*Proof:* of theorem 5

Let and let be given by . Then is a unitary operator (this is called the Koopman operator). Let be the characteristic function of (i.e. if and if ). Then and we want to prove that this is positive for some . We will do this by showing that the sum over some subset of is non-zero.

By hypothesis, the set is non-empty for all , and if then also (think) and so for any integer . It turns out that the average gets close to the projection of in a subspace (which will be the orderly subspace) if is large enough. The subspace is the rational subspace .

We can decompose where and is orthogonal to . Now we will prove that converges to , using proposition 6. First note that the function is also a polynomial, we will actually prove that

holds for any non-constant polynomial , using induction on the degree of . If has degree , say , then the mean ergodic theorem applied to says that the limit in (1) is the projection of onto the invariant subspace of . Since is invariant under , is orthogonal to which contains the invariant subspace of . This proves (1) when the degree of is .

Now assume by induction that (1) holds for (which is a polynomial of degree less than ). Let . Then by the induction hypothesis we have for each positive integer that

Thus by proposition 6 we conclude that (1) holds.

We can now prove that gets arbitrarily close to : fix and let be such that for some non-zero integer and (this can be done because ). Note that if is multiple of then still and so (because is unitary). Let , taking averages we get and so for large enough, using (1), we also have that .

Since . Putting this together with the last equation we get

and so for some we have , and so putting small enough we conclude that .

EDIT(17/01/2020): I’ve edited the post, restricting the Furstenberg correspondence principle (Theorem 3) only to commutative semigroups admitting a Følner sequence. Formerly the post contained the (wrong) claim that if a countable semigroup is amenable, then it admits a Følner sequence. Moreover, the proof of Theorem 3 as presented didn’t work for non-commutative semigroups.

The Furstenberg Correspondence principle is valid for arbitrary amenable semigroups, but a different proof is required.

Pingback: YAMB

Pingback: Recurrence theorems | YAMB

Pingback: Proof of Roth’s Theorem using Ergodic Theory | YAMB

Pingback: Convergence along ultrafilters – part II | YAMB

Pingback: Sets of nice recurrence | YAMB

Pingback: YAMB

Pingback: On {x+y,xy} patterns in large sets of countable fields | I Can't Believe It's Not Random!

Pingback: Applications of the coloring trick | I Can't Believe It's Not Random!

Pingback: Convergence and Recurrence of Z actions | I Can't Believe It's Not Random!

Pingback: Szemerédi’s Theorem Part I – Equivalent formulations | I Can't Believe It's Not Random!

Reblogged this on Human Mathematics.

Pingback: Measure preserving actions of affine semigroups and {x+y,xy} patterns | I Can't Believe It's Not Random!

Pingback: I Can't Believe It's Not Random!

Hi Joel, thanks for your post! It’s very informative. I was just wondering if your definition of “amenable” (that there exists a Folner sequence/net) is really sufficient for most working definitions of amenability. For example, commutative semigroups ought to be amenable, and yet the multiplicative semigroup does not admit a Folner sequence because is an absorbing element.

Even if you’re given a Folner sequence (or net), constructing a finitely-additive, invariant probability measure on the power set of doesn’t seem to work unless satisfies some cancellation condition. Absorbing elements seem to screw things up. Even the action seems to be more of an “anti” action in the sense that ; for groups you could fix this by setting where denotes left-translation. So amenability of semigroups in terms of Folner sequences doesn’t seem to be as easy to work with as it is for groups.

Dear E, thank you for your comment!

You’re right of course that not every countable amenable semigroup has a Folner sequence (this was a misconception I had at the time I wrote this post), and indeed general (even amenable) semigroups can be much weirder than those embedable into groups.

And you’re also right in that the proof presented of the Furstenberg Correspondence Principle only works for commutative semigroups.

I’ve edited the post to reflect this (perhaps I will later write a new post with a more general and modern proof of the Correspondence principle).

However, strictly speaking, absorbing elements do not prevent invariant means/Folner sequences from existing. Indeed, in the semigroup endowed with multiplication, a “Dirac measure” at is an invariant mean (and we can take as Folner sequence a constant sequence where every consists only of the singleton ). This is of course quite degenerate…

Thanks Joel for the quick feedback. I appreciate it! I found recently that if $G$ admits a Folner net, then in fact it admits a Folner net $(F_\lambda)$ in which each member is “cancellable”, meaning $gx=gy\implies x=y$ for all $g\in G$ and $x,y\in F_\lambda$. This is a clever theorem of Gray–Kambites. Once you have this sequence, you can construct a finitely-additive invariant probability measure on the power set of $G$ in the way one would usually do for groups (using an ultralimit in place of a limsup in the definition of the upper density). Unfortunately, without the cancellation property, the ultralimit construction may fail to result in an *invariant* measure, as far as I can tell.