In 1977 Furstenberg gave a new proof of Szemerédi’s theorem using ergodic theory. The first step in that proof was to turn the combinatorial statement into a statement in ergodic theory. Thus Furstenberg created what is now known as Furstenberg’s correspondence principle. While this was not (by far) the most difficult part of the proof of Szemerédi’s theorem, it was this principle that allowed many generalizations of Szemerédi’s theorem to be proved via ergodic theoretical arguments. Most of those generalizations had to wait a long time before seeing a combinatorial proof, and for some, no combinatorial proof was ever found (yet).
In this post I will state and prove the correspondence principle and then I will use it to give another proof of Sárközy’s theorem, discussed in my previous post.
— 1. Motivation and a more general setup —
The original correspondence principle was stated for subsets of , but I will go a little more generally and consider any countable amenable semigroup. This is useful when we want to use this technology to deal with combinatorial properties of (say) the multiplicative semigroup of integers, as well as additive structure on the higher dimension groups .
Definition 1 (Følner sequence)Let be a countable semigroup. A (left) Følner sequence in is a sequence where each is a finite subset of and such that for any the set satisfies
The standard example is the sequence of intervals in the group (note that the group operation is the addition). We note that not every countable semigroup has a Følner sequence. For instance the free group on two generators has no Følner sequence.
If a countable semigroup has a Følner sequence we say that is an amenable semigroup. All solvable groups (in particular abelian groups) are amenable. The existence of the Følner sequence allows us to define the notion of density:
Note that the limit doesn’t exist in general. Also, a set can have different densities with respect to different Følner sequences. As an example consider the set . With respect to the Følner sequence the density of is , but with respect to the standard Følner sequence the density is .
The idea of the Correspondence principle is to transfer statements about measure preserving systems to combinatorial statements. For instance, if , to say that contains an arithmetic progression of length is the same as to say that for some positive integer the intersection is non-empty. To see this note that is in that intersection if and only if the arithmetic progression is contained in .
Now if we introduce the shift map defined by , then contains an arithmetic progression of length if and only if . In many situations we prove that the intersection is non-empty by proving it’s actually big (has positive upper density). Now this formulation suggests an ergodic theoretical formulation of the Szemerédi’s theorem. First we need to introduce some concepts.
Let be a probability space and a probability preserving map, i.e. is a measurable function such that for each we have (here as usual we define ). The quadruple is called a measure preserving system. An ergodic theoretical formulation of Szemerédi’s theorem is
More generally one can consider, for any semigroup , an action of on the probability space by probability preserving transformations.
— 2. Furstenberg’s Correspondence principle —
We now state the general correspondence principle that, in the particular case and is the shift defined above, allows one to actually deduce Szemerédi’s theorem from theorem 3.
Theorem 4 (Furstenberg’s Correspondence Principle)Let be a countable amenable semigroup (with identity) and let be a Følner sequence. Let be a subset with positive upper density. Then there exists a probability space , a measure preserving -action on and a set such that and for any
Proof: Consider the compact space and let be the characteristic function of (i.e. if and otherwise). Consider the action of given by for any and . Let be the orbit closure of , i.e. where the topology on is the product topology.
If then for any we also have . Let be the set of all sequences such that . We have . Let be a sequence such that
Now consider in the induced topology and the Borel -algebra . For each positive integer we define the measure as the average of the point mass measure at the points for . In other words, for a subset we set where is the number of elements of such that . We have thus that . On the other hand, all measures are probability Borel measures, and thus live in the weak-* unit ball of the dual of the space of continuous functions on . Thus by compactness of that unit ball we have that (possibly after passing to a subsequence) the sequence converges (in the weak-* topology). Let be the limit measure. Then and for any we have
I now show how to use this result to convert the multiple recurrence theorem (theorem 3) into Szemerédi’s theorem. Recall that Szemerédi’s theorem states that a subset with positive upper Banach density contains arbitrarily long arithmetic progressions. If is such a set then it has positive upper density with respect to some Følner sequence consisting of intervals (cf. definition 2). We can now apply the correspondence theorem to get a measure preserving system and a subset with .
By theorem 3for all there is some such that , and from the correspondence theorem we know that
In particular there is some and so is the desired arithmetic progression (since was arbitrary contains arbitrarily long arithmetic progressions). This finishes the proof of Szemerédi’s theorem (assuming theorem 3).
As another simple example of an application of the correspondence principle, I now give the proof that a recurrent set is also an intersective set, as mentioned in my previous post. Let be a recurrent set and be a set of positive upper density (with respect to some Følner sequence). We want to prove that the intersection is non-empty. Applying the correspondence principle we find a measure preserving system and with . Since is recurrent, there is some such that . But then we have and if then , so both and thus .
— 3. Sárközi’s theorem, revisited —
In my last post, I proved that any divisible polynomial (I define this below) is a recurrent set. I also observed that that result is a consequence of Sárközi’s theorem (which states that divisible polynomials are intersective), because intersective sets are recurrent. In the last section I showed that recurrent sets are intersective, so Sárközi’s theorem can be stated as “divisible polynomials are recurrent”. More precisely
Then the set is recurrent, i.e. for any measure preserving system and any set with there is such that .
Sárközi proved that for polynomials with a zero (i.e. an integer such that ) the set is intersective (he first proved the case ) and independently Furstenberg, using his correspondence principle proved the same result for squares (so he actually proved that is a recurrent set, and then from the correspondence principle concluded that it must also be an intersective set). Later Kamae and Mendès France proved the general case when is just divisible (note that if has a zero then it is automatically divisible).
I will now provide a proof based on Furstenberg’s original argument but using a version of the van der Corput trick instead of the spectral theorem which would be a (slightly) softer way.
then as .
The proof of theorem 5 illustrates the basic idea behind recurrence theorems (which, thanks to the correspondence theorem can be used to deduce combinatorial results). The first step is to consider the characteristic function of instead of the set itself. Then we want to decompose the space (in this case ) into an orderly component and a pseudo-random component. This is of course a very vague statement, and in each case, to determine what is the orderly component (and what is the pseudo-random) is the important part. Elements in the pseudo-random component should have pseudo-random orbits, in the sense that they somehow average down to , and elements in the orderly component can be controlled.
In this case those component are orthogonal subspaces of . The van der Corput trick will be used to prove that the pseudo-random component does average down to , then the orderly part is easy to control.
Proof: of theorem 5
Let and let be given by . Then is a unitary operator (this is called the Koopman operator). Let be the characteristic function of (i.e. if and if ). Then and we want to prove that this is positive for some . We will do this by showing that the sum over some subset of is non-zero.
By hypothesis, the set is non-empty for all , and if then also (think) and so for any integer . It turns out that the average gets close to the projection of in a subspace (which will be the orderly subspace) if is large enough. The subspace is the rational subspace .
We can decompose where and is orthogonal to . Now we will prove that converges to , using proposition 6. First note that the function is also a polynomial, we will actually prove that
holds for any non-constant polynomial , using induction on the degree of . If has degree , say , then the mean ergodic theorem applied to says that the limit in (1) is the projection of onto the invariant subspace of . Since is invariant under , is orthogonal to which contains the invariant subspace of . This proves (1) when the degree of is .
Now assume by induction that (1) holds for (which is a polynomial of degree less than ). Let . Then by the induction hypothesis we have for each positive integer that
We can now prove that gets arbitrarily close to : fix and let be such that for some non-zero integer and (this can be done because ). Note that if is multiple of then still and so (because is unitary). Let , taking averages we get and so for large enough, using (1), we also have that .
Since (where is the constant function) and we have that and so . Putting this together with the last equation we get
and so for some we have , and so putting small enough we conclude that .