Theorem 1 (van der Waerden) Let be a positive integer. Consider a partition (or coloring) of the integers in finitely many colors. Then there is some color that contains arbitrarily long (but finite) arithmetic progressions.
An arithmetic progression is a set of the form . It is easy to construct an example of a partition of into two subsets (or colors) such that no one contains an infinite arithmetic progression. For instance if is any increasing sequence such that is also strictly increasing, than coloring the intervals with one color and the intervals with the other color, there is no monochromatic infinite arithmetic progression, although both colors have arbitrarily long arithmetic progressions.
Van der Waerden first proof of this theorem in 1927 used only combinatorial methods. Since then, other combinatorial proofs have been found. In this post I will prove vdW with methods from topological dynamics, to illustrate how one can use somewhat heavy machinery to prove combinatorial statements about integers. The approach here mainly follows the one in Furstenberg’s book Recurrence in Ergodic Theory and Combinatorial Number Theory
2. Compactness and vdW numbers
As said before, there are many equivalent formulations of vdW theorem. I present now a stronger looking version and prove the equivalence between this and the previous formulation. This section is independent of the rest of the post.
This is called the finitistic version of vdW. It is an easy exercise to deduce Theorem 1 from Theorem 2. We now present a standard compactness argument, explored in great detail by Terence Tao in his blog, to derive the other implication.
While solving mathematical olympiad problems in combinatorics I got used to see them solved by “considering” some set, usually to double count its cardinality or to create some bijection/injection. In the same spirit, in this case, thinking about the set of all colorings of is the main step to solve this question (this shouldn’t be seen as a magic step though. Since we want to prove that all colorings have some property it is natural that we consider the set of all colorings. However one may be tempted to try to think about van der Waerden theorem by fixing the coloring since that’s the way the theorem is formulated, so considering this set is a natural but non-trivial step). It is now important to look at this set, let’s call it , from another angle. A coloring of can be viewed as a function that to each assigns its color. Thus an element of is a function from to , or as a point in the space .
A second step, maybe not so natural but easier to come up with, is to endow with a topology. After all a set with no structure isn’t that useful. We give the discrete topology and in we consider the product topology which makes into a metrizable space. A key observation now is that, by Tychonoff’s theorem, is a compact set.
After this set up let’s now prove Theorem 1 using Theorem 2: suppose Theorem 1 isn’t true. Then there is some sequence going to infinity and such that there is some coloring of in colors with no monochromatic arithmetic progression. Extend the colorings (functions) to all arbitrarily. We have now a sequence in our compact metric space , so we can extract a convergent subsequence. Let be the limit of that subsequence. Then, by Theorem 2 we have a monochromatic arithmetic progression of length , say , in . On other words, is constant. We note that, in the topology on , convergence is the pointwise convergence, and since has the discrete topology this means that there is some arbitrarily large such that is also constant. But when (which happens for sufficiently large ) then has the same monochromatic arithmetic progression, contradicting the construction of . This contradiction proves Theorem 1.
With this finitistic formulation, a question arises, namely how does the function depends on and . The proof of vdW presented below gives no answer to this question but finding tight bounds for is famous open question. The best bounds known were found by Gowers in his proof of Szemerédi’s theorem:
This upper bounds are still far from the best know lower bounds.
3. Furstenberg-Weiss multiple recurrence theorem
In this section we will build the tools we need to prove vdW theorem. Let us first introduce some notation. A topological dynamical system is a pair where is a topological space and is a continuous functions. A good example is and or any topological group with being the multiplication by some fixed element. For a point in we define its orbit as . In our first example we have .
A point is said to be recurrent if , in other words if there is some sequence of positive integers such that . If we have several continuous maps we say that is multiply recurrent if there is some sequence such that for each we have . We will see that vdW is about existence of certain multiply recurrent point in a certain topological space. This topological statement, due to Furstenberg and Weiss, is a generalization of a theorem of Birkoff, curiously first proved about the same time van der Waerden proved his theorem on arithmetic progressions.
The proof is somewhat lengthy and technical. I present it here in full detain primarily for my own benefit, so the reader is encourage to skip it on a first reading.
Proof: We will split the proof into several parts. First we prove that we can reduce the statement to the case of invertible systems, i.e. we can assume that each is not only continuous but is actually a homeomorphism. We also reduce the theorem to the case of a minimal system. Then the argument follows by induction on . The case follows then easily, and for we reduce the statement to find a recurrent point in the diagonal on . Using the induction hypothesis we get that the diagonal is a homogeneous and recurrent subset (this terms will be made precise later) and then we use several technical steps to eventually find a recurrent point.
- Reduction to an Invertible System.
We will use the standard construction to replace by , the set of functions from to with the product topology. For each we define Now let . To prove that this set is non-empty, we use the compactness of (given again by Tychonoff Theorem) to find a limit point of the sequence defined by: where is arbitrary. We have that if . Thus a limit point of the sequence is in .It is also clear that is in variant. We now verify that if is a multiple recurrent point in for all , then is multiple recurrent for all . Let the sequence be such that and an open neighborhood of . Then is a neighborhood of in , by the definition of product topology. Since is multiple recurrent we will have for all large enough and all that , so and so is indeed multiple recurrent for all .
- Reduction to a Minimal system.
From now on we assume that all are homeomorphisms of . Now consider the family of all closed invariant sets, i.e. if is closed and for all . Note that can be partially ordered by inclusion and for a totally ordered family in , the intersection is still in . Applying Zorn’s lemma (ultimately this could be avoided, but to do so one needs to introduce a measure on so I will use this method) we can find a minimal element . If we find a multiple recurrent point in for all , then clearly that point is also multiple recurrent on . Thus we can assume from now on that is minimal, i.e. it contains no closed invariant subsets.
- The Induction process.
We use induction on . For , if , we have for its orbit closure, is a closed -invariant subset of , so by minimality of we have and so is recurrent for .If , construct as the diagonal, i.e. . We note that is compact, since it is the diagonal of a Hausdorff space. Let , and note that . Note that if is a recurrent point for , then is a multiple recurrent point for .Define, for the homeomorphism and, using the induction hypothesis, let be a multiple recurrent point for all , . Also let be a sequence such that as for all . Let and for . Then for all , if is sufficiently large we have . It is here that the commutative hypothesis on the is used.
- is Homogeneous and Recurrent.
A subset of a compact metric space is called homogeneous for a continuous function if it is the minimal invariant set of some group of transformations commuting with ; and it is called recurrent if such that . is homogeneous in with respect to because it is a minimal invariant set for the continuous functions defined by . Indeed is clearly invariant and if is invariant for all , by the minimality of , we have , so and is minimal. Also since the commute, also commute with .To prove that is recurrent, let and . Let be a finite covering of by ball of diameter . Let be the group generated by , so acts on and fixes but no closed proper subset of . Therefore there is a finite subset such that covers for each (note that there are only finitely many such . Therefore we can find such that for , for all .We can now find such that . Also for large enough we have that , so (remember that commutes with ). Putting both together we get . From this we conclude that is recurrent.
- For each there is some such that .
Let . Let . For each find and such that and then such that . Such can be found because is recurrent and the exist because is continuous.Let be such that (they exist by compactness). Then clearly so and continuing with this we get . Finally this implies that .
- has a Recurrent Point.
Let be defined by . By the last point we have that gets arbitrarily close to . Also it is easy to see that is upper semi-continuous, so there is a point of continuity (actually the points of discontinuity form a meager set which, by the Baire Category Theorem, can’t be the whole space).Let be a point where is continuous. If we are done (as is our recurrent point in ). Otherwise there is some neighborhood of , say , where for all . But then there is some finite subset such that . Let be such that , for all and .
Let be such that . Let be such that , so that . Then for some we have and so which implies that , but this contradicts the fact that , and this contradiction means that actually and is our recurrent point for which can be converted as referred to a multiple recurrent point for .
Now that we have the right tools from topological dynamics we have to reduce vdW theorem to a statement in that language. Let’s fix the number of colors. Now consider the space of all functions from to , i.e. the space of all colorings of by colors. In we can define a distance:
It is easy to verify that is indeed a metric on . To illustrate the behavior of this metric we note that for we have . Lets now consider a ball in that metric:
We now have that two colorings are close to each other if they start the same way. However, arithmetic progressions are not about the beginning of coloring, but about repetition of the same color along some constant jumps or period. A crucial idea here is to introduce the shift map defined by . Notice for instance that if for some , then is periodic, and we have actually infinite monochromatic arithmetic progressions. Even if we only know that and are close, then they have the same beginning, and so . Notice also that is continuous, since if then the first values of and coincide, so .
Now suppose a coloring has a monochromatic arithmetic progression of length , say (the is just there for notational convenience). Then , or in other words, the set has diameter less than . This is actually equivalent to be monochromatic, and in fact we have the following statement:
Proof: } Let and consider its orbit closure . From Theorem 4 there is some and such that diam, which is equivalent to . Since we can find such that . Then the first values of are the same as those of , so also and as seen above this is equivalent to say that has a monochromatic arithmetic progression.
Proof: Let for . This functions clearly commute. From Theorem 3 we can find a multiple recurrent point for . Let be such that for all . Then satisfies the conditions.
This finishes the proof of van der Waerden theorem on arithmetic progressions.