— 1. Introduction —
In the study of measurable dynamics, the basic object of study is a measure preserving system: a quadruple , where is a set, is a -algebra over , is a probability measure on and is a measurable map such that, for each , we have , where . If there exists a set such that and , then we can consider the measure preserving system , where , and . This system is a piece of the original system , and thus can be studied separately. If there is no such set then we say that the system is ergodic.
Analogous to the way primes are the building blocks of the integers, ergodic systems are the building blocks of measure preserving systems. When we want to prove certain statements about general measure preserving systems (such as Furstenberg’s multiple recurrence theorem, which is equivalent to the celebrated theorem of Szemeredi in arithmetic progressions) it might be useful to reduce them to the case when the system is ergodic. The tool that allows for this reduction is called the ergodic decomposition and can be compared to the fundamental theorem of arithmetic in our analogy between ergodic measure preserving systems and the prime numbers. I have used this method before on this blog, when presenting the ergodic theoretical proof of Roth’s theorem.
Before I state the theorem I need to establish some notation. Throughout this post, will usually denote a measurable space and will be a -measurable map. A probability is invariant under (or -invariant) if for all we have , equivalently if is a measure preserving system. The probability is ergodic if for every with we have or , equivalently if the system is ergodic.
Theorem 1 (Ergodic Decomposition) Let be a compact metric space, let be the Borel -algebra and let be -measurable. Then there exists a set and a map that associates with every a -invariant -ergodic probability measure in such that for every -measurable function , the map is -measurable and invariant under and for every -invariant probability , after completing the -algebra with respect to , we have:
- For every we have
The last condition can be informally stated as , i.e., any -invariant probability is the convex combination of the ergodic measures .
In this post I will discuss and eventually give a full proof of Theorem 1 using the technology of disintegration of measures. I posted about this topic recently, and all the background can be found on that post.
— 2. Alternative approach —
Before giving a rigorous proof of Theorem 1 I will briefly describe an alternative way to think about this theorem. This can be formalized to give a full proof of the ergodic decomposition theorem. Let be a compact metric space and let be the Borel -algebra. Let be a map measurable with respect to . Let be the set of all -invariant probability measures over . To see that is non-empty, let be arbitrary and let be the probability measure over defined by . Since the space of probability measures over is weak compact, there exists some weak limit point for the sequence and it is not hard to see that .
Observe that is a convex set. In other words, if and then . Recall that an extreme point of a set in a linear space is a point such that whenever is written as a convex combination of points in and , then .
Proposition 2 A measure is an extreme point of the set if and only if is -ergodic.
Proof: First let be an extreme point. Let be an invariant set such that (so we want to show that ). Let be the probability measure defined by for any . Since and is invariant under we have
Hence . If and is also -invariant. Thus we can create a -invariant measure defined by and we have . Since is an extreme point in this can’t happen, and hence as desired.
Now we prove the converse. Let be ergodic, and write with . For any -invariant set we have
Since and (strict inequalities!) we deduce that . This implies that both and are ergodic measures.
Now let be arbitrary. By the pointwise ergodic theorem there exists a set such that and for each we have
By the previous remark, also , and hence, again by the ergodic theorem, we have that for -almost every point in we have
Since the right hand side of the two previous displays is the same, we conclude that . Since was arbitrary, we conclude that , and then it follows that as well. Therefore is an extreme point in .
Denote by the subset of -ergodic measures. We now recall Choquet’s theorem, which, in this case, says that for any there exists some measure on (yes, this is a measure on a space whose points are measures!) such that . Note that this equality is between measures of , it can be made more precise by for every .
This conclusion follows the same spirit as Theorem 1 and is also called the Ergodic Decomposition. For most (if not all) applications, this is enough, although we get maybe a better understanding from the statement and proof of Theorem 1.
— 3. Examples —
I will try to give some intuition about Theorem 1 by exploring some examples first.
Example 1 Let be given the discrete topology and let be the uniform measure (more precisely, ). Let , and . The set is invariant under and , hence the system is not ergodic.
However, if we restrict to and renormalize it, we obtain a probability measure which makes the system ergodic. More precisely, let and . Then is and ergodic measure, in other words, the system is ergodic.
Also, if is the point mass at (so that and ), then the system is also ergodic (one can also think of as the normalized restriction of to the invariant set ).
Finally, observe that we can write as the convex combination of the ergodic measures and . If we let , then we can write informaly .
Example 2 Let be the torus group and let be the unit square with the usual topology and the Borel -algebra. Let be the Lebesgue measure on and let where is some irrational number. Any set of the form , where is a Borel set, is invariant under . Therefore the measure preserving system is not ergodic.
To try to mimic the previous example, we can take some Borel set such that , and let . The probability is -invariant but, unlike in the first example, is not ergodic (for any choice of ).
Regardless, it is still quite intuitive what we need to do. Let denote the (one dimensional) Lebesgue measure on . For each , let be the measure defined as . It is not hard to see that is -invariant and ergodic (it is a not completely trivial exercise to verify that it is ergodic. One can show this, for instance, using Fourier analysis). Moreover it follows from Fubini’s theorem that
for any . To make this decomposition compatible with the notation of Theorem 1, let for all . Observe that the function does not depend on . Thus, applying Fubini’s theorem again we have
Example 3 Let again with the usual topology and let be the Lebesgue measure. Let . Again, any set of the form , where is a Borel set, is invariant under and hence the measure preserving system is not ergodic.
However, unlike the previous example, not all the -invariant measures (defined by ) are ergodic. Indeed, the set is invariant under but . This shows that the measure is not ergodic.
In fact the measures are ergodic exactly when is irrational (again, this can be proved with some Fourier analysis). Since the set of irrational have full measure on , the ergodic decomposition of is the same as the one on the previous example, using only the irrational values for .
However, in this example there are more ergodic measures. Indeed let be some rational point and let be arbitrary. Denote by . Then the probability measure defined by
is -ergodic. We have now found all ergodic measures for this system, so any -invariant measure can be decomposed as
for every .
— 4. Proof of Theorem 1
Example 3 hints that in order to find all the ergodic measures of a given system, one should look at the invariant sets (observe, however, that not all -invariant sets give an ergodic measure: the set is invariant for the system of Example 3 and yet no ergodic measure has as its support).
Proposition 3 Let be a probability preserving system and let
Then is a -algebra.
Proof: Let and let . Then
and hence is closed under complements. Now let be a sequence of sets in and let . Then
and hence is closed under countable unions and therefore it is a -algebra.
Henceforth we will call the -algebra of invariant sets. It turns out that the ergodic measures of a system are the measures that arise from the disintegration of invariant measures with respect to the -algebra of invariant sets.
More precisely, let be a measure preserving system, with being the Borel -algebra of a compact topology on , and let be the -algebra of -invariant sets. Apply Theorem 7 from my previous post to find a set with full measure and a disintegration of . This means that, for every , the function is in (extended outside as ) and for every we have
Proof: We first prove that is -invariant. Let . Note that the function is -measurable. Hence is -invariant, in the sense that . Moreover, for every we have
Since is -invariant and preserves we have
and hence for all . Since is -measurable we conclude that a.e. and hence for a.e. . Since is countably generated we conclude that is -invariant for a.e. .
We now show that is ergodic: Observe that, by construction, the invariant sets in are exactly the sets in . But if then we have if and if . To see why this is true, let . We have that is -measurable and for any set . Thus as desired. This shows that for every -invariant set and a.e. we have or . We conclude that is ergodic.
Proof: Let be the set where is defined (and thus ).
For any we claim that . By Lemma 4 both and are ergodic. Hence it follows from the pointwise ergodic theorem that for any there exists a set with such that for all we have
The proof of Theorem 1 is now almost complete. Let denote the set of all -invariant probability measures on and for each let be a set such that and is the disintegration of with respect to . Let . By Lemma 5 the measures are uniquely defined for . By Lemma 4 each of the measures is -invariant and -ergodic.
By the properties of the disintegration of measures we have that for every -measurable function , the map is -measurable, and hence it is -measurable and -invariant.
For any -invariant measure we have that and since it follows that is measurable with respect to the completion of the Borel -algebra and . For any it follows from the properties of the disintegration of measures that
and this finishes the proof.