## Weak Mixing

— 1. Introduction —

When studying measure preserving systems (defined below) there are many important classes that are worth studying separately. One way to distinguish between different classes is the level of “mixing” or “randomness” of the system. In this post, a measure preserving system is a quadruple ${\textbf{X}=(X,{\cal B},\mu,T)}$ where ${X}$ is a set, ${{\cal B}}$ is ${\sigma}$-algebra, ${\mu}$ is a probability measure on ${{\cal B}}$ and ${T:X\rightarrow X}$ is an invertible bi-measurable map that preserves the measure, i.e., ${\mu(TA)=\mu(A)}$ for all ${A\in{\cal B}}$ (where, as usual, ${TA=\{Tx:x\in A\}}$). In some cases where they are clear, we may omit the reference to ${{\cal B}}$ or ${\mu}$.

For instance, if there exists a nontrivial set ${A}$ (nontrivial means that neither ${\mu(A)=0}$ nor ${\mu(A)=1}$) which is preserved under ${T}$, then the system is as far from mixing as possible – the set ${A}$ doesn’t mix with the rest of the system. A system where this does not happen is called ergodic. Hence ergodic systems are in some sense more “chaotic” than non-ergodic systems because for any set ${A}$ with positive measure, we have that ${A\cup TA\cup T^2A\cup...}$ is the whole space ${X}$ (up to ${0}$ measure).

One of the best ways to express the mixing behavior of an ergodic system is the Ergodic Theorem:

Proposition 1 (von Neumann’s Ergodic Theorem) Let ${(X,{\cal B},\mu,T)}$ be an ergodic probability preserving system and let ${f,g\in L^2(X)}$. Then

$\displaystyle \lim_{n\rightarrow\infty}\frac1n\sum_{k=1}^n\int_Xf\circ T^n.gd\mu=\int_Xfd\mu.\int_Xgd\mu$

In particular, for ${A,B\in{\cal B}}$ we have

$\displaystyle \lim_{n\rightarrow\infty}\frac1n\sum_{k=1}^n\mu(T^nA\cap B)=\mu(A)\mu(B)$

This shows that requiring ergodicity, which is not hard to see is equivalent to the property that for any two sets ${A,B\in{\cal B}}$ with ${\mu(A)>0,\mu(B)>0}$ there exists some ${n\in{\mathbb N}}$ such that ${\mu(T^nA\cap B)>0}$, implies that the measure of the intersections ${\mu(T^nA\cap B)}$ actually behaves rather regularly. This also suggests some weak form of asymptotic independence (in a probabilistic sense) between any set ${B}$ and the orbit of a set ${A}$. In fact, a much stronger level of mixing is characterized precisely by this asymptotic independence:

Definition 2 (Strong mixing) A measure preserving system ${\textbf{X}=(X,{\cal B},\mu,T)}$ is called strongly mixing if for any two sets ${A,B\in{\cal B}}$ we have

$\displaystyle \lim_{n\rightarrow\infty}\mu(T^nA\cap B)=\mu(A)\mu(B)$

It should be immediate to see that a strongly mixing system is always ergodic, however, the converse does not hold. An example to have in mind here are the circle rotations. Let ${\mathbb T={\mathbb R}/{\mathbb Z}}$ be the circle group and let ${\alpha\in{\mathbb R}}$. Consider the map ${T:\mathbb T\rightarrow\mathbb T}$ defined by ${Tx=x+\alpha}$ (the sum is defined${\mod1}$). Then the system ${(\mathbb T,T)}$ is ergodic if and only if ${\alpha\notin{\mathbb Q}}$. This can be proved directly, analyzing the behavior of intervals, or using Fourier analysis. However, this system is not mixing! To see this, consider ${A}$ and ${B}$ small intervals, it should be clear that there are arbitrarily large ${n}$ for which ${T^nA}$ does not intersect ${B}$. Intuitively speaking this happens because the image of an interval is still an interval, it does not mix on ${\mathbb T}$.

In this post we will analyze a notion called weak mixing. A weakly mixing system is always ergodic, but not always strongly mixing. I will show the equivalence of several properties about a measure preserving system to weak mixing, the fact that so many distinct properties are equivalent gives some evidence that weak mixing is in some ways a better notion than strong mixing. Moreover, the notion of weak mixing plays a very important role on Furstenberg’s proof of the Szemeredi’s Theorem. Instead of giving a definition of weak mixing I will instead present a Theorem stating that several properties of a measure preserving system are equivalent. A system satisfying one (and hence all) of those properties is called weakly mixing.

Before I state the Theorem, I need to defined product system: given two mps ${\textbf{X}=(X,{\cal B},\mu,T)}$ and ${\textbf{Y}=(Y,{\cal C},\nu,S)}$ we can form the product ${\textbf{X}\times\textbf{Y}=(X\times Y, {\cal B}\otimes{\cal C},\mu\otimes\nu,T\times S)}$ where ${{\cal B}\otimes{\cal C}}$ is the ${\sigma}$-algebra in the cartesian product ${X\times Y}$ generated by the rectangles (sets of the form ${B\times C}$ where ${B\in {\cal B}}$ and ${C\in {\cal C}}$) and ${\mu\otimes\nu}$ is the product measure, so that in particular ${(\mu\otimes\nu)(B\times C)=\mu(B)\nu(C)}$. Also, for a subset ${J\subset{\mathbb N}}$ we define its upper density by ${\displaystyle\bar d(J)=\limsup_{n\rightarrow\infty}\frac{|J\cap[1,n]|}n}$. Here, as usual, for a finite set ${E}$ we use the notation ${|E|}$ to denote its cardinality.

Theorem 3 Let ${\textbf{X}=(X,{\cal B},\mu,T)}$ be a mps. Then the following are equivalent

1. For any two sets ${A,B\in {\cal B}}$ we have ${\displaystyle\lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N|\mu(A\cap T^nB)-\mu(A)\mu(B)|=0}$
2. For any ${f,g\in L^2}$ we have ${\displaystyle\lim_{n\rightarrow\infty}\frac1n\sum_{k=1}^n\left|\int_Xf\circ T^k.gd\mu-\int_Xfd\mu\int_Xgd\mu\right|_{L^2}=0}$
3. ${\textbf{X}\times\textbf{X}}$ is ergodic.
4. For every ergodic m.p.s. ${\textbf{Y}}$, the product ${\textbf{X}\times\textbf{Y}}$ is ergodic.
5. For any ${A,B\in {\cal B}}$ there exists a subset ${E\subset {\mathbb N}}$ with upper density ${\bar d(E)=0}$ such that ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^nA\cap B)=\mu(A)\mu(B)}$ for ${n\notin E}$. Moreover if ${{\cal B}}$ is separable we can choose ${E}$ independent of ${A,B}$.
6. For any ${A,B,C\in{\cal B}}$ with ${\mu(A)\mu(B)\mu(C)>0}$ there is ${n\in {\mathbb N}}$ such that ${\mu(A\cap T^nB)\mu(A\cap T^nC)>0}$.
7. If ${f\in L^2}$ and the orbit closure ${\overline{\{f\circ T^n:n\in{\mathbb N}\}}}$ is compact (both the closure and compactness are in the strong ${L^2}$ topology) then ${f}$ is a constant.
8. If ${f(Tx)=\lambda f(x)}$ a.e. for some function ${f\in L^2(X)}$ and some ${\lambda\in{\mathbb C}}$ then ${f}$ is a constant function.

In the next section I prove this Theorem, for now I will just present some remarks about it. The first condition is a strengthening of the ergodic Theorem and is a clear consequence of the strongly mixing property. This explains why we call such systems weakly mixing. The second property is just a rewriting of the first, with the characteristic functions being replaced with general ${L^2}$ functions.

The third condition is more surprising, this may be the easiest way to check that the circle rotations ${(\mathbb T,T)}$, defined above, are not weak mixing. This is also the easiest way to define weak mixing, and when extending the notion of weak mixing to other settings (relative weak mixing, weakly mixing unitary operator on Hilbert spaces, weakly mixing action of groups different than ${{\mathbb Z}}$, etc) this seems to be the best property to use. The fourth condition is even more surprising as the implication (3)${\Rightarrow}$(4) seems quite unlikely a priori. This condition implies also that if ${\textbf{X}}$ is weak mixing, then ${\textbf{X}\times\textbf{X}\times\textbf{X}}$ is also ergodic, and so is ${\textbf{X}\times\textbf{X}\times\textbf{X}\times\textbf{X}}$. Therefore we get the amusing property that ${\textbf{X}\times\textbf{X}}$ is ergodic if and only if it is weak mixing!

The fifth condition is another immediate weakening of the strongly mixing condition and should be compared with (1). To be completely clear, to say that ${\displaystyle\lim_{n\rightarrow\infty}a_n=L}$ for ${n\notin E}$ means that for all ${\epsilon>0}$ there is some ${N\in{\mathbb N}}$ such that for all ${n>N}$, ${n\notin E}$ we have ${|a_n-L|<\epsilon}$.

The condition (6) reflects the mixing philosophy as a strengthening of the ergodicity assumption. Moreover, it hints at why the condition (3) is equivalent to the other properties. Finally the conditions (7) and (8) seems unrelated with the others a priori, but provide a very useful spectral characterization of weakly mixing systems.

— 2. Proof of the Theorem 3

• (1)${\Rightarrow}$(5)

Fix ${m\in{\mathbb N}}$ and set ${A_m:=\{n\in{\mathbb N}:|\mu(T^nA\cap B)-\mu(A)\mu(B)|>1/m\}}$. Observe that

$\displaystyle \frac1N\sum_{n=1}^N|\mu(T^nA\cap B)-\mu(A)\mu(B)|\geq \frac1m\frac{|A_m\cap[1,N]|}n$

Taking the limit as ${N\rightarrow\infty}$ we conclude that ${\bar d(A_m)=0}$ for all ${m\in{\mathbb N}}$ For each ${m\in{\mathbb N}}$ let ${N_m}$ be the smallest positive integer such that for all ${N>N_m}$ we have ${|A_m\cap[1,N]|\leq N/m}$ and make

$\displaystyle E=\bigcup_{m=1}^\infty\left(A_m\cap[N_m+1,N_{m+1}]\right)$

Now observe that ${A_k\subset A_{k+1}}$ for all ${k\in{\mathbb N}}$, hence for each ${N\in{\mathbb N}}$, choosing ${m}$ such that ${N\in[N_m+1,N_{m+1}]}$ we have ${E\cap[1,N]\subset A_m\cap[1,N]}$ and hence ${|E\cap[1,N]|\leq N/m}$. Taking ${N\rightarrow\infty}$ (note that also ${m\rightarrow\infty}$ because all ${A_m}$ have ${0}$ density) we conclude that ${\bar d(E)=0}$.

Finally, for each ${m\in{\mathbb N}}$, let ${N>N_m}$, then if ${N\notin E}$ we also have ${N\notin A_m}$ and so ${|\mu(T^nA\cap B)-\mu(A)\mu(B)|<1/m}$ concluding the proof.

In the case when ${{\cal B}}$ is separable, let ${\{B_n\}_{n=1}^\infty}$ be a countable dense family. For each ${m=(m_1,m_2)\in{\mathbb N}^2}$ let ${E_m\subset{\mathbb N}}$ be such that ${\bar d(E_m)=0}$ and ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^{-n}B_{m_1}\cap B_{m_2})\rightarrow\mu(B_{m_1})\mu(B_{m_2})}$ for ${n\notin E_m}$. As above we construct a set ${E}$ of ${0}$ density such that for all ${m\in{\mathbb N}^2}$ there exists ${N=N(m)\in{\mathbb N}}$ such that ${E_m\setminus[1,N]\subset E}$. It is not hard to check that this set ${E}$ satisfies the conditions, we omit the details.

• (5)${\Rightarrow}$(6)

Let ${A,B,C\in{\cal B}}$ be such that ${\mu(A)\mu(B)\mu(C)>0}$, let ${E_1,E_2\subset{\mathbb N}}$ be such that ${\bar d(E_1)=0}$, ${\bar d(E_2)=0}$ and

$\displaystyle \lim_{n\rightarrow\infty}\mu(T^nB\cap A)=\mu(A)\mu(B)\text{ for }n\notin E_1;\qquad\lim_{n\rightarrow\infty}\mu(T^nC\cap A)=\mu(A)\mu(C)\text{ for }n\notin E_2$

Clearly ${E_1\cup E_2\neq{\mathbb N}}$, hence we can find ${n\in{\mathbb N}}$ such that both ${\mu(T^nB\cap A)>\mu(A)\mu(B)/2}$ and ${\mu(T^nC\cap A)>\mu(A)\mu(C)/2}$.

• (6)${\Rightarrow}$(8)

We proceed by contradiction. Assume that ${\textbf{X}}$ satisfies (6) but not (8). Then ${\textbf{X}}$ is ergodic and there is some non-constant eigenfunction ${f\in L^2(X)}$ with ${Tf=\lambda f}$ for some ${\lambda\in{\mathbb C}}$. Since ${T}$ preserves the measure we conclude that ${|\lambda|=1}$, and because the system is ergodic this implies that ${|f|}$ is a constant and hence we can assume ${|f|=1}$.

Let ${\theta\in(0,1)}$ be such that ${\lambda=\exp(\theta)}$, where ${\exp(t)=e^{2\pi it}}$. Since ${f}$ is not constant, we can find two disjoint intervals ${I_1,I_2\subset[0,1)}$, both of length smaller than some ${\epsilon}$ and that are more than ${\epsilon}$ apart (in the circle metric, so that ${1}$ and ${0}$ are identified as the same point) and such that both ${B:=\{x:f(x)\in\exp(I_1)\}}$ and ${C:=\{x:f(x)\in\exp(I_2)\}}$ have positive measure. Also make ${A=B}$. We now have that

$\displaystyle A\cap T^nB=\{x:f(x)\in\exp(I_1)\cap\exp(I_1+n\theta)]\}=\{x:f(x)\in\exp[I_1\cap(I_1+n\theta)]\}$

$\displaystyle A\cap T^nC=\{x:f(x)\in\exp(I_1)\cap\exp(I_2+n\theta)]\}=\{x:f(x)\in\exp[I_1\cap(I_2+n\theta)]\}$

But because the length of ${I_1}$ is smaller than ${\epsilon}$ and ${I_1}$ is more that ${\epsilon}$ apart from ${I_2}$ (and hence also ${T^nI_1}$ is more that ${\epsilon}$ apart from ${T^nI_2}$) we conclude that ${I_1}$ can not intersect both ${T^nI_1}$ and ${T^nI_2}$ at the same time. In other words either ${\mu(A\cap T^nB)=0}$ or ${\mu(A\cap T^nC)=0}$, contradicting (6) as desired.

• (7)${\Rightarrow}$(8)

This is trivial because the orbit of an eigenfunction is one dimensional, hence its closure is compact.

• (2) ${\Rightarrow}$(7)

We proceed by contradiction. Assume that (2) holds but there exists some non-constant function ${f\in L^2}$ such that the orbit closure ${\overline{\{f\circ T^n:n\in{\mathbb N}\}}}$ is compact. Observe that replacing ${f}$ with ${f-\int_Xfd\mu}$ if needed, we can assume that ${\int_Xfd\mu=0}$. We want to prove that ${f=0}$, so for the sake of a contradiction let’s assume otherwise.

Let ${\epsilon<\|f\|/4}$ be positive and let ${f_1,...,f_m}$ be such that the balls ${B(f_i,\epsilon)}$ cover the orbit of ${f}$. Then for each ${n\in{\mathbb N}}$ there is some ${i}$ such that ${\|f\circ T^n-f_i\|<\epsilon}$. We have

$\displaystyle \begin{array}{rcl} \|f\|^2&=&\displaystyle\|f\circ T^n\|^2=\langle f\circ T^n,f_i\rangle+\langle f\circ T^n,f\circ T^n-f_i\rangle\\&\leq&\displaystyle\sum_{i=1}^m\left|\langle f\circ T^n,f_i\rangle\right|+\epsilon\|f\| \end{array}$

Thus, averaging over ${n=1,2,...,N}$ we obtain

$\displaystyle \|f\|^2=\frac1N\sum_{n=1}^N\|f\circ T^n\|^2\leq\sum_{i=1}^m\frac1N\sum_{n=1}^N\left|\langle f\circ T^n,f_i\rangle\right|+\epsilon\|f\|$

Now using (2) we can find ${N}$ such that the right hand side of the above equation is smaller than ${2\epsilon\|f\|<\|f\|^2/2}$ which is a contradiction.

• (8)${\Rightarrow}$(2)

This is essentially the hard part of the Koopman-von Neumann Decomposition. I posted before on this blog about this theorem, and so I will not present a proof here. Using the notation from that post, the Proposition 5 there says that every function in ${H_{wm}}$ satisfies our condition (2). Moreover, if there are no non-constant eigenfunctions for ${T}$, then ${H_c}$ is just the subspace of constant functions.

• (2)${\Rightarrow}$(4)

Let ${f(x,y)=f_1(x)f_2(y)\in L^2(X\times Y)}$ and ${g(x,y)=g_1(x)g_2(y)\in L^2(X\times Y)}$. Applying Cauchy-Schwartz with ${f_2,g_2}$ and using (2) we get

$\displaystyle \begin{array}{rcl} \limsup_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\left|\langle f\circ(T\times S),g\rangle\right|&=&\displaystyle\limsup_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\left|\langle f_1\circ T,g_1\rangle\langle f_2\circ S,g_2\rangle\right|\\&\leq&\displaystyle\|f_2\|.\|g_2\|\limsup_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\left|\langle f_1\circ T,g_1\rangle\right|\\&=&0 \end{array}$

By the triangular inequality we get the same result replacing ${f}$ and ${g}$ with finite linear combinations of functions of the form ${f_1(x)f_2(y)}$, in other words, we can choose ${f,g\in L^2(X)\otimes L^2(Y)}$. Since ${L^2(X)\otimes L^2(Y)}$ is dense in ${L^2(X\times Y)}$ we conclude that the same result holds for any ${f,g\in L^2(X\times Y)}$.

Hence ${\textbf{X}\times\textbf{Y}}$ satisfies (2) and hence it is clearly ergodic.

• (4)${\Rightarrow}$(3)

It suffices to show that if (4) holds, then ${\textbf{X}}$ is ergodic. To see this assume that ${\textbf{X}}$ is not ergodic and let ${A\in{\cal B}}$ be an invariant set such that ${0<\mu(A)<1}$. Let ${\textbf{Y}}=(Y,S)$ be the (ergodic) one point system. Then ${A\times Y}$ is invariant for ${T\times S}$ and so ${\textbf{X}\times\textbf{Y}}$ wouldn’t also be ergodic.

• (3)${\Rightarrow}$(1)}

Applying the von Neumann’s Ergodic Theorem (Proposition 1) to the characteristic functions ${1_{A\times A}\in L^2(X\times X)}$ and ${1_{B\times B}\in L^2(X\times X)}$ we obtain

${\displaystyle\lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\int_{X\times X}1_{A\times A}\circ(T\times T)^n.1_{B\times B}d(\mu\otimes\mu)=}$

${\displaystyle=\int_{X\times X}1_{A\times A}d(\mu\otimes\mu)\int_{X\times X}1_{B\times B}d(\mu\otimes\mu)}$

The left hand side of the above equation is

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\mu(T^nA\cap B)^2$

and the right hand side is ${\mu(A)^2\mu(B)^2}$. Replacing ${f}$ with ${f-\int_Xfd\mu}$ we can assume that ${\int_Xfd\mu=0}$ and thus we can conclude (applying the Cauchy-Schwartz inequality and the ergodic theorem):

${\displaystyle\limsup_{N\rightarrow\infty}\left(\frac1N\sum_{n=1}^N\left|\mu(T^nA\cap B)-\mu(A)\mu(B)\right|\right)^2\leq}$

${\displaystyle\leq \limsup_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\left[\mu(T^nA\cap B)-\mu(A)\mu(B)\right]^2}$

${\displaystyle=\lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\mu(T^nA\cap B)^2-2\mu(A)\mu(B)\mu(T^nA\cap B)+\mu(A)^2\mu(B)^2=0}$

— 3. Some more equivalent notions —

In this section we give three more properties of a measure preserving system that happen to be equivalent to weak mixing. The first proposition shows that weak mixing and “totally” weak mixing (in analogy with total ergodicity) turn out to be the same concept for ${{\mathbb Z}}$ actions.

Proposition 4 Let ${\textbf{X}=(X,{\cal B},\mu, T)}$ be a measure preserving system and let ${k\in{\mathbb N}}$ be a positive integer. Then ${\textbf{X}}$ is weak mixing if and only if the system ${(X,{\cal B},\mu, T^k)}$ is.

Proof: We will use the condition (5) from the Theorem 3. Assume first that ${\textbf{X}}$ is weak mixing and let ${A,B\in{\cal B}}$ be measurable sets. Let ${E\subset{\mathbb N}}$ be the set with ${0}$ density such that ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^nA\cap B)=\mu(A)\mu(B)}$ for ${n\notin E}$. But the the set ${E/k:=\{n\in{\mathbb N}:nk\in E\}}$ also has ${0}$ density and clearly ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^{kn}A\cap B)=\mu(A)\mu(B)}$ for ${n\notin E/k}$.

Now assume that the system ${(X,{\cal B},\mu, T^k)}$ is weak mixing and let ${A,B\in{\cal B}}$ be measurable sets. For each ${i=0,...,k-1}$, let ${E_i\subset{\mathbb N}}$ be the set with ${0}$ density such that ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^{kn}T^iA\cap B)=\mu(A)\mu(B)}$ for ${n\notin E_i}$. But then the set ${E:=E_0\cup...\cup E_{k-1}}$ still has ${0}$ density and clearly ${\displaystyle \lim_{n\rightarrow\infty}\mu(T^nA\cap B)=\mu(A)\mu(B)}$ for ${n\notin E}$. $\Box$

For the next two theorems we will need to use the following generalization of the classical van der Corput Lemma due to Bergelson:

Proposition 5 [vdC generalized] Let ${u_n}$ be a bounded sequence in a Hilbert space ${H}$. If

$\displaystyle \lim_{H\rightarrow\infty}\frac1H\sum_{h=1}^H\limsup_{N\rightarrow\infty}\left|\frac1N\sum_{n=1}^N\langle u_{n+h},u_n\rangle\right|=0$

then ${\frac1N\sum_{n=1}^N u_n\rightarrow0}$ as ${N\rightarrow\infty}$.

I have used this version before on this blog, its proof can be found in this post.

The next theorem states the fact that a system is weak mixing if and only if the sets ${A,T^nA,T^{2n}A}$ are asymptotically independent, in the same spirit of the ergodic theorem, that states that a system is ergodic if and only if the sets ${A,T^nA}$ are asymptotically independent.

Theorem 6 Let ${(X,{\cal B},\mu,T)}$ be an invertible measure preserving system. Then the system is weak-mixing if and only if for any set ${A\in{\cal B}}$ we have

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\mu(A\cap T^{-n} A\cap T^{-2n}A)=\mu(A)^3$

Proof: We only prove one implication (that weak mixing implies this property). The derivation of the other implication is more complicated.

Let ${f=1_A}$ and for each ${n\geq1}$ let ${f_n=f\circ T^n=1_{T^{-n}A}}$. Also let ${g=f-\mu(A)}$ (so that ${\int_Xgd\mu=0}$) and let ${g_n=g\circ T^n}$. Note that ${f_n-g_n}$ is the constant function ${\mu(A)}$ for any ${n}$.

Note that ${\mu(A\cap T^{-n} A\cap T^{-2n}A)=\int_Xff_nf_{2n}d\mu}$. Since ${T}$ is weak mixing, ${T^2}$ is still ergodic, so we have

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\int_Xff_{2n}d\mu=\int_Xfd\mu\int_Xfd\mu=\mu(A)^2$

so ${\int_Xf(f_n-g_n)f_{2n}d\mu}$ converge in Cesàro to ${\mu(A)^3}$. Therefore to conclude the proof of the first implication it suffices to show that

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\int_Xfg_nf_{2n}d\mu=0$

This will be achieved if we prove that ${g_nf_{2n}}$ converges to ${0}$ in the Cesàro sense and the weak topology. We will use the van der Corput trick. Let ${u_n=g_nf_{2n}}$. Thus from invariance of ${\mu}$ under ${T}$

$\displaystyle \begin{array}{rcl} \langle u_{n+h},u_n\rangle&=&\displaystyle\int_Xg_{n+h}f_{2n+2h}g_nf_{2n}d\mu=\int_Xg_hf_{n+2h}gf_nd\mu\\ &=&\displaystyle\int_X(g_hg)((f_{2h}f)\circ T^n)d\mu \end{array}$

Applying again the ergodic theorem we thus conclude that

$\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\langle u_{n+h},u_n\rangle= \left(\int_Xg_hgd\mu\right)\left(\int_Xf_{2h}fd\mu\right)$

Since ${\displaystyle\int_Xf_{2h}fd\mu\leq1}$ for any ${h}$, we have

$\displaystyle \limsup_{N\rightarrow\infty}\left|\frac1N\sum_{n=1}^N\langle u_{n+h},u_n\rangle\right|=\left|\left(\int_Xg_hgd\mu\right)\left(\int_Xf_{2h}fd\mu\right)\right|\leq\left|\int_Xg_hgd\mu\right|$

and so, since ${T}$ is weak mixing, we conclude that

$\displaystyle \lim_{H\rightarrow\infty}\frac1H\sum_{h=1}^H\limsup_{N\rightarrow\infty}\left|\frac1N\sum_{n=1}^N\langle u_{n+h},u_n\rangle\right|=0$

Now we are in the conditions of van der Corput trick and can conclude that indeed ${g_nf_{2n}}$ converges to ${0}$ in the Cesaro sense which, as we saw above implies the desired convergence. $\Box$

The next theorem states that the sets ${A,T^nA,T^mA,T^{n+m}A}$ become asymptotically independent as ${n,m\rightarrow\infty}$.

Theorem 7 Let ${(X,{\cal B},\mu,T)}$ be an invertible measure preserving system. Then the system is weak-mixing if and only if for any set ${A\in{\cal B}}$ we have

$\displaystyle \lim_{N,M\rightarrow\infty}\frac1{NM}\sum_{n=1}^N\sum_{m=1}^M\mu(A\cap T^{-n} A\cap T^{-m}A\cap T^{-n-m}A)=\mu(A)^4$

Proof: Again we only prove the easy implication: we assume that the system is weak mixing.

Fix ${A\in{\cal B}}$ and fix ${\epsilon>0}$. For each ${n}$ let ${B_n=A\cap T^{-n}A}$. Let ${N_0}$ be such that, for ${N>N_0}$ we have

$\displaystyle \frac1N\sum_{n=1}^N\left|\mu(B_n)-\mu(A)^2\right|<\frac\epsilon4$

Note that for such ${N}$ we get

$\displaystyle \begin{array}{rcl} \left|\frac1N\sum_{n=1}^N\mu(B_n)^2-\mu(A)^4\right|&\leq&\displaystyle\frac1N\sum_{n=1}^N\left|\mu(B_n)^2-\mu(A)^4\right|\\&=& \displaystyle\frac1N\sum_{n=1}^N\left|\mu(B_n)-\mu(A)^2\right|.\left|\mu(B_n)+\mu(A)^2\right|\\&\leq& \displaystyle2\frac1N\sum_{n=1}^N\left|\mu(B_n)-\mu(A)^2\right|\\ &<&\displaystyle\frac\epsilon2 \end{array}$

For each ${N}$ let ${M_N}$ be such that for ${M>M_N}$ we have, for each ${n\leq N}$:

$\displaystyle \frac1M\sum_{m=1}^M\left|\mu(B_n\cap T^{-m}B_n)-\mu(B_n)^2\right|<\frac\epsilon2$

Therefore, if ${N>N_0}$ and ${M>M_N}$ we get

$\displaystyle \begin{array}{rcl} &&\displaystyle\left|\frac1{NM}\sum_{n=1}^N\sum_{m=1}^M\mu(A\cap T^{-n} A\cap T^{-m}A\cap T^{-n-m}A)-\mu(A)^4\right|\\ &\leq&\displaystyle\left|\frac1N\sum_{n=1}^N\frac1M\sum_{m=1}^M\mu(B_n)^2-\mu(A)^4\right|+\left|\frac1N\sum_{n=1}^N\frac1M\sum_{m=1}^M\mu(B_n\cap T^{-m}B_n)-\mu(B_n)^2\right|\\ &\leq &\epsilon\end{array}$

$\Box$

Our final criterion for weak mixing is related with limits along ultrafilters. We recall that given an ultrafilter ${p}$ and a sequence ${\{x_n\}}$ taking values in some Hausdorff space, we define the limit of ${\{x_n\}}$ along ${p}$ to be some point ${x}$ such that for every neighborhood ${U}$ of ${x}$ the set ${\{n:x_n\in U\}}$ is in ${p}$. We recall that if the space is compact, then every sequence converges along ${p}$ (to a unique limit). I posted about this before and I gave a proof of that fact there.

We will be interested in a special class of ultrafilters, called minimal idempotent ultrafilters. A nice introduction of this topic is given in this survey by Bergelson. I explored some properties of general idempotent ultrafilters in a previous post. Minimal idempotents are a subclass of idempotent ultrafilters, for us it will suffice to know that if ${p}$ is such an ultrafilter and ${A\in p}$ then ${A}$ is piecewise syndetic, and in particular, the set ${A-A}$ is syndetic.

Theorem 8 Let ${(X,{\cal B},\mu,T)}$ be a measure preserving system and let ${p}$ be a minimal idempotent. Then the system is weak mixing if and only if for every ${f\in L^2}$ we have

$\displaystyle p\lim f\circ T^n=\int_Xfd\mu\qquad\text{weakly in }L^2$

Proof: Assume first that the system is not weak mixing. By the property (8) of the Theorem 3 there exists some non-constant ${f\in L^2}$ such that ${f(Tx)=\lambda f(x)}$ a.e. for some ${\lambda\in{\mathbb C}}$. We can assume that ${\int_Xfd\mu=0}$. But then

$\displaystyle p\lim f\circ T^n=p\lim\lambda^nf=f.p\lim\lambda^n$

We recall that ${|\lambda|=1}$ because ${T}$ preserves the measure. Since ${p}$ is idempotent we get

$\displaystyle p\lim\lambda^n=(p+p)\lim\lambda^n=p\lim_np\lim_m\lambda^{n+m}=p\lim_n\lambda^np\lim_m\lambda^m=(p\lim\lambda^n)^2$

Therefore ${p\lim\lambda^n=1}$ and hence ${p\lim f\circ T^n=f\neq0=\int_Xfd\mu}$.

Now we assume that the system is weak mixing. Let ${f\in L^2}$ and let ${g=p\lim f\circ T^n}$ (note that the limit always exists in the weak topology because the ball with radius ${\|f\|}$ is compact). Then because ${p}$ is idempotent we get that

$\displaystyle p\lim g\circ T^n=p\lim_np\lim_m f\circ T^m\circ T^n=p\lim_np\lim_m f\circ T^{n+m}=p\lim f\circ T^n=g$

We observe that

$\displaystyle \|g-g\circ T^n\|^2=2\|g\|^2-2\langle g,g\circ T^n\rangle=2\langle g,g-g\circ T^n\rangle$

so in this case ${p\lim g\circ T^n=g}$ in the strong topology. Now fix ${\epsilon>0}$ and note that let ${A:=\{n\in{\mathbb N}:\|g-g\circ T^n\|<\epsilon/2\}\in p}$. But then, for any ${n,m\in A}$ we have ${\|g\circ T^m-g\circ T^n\|<\epsilon}$ and thus for any ${n\in A-A}$ we have that ${\|g-g\circ T^n\|<\epsilon}$. Because ${A-A}$ is syndetic, there is some ${N\in{\mathbb N}}$ such that any ${n\in{\mathbb N}}$ can be decomposed as ${n=a+i}$ where ${a\in A-A}$ and ${1\leq i\leq N}$. Therefore the balls ${B(g\circ T^i,\epsilon)}$ with ${i=1,...,N}$ cover the orbit of ${g}$ and hence the orbit closure of ${g}$ is compact. Since the system is weak mixing we conclude by the condition (7) of the Theorem 3 that ${g}$ must be constant.

To conclude that ${g}$ is indeed ${\int_Xfd\mu}$ just note that the inner product ${\langle f\circ T^n,1\rangle}$ (where ${1}$ is the constant function equal to ${1}$) is always ${\int_Xfd\mu}$. $\Box$

This entry was posted in Analysis, Ergodic Theory and tagged , , , , . Bookmark the permalink.