Single and multiple recurrence along non-polynomial sequences

Vitaly Bergelson, Florian Richter and I have recently uploaded to the arXiv our new paper “Single and multiple recurrence along non-polynomial sequences”. In this paper we address the question of what combinatorial structure is present in the set of return times of a non-polynomial sequence, i.e., sets of the form

\displaystyle   R=\big\{n\in{\mathbb N}:\mu(A\cap T^{-f(n)}A)>\mu^2(A)-\epsilon\big\} \ \ \ \ \ (1)

where {(X,\mu,T)} is a measure preserving system, {A\subset X}, {\epsilon>0} and {f:{\mathbb N}\rightarrow{\mathbb N}} is some non-polynomial function.

The departure point for this paper was the observation that when {f(n)} is a polynomial with integer coefficients and {f(0)=0}, set {R} in (1) is syndetic, i.e. has bounded gaps (for the special case {f(n)=n} this is the content of Khintchine’s theorem, and the general case was obtained by Furstenberg, see a proof in my earlier post), but the same is no longer true in general for non-polynomial sequences {f(n)}.

For instance when {f(n)=\lfloor n^a\rfloor} for some positive {a\in{\mathbb R}\setminus{\mathbb Z}} (where {\lfloor \cdot\rfloor} denotes the floor function) it is known that the set {R} from (1) is nonempty, and in fact has positive density but is not (in general) syndetic.

Our first theorem shows that nevertheless, the set {R} has an interesting (and a-priori surprising) combinatorial property, it is thick, i.e., contains arbitrarily long intervals of integers. One reason why this is surprising is that the notions of thick and syndetic sets are dual in the sense that a set is thick if and only if its complement is not syndetic and vice-versa. See my previous post for some discussion of the basic properties of thick and syndetic sets.

Theorem 1 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then the set {R} defined in (1) for {f(n)=\lfloor n^a\rfloor} is thick.

Theorem 1 implies in particular that for every {k\in{\mathbb N}} there exists some {n\in{\mathbb N}} such that each of the sets {T^{-f(n)}A}, {T^{-f(n+1)}A}, … , {T^{-f(n+k)}A} have a large intersection with {A}. Our second theorem proves that in fact all of these sets intersect together.

Theorem 2 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then for every {k\in{\mathbb N}} there exist (many) {n\in{\mathbb N}} such that

\displaystyle \mu\Big(A\cap T^{-f(n)}A\cap T^{-f(n+1)}A\cap\cdots\cap T^{-f(n+k)}A\Big)>0

where {f(n)=\lfloor n^a\rfloor}.

Note that both Theorems 1 and 2 fail when {f(n)} is a polynomial.

In the paper we deal with a considerably more general class of non-polynomial functions (see Definition 4 below) but for now I will restrict attention to the example {f(n)=\lfloor n^a\rfloor}. Continue reading

Posted in Ergodic Theory, paper | Tagged , , , , , | 2 Comments

Erdős Sumset conjecture

Hindman’s finite sums theorem is one of the most famous and useful theorems in Ramsey theory. It states that for any finite partition of the natural numbers, one of the cells {C} of this partition contains an IP-set, i.e., there exists an infinite set {I\subset{\mathbb N}} such that for any finite subset {F\subset I} the sum {\sum_{x\in F}x\in C}.

This result is a partition result with no obvious density counterpart. Indeed it is easy to see that the set {2{\mathbb N}-1} of odd numbers (which has density {1/2}) can not have an IP-set, since for any {x,y\in{\mathbb N}}, either {x}, {y} or {x+y} must be even. One could still hope that any set with positive density can be shifted in order to contain an IP-set. Indeed, for any set of positive Banach density {A} there is a shift {A-k} which contains a triple of the form {\{x,y,x+y\}}, and often configurations which are partition regular are also in some shift of any set with positive density (however this is not always true, even for configurations with {2} elements, as shown by Kriz’s theorem).

Nevertheless, E. Straus constructed an example of a set {A} with density

\displaystyle d(A):=\lim_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N

arbitrarily close to {1} and such that no shift {A-k} with {k\in{\mathbb N}} contains an IP-set. (Straus never published this result, see Theorem 11.6 in this chapter by Hindman or ) or Theorem 2.20 in this more recent paper.

Observe that by splitting an infinite set {I} into two infinite pieces {B,C}, it follows that any IP-set contains the sum {B+C:=\{b+c:b\in B,c\in C\}} for infinite sets {B} and {C}. Therefore the following fact can be seen as a weakening of Hindman’s theorem:

Proposition 1 For any finite partition {{\mathbb N}=C_1\cup\dots\cup C_r} of the natural numbers there exists a color {A\in\{C_1,\dots,C_r\}} and infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

Unlike IP-sets, the configuration {B+C} is invariant under shifts, so it makes sense to ask whether any set with positive density contains such configuration.

Conjecture 2 (Erdős sumset conjecture) Let {A\subset{\mathbb N}} and assume that the Banach density

\displaystyle d^*(A):=\lim_{N\rightarrow\infty}\max_{M\in{\mathbb N}}\frac{|A\cap\{M+1,M+2,\dots,M+N\}|}N

is positive. Then there exist infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

This past week I was at AIM in San Jose for a workshop on Nonstandard methods in combinatorial number theory and there was some interesting discussion surrounding this conjecture. A few years ago there was some significant progress in this direction, in a paper by Di Nasso, Golbring, Jin, Leth, Lupini and Mahlburg using nonstandard analysis, and perhaps because all the authors were present at this workshop there was some hope of extending the ideas developed in the paper to address the general case. In this post I present a standard rendering of the proof from the DGJLLM paper as well as some of the observations made during the workshop. Continue reading

Posted in Combinatorics, Number Theory, State of the art | Tagged , , , , , , , | 4 Comments

An arithmetic van der Corput trick and the polynomial van der Waerden theorem

The van der Corput difference theorem (or van der Corput trick) was developed (unsurprisingly) by van der Corput, and deals with uniform distribution of sequences in the torus.

Theorem 1 (van der Corput trick) Let {(x_n)_{n\in{\mathbb N}}} be a sequence in a torus {{\mathbb T}^d:={\mathbb R}^d/{\mathbb Z}^d}. If for every {h\in{\mathbb N}} the sequence {(x_{n+h}-x_n)_{n\in{\mathbb N}}} is uniformly distributed on {{\mathbb T}^d}, then so is {(x_n)_{n\in{\mathbb N}}}.

In 1987, Bergelson observed that a similar trick works in Hilbert spaces and it is this version which turned out to be amazingly useful in ergodic theory. Here is a consequence of the “Hilbertian” van der Corput trick of Bergelson:

Theorem 2 Let {(X,mu,T)} be a measure preserving system, let {k\in{\mathbb N}} and let {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}}. If

\displaystyle \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\int_Xf_0\cdot\prod_{p\in{\mathcal P}}T^{p(n)}f_pd\mu=0\qquad \text{ for any }f_p\in L^\infty(X) \ \ \ \ \ (1)

holds with {{\mathcal P}=\Big\{p_\ell(n+h)-p_1(n):\ell\in[k]\Big\}\cup\Big\{p_\ell(n)-p_1(n):\ell\in[k]\Big\}} for every {h\in{\mathbb N}}, then (1) holds with {{\mathcal P}=\{p_1,\dots,p_k\}}.

Then in 1996, Bergelson and Leibman adapted the same ideology to topological dynamics. Unfortunately, they didn’t explicitly state an analogue of the difference theorem in topological dynamics (such an explicit statement is Lemma 8.5 in this paper). While it is quite a stretch to call such an analogue as a van der Corput trick (so far from uniform distribution it is) it still expresses the same idea of exploiting the “derivatives” in order to obtain a result about the “original function”.

Here I take this analogy one step further, and using the connection between topological dynamics and arithmetic Ramsey theory powered by piecewise syndetic sets, I will present an arithmetic analogue of the van der Corput trick in {{\mathbb N}}. First it will be convenient to have the following definition:

Definition 3 Let {k\in{\mathbb N}} and {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}}. We say that the family {\{p_1,\dots,p_k\}} is a family of multiple recurrence if for any finite coloring (i.e. partition) of the natural numbers there exists {a,n\in{\mathbb N}} such that the set

\displaystyle \big\{a,a+p_1(n),\dots,a+p_k(n)\big\}

is monochromatic.

Theorem 4 (Arithmetic van der Corput trick) Let {k\in{\mathbb N}} and {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}} be such that {p_i(0)=0}. Assume that for any finite set {F\subset{\mathbb N}\cup\{0\}} the family

\displaystyle \Big\{x\mapsto p_i(x+h)-p_1(x)-p_i(h)\mid h\in F,i\in\{1,\dots k\}\Big\}

is a family of multiple recurrence. Then the family {\{p_1,\dots,p_k\}} is a family of multiple recurrence as well.

In this post I will give a proof of Theorem 4 and then I will illustrate its power by deriving from it a short proof of the polynomial van der Waerden theorem of Bergelson and Leibman. Continue reading

Posted in Combinatorics, Ramsey Theory, Tool | Tagged , , , , , , , , , | Leave a comment

Piecewise syndetic sets, topological dynamics and ultrafilters

In this post I explore the notion of piecewise syndeticity and its relation to topological dynamical systems and the Stone-Čech compactification. I restrict attention to the additive semigroup {({\mathbb N},+)} but most results presented are true in much bigger generality (and I tried to present the proofs in a way that does’t depend crucially on the fact that the semigroup is {({\mathbb N},+)}). These results and observations are collected from several sources, and are scattered in the literature so I thought it would be nice to collect them all in one place. Probably the union of Furstenberg’s book with the book by Hindman and Strauss contains most (if not all) of the things in here; these surveys of Bergelson are also a good source.

— 1. Introduction —

There are several notions of largeness for subsets of {{\mathbb N}}, each notion with its advantages and disadvantages. One such notion is that of syndetic set, i.e., a set {S\subset{\mathbb N}} with bounded gaps. More precisely, {S} is syndetic if there exists a finite set {F} such that any natural number can be written as {a-b} with {a\in S} and {b\in F}.

Example 1 The following are syndetic sets:

  • The set {{\mathbb N}} of all natural numbers,
  • Any co-finite subset {S\subset{\mathbb N}},
  • The set {2{\mathbb N}} of even numbers,
  • Any infinite progression of the form {a{\mathbb N}+b},
  • The set {\{n\in{\mathbb N}:d(n\alpha,{\mathbb N})<\epsilon\}} for any {\alpha,\epsilon>0}, where {d(x,{\mathbb N})} is the distance between the positive real number {x} and the lattice of natural numbers.
  • The set {\{n\in{\mathbb N}:d(f(n),{\mathbb N})<\epsilon\}} for any {\epsilon>0} and any {f\in{\mathbb R}[x]} with an irrational coefficient (other than the constant term).

One problem with syndeticity is that this notion is not partition regular, i.e. if {S} is a syndetic set and we decompose {S=A\cup B}, it may happen that neither {A} nor {B} are syndetic (take {S={\mathbb N}} and {A} and {B} consisting of increasingly long alternating sequences of intervals). One way to fix this is to consider its dual family and then its intersection family (spoiler alert: we end up with piecewise syndetic sets).

Continue reading

Posted in Classic results, Combinatorics, Tool, Topological Dynamics | Tagged , , | 3 Comments

Measure preserving actions of affine semigroups and {x+y,xy} patterns

Vitaly Bergelson and I have recently submitted to the arXiv our paper entitled `Measure preserving actions of affine semigroups and {\{x+y,xy\}} patterns’. The main purpose of this paper is to extend the results of our previous paper, establishing some partial progress towards the following (still open) conjecture:

Conjecture 1 Let {{\mathbb N}=C_1\cup\cdots\cup C_r} be an arbitrary finite partition of the natural numbers. Then there exists {i\in\{1,\dots,r\}} and infinitely many {x,y\in{\mathbb N}} such that

\displaystyle \{x+y,xy\}\subset C_i

This conjecture is the simplest unknown case of a much stronger conjecture pertaining various monochromatic polynomial configurations, which is itself only a piece in the wide open problem of classifying which polynomial configurations are partition regular over {{\mathbb N}}; see this post for another recent result in this direction.

Continue reading

Posted in Combinatorics, paper, Ramsey Theory | Tagged , , , , , , , | 2 Comments

Szemerédi Theorem Part VI – Dichotomy between weak mixing and compact extension

This is the sixth and final post in a series about Szemerédi’s theorem. In this post I complete the proof of the Multiple Recurrence Theorem, which I showed in a previous post of this series to be equivalent to Szemerédi’s theorem. The last step missing (cf. third point of Theorem 7 in my previous post) is the following:

Proposition 1 Let {(X,{\cal B},\mu,T)} be a measure preserving system and let {{\mathcal A}\subset{\mathcal B}} be factor. If the extension {{\cal A}\subset{\cal B}} is not weak mixing, then there exists a non-trivial extension {{\cal A}\subset{\cal D}\subset{\cal B}} of which is compact.

The definitions are all precisely given in this earlier post from this series. Most of this post is adapted from the Chapter 7.8 of the book by Einsiedler and Ward. Continue reading

Posted in Ergodic Theory, Ramsey Theory | Tagged , , , | Leave a comment

Gaussian systems

Examples of measure preserving systems with varied behaviours are vital in ergodic theory, to understand the general properties and to have counter examples to false statements. One classical method to craft examples with specific properties is the so-called Gaussian construction. In this post I define and give simple examples of applications of this construction. I took most of the material for this post from the book of Cornfeld, Fomin and Sinai.

— 1. Definition of Gaussian systems —

Definition 1

  1. A probability measure {\nu} on the Borel sets of {{\mathbb R}} is called Gaussian if there exists a pair {(\bar\nu,\sigma)\in{\mathbb R}\times{\mathbb R}^{\geq0}} such that

    \displaystyle \forall A\subset{\mathbb R}\text{ Borel }\qquad\nu(A)=\frac1{\sigma\sqrt{2\pi}}\int_A\exp\left(\frac{-(x-\bar\nu)^2}{2\sigma^2}\right)dx \ \ \ \ \ (1)

  2. Given a probability space {(X,{\mathcal B},\mu)} and a Borel measurable function {f:X\rightarrow{\mathbb R}}, we say that {f} is a Gaussian function if the pushforward {\nu:=f_*\mu} is a Gaussian measure.

Using probabilistic terminology, {f} is a random variable, {\bar\nu=\int_Xfd\mu} is the expectation (or mean or first moment) of {f} and {\sigma=\|f-\bar\nu\|_{L^2}} is the standard deviation. Since by subtracting a constant from {f} we obtain {\int fd\mu=\bar\nu=0}, we will from now on assume that every Gaussian function has mean {0}. The following lemma is well known in probability theory and the proof will not be provided.

Lemma 2 If {H\subset L^2(X)} is a (real) subspace and every {f\in H} is a Gaussian function, then any {f\in\overline{H}} in the {L^2} closure of {H} is also a Gaussian function.

Next we define Gaussian measure preserving systems.

Definition 3 Let {(X,{\mathcal B},\mu,T)} be a measure preserving system and let {f\in L^2(X)}.

  1. We say that {f} is Gauss element if any function in the linear span {H_0} of {\{T^nf:n\in{\mathbb Z}\}} is Gaussian.
  2. We say that {(X,{\mathcal B},\mu,T)} is a Gaussian system if there exists a Gauss element {f} in {L^2_0(X)} whose orbit generates the whole {\sigma}-algebra, i.e.

    \displaystyle {\mathcal B}=\sigma\big(\big\{(T^nf)^{-1}(A):n\in{\mathbb Z},~A\subset{\mathbb R}\text{ is a Borel set}\big\}\big)

    Such {f} is called a generating Gaussian element.

We want to stress the obvious fact that {(X,{\mathcal B},\mu,T)} being a Gaussian system does not imply that every {f\in L^2(X)} is a Gaussian function (consider for instance the indicator function of any measurable set) but rather that it contains a closed invariant subspace {H}, generating the whole {\sigma}-algebra, such that every {f\in H} is a Gaussian function.

— 2. Positive definite sequences and Gaussian systems —

In the same way that a Gaussian measure (with mean {0}) is uniquely determined by its standard deviation, a Gaussian system is uniquely determined by the correlation sequence {\gamma(n)=\langle T^nf,f\rangle}.

It is well known that for any measure preserving system {(X,\mu,T)} and any {f\in L^2(X)}, the sequence {n\mapsto\langle T^nf,f\rangle} is positive definite:

Definition 4 A function {\gamma:{\mathbb Z}\rightarrow{\mathbb R}} is called positive definite if for every {f:{\mathbb Z}\rightarrow{\mathbb C}} with finite support

\displaystyle \sum_{n,m\in{\mathbb Z}}f(n)\overline{f(m)}\gamma(n-m)\geq0

In particular, the correlation sequence {\gamma} of a Gaussian system is positive definite. It is not so obvious that for every positive definite sequence {\gamma} there exist a Gaussian system with correlation sequence {\gamma}, or indeed a measure preserving system {(X,{\mathcal B},\mu,T)} and {f\in L^2(X)} with {\gamma(n)=\int_XT^nf\cdot fd\mu}.

Theorem 5 Every positive definite sequence {\gamma} is the correlation sequence of a Gaussian system.

While so far we restricted our attention to {{\mathbb Z}}-actions, Theorem 5 holds in far greater generality. Indeed we can replace the role of {{\mathbb Z}} with any group (in fact the result goes even outside the scope of groups, by considering, instead of positive definite sequences, so-called reproducing kernels (which are essentially functions {K} with domain in the cartesian square of any set {S} satisfying certain properties. In the case of groups the reproducing kernel takes the shape of {K(n,m)=\gamma(m^{-1}n)=\langle T_nf,T_mf\rangle}).) The proof of Theorem 5 has two steps, both of which are non-trivial. The first step is to establish a halfway result (it can be found in this formulation as Theorem 5.20 in these notes of Paulsen):

Theorem 6 (Naimark’s Dilation Theorem) For every positive definite sequence {\gamma:G\rightarrow{\mathbb R}} of a group {G} there exists unitary representation of the group {G} on a Hilbert space {H} and a vector {f\in H} such that {\gamma(n)=\langle U_nf,f\rangle} for all {n\in G}.

Proof: (sketch)

Let {H_0} be the set of all functions {f:G\rightarrow{\mathbb C}} with finite support. Define a inner product in {H_0} by letting

\displaystyle \langle f,g\rangle=\sum_{n,m\in G}f(n)\overline{g(m)}\gamma(m^{-1}n)

Quotient out the subspace {f:\langle f,f\rangle=0} and then let {H} be the completion of {H_0} with respect to the norm {\|f\|^2=\langle f,f\rangle}; the extension of the inner product turns {H} into a Hilbert space. Define {U_gf} for {f\in U_0} as the function with finite support {(U_gf)(n)=f(ng)} and extend it by continuity to {H}; this becomes a unitary representation of {G} in {H}. Finally take {f=1_{1_G}} to be the indicator function of the singleton containing the identity of {G}; it is clear that {\gamma(n)=\langle U_nf,f\rangle}. \Box

We can now prove Theorem 5:

Proof: Let {X={\mathbb R}^G}, let {\nu} be the Gaussian measure on {{\mathbb R}} with mean {\bar\nu=0} and standard deviation {\sigma=1} and let {\mu=\nu^G} be the product measure on the Borel sets of {X}. Let {H, (U_g)_{g\in G}} and {f\in H} be the Hilbert space, unitary representation and vector given by Naimark’s dilation theorem. Let {(e_n)_{n\in G}} be an orthonormal basis on {H} and define {\phi:H\rightarrow L^2(X)} by mapping {e_n} to {\pi_n}, where {\pi_n:X\rightarrow{\mathbb R}} is the projection onto the {n}-th coordinate. Extend {\phi} to {H}, it follows from Lemma 2 that every function {h\in\phi(H)} is a Gaussian function.

Next let {h=\phi(f)}, it is a Gaussian element. Since {\phi} is an isometric isomorphism, it follows that {\gamma(n)=\langle U_nf,f\rangle=\langle T_nh,h\rangle}, so {\gamma} is the correlation function of the Gaussian system {(X,{\mathcal B},\mu,T,h)}. \Box

— 3. Spectral properties of Gaussian systems —

In this section we return to {{\mathbb Z}}-actions (in fact, most of this section applies to actions of locally compact abelian groups). Let {(X,{\mathcal B},\mu,T)} be a Gaussian system and let {f\in X} be a generating Gaussian element with correlation sequence {\gamma}. Bochner-Herglotz theorem states that there exists a measure {\rho} on {{\mathbb T}:={\mathbb R}/{\mathbb Z}\cong\{z\in{\mathbb C}:|z|=1\}} such that {\gamma(n)=\int_{\mathbb T} e_nd\rho}, where {e_n:x\mapsto2^{2\pi i x}}. The measure {\rho} is called the spectral measure of the Gaussian system. Since {\gamma(n)=\gamma(-n)} it follows that {\rho(A)=\rho(-A)} for any Borel {A\subset{\mathbb T}}.

Let {H\subset L^2(X)} denote the smallest closed invariant subspace containing {f} . One can define the linear map {\theta:H\rightarrow L^2({\mathbb T},\rho)} by sending {T^kf} to {\theta(T^kf):=e_k} and extending by linearity and continuity to {H}. Observe that for a real valued function {h\in H}, the image under {\theta} satisfies {(\theta h)(-t)=\overline{(\theta h)(t)}}.

Proposition 7 The map {\theta:H\rightarrow L^2({\mathbb T},\rho)} is an isometric isomorphism such that

\displaystyle \theta(T^nh)=e_n\cdot\theta(h)

This proposition, whose proof is routine and will be omitted, already allows us to deduce the first non-trivial fact about Gaussian systems:

Proposition 8 If the spectral measure {\rho} has atoms, then the Gaussian system {(X,{\mathcal B},\mu,T)} is not ergodic.

Proof: Let {x\in{\mathbb T}} be such that {\rho(\{x\})>0} and let {\varphi\in L^2({\mathbb T},\rho)} be the indicator function of the singleton {\{x\}}. Observe that {e_n\cdot\varphi=e_n(x)\varphi} (to be completely clear, the first member of the equation has a multiplication of functions, the second a multiplication of a scalar with a function). Let {h=\theta^{-1}\varphi}, it follows from Proposition 7 that

\displaystyle T^nh=T^n(\theta^{-1}\varphi)=\theta^{-1}(e_n\varphi)=\theta^{-1}(e_n(x)\varphi)=e_n(x)\theta^{-1}(h)=e_n(x)h

Therefore {|h|\in L^2(X)} is a {T}-invariant function. Moreover, since {h\in H}, both its real and imaginary part are Gaussian functions, therefore {h} can not be a constant, and this shows that the Gaussian system {(X,{\mathcal B},\mu,T)} is not ergodic. \Box

To show the converse (in a strong sense) to Proposition 8 we need a more fine understanding of the geometry of {L^2(X)}. Denote by {H_0\subset L^2(X)} the one dimensional space of constant functions and let {H_1} be the space denote {H} above. Then, for each {m\geq2} we let {H_m} be the orthogonal complement of {H_0\oplus\cdots\oplus H_{m-1}} in the closed linear span of functions of the form {f_1\cdot f_2\cdots f_m} for any {f_j\in H\cup\{1\}}. Since {H} generates the full {\sigma}-algebra, we have the orthogonal decomposition

\displaystyle L^2(X)=\bigoplus_{m=0}^\infty H_m\ \ \ \ \ (2)


Observe that each {H_m} is a closed invariant subspace of {L^2(X)}. For any {m\geq1} one can define a map {\theta_m} analogous to the map {\theta} defined above. Unfortunately, the construction is significantly more complicated , so I will just state the relevant properties without proof.

Theorem 9 For each {m\in{\mathbb N}} there exists a map {\theta_m:H_m\rightarrow L^2({\mathbb T}^m,\rho^m)} (where {\rho^m} is just the product measure of {\rho} with itself {m} times) satisfying:

  1. The image of {\theta_m} is the set of functions {\varphi\in L^2({\mathbb T}^m,\rho^m)} which are invariant under permutation of coordinates.
  2. The map {\theta_m} is an isometric isomorphism between {H_m} and its image.
  3. Let {e\Delta:{\mathbb T}^m\rightarrow{\mathbb T}} denote the map {{\bf t}\mapsto e^{2\pi i(t_1+\cdots+t_m)}}. Then for every {h\in H_m}

    \displaystyle \theta_m(Th)=e\Delta\cdot(\theta h)

We can now prove the converse of Proposition 8.

Proposition 10 Let {(X,{\mathcal B},\mu,T)} be a Gaussian system . If the spectral measure {\rho} has no atoms, then the system {(X,{\mathcal B},\mu,T)} is weak mixing.

Proof: Let {h\in L^2(X)} be an eigenvector satisfying {Th=\lambda h}; we need to show that it is constant. Using (2), decompose {h=\sum_{m=0}^\infty h_m}. Since the spaces {H_m} are orthogonal and invariant, each {h_m} is itself an eigenvector with {Th_m=\lambda h_m}. Next, fix {m\geq1} and let {\varphi_m=\theta(h_m)}. Using Theorem 9 we deduce that {e\Delta\cdot\varphi_m=\lambda\cdot\varphi_m} in {L^2({\mathbb T}^m,\rho^m)}. More precisely

\displaystyle \int_{{\mathbb T}^m}\left|e\Delta({\bf t})-\lambda\right|^2\cdot|\varphi_m({\bf t})|^2d\rho^m({\bf t})=0\ \ \ \ \ (3)


Since {\rho} is non-atomic, the codimension {1} shifted subtorus {S=\{{\bf t}\in{\mathbb T}^m:e\Delta({\bf t})=\lambda\}} has measure {\rho^m(S)=0}. Hence it follows from (3) that {\|\varphi\|_{L^2}=0}. Since {m\geq1} was arbitrary, we conclude that {h=h_0}, i.e. it is a constant as desired. \Box

Corollary 11 A Gaussian system is weak mixing if and only if it is ergodic.

Another property which can be easily deduced from the decomposition (2) characterizes strong mixing of a Gaussian system in terms of properties of the spectral measure {\rho}.

Proposition 12 Let {(X,{\mathcal B},\mu,T)} be a Gaussian system. The system is strongly mixing if and only if {\lim_{|n|\rightarrow\infty}\hat\rho(n)=0}.

Proof: Let {f\in L^2(X)} be a generating Gaussian element. If the system is strongly mixing then {\lim_{|n|\rightarrow\infty}\hat\rho(n)=\lim_{|n|\rightarrow\infty}\langle T^nf,f\rangle=0}.

Now assume that {\lim_{|n|\rightarrow\infty}\hat\rho(n)=0} and let {h\in H_m} for some {m\geq1}. Let {{\mathcal A}} be the pullback {\sigma}-algebra of the Borel {\sigma}-algebra on {{\mathbb T}} by the map {{\bf t}\mapsto t_1+\cdots+t_m}. Let {\varphi=\theta h}, let {\rho^{*m}} be the convolution of {\rho} with itself {m} times (equivalently, {\rho^{*m}} is the pushforward of {\rho^m} through the map {{\bf t}\mapsto t_1+\cdots+t_m}) and let {\psi=\mathop{\mathbb E}[|\varphi|^2\mid{\mathcal A}]\in L^2({\mathbb T},\rho^{*m})} be the conditional expectation of {\varphi} on {{\mathcal A}}. We have

\displaystyle \langle T^nh,h\rangle=\int_{{\mathbb T}^m}(e\Delta)^n\cdot|\varphi|^2d\rho^m=\int_{\mathbb T} e_n\cdot\psi d\rho^{*m}

Since {\hat{\rho^{*m}}(n)=(\hat\rho(n)\big)^m}, we have that {\lim_{|n|\rightarrow\infty}\hat{\rho^{*m}}(n)=0}. Therefore the generalized Riemann-Lebesgue lemma implies that {\lim_{|n|\rightarrow\infty}\langle T^nh,h\rangle=0}, finishing the proof. \Box

It is a known fact that there exist continuous singular measures {\rho} on {{\mathbb T}} such that {\hat\rho(n)} does not go to {0} as {|n|\rightarrow\infty}. This implies that there exist (Gaussian) systems which are weak mixing but not strongly mixing.

— 4. Rigid {{\mathbb Z}^d}-action mixing of all shapes —

One interesting construction using Gaussian systems is of a rigid {{\mathbb Z}^d}-action mixing of all shapes.

Definition 13 Let {G} be an abelian group and {F\subset G} be a finite set.

  1. A probability preserving action {(X,{\mathcal B},\mu,(T_g)_{g\in G})} of {G} is {F}-mixing if

    \displaystyle \lim_{k\rightarrow\infty}\mu\left(\bigcap_{n\in F}T_{kn}A_n\right)=\prod_{n\in F}\mu(A_n)

    for any {A_n\in{\mathcal B}} for {n\in F}.

  2. The system {(X,{\mathcal B},\mu,(T_g)_{g\in G})} is mixing of all shapes if it is {F}-mixing for every finite {F\subset G}.
  3. The system {(X,{\mathcal B},\mu,(T_g)_{g\in G})} is rigid if there exists a sequence {(n_k)_{k\in{\mathbb N}}} in {G} such that {\lim_{k\rightarrow\infty}\mu(A\cap T_{n_k}A)=\mu(A)} for all {A\in{\mathcal B}}.

Observe that if a system is mixing of order {m}, then it is {F}-mixing for every {F\subset G} with {|F|\leq m}. However, the converse is not necessarily true.

Theorem 14 (Ferenczi and Kamiński) There exists a probability preserving action of {{\mathbb Z}^d} which is rigid and mixing of all shapes.

Proof: (Sketch)

In view of Theorem 5, it suffices to construct a positive definite sequence with certain properties. Indeed one can show that if {\gamma:{\mathbb Z}^d\rightarrow{\mathbb R}} is a positive definite sequence such that {\lim_{k\rightarrow\infty}\gamma(m+kn)=0} for all {m,n\in{\mathbb Z}^d} with {n\neq0}, then the Gaussian system induced by {\gamma} is mixing of all shapes. Similarly, if {\gamma({\bf 0})=1} and there exists a sequence {(n_k)_{k\in{\mathbb N}}} in {{\mathbb Z}^d} along which {\lim_{k\rightarrow\infty}\gamma(n_k)=1}, then the induced Gaussian system is rigid (exactly along {(n_k)}). Taking {1,\beta_1,\dots,\beta_d\in{\mathbb R}} linearly independent over {{\mathbb Q}}, the sequence

\displaystyle \gamma({\bf n})=\frac{\sin\big(2\pi(n_1\beta_1+\cdots+n_d\beta_d)\big)}{2\pi(n_1\beta_1+\cdots+n_d\beta_d)}

satisfies the conditions. Indeed, it is clear to show that {\lim_{k\rightarrow\infty}\gamma(m+kn)=0} for any {m,n\in{\mathbb Z}^d} with {n\neq0}. It is also not hard to find a sequence {(n_k)_{k\in{\mathbb N}}} in {{\mathbb Z}^d} such that {\sum n_k^{(j)}\beta_j\rightarrow0}, which implies that {\gamma(n_k)\rightarrow1}. \Box

Posted in Classic results, Ergodic Theory | Leave a comment

Alternative proofs of two classical lemmas

Two of the most fundamental tools in ergodic Ramsey theory are the mean ergodic theorem and the van der Corput trick. Both have a classical and fairly simple proof, which I have presented before in the blog. Recently I came across alternative proofs for both results which seem to “not use anything” other than very well known general facts (such as Banach-Alaoglu theorem). Although both facts are related to Cesàro limits along Følner sequences in arbitrary amenable groups, the proofs presented in this post are not related. I don’t think either of the proofs is new, but I don’t remember ever seing them written out; so I decided to post them here.

After writing down the alternative proof of the van der Corput trick, I realized one actually needs somewhat advanced results to obtain the full generality… (However if we care only about {{\mathbb Z}}, or even abelian groups (or the weak version with only one average), then the proof is softer.) Continue reading

Posted in Number Theory | 1 Comment

The horocycle flow is mixing of all orders

— 1. Introduction —

The main purpose of this post is to present a proof, due to Brian Marcus, that the horocycle flow is mixing of all orders. The precise definition of mixing of all orders for {{\mathbb R}}-actions is given below in Definition 2; we begin by describing the horocycle flow. Let {SL(2,{\mathbb R})} denote the set of all {2\times2} matrices with real entries and determinant {1}, endowed with the usual topology and the (left and right invariant) Haar measure {\lambda}. Given a discrete subgroup {\Gamma\subset SL(2,{\mathbb R})}, the quotient {X:=SL(2,{\mathbb R})/\Gamma} is given the quotient topology and `quotient Haar measure’ {\mu}. The latter can be described by

\displaystyle \int_X\sum_{\gamma\in\Gamma}f(x\gamma)~\mathtt{d}\mu(x)=\int_{SL(2,{\mathbb R})}f(g)~\mathtt{d}\lambda(g)

for any {f\in C_c(SL(2,{\mathbb R}))}. If the total measure {\mu(X)} of {X} is finite (in which case we normalize it so that {\mu(X)=1}), we say that {\Gamma} is a lattice. The classical example is when {\Gamma=SL(2,{\mathbb Z})}.

The horocycle flow is the (continuous) {{\mathbb R}}-action on {X} defined by {s\cdot(g\Gamma)=(h_sg)\Gamma}, where {h_s=\left[\begin{array}{cc} 1 & s \\ 0 & 1 \end{array}\right]} for {s\in{\mathbb R}} and {g\in SL(2,{\mathbb R})}. We will also need the geodesic flow, corresponding in this case to {g_t=\left[\begin{array}{cc} e^{t/2} & 0 \\ 0 &e^{-t/2} \end{array}\right]}.

The theorem we will prove actually deals with a more general situation:

Theorem 1 Let {X} be a (finite dimensional) manifold, let {\mu} a Borel probability measure on {X} and let {(h_s)_{s\in{\mathbb R}}} and {(g_t)_{t\in{\mathbb R}}} be continuous {\mu}-preserving {{\mathbb R}}-actions. If {(h_s)_{s\in{\mathbb R}}} is ergodic and there exists {\lambda>1} such that

\displaystyle  g_th_sg_t^{-1}=h_{s\lambda^t}, \ \ \ \ \ (1)

then {(h_s)_{s\in{\mathbb R}}} is mixing of all orders.

The first step of the proof of Theorem 1 is to show that {(h_s)_{s\in{\mathbb R}}} satisfies a weaker property, called property {P(N)} (see Definition 3) which resembles a property equivalent to weakly mixing of order {N} (at least in the case of {{\mathbb Z}}-actions). For this step, all we need is that {(h_s)_{s\in{\mathbb R}}} is a continuous probability preserving mixing {{\mathbb R}}-action .

Continue reading

Posted in Analysis, Classic results, Ergodic Theory | Tagged , , , , | Leave a comment

Large subsets of discrete hypersurfaces in Z^d contain arbitrarily many collinear points

— 1. Introduction —

Recently, Florian Richter and I uploaded to the arXiv our paper titled `Large subsets of discrete hypersurfaces in {{\mathbb Z}^d} contain arbitrarily many collinear points’.

This was the outcome of a fun project which started when we learned about Pomerance’s theorem:

Theorem 1 (Pomerance’s theorem) For every {B,k\in{\mathbb N}} there exists {n\in{\mathbb N}} such that for any {u_0,\dots,u_n\in{\mathbb Z}^2} with average gap bounded by {B}, i.e.

\displaystyle \frac1n\sum_{i=1}^n\|u_i-u_{i-1}\|\leq B,

there exist {k} points among {u_1,\dots,u_n} which are collinear.

I have presented the proof of Pomerance’s theorem in this post. A weakening of Pomerance’s theorem was proved earlier by L. T. Ramsey (it is Lemma 1 in that paper) as a combinatorial intermediate step in a problem in Fourier analysis.

Lemma 2 (Ramsey’s lemma) For every {B,k\in{\mathbb N}} there exists {n\in{\mathbb N}} such that for any {u_0,\dots,u_n\in{\mathbb Z}^2} with gaps bounded by {B}, i.e.

\displaystyle \forall i\in\{1,\dots,n\}\qquad\qquad\|u_i-u_{i-1}\|\leq B

there exist {k} points among {u_1,\dots,u_n} which are collinear.

Even when {B=1} Lemma 2 is nontrivial (and in fact it is easy to derive Lemma 2 from the case when {B=1}).

The question we asked (and eventually answered) was whether these results could be extended to higher dimensions. The first problem here is how to even formulate such a generalization!
Continue reading

Posted in Analysis, Combinatorics, paper | Tagged , , , , , | 2 Comments