Entropy of measure preserving systems

A measure preserving system is a quadruple {(X,{\mathcal B},\mu,T)} where {X} is a set, {{\mathcal B}} is a {\sigma}-algebra, {\mu:{\mathcal B}\rightarrow[0,1]} is a probability measure and {T:X\rightarrow X} is a measurable map satisfying {\mu(T^{-1}A)=\mu(A)} for every {A\in{\mathcal B}}.

The notion of isomorphism in the category of measure preserving systems (defined, for instance, in this earlier post) is fairly flexible, as it allows {0} measure perturbations. For instance, the system {([0,1],{\mathcal B},\lambda,T_2)} where {\lambda} is the Lebesgue measure and {T_2x=2x\bmod1} is isomorphic to the Bernoulli system {(\{0,1\}^{\mathbb N},\mu,\sigma)}, where {\sigma:\{0,1\}^{\mathbb N}\rightarrow\{0,1\}^{\mathbb N}} is the left shift and {\mu} is the infinite product of the measure {p} on {\{0,1\}} defined so that {p(\{0\})=p(\{1\})=1/2}.

However, this flexibility can make it quite difficult to decide whether two given systems are isomorphic or not. While several spectral invariants can be used, such as ergodicity, weakly mixing or strongly mixing, they are not sufficient to distinguish even between two Bernoulli shifts. To address this problem, Kolmogorov and Sinai developed the notion of entropy in the late 1950’s. Since then, the entropy of measure preserving systems, together with its sister in topological dynamics, has became a fundamental tool in understanding the structure of dynamical systems, with applications far beyond distinguishing non-isomorphic systems.

In this post I will define and then briefly survey some of the basic properties of entropy. I’ll mostly follow the exposition in the upcoming book of Einsiedler, Lindenstrauss and Ward, whose draft is available here; other good sources are Walter’s book and, in a somewhat different perspective, Austin’s recent notes.

Continue reading

Posted in Classic results, Ergodic Theory, Tool | Tagged | Leave a comment

Optimal intersectivity

In ergodic Ramsey theory, one often wants to prove that certain dynamically defined sets in a probability space intersect (or “recur”) in non-trivial ways. Typically, this is achieved by studying the long term behavior of the sets as the dynamics flow. However, in certain situations, one can establish the desired intersection (or recurrence) using purely combinatorial arguments, and without using the fact that the sets are dynamically defined. In such cases, one ends up obtaining a “static” (as opposed to dynamical) statement. An instance of this situation is the following intersectivity result of Bergelson, first used in this paper, and which I have mentioned before in this blog.

Lemma 1 Let {A_1,A_2,\dots} be sets in a probability space {(X,\mu)} with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite set {I\subset{\mathbb N}} with density {d(I)>\delta}, such that for every non-empty finite set {F\subset I} we have

\displaystyle \mu\left(\bigcap_{n\in F}A_n\right)>0

A different kind of intersection property is the following static modification of Poincaré’s recurrence theorem.

Lemma 2 Let {(X,\mu)} be a probability space and let {A_1, A_2,\dots} be sets with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite subset {I\subset{\mathbb N}} such that for every {n,m\in I},

\displaystyle \mu\left(A_n\cap A_m\right)>\delta^2

Observe that if the events {A_1,A_2,\dots} are independent, then the lower bound {\delta^2} is essentially achieved, and so this lemma is that sense optimal. The purpose of this post is to present a proof of the following common strengthening of Lemmas 1 and 2.

Theorem 3 Let {A_1,A_2,\dots} be sets in a probability space {(X,\mu)} with {\inf_n\mu(A_n)>\delta}. Then there exists an infinite set {I\subset{\mathbb N}} such that for every non-empty finite set {F\subset I} we have

\displaystyle \mu\left(\bigcap_{n\in F}A_n\right)>\delta^{|F|}

Observe that the bound {\delta^{|F|}} is optimal, again by considering the case of independent sets.

I learned this strengthening, as well as its proof from Konstantinos Tyros last December, when we were in Lyon, France attending the conference Ultrafilters, Ramsey Theory and Dynamics.

One could ask whether something more can be said about the set {I} in Theorem 3 other than being infinite . Something one can not hope for is that the set {I} has positive density, as showed in my previous post, using Forest’s theorem that not all sets of recurrence are sets of nice recurrence. On the other hand, one can indeed obtain certain combinatorial structures inside {I}. For instance, assuming the index set of the sets {(A_n)} is given the structure of a homogeneous tree, one can choose {I} to be a strong subtree; this is Theorem 3 in this paper of Dodos, Kanellopoulos and Tyros. Results of this kind are related to density versions of the Hales-Jewett and the Carlson-Simpson theorems due respectively to Furstenberg and Katznelson, and to Dodos, Kanellopoulos and Tyros. Continue reading

Posted in Combinatorics, Probability, Ramsey Theory, Tool | Tagged , , , , | Leave a comment

A proof of the Erdős sumset conjecture

Florian Richter, Donald Robertson and I have uploaded to the arXiv our paper entitled A proof of the Erdős sumset conjecture. The main goal of the paper is to prove the following theorem, which verifies a conjecture of Erdős discussed in this previous post.

Theorem 1 Let {A\subset{\mathbb N}} have positive upper density. Then it contains a sumset {B+C=\{b+c:b\in B, c\in C\}} for infinite sets {B} and {C}.

As usual, the upper density of a set {A\subset{\mathbb N}} is the quantity

\displaystyle \bar d(A):=\limsup_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N.

More generally, given a sequence {\Phi:N\mapsto\Phi_N=\{a_N,a_N+1,\dots,b_N\}} of intervals of integers with lengths {|\Phi_N|} tending to infinity (or even more generally, given any Følner sequence {\Phi:N\mapsto\Phi_N} in {{\mathbb N}}), we define the upper density with respect to {\Phi} as

\displaystyle \bar d_\Phi(A):=\limsup_{N\rightarrow\infty}\frac{\big|A\cap\Phi_N\big|}{|\Phi_N|}.

When the limit exists we say that the density of {A} with respect to {\Phi} exists and denote it by {d(A)} (and whenever we write {d(A)} we are implicitly stating that the limit exists).

As explained in my earlier post mentioned above (see Proposition 12), one can adapt an argument of Di Nasso, Goldbring, Jin, Leth, Lupini and Mahlburg to reduce Theorem 1 to the following statement.

Theorem 2 If {A\subset{\mathbb N}} has positive upper density, then there exist a sequence {\Phi} of intervals, a set {L\subset{\mathbb N}} and {\epsilon>0} such that for every finite subset {F\subset L}, the intersection

\displaystyle  \bigcap_{\ell\in F}(A-\ell)\cap\big\{m\in{\mathbb N}:d_\Phi(A-m\cap L)>0\big\} \ \ \ \ \ (1)

is infinite.

In my previous post I described how Theorem 2 can then be easily checked for weak mixing sets and for Bohr sets, whose definitions we now recall.

Definition 3 A set {A\subset{\mathbb N}} is:

  • a Bohr set if there exist {d\in{\mathbb N}}, a vector {\alpha\in{\mathbb T}^d} and an open set {U\subset{\mathbb T}^d} such that {A\supset\{n\in{\mathbb N}:n\alpha\in U\}\neq\emptyset}.
  • a weak mixing set with respect to the sequence of intervals {\Phi} if the set {\big\{n:\big|d_\Phi((A-n)\cap A)-d_\Phi(A)^2\big|<\epsilon\big\}} has density {1}.

It is well known in the realm of ergodic theory that weak mixing and almost periodicity are extreme points of a dichotomy (see for instance, this post) and so it was natural to try to patch these two observations into a proof of Theorem 2 in general. While this is ultimately how we prove Theorem 2, there were several problems with implementing this idea. Below the fold I explain the problems that arise and (briefly) sketch how we address them. Continue reading

Posted in Combinatorics, paper | Tagged , , , , , , , | Leave a comment

Single and multiple recurrence along non-polynomial sequences

Vitaly Bergelson, Florian Richter and I have recently uploaded to the arXiv our new paper “Single and multiple recurrence along non-polynomial sequences”. In this paper we address the question of what combinatorial structure is present in the set of return times of a non-polynomial sequence, i.e., sets of the form

\displaystyle   R=\big\{n\in{\mathbb N}:\mu(A\cap T^{-f(n)}A)>\mu^2(A)-\epsilon\big\} \ \ \ \ \ (1)

where {(X,\mu,T)} is a measure preserving system, {A\subset X}, {\epsilon>0} and {f:{\mathbb N}\rightarrow{\mathbb N}} is some non-polynomial function.

The departure point for this paper was the observation that when {f(n)} is a polynomial with integer coefficients and {f(0)=0}, set {R} in (1) is syndetic, i.e. has bounded gaps (for the special case {f(n)=n} this is the content of Khintchine’s theorem, and the general case was obtained by Furstenberg, see a proof in my earlier post), but the same is no longer true in general for non-polynomial sequences {f(n)}.

For instance when {f(n)=\lfloor n^a\rfloor} for some positive {a\in{\mathbb R}\setminus{\mathbb Z}} (where {\lfloor \cdot\rfloor} denotes the floor function) it is known that the set {R} from (1) is nonempty, and in fact has positive density but is not (in general) syndetic.

Our first theorem shows that nevertheless, the set {R} has an interesting (and a-priori surprising) combinatorial property, it is thick, i.e., contains arbitrarily long intervals of integers. One reason why this is surprising is that the notions of thick and syndetic sets are dual in the sense that a set is thick if and only if its complement is not syndetic and vice-versa. See my previous post for some discussion of the basic properties of thick and syndetic sets.

Theorem 1 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then the set {R} defined in (1) for {f(n)=\lfloor n^a\rfloor} is thick.

Theorem 1 implies in particular that for every {k\in{\mathbb N}} there exists some {n\in{\mathbb N}} such that each of the sets {T^{-f(n)}A}, {T^{-f(n+1)}A}, … , {T^{-f(n+k)}A} have a large intersection with {A}. Our second theorem proves that in fact all of these sets intersect together.

Theorem 2 Let {(X,\mu,T)} be a measure preserving system, let {A\subset X}, and {a,\epsilon>0}, {a\notin{\mathbb N}}. Then for every {k\in{\mathbb N}} there exist (many) {n\in{\mathbb N}} such that

\displaystyle \mu\Big(A\cap T^{-f(n)}A\cap T^{-f(n+1)}A\cap\cdots\cap T^{-f(n+k)}A\Big)>0

where {f(n)=\lfloor n^a\rfloor}.

Note that both Theorems 1 and 2 fail when {f(n)} is a polynomial.

In the paper we deal with a considerably more general class of non-polynomial functions (see Definition 4 below) but for now I will restrict attention to the example {f(n)=\lfloor n^a\rfloor}. Continue reading

Posted in Ergodic Theory, paper | Tagged , , , , , | Leave a comment

Erdős Sumset conjecture

Hindman’s finite sums theorem is one of the most famous and useful theorems in Ramsey theory. It states that for any finite partition of the natural numbers, one of the cells {C} of this partition contains an IP-set, i.e., there exists an infinite set {I\subset{\mathbb N}} such that for any finite subset {F\subset I} the sum {\sum_{x\in F}x\in C}.

This result is a partition result with no obvious density counterpart. Indeed it is easy to see that the set {2{\mathbb N}-1} of odd numbers (which has density {1/2}) can not have an IP-set, since for any {x,y\in{\mathbb N}}, either {x}, {y} or {x+y} must be even. One could still hope that any set with positive density can be shifted in order to contain an IP-set. Indeed, for any set of positive Banach density {A} there is a shift {A-k} which contains a triple of the form {\{x,y,x+y\}}, and often configurations which are partition regular are also in some shift of any set with positive density (however this is not always true, even for configurations with {2} elements, as shown by Kriz’s theorem).

Nevertheless, E. Straus constructed an example of a set {A} with density

\displaystyle d(A):=\lim_{N\rightarrow\infty}\frac{|A\cap\{1,\dots,N\}|}N

arbitrarily close to {1} and such that no shift {A-k} with {k\in{\mathbb N}} contains an IP-set. (Straus never published this result, see Theorem 11.6 in this chapter by Hindman or ) or Theorem 2.20 in this more recent paper.

Observe that by splitting an infinite set {I} into two infinite pieces {B,C}, it follows that any IP-set contains the sum {B+C:=\{b+c:b\in B,c\in C\}} for infinite sets {B} and {C}. Therefore the following fact can be seen as a weakening of Hindman’s theorem:

Proposition 1 For any finite partition {{\mathbb N}=C_1\cup\dots\cup C_r} of the natural numbers there exists a color {A\in\{C_1,\dots,C_r\}} and infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

Unlike IP-sets, the configuration {B+C} is invariant under shifts, so it makes sense to ask whether any set with positive density contains such configuration.

Conjecture 2 (Erdős sumset conjecture) Let {A\subset{\mathbb N}} and assume that the Banach density

\displaystyle d^*(A):=\lim_{N\rightarrow\infty}\max_{M\in{\mathbb N}}\frac{|A\cap\{M+1,M+2,\dots,M+N\}|}N

is positive. Then there exist infinite sets {B,C\subset{\mathbb N}} such that {A\supset B+C}.

This past week I was at AIM in San Jose for a workshop on Nonstandard methods in combinatorial number theory and there was some interesting discussion surrounding this conjecture. A few years ago there was some significant progress in this direction, in a paper by Di Nasso, Golbring, Jin, Leth, Lupini and Mahlburg using nonstandard analysis, and perhaps because all the authors were present at this workshop there was some hope of extending the ideas developed in the paper to address the general case. In this post I present a standard rendering of the proof from the DGJLLM paper as well as some of the observations made during the workshop. Continue reading

Posted in Combinatorics, Number Theory, State of the art | Tagged , , , , , , , | 4 Comments

An arithmetic van der Corput trick and the polynomial van der Waerden theorem

The van der Corput difference theorem (or trick) was develop (unsurprisingly) by van der Corput, and deals with uniform distribution of sequences in the torus.

Theorem 1 (van der Corput trick) Let {(x_n)_{n\in{\mathbb N}}} be a sequence in a torus {{\mathbb T}^d:={\mathbb R}^d/{\mathbb Z}^d}. If for every {h\in{\mathbb N}} the sequence {(x_{n+h}-x_n)_{n\in{\mathbb N}}} is uniformly distributed on {{\mathbb T}^d}, then so is {(x_n)_{n\in{\mathbb N}}}.

In 1987, Bergelson observed that a similar trick works in Hilbert spaces and it is this version which turned out to be amazingly useful in ergodic theory (unfortunately for Bergelson, following his own example, everyone keeps calling it the van der Corput trick). Here is a consequence of the “Hilbertian” van der Corput trick of Bergelson:

Theorem 2 Let {(X,mu,T)} be a measure preserving system, let {k\in{\mathbb N}} and let {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}}. If

\displaystyle  \lim_{N\rightarrow\infty}\frac1N\sum_{n=1}^N\int_Xf_0\cdot\prod_{p\in{\mathcal P}}T^{p(n)}f_pd\mu=0\qquad \text{ for any }f_p\in L^\infty(X) \ \ \ \ \ (1)

holds with {{\mathcal P}=\Big\{p_\ell(n+h)-p_1(n):\ell\in[k]\Big\}\cup\Big\{p_\ell(n)-p_1(n):\ell\in[k]\Big\}} for every {h\in{\mathbb N}}, then (1) holds with {{\mathcal P}=\{p_1,\dots,p_k\}}.

Then in 1996, Bergelson and Leibman adapted the same ideology to topological dynamics. Unfortunately, they didn’t explicitly state an analogue of the difference theorem in topological dynamics (such an explicit statement is Lemma 8.5 in this paper). While it is quite a stretch to call such an analogue as a van der Corput trick (so far from uniform distribution it is) it still expresses the same idea of exploiting the “derivatives” in order to obtain a result about the “original function”.

Here I take this analogy one step further, and using the connection between topological dynamics and arithmetic Ramsey theory powered by piecewise syndetic sets, I will present an arithmetic analogue of the van der Corput trick in {{\mathbb N}}. First it will be convenient to have the following definition:

Definition 3 Let {k\in{\mathbb N}} and {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}}. We say that the family {\{p_1,\dots,p_k\}} is a family of multiple recurrence if for any finite coloring (i.e. partition) of the natural numbers there exists {a,n\in{\mathbb N}} such that the set

\displaystyle \big\{a,a+p_1(n),\dots,a+p_k(n)\big\}

is monochromatic.

Theorem 4 (Arithmetic van der Corput trick) Let {k\in{\mathbb N}} and {p_1,\dots,p_k:{\mathbb N}\rightarrow{\mathbb Z}} be such that {p_i(0)=0}. Assume that for any finite set {F\subset{\mathbb N}\cup\{0\}} the family

\displaystyle \Big\{x\mapsto p_i(x+h)-p_1(x)-p_i(h)\mid h\in F,i\in\{1,\dots k\}\Big\}

is a family of multiple recurrence. Then the family {\{p_1,\dots,p_k\}} is a family of multiple recurrence as well.

In this post I will give a proof of Theorem 4 and then I will illustrate its power by deriving from it a short proof of the polynomial van der Waerden theorem of Bergelson and Leibman. Continue reading

Posted in Combinatorics, Ramsey Theory, Tool | Tagged , , , , , , , , , | Leave a comment

Piecewise syndetic sets, topological dynamics and ultrafilters

In this post I explore the notion of piecewise syndeticity and its relation to topological dynamical systems and the Stone-Čech compactification. I restrict attention to the additive semigroup {({\mathbb N},+)} but most results presented are true in much bigger generality (and I tried to present the proofs in a way that does’t depend crucially on the fact that the semigroup is {({\mathbb N},+)}). These results and observations are collected from several sources, and are scattered in the literature so I thought it would be nice to collect them all in one place. Probably the union of Furstenberg’s book with the book by Hindman and Strauss contains most (if not all) of the things in here; these surveys of Bergelson are also a good source.

— 1. Introduction —

There are several notions of largeness for subsets of {{\mathbb N}}, each notion with its advantages and disadvantages. One such notion is that of syndetic set, i.e., a set {S\subset{\mathbb N}} with bounded gaps. More precisely, {S} is syndetic if there exists a finite set {F} such that any natural number can be written as {a-b} with {a\in S} and {b\in F}.

Example 1 The following are syndetic sets:

  • The set {{\mathbb N}} of all natural numbers,
  • Any co-finite subset {S\subset{\mathbb N}},
  • The set {2{\mathbb N}} of even numbers,
  • Any infinite progression of the form {a{\mathbb N}+b},
  • The set {\{n\in{\mathbb N}:d(n\alpha,{\mathbb N})<\epsilon\}} for any {\alpha,\epsilon>0}, where {d(x,{\mathbb N})} is the distance between the positive real number {x} and the lattice of natural numbers.
  • The set {\{n\in{\mathbb N}:d(f(n),{\mathbb N})<\epsilon\}} for any {\epsilon>0} and any {f\in{\mathbb R}[x]} with an irrational coefficient (other than the constant term).

One problem with syndeticity is that this notion is not partition regular, i.e. if {S} is a syndetic set and we decompose {S=A\cup B}, it may happen that neither {A} nor {B} are syndetic (take {S={\mathbb N}} and {A} and {B} consisting of increasingly long alternating sequences of intervals). One way to fix this is to consider its dual family and then its intersection family (spoiler alert: we end up with piecewise syndetic sets).

Continue reading

Posted in Classic results, Combinatorics, Tool, Topological Dynamics | Tagged , , | 2 Comments