One of the most fundamental notions in modern mathematics is the notion of limit. And to define limit of a sequence, the only structure needed is a Hausdorff topology (if we want the limit to be unique). One definition of the limit of a sequence is:

Definition 1 (Limit)Let be a sequence in a Hausdorff space. Then we say that the limit of the sequence is if for any open neighborhood of there is a finite set such that for all .

This may not be exactly the definition used most times, but is clearly equivalent. One issue that arises with the limit is that it may not exist, so any time one wants to say something about the limit of a sequence, one has to prove first that the limit exist. Several classical techniques are used to do this, for instance, if the it’s a sequence of real numbers, both and of the sequence are defined, and they coincide if and only if the sequence has a limit. Other crucial idea, for sequences in complete metric spaces, is that one can tell if a sequence is convergent by information only inside the sequence, namely the property of being a Cauchy sequence.

However, sometimes, having a limit as defined above is too much to ask a sequence to satisfy. One weaker statement that has, perhaps surprisingly, a lot of useful applications is to require that some subsequence converge. This is, of course, the case if the space is compact. Another way to weaken the notion of limit is to consider weaker topologies (with less open sets). This has been extensively used in applied functional analysis and in the study of differential equations.

We now turn our attention to sequences taking values in normed vector spaces (over ). If then it is easy to conclude that , where is the expectation and in this case can be replaced by . If satisfies the second condition, we say that is the Cesàro limit of the sequence. A sequence can be Cesàro convergent without being convergent in the usual sense, for instance, is clearly not convergent, but is convergent in the Cesàro sense to . This idea of weaker limit is used for instance in Fourier Analysis. A perhaps more famous example is the Law of Large Numbers, which we can state as: Let be independent and equidistributed random variables on some probability space , with expectation and bounded variance. Then for almost all we have that converges to in the Cesàro sense. This illustrates the fact that with Cesàro limits one can find some order in random (or chaotic) behavior in the long time run.

We now give a definition of Cesàro convergence, which is clearly equivalent to the one discussed above.

Definition 2 (Cesàro limit)Let be a sequence in a normed vector space. We say that converges in the Cesàro sense to if .

This definition was written in this form to suggest a slight strengthening, which we will call strong Cesàro convergence:

Definition 3 (Strong Cesàro limit)Let be a sequence in a normed vector space. We say that converges in the strong Cesàro sense to if

It’s easy to see that the sequence which converges in Cesàro sense to , does not converge in the strong Cesàro sense, so this is indeed a stronger notion.

In a Hilbert space , let be a unitary operator with no fixed points (i.e. for all ). Then for all we have that converges to in the (regular) Cesàro sense (and clearly if this holds then fixes no point). This result is called the (mean) Ergodic Theorem, and so such operator is called ergodic. The name comes from the fact that if is a probability preserving system (meaning that and for all we have ), then the Koopman operator (i.e. the operator defined by . Notice that is unitary since preserves the measure) is ergodic in (which is the set of functions such that ) if and only if the system is ergodic, in the classic sense that if then or .

Now assume that has no invariant subspace (and so in particular no fixed points). Then this is equivalent to the fact that converges to in the strong Cesàro sense, for all (note how two weaker notions of convergence are being used here: if for all then in the weak topology, here we request that to happen only in the strong Cesàro sense.) Such operator is called weakly mixing, again related to the ergodic theory definition.

Furthermore, given a unitary operator defined in the Hilbert space , one can decompose the space where and it’s orthogonal complement is (and thus is ergodic if ). Another decomposition is , where ( stands for compact, as is compact for ) and it’s orthogonal complement .

These decompositions are simple examples of the fruitful idea to separate a system into a structured component (such as or ) and a noisy (or pseudo-random) component (such as or ). In the long run the noisy components cancel and became negligible, and the behavior of the structured component, which is easier to control/predict becomes dominant.

Other nice feature of the strong Cesàro convergence is that if a sequence converges in the strong Cesàro sense, then it also converges in density as defined by:

Definition 4 (Convergence in density)Let be a sequence in a Hausdorff space. Then we say that the sequence converges to in density if for all open neighborhood of there exist a sub set of density such that for all .

Here having density means that .

Compare this definition with the first definition of (usual) limit. We notice that this notion is independent of the (regular) Cesàro convergence, in the sense doesn’t converge in density but the sequence defined by if for some and otherwise, converges in density to but not in the Cesàro sense.

I will now prove the claimed relation between strong Cesàro limit and density limit.

Proposition 5Let converge strong Cesàro to . Then, there is a subset of density such that .

For , the set has density. Indeed, suppose . Then there is some sequence increasing to such that for large enough and this contradicts the strong Cesàro convergence.

Let and for each let be such that if then . Then let . We claim that satisfies the required conditions. Indeed , since for we have . Therefore, if ,

so . Also if and then so

Looking to definitions 1 and 4 we can formulate a general notion of limit by saying that if for all open neighborhood of there is a co-null subset of such that for all . In my previous post on co-null sets I didn’t discuss co-null sets in , however it is quite acceptable that finite sets and sets of density can be classified as null, since the union of finitely many sets in one of those collections is still in that collection, and because is a countable set we can’t expect to do better than finite unions.

A somewhat artificial way to introduce a notion of co-null sets in is through the use of a non-principal ultrafilter, i.e. a collection of subsets of (which will be our co-null sets) satisfying the properties one would expect co-null sets to satisfy, namely:

- ,
- If and then ,
- If and are in , then is also in ,
and two more properties to assure we don’t get trivial issues

- If then ,
- No finite set is in .

A collection satisfying conditions to is called simply a ultrafilter. It’s easy to see that for each , the family of all sets containing is a ultrafilter, and such ultrafilters are called principal. The condition prevents that from happening, and so if also satisfies condition it is called a non-principal ultrafilter. The existence of such requires the axiom of choice.

Definition 6 (p-limit)Let be a ultrafilter and a Hausdorff space. Given a sequence we define it’slimit alongby if for all neighborhood of there is some such that for all .

One nice feature of this notion is that sequences in compact spaces always have limit along , without need to pass to a subsequence.

Hi, Joel. Do you think one could replace N in Def. 1, 2, 3, 4 by a more general space? I think at least we have Def. 1 for general top. space X, with A compact set. Cesaro sum looks like normalized integral to me, with Dirac measure in your particular case.

With all these, do we still have Prop. 5(with appropriate definition of density) ?

One can indeed define limits for a function with domain in a general topological space (instead of just functions with support in , which are sequences) when goes to infinity (and now can be in any topological space) replacing in definition 1 the condition on A to be a compact set.

In definitions 2 and 3 we take the average over some set, and then we change the set and consider the limit of those averages. A way I can see to make a more general definition in the spirit of definition 2 is the following:

Let be a measure space, and for each let be a measurable set such that the sequence is either increasing and or decreasing and for some . Then for a function its Cesàro limit along is the limit of the averages of the function on (if this limit exists).

An application of this formulation (in the case when and are shrinking balls around some fixed point ) is the Lebesgue Differentiation Theorem.

Pingback: Equidistribution of polynomials, recurrence and van der Corput trick | YAMB

Pingback: Convergence along ultrafilters | YAMB

Pingback: The horocycle flow is mixing of all orders | I Can't Believe It's Not Random!