Cesàro Convergence via L² Bounds #
This file implements Kallenberg's L² approach to proving convergence of Cesàro averages for exchangeable sequences. The key result is that block averages form a Cauchy sequence in L², using only elementary variance bounds.
Main results #
blockAvg_cauchy_in_L2: Block Cesàro averages form a Cauchy sequence in L²- Supporting lemmas for L² convergence analysis
References #
- Kallenberg (2005), Probabilistic Symmetries and Invariance Principles, Chapter 1, Lemma 1.2 and "Second proof"
Kallenberg's L² Approach (Lemma 1.2 + Second Proof) #
This section implements Kallenberg's "second proof" of de Finetti's theorem using elementary L² bounds. The key is Lemma 1.2: for exchangeable sequences, weighted averages satisfy a simple variance bound that makes Cesàro averages Cauchy in L².
No ergodic theory is used - only:
- Exchangeability → constant pairwise second moments
- Algebraic identity for variance of weighted sums
- Completeness of L²
This is the lightest-dependency route to de Finetti.
References:
- Kallenberg (2005), Probabilistic Symmetries, Chapter 1, pp. 27-28
- Lemma 1.2 (L² bound for exchangeable weighted sums)
- "Second proof of Theorem 1.1" (the L² route to de Finetti)
Kallenberg's L² bound (Lemma 1.2) - Core of the elementary proof.
For an exchangeable sequence and centered variables Z_i := f(X_i) - E[f(X_1)], the L² distance between any two weighted averages satisfies:
‖∑ p_i Z_i - ∑ q_i Z_i‖²_L² ≤ C_f · sup_i |p_i - q_i|
where C_f := E[(Z_1 - Z_2)²].
Key application: For uniform block averages of length n, ‖A_{m,n} - A_{m',n}‖_L² ≤ √(C_f/n)
making the family {A_{m,n}}_m Cauchy in L² as n→∞.
Proof: Pure algebra + exchangeability:
- Expand ‖∑ c_i Z_i‖² = ∑ c_i² E[Z_i²] + ∑_{i≠j} c_i c_j E[Z_i Z_j]
- By exchangeability: E[Z_i²] = E[Z_1²], E[Z_i Z_j] = E[Z_1 Z_2] for i≠j
- For c_i = p_i - q_i (differences of probability weights): ∑ c_i = 0
- Algebraic bound: ∑ c_i² ≤ (∑|c_i|) · sup|c_i| ≤ 2 · sup|c_i|
- Substitute and simplify to get the bound
This is exactly Kallenberg's Lemma 1.2. No ergodic theory needed!
Why this proof uses l2_contractability_bound instead of kallenberg_L2_bound #
The Circularity Problem:
The de Finetti theorem we're proving establishes: Contractable ↔ Exchangeable
contractable_of_exchangeable(✓ proved in Contractability.lean): Exchangeable → Contractablecesaro_to_condexp_L2(this file): Contractable → Exchangeable (via conditionally i.i.d.)
Since we're trying to prove Contractable → Exchangeable, we cannot assume exchangeability in this proof - that would be circular!
Why kallenberg_L2_bound requires exchangeability:
kallenberg_L2_bound needs Exchangeable μ Z to establish uniform second moments:
- E[Z_i²] = E[Z_0²] for all i (uniform variance)
- E[Z_i Z_j] = E[Z_0 Z_1] for all i≠j (uniform pairwise covariance)
Exchangeability gives this via permutation invariance: swapping indices doesn't change the distribution.
Why contractability is insufficient for kallenberg_L2_bound:
Contractability only tells us about increasing subsequences:
- For any increasing k : Fin m → ℕ, the subsequence (Z_{k(0)}, ..., Z_{k(m-1)}) has the same distribution as (Z_0, ..., Z_{m-1})
This is weaker than exchangeability:
- ✓ We know (Z_0, Z_1) has same distribution as (Z_1, Z_2), (Z_2, Z_3), etc.
- ✗ We DON'T know (Z_0, Z_1) has same distribution as (Z_1, Z_0) - contractability doesn't give permutation invariance!
However: contractability DOES give uniform covariance!
Even though contractability ≠ exchangeability, contractability is sufficient for:
- E[Z_i²] = E[Z_0²] for all i (from the increasing subsequence {i})
- E[Z_i Z_j] = E[Z_0 Z_1] for all i<j (from the increasing subsequence {i,j})
This is exactly the covariance structure needed by l2_contractability_bound from
L2Helpers.lean, which doesn't assume full exchangeability.
Note: By the end of this proof, we'll have shown Contractable → Exchangeable, so contractable sequences ARE exchangeable. But we can't use that equivalence while proving it - that would be begging the question!
Performance wrappers to stop unfolding blockAvg inside eLpNorm #
Frozen alias for blockAvg f X 0 n. Regular def (not @[irreducible])
but we provide helper lemmas to avoid unfolding in timeout-prone contexts.
This wrapper prevents expensive elaboration timeouts when blockAvg appears
inside eLpNorm goals, by using pre-proved lemmas instead of unfolding.
Equations
Instances For
blockAvg is measurable w.r.t. the m-th tail family.
The block average blockAvg f X m n only depends on X m, X (m+1), ..., X (m+n-1),
which are all measurable w.r.t. tailFamily X m.
blockAvg is AEStronglyMeasurable w.r.t. tailFamily X m.
Direct consequence of being Measurable w.r.t. a sub-σ-algebra.
Cesàro averages converge in L² to a tail-measurable limit.
This is the elementary L² route to de Finetti (Kallenberg's "second proof"):
- Kallenberg L² bound → Cesàro averages are Cauchy in L²
- Completeness of L² → limit α_f exists
- Block averages A_{N,n} are σ(X_{>N})-measurable → α_f is tail-measurable
- Tail measurability + L² limit → α_f = E[f(X_1) | tail σ-algebra]
No Mean Ergodic Theorem, no martingales - just elementary L² space theory!
L¹ version via L² → L¹ conversion.
For bounded functions on probability spaces, L² convergence implies L¹ convergence (by Cauchy-Schwarz: ‖f‖₁ ≤ ‖f‖₂ · ‖1‖₂ = ‖f‖₂).
This gives the L¹ convergence needed for the rest of the ViaL2 proof.
THEOREM (Indicator integral continuity at fixed threshold):
If Xₙ → X a.e. and each Xₙ, X is measurable, and t is a continuity set
(meaning μ(X⁻¹'{t}) = 0), then ∫ 1_{(-∞,t]}(Xₙ) dμ → ∫ 1_{(-∞,t]}(X) dμ.
This is the Dominated Convergence Theorem: indicator functions are bounded by 1, and converge pointwise a.e. The continuity set assumption ensures we avoid the boundary case where convergence can fail (when X ω = t and Xn oscillates around t).
Shifted Cesàro Convergence #
The following lemmas extend Cesàro convergence from the n=0 case to arbitrary shifts n. The key insight is that shifting the Cesàro window by n changes at most 2n terms (n removed from front, n added at back), each bounded by 1, giving an O(n/m) error.
Cesàro convergence for shifted sequences: For contractable ℝ-valued sequences, the Cesàro average starting at position n converges to the same conditional expectation as the unshifted average.
This resolves the circular import issue: the proof uses cesaro_to_condexp_L1 (n=0 case)
from this file, combined with a deterministic shift bound.