Documentation

Exchangeability.DeFinetti.ViaL2.CesaroConvergence

Cesàro Convergence via L² Bounds #

This file implements Kallenberg's L² approach to proving convergence of Cesàro averages for exchangeable sequences. The key result is that block averages form a Cauchy sequence in L², using only elementary variance bounds.

Main results #

References #

Kallenberg's L² Approach (Lemma 1.2 + Second Proof) #

This section implements Kallenberg's "second proof" of de Finetti's theorem using elementary L² bounds. The key is Lemma 1.2: for exchangeable sequences, weighted averages satisfy a simple variance bound that makes Cesàro averages Cauchy in L².

No ergodic theory is used - only:

  1. Exchangeability → constant pairwise second moments
  2. Algebraic identity for variance of weighted sums
  3. Completeness of L²

This is the lightest-dependency route to de Finetti.

References:

theorem Exchangeability.DeFinetti.ViaL2.kallenberg_L2_bound {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (Z : Ω) (hZ_exch : Exchangeable μ Z) (hZ_meas : ∀ (i : ), Measurable (Z i)) (p q : ) (s : Finset ) (hs : s.Nonempty) (hp_prob : s.sum p = 1 is, 0 p i) (hq_prob : s.sum q = 1 is, 0 q i) (hZ_L2 : is, MeasureTheory.MemLp (Z i) 2 μ) :
(ω : Ω), (∑ is, (p i - q i) * Z i ω) ^ 2 μ ( (ω : Ω), (Z 0 ω - Z 1 ω) ^ 2 μ) * s.sup' hs fun (i : ) => |p i - q i|

Kallenberg's L² bound (Lemma 1.2) - Core of the elementary proof.

For an exchangeable sequence and centered variables Z_i := f(X_i) - E[f(X_1)], the L² distance between any two weighted averages satisfies:

‖∑ p_i Z_i - ∑ q_i Z_i‖²_L² ≤ C_f · sup_i |p_i - q_i|

where C_f := E[(Z_1 - Z_2)²].

Key application: For uniform block averages of length n, ‖A_{m,n} - A_{m',n}‖_L² ≤ √(C_f/n)

making the family {A_{m,n}}_m Cauchy in L² as n→∞.

Proof: Pure algebra + exchangeability:

  1. Expand ‖∑ c_i Z_i‖² = ∑ c_i² E[Z_i²] + ∑_{i≠j} c_i c_j E[Z_i Z_j]
  2. By exchangeability: E[Z_i²] = E[Z_1²], E[Z_i Z_j] = E[Z_1 Z_2] for i≠j
  3. For c_i = p_i - q_i (differences of probability weights): ∑ c_i = 0
  4. Algebraic bound: ∑ c_i² ≤ (∑|c_i|) · sup|c_i| ≤ 2 · sup|c_i|
  5. Substitute and simplify to get the bound

This is exactly Kallenberg's Lemma 1.2. No ergodic theory needed!

Why this proof uses l2_contractability_bound instead of kallenberg_L2_bound #

The Circularity Problem:

The de Finetti theorem we're proving establishes: Contractable ↔ Exchangeable

  • contractable_of_exchangeable (✓ proved in Contractability.lean): Exchangeable → Contractable
  • cesaro_to_condexp_L2 (this file): Contractable → Exchangeable (via conditionally i.i.d.)

Since we're trying to prove Contractable → Exchangeable, we cannot assume exchangeability in this proof - that would be circular!

Why kallenberg_L2_bound requires exchangeability:

kallenberg_L2_bound needs Exchangeable μ Z to establish uniform second moments:

  • E[Z_i²] = E[Z_0²] for all i (uniform variance)
  • E[Z_i Z_j] = E[Z_0 Z_1] for all i≠j (uniform pairwise covariance)

Exchangeability gives this via permutation invariance: swapping indices doesn't change the distribution.

Why contractability is insufficient for kallenberg_L2_bound:

Contractability only tells us about increasing subsequences:

  • For any increasing k : Fin m → ℕ, the subsequence (Z_{k(0)}, ..., Z_{k(m-1)}) has the same distribution as (Z_0, ..., Z_{m-1})

This is weaker than exchangeability:

  • ✓ We know (Z_0, Z_1) has same distribution as (Z_1, Z_2), (Z_2, Z_3), etc.
  • ✗ We DON'T know (Z_0, Z_1) has same distribution as (Z_1, Z_0) - contractability doesn't give permutation invariance!

However: contractability DOES give uniform covariance!

Even though contractability ≠ exchangeability, contractability is sufficient for:

  • E[Z_i²] = E[Z_0²] for all i (from the increasing subsequence {i})
  • E[Z_i Z_j] = E[Z_0 Z_1] for all i<j (from the increasing subsequence {i,j})

This is exactly the covariance structure needed by l2_contractability_bound from L2Helpers.lean, which doesn't assume full exchangeability.

Note: By the end of this proof, we'll have shown Contractable → Exchangeable, so contractable sequences ARE exchangeable. But we can't use that equivalence while proving it - that would be begging the question!

Performance wrappers to stop unfolding blockAvg inside eLpNorm #

def Exchangeability.DeFinetti.ViaL2.blockAvgFrozen {Ω : Type u_3} (f : ) (X : Ω) (n : ) :
Ω

Frozen alias for blockAvg f X 0 n. Regular def (not @[irreducible]) but we provide helper lemmas to avoid unfolding in timeout-prone contexts.

This wrapper prevents expensive elaboration timeouts when blockAvg appears inside eLpNorm goals, by using pre-proved lemmas instead of unfolding.

Equations
Instances For
    theorem Exchangeability.DeFinetti.ViaL2.blockAvgFrozen_def {Ω : Type u_3} (f : ) (X : Ω) (n : ) (ω : Ω) :
    blockAvgFrozen f X n ω = blockAvg f X 0 n ω
    theorem Exchangeability.DeFinetti.ViaL2.blockAvgFrozen_measurable {Ω : Type u_3} [MeasurableSpace Ω] (f : ) (X : Ω) (hf : Measurable f) (hX : ∀ (i : ), Measurable (X i)) (n : ) :
    theorem Exchangeability.DeFinetti.ViaL2.blockAvgFrozen_abs_le_one {Ω : Type u_3} [MeasurableSpace Ω] (f : ) (X : Ω) (hf_bdd : ∀ (x : ), |f x| 1) (n : ) (ω : Ω) :
    theorem Exchangeability.DeFinetti.ViaL2.blockAvgFrozen_diff_memLp_two {Ω : Type u_3} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (f : ) (X : Ω) (hf : Measurable f) (hX : ∀ (i : ), Measurable (X i)) (hf_bdd : ∀ (x : ), |f x| 1) (n n' : ) :
    MeasureTheory.MemLp (fun (ω : Ω) => blockAvgFrozen f X n ω - blockAvgFrozen f X n' ω) 2 μ
    theorem Exchangeability.DeFinetti.ViaL2.blockAvg_measurable_tailFamily {Ω : Type u_3} [MeasurableSpace Ω] {f : } (hf : Measurable f) {X : Ω} (hX : ∀ (i : ), Measurable (X i)) (m n : ) :

    blockAvg is measurable w.r.t. the m-th tail family.

    The block average blockAvg f X m n only depends on X m, X (m+1), ..., X (m+n-1), which are all measurable w.r.t. tailFamily X m.

    theorem Exchangeability.DeFinetti.ViaL2.blockAvg_aestronglyMeasurable_tailFamily {Ω : Type u_3} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} {f : } (hf : Measurable f) {X : Ω} (hX : ∀ (i : ), Measurable (X i)) (m n : ) :

    blockAvg is AEStronglyMeasurable w.r.t. tailFamily X m.

    Direct consequence of being Measurable w.r.t. a sub-σ-algebra.

    theorem Exchangeability.DeFinetti.ViaL2.cesaro_to_condexp_L2 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {X : Ω} (hX_contract : Contractable μ X) (hX_meas : ∀ (i : ), Measurable (X i)) (f : ) (hf_meas : Measurable f) (hf_bdd : ∀ (x : ), |f x| 1) :
    ∃ (α_f : Ω), MeasureTheory.MemLp α_f 2 μ MeasureTheory.AEStronglyMeasurable α_f μ Filter.Tendsto (fun (n : ) => MeasureTheory.eLpNorm (blockAvg f X 0 n - α_f) 2 μ) Filter.atTop (nhds 0) α_f =ᵐ[μ] μ[f X 0|TailSigma.tailSigma X]

    Cesàro averages converge in L² to a tail-measurable limit.

    This is the elementary L² route to de Finetti (Kallenberg's "second proof"):

    1. Kallenberg L² bound → Cesàro averages are Cauchy in L²
    2. Completeness of L² → limit α_f exists
    3. Block averages A_{N,n} are σ(X_{>N})-measurable → α_f is tail-measurable
    4. Tail measurability + L² limit → α_f = E[f(X_1) | tail σ-algebra]

    No Mean Ergodic Theorem, no martingales - just elementary L² space theory!

    theorem Exchangeability.DeFinetti.ViaL2.cesaro_to_condexp_L1 {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] {X : Ω} (hX_contract : Contractable μ X) (hX_meas : ∀ (i : ), Measurable (X i)) (f : ) (hf_meas : Measurable f) (hf_bdd : ∀ (x : ), |f x| 1) (ε : ) :
    ε > 0∃ (M : ), mM, (ω : Ω), |1 / m * i : Fin m, f (X (↑i) ω) - μ[f X 0|TailSigma.tailSigma X] ω| μ < ε

    L¹ version via L² → L¹ conversion.

    For bounded functions on probability spaces, L² convergence implies L¹ convergence (by Cauchy-Schwarz: ‖f‖₁ ≤ ‖f‖₂ · ‖1‖₂ = ‖f‖₂).

    This gives the L¹ convergence needed for the rest of the ViaL2 proof.

    theorem Exchangeability.DeFinetti.ViaL2.tendsto_integral_indicator_Iic {Ω : Type u_3} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (Xn : Ω) (X : Ω) (t : ) (hXn_meas : ∀ (n : ), Measurable (Xn n)) (hX_meas : Measurable X) (hae : ∀ᵐ (ω : Ω) μ, Filter.Tendsto (fun (n : ) => Xn n ω) Filter.atTop (nhds (X ω))) (h_cont : μ (X ⁻¹' {t}) = 0) :
    Filter.Tendsto (fun (n : ) => (ω : Ω), (Set.Iic t).indicator (fun (x : ) => 1) (Xn n ω) μ) Filter.atTop (nhds ( (ω : Ω), (Set.Iic t).indicator (fun (x : ) => 1) (X ω) μ))

    THEOREM (Indicator integral continuity at fixed threshold): If Xₙ → X a.e. and each Xₙ, X is measurable, and t is a continuity set (meaning μ(X⁻¹'{t}) = 0), then ∫ 1_{(-∞,t]}(Xₙ) dμ → ∫ 1_{(-∞,t]}(X) dμ.

    This is the Dominated Convergence Theorem: indicator functions are bounded by 1, and converge pointwise a.e. The continuity set assumption ensures we avoid the boundary case where convergence can fail (when X ω = t and Xn oscillates around t).

    Shifted Cesàro Convergence #

    The following lemmas extend Cesàro convergence from the n=0 case to arbitrary shifts n. The key insight is that shifting the Cesàro window by n changes at most 2n terms (n removed from front, n added at back), each bounded by 1, giving an O(n/m) error.

    theorem Exchangeability.DeFinetti.ViaL2.cesaro_convergence_all_shifts {Ω : Type u_1} [MeasurableSpace Ω] {μ : MeasureTheory.Measure Ω} [MeasureTheory.IsProbabilityMeasure μ] (X : Ω) (hX_contract : Contractable μ X) (hX_meas : ∀ (i : ), Measurable (X i)) (f : ) (hf_meas : Measurable f) (hf_bdd : ∀ (x : ), |f x| 1) (n : ) (ε : ) :
    ε > 0∃ (M : ), mM, (ω : Ω), |1 / m * k : Fin m, f (X (n + k) ω) - μ[f X 0|TailSigma.tailSigma X] ω| μ < ε

    Cesàro convergence for shifted sequences: For contractable ℝ-valued sequences, the Cesàro average starting at position n converges to the same conditional expectation as the unshifted average.

    This resolves the circular import issue: the proof uses cesaro_to_condexp_L1 (n=0 case) from this file, combined with a deterministic shift bound.