Regularity Lemma for Graphons #
This file proves the regularity lemma for graphons, which says that any graphon can be approximated by a step graphon in cut norm.
Main results #
Graphon.regularity- For any ε > 0, there exists a partition with bounded number of parts such that the stepified graphon is ε-close in cut norm.
Implementation notes #
The Frieze–Kannan weak regularity lemma is one of the central results in graphon theory. It provides a step-function approximation in cut norm with single-exponential bounds on the number of parts (better than the tower-type bounds from Szemerédi's strong regularity lemma).
The number of parts in the partition depends only on ε, not on the graphon. This is crucial for applications to graph limits.
References #
- [L. Lovász, Large Networks and Graph Limits][lovasz2012], Corollary 9.13
- Frieze, A. & Kannan, R. (1999). Quick approximation to matrices and applications.
Energy function #
The energy of a graphon with respect to a partition.
E(P, W) = Σ_{S,T ∈ P.parts} μ(S) μ(T) (rectAverage W S T)²
This measures how much of the L² norm of W is captured by its stepification. The energy is always in [0, 1] and increases under refinement.
Equations
Instances For
The energy is non-negative.
The energy is at most 1.
Since rectAverage W S T ∈ [0,1] and Σ_{S,T} μ(S)μ(T) ≤ 1.
Energy increment lemma #
The T-average of a graphon: for fixed T, W_T(x) = (1/μ(T)) ∫_T W(x,y) dμ(y).
This is the conditional expectation of W(x,·) given that y ∈ T.
Equations
Instances For
The T-average is measurable.
The T-average takes values in [0, 1] for a.e. x when W is a graphon.
This uses that W ∈ [0,1] a.e. on the product measure. By Fubini (ae_ae_of_ae_prod), for a.e. x, the function y ↦ W(x,y) is in [0,1] for a.e. y. Then the integral is bounded: 0 ≤ ∫_T W(x,y) ≤ μ(T).
The average of W_T over S equals rectAverage W S T.
This is a consequence of Fubini-Tonelli: (1/μS) ∫_S W_T dx = (1/μS) ∫_S (1/μT) ∫T W(x,y) dy dx = (1/μS μT) ∫{S×T} W = rectAverage W S T
Helper lemmas for energy increment #
Weighted variance identity: when splitting a weighted average, the energy difference equals a weighted difference of squares.
If c = (ac1 + bc2) / (a+b) is the weighted average, then: ac1² + bc2² - (a+b)c² = (ab/(a+b)) * (c1 - c2)²
This identity is key for the energy increment lemma. When a rectangle S×T is split into S₁×T and S₂×T with respective averages c1 and c2, the energy change comes from this weighted variance formula.
Cauchy-Schwarz for set integrals: (∫_T f)² ≤ μ(T) · ∫_T f².
This is Jensen's inequality applied to the convex function x². The key is: (average f)² ≤ average(f²), which rearranges to give this bound.
Pointwise bound: the T-average deviation squared is bounded by the average squared deviation over T.
For fixed x: (W_T(x) - c)² ≤ (μT)⁻¹ · ∫_T (W(x,y) - c)² dy
This follows from sq_setIntegral_le_measure_mul_setIntegral_sq.
Jensen's inequality gives an UPPER bound for the T-average squared deviation.
By Jensen: (∫_T f)² / μ(T) ≤ ∫_T f² Applied pointwise: (W_T(x) - c)² = ((1/μT) ∫_T (W(x,y) - c))² ≤ (1/μT) ∫_T (W(x,y) - c)² Integrating over S: ∫S (W_T - c)² ≤ (1/μT) ∫{S×T} (W - c)²
Note: This is an UPPER bound. For a lower bound, we need a different approach.
For a.e. x, ∫_T (W(x,·) - tAverage W T x) = 0 by definition of tAverage.
Pointwise variance decomposition: for fixed x, the integral of (f - c)² over T equals the integral of (f - m)² plus μ(T) * (m - c)², when ∫_T (f - m) = 0.
This is the key algebraic identity: (f - c)² = (f - m)² + 2(f - m)(m - c) + (m - c)² and the cross term vanishes when integrated because ∫_T (f - m) = 0.
Variance decomposition along T: total defect = within-slice + between-slice variance.
∫_{S×T} (W - c)² = ∫_S (∫_T (W - tAvg)²) + μ(T) * ∫_S (tAvg - c)²
where tAvg = tAverage W T and c = rectAverage W S T.
The key insight is that ∫_T (W(x,·) - tAvg(x)) = 0 by definition of tAvg, so the cross term vanishes when expanding (W - c) = (W - tAvg) + (tAvg - c).
Frieze-Kannan median cut lemma (key for energy increment).
If ∫_S (f - m)² ≥ ε² μ(S) where m = (1/μS) ∫_S f (the mean), then there exists a measurable subset S₁ ⊆ S with μ(S₁), μ(S \ S₁) ≥ μ(S)/4 such that the averages of f on S₁ and S \ S₁ differ from m by at least ε/2.
This is the key step that produces a "good cut" for the energy increment. The proof uses Chebyshev's inequality: if variance is ε², then the probability of deviating by ε from the mean is at least some constant.
Standard argument:
- By Chebyshev: μ{x : |f(x) - m| ≥ ε/2} ≥ constant · ε²/ε² = constant
- Split by the median: S₁ = {x ∈ S : f(x) ≤ median}, S₂ = S \ S₁
- If variance is high, at least one of S₁ or S₂ has average differing from m
Proof: Uses a Chebyshev-type argument to find a measurable cut.
The "defect" of a partition: measures how far W is from being stepwise constant.
For a partition P and rectangle S × T, the defect is: ∫_{S×T} |W(x,y) - rectAverage W S T|² dμ(x) dμ(y)
The total defect is the sum over all partition rectangles. When W is close to stepified P W in cut norm, the defect is small.
Equations
Instances For
The defect is non-negative.
The defect on S × T equals the defect on T × S, by symmetry of the graphon.
This follows from the change-of-variables formula (swapping integration domain) and the graphon symmetry W(x,y) = W(y,x) a.e.
Variance decomposition on a rectangle: ∫{S×T} W² = ∫{S×T} (W - c)² + c² · μ(S×T) where c = rectAverage W S T.
The key fact is that the cross term vanishes: ∫{S×T} 2(W - c)·c = 2c · (∫{S×T} W - c·μ(S×T)) = 0 since c·μ(S×T) = ∫_{S×T} W by definition of rectAverage.
Key identity: L² norm = energy + defect.
‖W‖₂² = E(P, W) + D(P, W)
where E(P, W) is the energy and D(P, W) is the defect. This follows from the Pythagorean theorem for L² orthogonal projections.
The energy difference when replacing S with S₁ ∪ S₂ = S in the energy sum. For a fixed T, the contribution from S × T becomes the sum of S₁ × T and S₂ × T contributions. The difference is μ(T) * μ(S₁)μ(S₂)/μ(S) * (c₁ - c₂)² where c_i = rectAverage W S_i T.
Energy increases when splitting a part with different sub-averages.
When we split S into S₁ and S\S₁ in a partition, the energy of the new partition
is at least the old energy plus the positive contribution from energy_rect_split
applied to each T ∈ P.parts. More precisely:
energy W Q ≥ energy W P + Σ_T μ(T) * μ(S₁) * μ(S\S₁) / μ(S) * (c₁_T - c₂_T)²
where c₁_T = rectAverage W S₁ T and c₂_T = rectAverage W (S\S₁) T.
In particular, if c₁_T ≠ c₂_T for some T, the energy strictly increases.
Energy increment lemma (Frieze-Kannan style).
If W has large defect on some rectangle S × T of P, then refining that rectangle increases the energy.
More precisely: if there exist S, T ∈ P such that ∫_{S×T} |W - rectAverage W S T|² ≥ ε² μ(S) μ(T), then splitting S (or T) into two parts by an appropriate cut strictly increases the energy.
This is the key step that drives the regularity iteration.
Regularity lemma #
If the total defect exceeds ε², then there exists a "bad" rectangle (S,T) where the defect per unit area is at least ε². This is the contrapositive: if every rectangle has defect < ε² μ(S) μ(T), then total defect < ε².
This lemma is needed to convert the regularity iteration stopping condition (defect ≤ ε²) to finding a bad rectangle for energy_increment.
Pair energy increment: If cutNormDiff U (stepify P U) > δ, then there exists
a refinement Q of P with at most 4 times as many parts such that:
energy U Q ≥ energy U P + δ²(the graphon that triggered the refinement gains energy)energy V Q ≥ energy V Pfor any other graphon V (energy is monotone under refinement)
This is the key lemma for simultaneous regularity: we can refine for one graphon while ensuring the other's energy does not decrease.
The regularity function: given ε, returns an upper bound on the number of parts needed in a partition to achieve ε-approximation.
The bound is exponential in 1/ε², following the Frieze-Kannan approach which gives single-exponential bounds (better than the tower-type bounds from Szemerédi's proof).
Uses base 4 because each FK step doubles in both row and column dimensions.
Instances For
The Frieze-Kannan weak regularity lemma.
For any ε > 0 and any graphon W, there exists a measurable partition P with bounded number of parts such that the step graphon approximation of W on P has small cut norm difference from W.
Proof (Frieze-Kannan [1999]):
- Start with trivial partition P₀ = {α}
- While
cutNormDiff W (stepify P W) > ε:- Apply energy_increment_quantitative to get P_{i+1}
- Energy increases by ≥ ε²
- Since energy ≤ 1, at most ⌈1/ε²⌉ iterations
- Each iteration at most quadruples parts (double split): final count ≤ 4^(iterations+1)
Equitable partitions #
A partition is ε-equitable if all parts have measure within ε of 1/k, where k is the number of parts.
Instances For
IVT for atomless measures: any measurable set can be split at any prescribed measure.
This is a standard result (Sierpinski's theorem). Given [StandardBorelSpace α],
we use a countable separating sequence to greedily construct the desired subset.
Any partition can be refined to an equitable one with controlled part count.
Proof idea: Each part S of P is split into m = ⌈1/ε⌉₊ equal-measure pieces using the intermediate value theorem for atomless measures. The resulting partition has at most n * m parts (where n = P.parts.card) and each part has measure within ε of the average 1/(n*m).
The intermediate value theorem for atomless measures (Sierpinski's theorem) is
proved above as exists_measurable_subset_of_measure.
The [NoAtoms μ] hypothesis ensures the measure has no atoms, which is
necessary for the existence of subsets with prescribed measure.