Hybrid Zeckendorf Arithmetic

VERIFIED6 PHASES11 LEAN MODULES68 THEOREMS0 SORRY7 EXPERIMENTS55x OVER GMPR8 COMPLETELean 4 + Rust

>THE_QUESTION

Can a formally verified number system beat GMP?

GMP is the gold standard for arbitrary-precision arithmetic. It is hand-optimized assembly, decades of engineering, used everywhere from cryptography to computer algebra. Beating it at anything, ever, is not supposed to happen—and certainly not with a formally verified implementation.

Vladimir Veselov's Hybrid Zeckendorf paper proposes an alternative: instead of flat bit strings, decompose integers into a doubly-exponential weight hierarchy where each level carries a small Fibonacci-encoded coefficient. A million-bit number uses only ~11 levels. For sparse numbers (few nonzero Fibonacci indices), addition is O(K) where K is the total index count—versus GMP's O(N/64) scan of every machine word.

We formalized the complete theory in Lean 4 (68 theorems, zero sorry) and built a rigorous Rust benchmark suite with anti-hack constraints. The result: 55x speedup over GMP on sparse addition at 1M bits. To our knowledge, this is the first empirically-demonstrated case where a formally-verified number representation outperforms GMP at any operation, at any scale, under fair benchmarking conditions.

>RULIOLOGY_CONTEXT

Why this is Ruliology

Veselov's paper proposed the representation. Our Lean formalization proved its mathematical properties. The ruliology—the discovery that emerged through the verification process—changed how we understand the system.

Formalizing the normalization algorithm found a bug in the paper. Algorithm 3 specifies descending-level carry (reverse=True), but the carry semantics require ascending order. Our Lean proofs use independent carryAt calls with fuel; our Rust implementation uses bottom-up iteration. Both are correct despite the pseudocode error. This is exactly the kind of finding that only emerges when you force every claim through a type checker.

The closure operator result was unexpected. NucleusBridge proves that normalization is idempotent (normalizing a normal form returns itself), which sounds obvious but requires careful proof. The negative result— carryAt_not_pointwise_monotone—was a surprise: carry at one level can decrease the coefficient at another level. This subtlety is invisible in informal reasoning.

The deepest ruliology finding: formal verification and computational advantage are not in tension. The structural properties that make the Lean proofs work (evaluation preservation, canonical uniqueness, carry soundness) are the same properties that make the computation fast. The proof is the optimization.

>ARCHITECTURE

Lean 4 Formalization (HeytingLean/Bridge/Veselov/HybridZeckendorf/)
  WeightSystem.lean ───── w(k+2) = w(k+1)², closed form, strict mono
  HybridNumber.lean ──── core types, evaluation maps
  FibIdentities.lean ─── Fibonacci identities, Zeckendorf helpers
  Normalization.lean ─── normalize_sound, normalize_canonical, carry
  Addition.lean ──────── addLazy_eval, add_correct
  Multiplication.lean ── multiplyBinary_correct
  DensityBounds.lean ─── active_levels_bound ≤ log₁₀₀₀(n) + 2
  Uniqueness.lean ────── canonical_eval_injective
  NucleusBridge.lean ─── normalize_is_closure_operator (idempotent)
  DragStability.lean ─── transport stability
  TransportCoherence ─── round-trip coherence
       │
  CAB Pipeline (Lean → C → Rust)
       │
Rust Benchmark Suite (bench/hybrid_zeckendorf/, 2,415 lines)
  weight.rs ─────── weight hierarchy (// Mirrors: WeightSystem)
  zeckendorf.rs ─── Zeckendorf encoding (// Mirrors: FibIdentities)
  flat_hybrid.rs ── HybridNumber types (// Mirrors: HybridNumber)
  normalization.rs  normalize + carry (// Mirrors: Normalization)
  arithmetic.rs ─── add, multiply (// Mirrors: Addition/Multiplication)
  sparse.rs ─────── sparse number construction + E4/E5/E7
  density.rs ────── density measurement (// Mirrors: DensityBounds)
       │
Experiments (7 total)
  E1: modular exponentiation  │  E5: lazy accumulation
  E2: polynomial multiplication│  E6: sparse modular exponentiation
  E3: random density           │  E7: crossover determination
  E4: sparse addition (CORE RESULT)

>BENCHMARK_RESULTS

Sparse Addition: HZ vs GMP (E4)

100,000 bits (30 samples)

ρ	K	HZ (ns)	GMP (ns)	Speedup
10&sup5;	1	48	784	15.2x
10⁻&sup4;	14	112	840	7.0x
10⁻³	144	272	912	3.4x
10⁻²	1,440	688	920	1.2x

1,000,000 bits (30 samples)

ρ	K	HZ (ns)	GMP (ns)	Speedup
10⁻&sup5;	14	112	6,310	55.4x
10⁻&sup4;	144	272	6,192	22.8x
10⁻³	1,440	784	6,088	8.2x
10⁻²	14,404	3,312	6,272	1.9x

SCALING

Advantage grows with input size: 2.4–3.6x per 10x increase in N. GMP is flat at ~6,200 ns (scans all words). HZ scales only with K.

CROSSOVER (E7)

ρ* ≈ 5.6×10⁻³ at 100k bits. ρ* ≈ 3.2×10⁻² at 1M bits. Advantage regime expands 6x with input size.

LAZY ACCUM (E5)

1000 sparse additions: 50–593x faster than eager. R5–R8 normalizer campaign resolved: split-update kernel passes S9 scaling gate at ρ=10⁻&sup5; (32x→20x speedup, ratio 0.61). Normalization cost reduced 64x per pass vs original Vec kernel.

>MATERIALIZED_ARITHMETIC

Normalization Campaign: R5–R8

Lazy addition is O(K), but the deferred normalization cost matters for materialized arithmetic (add + normalize). The R5–R8 campaign diagnosed and fixed the sparse normalizer’s scaling bottleneck through four rounds of hostile-audited implementation and measurement.

ρ = 10⁻&sup5; (very sparse, 1M bits)

Count	Speedup	norm (ns)	S9
10	32.2x	1,232	—
100	29.3x	13,488	—
500	23.9x	92,272	—
1000	19.6x	244,833	PASS (0.61)

Normalizer Kernel Evolution

Kernel	norm (ns)	fuel	ns/pass
R5 Vec::insert	244,762k	5	48,952k
R7 BTreeMap	45,847k	9	5,094k
R8 split-update	6,831k	9	759k

At ρ=10⁻&sup4;, count=1000. R8 is 64x cheaper per pass than Vec, 6.7x cheaper than BTreeMap.

RESOLVED

R8 split-update sorted Vec: binary-search existing entries for O(1) update, buffer new entries for single O(K+D) merge after each pass. Eliminates Vec::insert's O(K) per-element shifting while preserving cache-friendly sorted storage. S9 scaling gate passes at ρ=10⁻&sup5;.

OPEN QUESTION

Deferred merge prevents within-pass cascading, adding a 1.8x fuel tax at denser regimes (ρ=10⁻&sup4;). S9 fails there (ratio 0.24). Restoring partial cascading or a split-phase normalization algorithm could close this gap.

>FORMAL_BOUNDARY

What is proved vs. what is measured

FORMALLY VERIFIED (Lean 4)

• w(i+1) = 1000^(2^i) (weight closed form)
• Normalization preserves evaluation
• Canonical forms are unique (injective)
• Lazy addition preserves value
• Multiplication correct
• Active levels ≤ log₁₀₀₀(n) + 2
• Normalization is idempotent closure operator
• Carry is NOT pointwise monotone (negative result)

EMPIRICALLY MEASURED (Rust bench)

• Sparse addition 15–55x faster than GMP
• Dense arithmetic 5–48x slower than GMP
• Crossover density ρ* ≈ 5.6×10⁻³ – 3.2×10⁻²
• Lazy accumulation 50–593x faster than eager
• R5–R8 normalizer campaign: split-update kernel (64x cheaper per pass)
• Fuel profiling: Vec vs BTreeMap vs split-update convergence dynamics
• Materialized arithmetic: 32x at count=10, 20x at count=1000 (ρ=10⁻&sup5;)
• End-to-end pipeline with deferred normalization

>PHASES

PHASE 1VERIFIED12 THM

3 modules (270 lines)

Weight System & Core Types

Super-exponential weight hierarchy w(0)=1, w(1)=1000, w(k+2)=w(k+1)². Closed-form: w(i+1) = 1000^(2^i). Strict monotonicity. Core HybridNumber and LazyHybridNumber types with evaluation maps. Fibonacci double identity and Zeckendorf encoding helpers.

WeightSystem.leanHybridNumber.leanFibIdentities.lean

Key Results

•weight_closed (w(i+1) = 1000^(2^i))
•weight_strict_mono (i < j ⟹ w(i) < w(j))
•eval_fromNat (evaluation map correctness)
•fib_double_identity, lazyEvalFib_append

PHASE 2VERIFIED15 THM

Normalization.lean

Normalization & Carry

The central algorithmic content: inter-level carry that transforms any lazy representation into canonical form. normalize_sound: normalization preserves evaluation. normalize_canonical: the result is unique. carryAt_preserves_eval: each carry step preserves value. Errata: paper's Algorithm 3 specifies descending carry, but the semantics require ascending order.

Normalization.lean

Key Results

•normalize_sound (evaluation preservation)
•normalize_canonical (unique canonical form)
•carryAt_preserves_eval (carry soundness)
•interCarry_idempotent_of_canonical
•Errata: carry must be ascending, not descending

PHASE 3VERIFIED8 THM

2 modules

Arithmetic Operations

Lazy addition: concatenate payload vectors at each level without normalizing. addLazy_eval proves this preserves value. add_correct proves the normalized pipeline (lazy + normalize) is correct. Binary multiplication via shift-and-add. multiplyBinary_correct proves the multiplication algorithm correct.

Addition.leanMultiplication.lean

Key Results

•addLazy_eval (lazy addition preserves value)
•add_correct (normalized addition preserves value)
•multiplyBinary_correct
•add_comm_repr (addition commutes at repr level)

PHASE 4VERIFIED18 THM

3 modules

Uniqueness, Density & Nucleus

canonical_eval_injective: canonical forms are unique — equal evaluation implies identical representation. active_levels_bound: support cardinality ≤ log₁₀₀₀(n) + 2. NucleusBridge: normalization is an idempotent closure operator (sheaf-glue extension). Subtle negative result: carryAt_not_pointwise_monotone — carry is NOT monotone per-level.

Uniqueness.leanDensityBounds.leanNucleusBridge.lean

Key Results

•canonical_eval_injective (unique representation)
•active_levels_bound (≤ log₁₀₀₀(n) + 2 levels)
•density_upper_bound
•normalize_is_closure_operator (idempotent)
•carryAt_not_pointwise_monotone (negative result)

PHASE 5VERIFIED

Rust benchmark

Benchmark: Dense Regime (E1–E3)

Establishing the baseline: HZ loses in the dense regime, as expected. E1 (modular exponentiation): 5–48x slower than GMP. E2 (polynomial multiplication): millions of times slower. E3 (random integer density): ρ ≈ 0.27 for random numbers — dense. The proved level-count bound holds with significant slack. These results confirm the theory: HZ advantage requires sparse inputs.

arithmetic.rsdensity.rs

Key Results

•E1: modular exponentiation 5–48x slower than GMP
•E2: polynomial multiplication millions of times slower
•E3: random density ρ ≈ 0.27 (dense regime)
•Level-count bound holds with slack
•Confirms: HZ advantage requires sparse inputs

PHASE 6VERIFIED

Rust benchmark

Benchmark: Sparse Regime (E4–E7)

The core positive result. E4 (sparse addition): 15.2x at 100k bits/ρ=10⁻⁵, 55.4x at 1M bits/ρ=10⁻⁵. GMP timing is flat (~6,200 ns at 1M) because it scans every word; HZ timing scales only with K. Scaling factor 2.4–3.6x per 10x input size. E5 (lazy accumulation): 50–593x faster than eager for 1000 sparse additions. E7 (crossover): ρ* ≈ 5.6×10⁻³ at 100k, ρ* ≈ 3.2×10⁻² at 1M — advantage regime expands 6x with input size.

sparse.rsnormalization.rs

Key Results

•E4: 15.2x speedup at 100k/ρ=10⁻⁵, 55.4x at 1M/ρ=10⁻⁵
•GMP flat at ~6,200 ns (scans all words); HZ scales with K only
•Scaling factor 2.4–3.6x per 10x increase in N
•E5: lazy accumulation 50–593x faster than eager
•E7: crossover regime expands 6x from 100k to 1M bits

VIEW_PPC_PAGE →← BACK_TO_RESEARCH