<_HOME/ROSETTA/>COLLABORATIVE_PROVING

Human / AI Collaborative Theorem Proving

Context engineering for tool-mediated LLM assistance in formal mathematics. Transform probabilistic AI exploration into verified, auditable proof artifacts through structured interaction with Lean 4.

CORE PRINCIPLE

The LLM becomes a guided search process; Lean becomes the acceptance boundary; and audits turn correctness into a reviewable ledger.

>LEAN_AI_DOJO

Extending LeanDojo for Lean 4

Building on LeanDojo (Yang et al., NeurIPS 2023), our lean_ai_dojo toolkit provides LLMs with programmatic access to Lean 4 through a unified API combining automated tactic execution, retrieval-augmented premise selection, and proof state management.

LLM AGENT

Claude, GPT, etc.

Tool Calls

▼

UNIFIED DOJO API

ProofSession

Management

BudgetMgmt

time/calls

AuditTrail

verification logs

▼

LeanAdapter

run_tactic

build_proof

check_axioms

SolverCascade

Tiered

Parallel

Early exit

RetrievalPipeline

FAISS (dense)

BM25 (sparse)

RRF fusion

▼

LEAN 4 + LAKE ENVIRONMENT

Mathlib 4

~120k thms

HeytingLean

custom

Project libs

user code

LEAN 4 NATIVE

Targets Lean 4 with Lake integration

SYNCHRONOUS API

Tool use for real-time workflows

RETRIEVAL-AUGMENTED

FAISS + BM25 + RRF fusion

GUARDRAILS

Strict verification boundaries

>UNIFIED_DOJO_API

Tool-Mediated Proof Development

The model does not "write a proof" as free-form text. It proposes tactic actions through structured tools, and every action is executed inside a verified Lean session.

PROOF CONSTRUCTION

begin_proof()ProofSession

apply_tactic()TacticResult

undo_tactic()bool

try_auto_solve()AutoSolveResult

finalize_proof()FinalProof

STATE INSPECTION

get_current_goal()str

get_all_goals()List[str]

get_hypotheses()List[Hyp]

is_proof_complete()bool

get_proof_script()str

RETRIEVAL

get_relevant_premises()List[Premise]

get_suggested_tactics()List[str]

get_similar_proofs()List[Proof]

~120k mathlib + ~500 HeytingLean theorems indexed

VERIFICATION

build_proof()BuildResult

get_axioms()List[str]

Kernel verification + axiom audit

>SOLVER_CASCADE

Tiered Parallel Tactic Execution

Automated discharge of routine goals through tiered solver cascades with early termination. Within each tier, tactics run in parallel; the first success terminates remaining attempts.

TRIVIAL

rfl, trivial, decide

AUTOMATION

simp, ring, omega, linarith, norm_num

LIBRARY

exact <premise>, apply <premise>

HEAVY (guarded)

aesop, decide, native_decide

>SEMANTIC_EMBEDDINGS

Beyond Lexical Similarity

The core challenge in proof retrieval: embeddings must capture semantic structure, not just textual similarity. A proof about commutativity should be close to related algebraic lemmas, even with different variable names and surface syntax. We evaluate embeddings by their ability to retrieve actual dependencies—not just visually similar proofs.

MULTI-STAGE EXTRACTION PIPELINE

lean

Lean Expr-derived AST contexts. Full type information preserved. Path contexts from value tree with type fallback.

lambda_view

Canonicalized operator-level view. Variables normalized to VAR, module prefixes stripped. Captures computational "shape" invariant to naming.

C function AST from Lean-emitted code. Validates that semantic structure survives compilation to executable form.

EVALUATION PRINCIPLE

Embeddings are evaluated by dependency retrieval: can the embedding of theorem T retrieve the lemmas that T actually uses in its proof? This tests genuine semantic understanding, not surface-level pattern matching. We measure MRR (Mean Reciprocal Rank) and hit@k across the full HeytingLean corpus.

NUCLEUS QUOTIENT (R-SIGNATURE)

Two declarations are R-equivalent if they have the same lambda_view canonical form. This implements a computable quotient: orbit members share representations in the "R-quotiented" embedding space.

R_signature(decl) = hash(lambda_view(decl))

HYBRID RETRIEVAL

Vector similarity alone misses exact matches. We fuse dense (FAISS) with sparse (BM25) retrieval using Reciprocal Rank Fusion, with OOD warnings when lexical coverage is low.

score = RRF(dense_rank, sparse_rank)

>GUARDRAILS

Verification Boundaries

Because model output is probabilistic, guardrails are non-negotiable. If something is proven through this system, it is proven in a way that can be audited, replayed, and independently verified.

BANNED PATTERNS

sorry

admit

native_decide (unguarded)

unsafe

sorryAx, lcProof, lcUnreachable

ACCEPTABLE AXIOMS

propext (propositional ext)

Quot.sound (quotient soundness)

Classical.choice (classical logic)

Nonstandard axioms trigger warnings

BUDGET CONTROLS

Total time: 300s per proof

LLM calls: 20 max per session

Per-tactic timeout: 30-120s

>PROOF2VEC

Proof Extraction for ML

Extract proof structures into machine-readable formats for training, similarity search, and visualization. The proof2vec pipeline supports AST, telescope, and Lambda Core representations.

// proof2vec_extract

{
  "name": "Reentry.idempotent",
  "module": "HeytingLean.LoF.Nucleus",
  "type": { "node": "forallE", ... },
  "value": { "node": "lam", ... },
  "lambda_core_type_tree": { ... },
  "lambda_core_value_tree": { ... },
  "consts_type": ["Reentry", "PrimaryAlgebra"],
  "consts_value": ["rfl", "Eq.mpr"],
  "type_pretty": "... (R : Reentry α) ...",
  "node_count_type": 42,
  "node_count_value": 156
}

// proof2vec_telescope

{
  "decl": "Reentry.idempotent",
  "binders": [
    { "binder_info": "implicit",
      "name": "α",
      "type_pretty": "Type u" },
    { "binder_info": "instImplicit",
      "name": "inst",
      "type_pretty": "PrimaryAlgebra α" },
    ...
  ],
  "target_pretty": "R (R a) = R a",
  "is_prop": true
}

>VISUALIZATION

Proof Latent Space

Diagrams as cognitive leverage. The Heyting Viz dashboard displays Lean declarations embedded in 2D/3D latent space via UMAP, enabling humans to identify structural patterns—trust bottlenecks, reuse opportunities, isolated clusters—invisible in raw source code.

Explore Proof Latent Space→

Read Full Paper: Context Engineering for Human-in-the-Loop Formal Cryptography→

← Rosetta Lenses Lean Blueprint →CAB Certification →