Apoth3osis
<_RESEARCH/MOLECULAR_COMPLEXITY

Assembly Theory Program

VERIFIED3 PHASES10 MODULES0 SORRY4 MENTAT CERTSLean 4 + C + Rust

The Central Question

Can molecular complexity be rigorously measured by the minimum number of construction steps required to build a molecule from basic building blocks? Assembly Theory proposes exactly this: the assembly index — the shortest path through an assembly space — as a universal complexity measure. Objects with high assembly index are overwhelmingly unlikely to arise without selection, making this a candidate biosignature detectable by mass spectrometry alone.

Original Researchers

Stuart Walker
Researcher
Assembly Theory mathematical foundations. Syntax tree models, DAG reuse semantics, quotient monotonicity.
Lee Cronin
Regius Professor of Chemistry
University of Glasgow. Pioneer of Assembly Theory as a framework for detecting selection and evolution in molecular systems via mass spectrometry.

School of Chemistry, University of Glasgow. Nature 2023, 2024.

Scientific Context

Assembly Theory addresses a fundamental question in astrobiology and origin-of-life research: how can we distinguish molecules produced by living systems from those arising through random chemistry? The key insight is that complex molecules require many construction steps, and the probability of assembling them decreases exponentially with complexity — unless a selection process is at work.

What formalization reveals: The assembly index equals the DAG join count — the number of distinct intermediate products in the most efficient construction. This is proved as assemblyIndex_eq_dagJoinCount in Lean 4. The tight bounds log₂(size) ≤ AI ≤ size - 1 are proved as dagJoinCount_bounds. Quotient monotonicity ensures the measure is well-behaved under symmetry.

What we built: A complete 10-module Lean 4 formalization plus verified C and safe Rust transpilations. The formalization extends the core theory to hypergraphs, molecular bond graphs, copy number selection, and string permutation spaces — covering the full breadth of the Assembly Theory program.

Layer 1: The Mathematical Ground Truth

An assembly space is a triple (Ω, U, J) where Ω is a set of objects, U ⊆ Ω is the set of primitives (basic building blocks), and J: Ω × Ω → Ω is a binary join operation. Every object is either a primitive or constructed by joining two sub-objects:

  • Obj.Base(a): a primitive atom from the universe U
  • Obj.Join(l, r): the join of two sub-objects
  • size(o): number of primitive leaves in the syntax tree
  • joinCount(o): number of internal join nodes (without reuse)

The assembly index is defined as the minimum number of distinct join operations needed to construct the object — tracking which intermediates have already been built:

  • assemblyIndex(o) = dagJoinCount(o) — count distinct joins in the DAG
  • Lower bound: log₂(size(o)) ≤ assemblyIndex(o)
  • Upper bound: assemblyIndex(o) ≤ size(o) - 1
  • Primitive iff zero: assemblyIndex(o) = 0 ↔ o ∈ U

AssemblySpace.lean — Obj inductive type, size, joinCount, structural lemmas

AssemblyIndex.lean — dagJoinCount via Finset tracking, assemblyIndex_eq_dagJoinCount

AssemblyBounds.lean — dagJoinCount_bounds: log₂ lower bound, size-1 upper bound

AssemblyQuotient.lean — quotient_assemblyIndex_le: monotonicity under equivalence

Layer 2: Extensions & Applications

The formalization extends Assembly Theory to several applied domains:

MOLECULAR SPACE

Molecules as bond graphs with atom types and bond multiplicities. The assembly index of a molecule measures the minimum number of distinct bond-forming operations needed to synthesize it — directly measurable by tandem mass spectrometry.

HYPERGRAPH SPACE

Extension from binary join to k-ary hyperedge join. Models chemical reactions with multiple reactants. Preserves all core bounds and monotonicity results.

COPY NUMBER SELECTION

Copy number constraints model biological replication: how many copies of each intermediate are available at each step. Introduces resource-bounded assembly complexity.

STRING PERMUTATION SPACE

Assembly over strings (polymers, DNA sequences). Permutation equivalence classes capture rotational/reflective symmetry. Assembly index on strings models sequence complexity.

Project Phases

PHASE 1VERIFIEDLean 4 Formal Proofs

Complete formalization of Assembly Theory: 10 modules covering assembly spaces, paths, index computation, tight bounds, quotient monotonicity, molecular semantics, hypergraph extension, copy number selection, and string permutation spaces. All theorems proved without sorry/admit.

  • assemblyIndex_eq_dagJoinCount: AI = DAG join count
  • dagJoinCount_bounds: log₂(size) ≤ AI ≤ size - 1
  • quotient_assemblyIndex_le: monotonicity under equivalence
  • size_eq_joinCount_add_one: structural identity
PHASE 2VERIFIEDC & Rust Transpilation

Faithful translation to verified C (gcc -Wall -Wextra -Werror: 0 warnings per unit) and safe Rust (cargo test: 16/16 pass). Syntax tree model, DAG join count with hash-table reuse tracking, and bounds verification functions.

  • gcc -std=c11 -Wall -Wextra -Werror: PASS (per unit)
  • cargo build: 0 warnings
  • cargo test: 16/16 pass
  • Hash-based reuse tracking matches Lean HashSet semantics
PHASE 3PENDINGIPFS Permanent Storage

All artifacts content-addressed and pinned to IPFS. Lean proofs, verified C, safe Rust archives with CIDv1 identifiers. Immutable, globally retrievable via any IPFS gateway.

  • Lean archive: 10 source files (23KB)
  • C archive: 3 translation units (2.6KB)
  • Rust archive: 5 source files + Cargo.toml (2.6KB)
  • SHA-256 content hashes for all archives

MENTAT Contribution Certificates

MENTAT-CA-002|MCR-AT-001
2024-01-01

MENTAT Contribution Record

IDEA

Conceptual Contribution

CONTRIBUTION LEVEL: IDEA

Ontological Engineer

Assembly Theory: Quantifying Molecular Complexity via Shortest Paths

Contributor

Stuart Walker, Lee Cronin

University of Glasgow

Core conceptual insight: the complexity of a molecule (or any combinatorial object) can be quantified by the minimum number of joining operations needed to construct it from a set of basic building blocks, counting reuse. This assembly index (AI) captures selection and evolution — objects with high AI are statistically unlikely without selection processes, making AI a candidate biosignature for astrobiology.

MENTAT · Mesh-Encrypted Network for Trusted Autonomous TransactionsImmutable · Content-Addressed · Tamper-Proof
MENTAT-CA-002|MCR-AT-002
2024-01-01

MENTAT Contribution Record

THEORY

Mathematical Foundation

CONTRIBUTION LEVEL: THEORY

Ontological Engineer

Assembly Spaces as Syntax Tree Models with DAG Reuse Semantics

Contributor

Stuart Walker, Lee Cronin

University of Glasgow

Full mathematical framework: (1) Objects as binary syntax trees with primitive leaves, (2) Assembly paths as sequences of join operations, (3) DAG join count as the reuse-aware complexity measure equivalent to assembly index, (4) Quotient spaces preserving monotonicity under symmetry, (5) Extension to hypergraphs, molecular spaces with bond graphs, and string permutation spaces.

Builds Upon

MCR-AT-001
MENTAT · Mesh-Encrypted Network for Trusted Autonomous TransactionsImmutable · Content-Addressed · Tamper-Proof
MENTAT-CA-002|MCR-AT-003
2026-03-13

MENTAT Contribution Record

PROOF

Formally Verified

CONTRIBUTION LEVEL: PROOF

Lean 4 Formalization — 10 Modules, Complete Assembly Theory

Contributor

Apoth3osis Labs

R&D Division

Machine-checked Lean 4 formalization of the full Assembly Theory framework. 10 modules covering assembly spaces, paths, index computation, tight bounds, quotient monotonicity, molecular semantics, hypergraph extension, copy number selection, and string permutation spaces. All theorems proved without sorry/admit. Key results: assemblyIndex_eq_dagJoinCount, dagJoinCount_bounds (log₂ lower, size-1 upper), quotient_assemblyIndex_le.

Builds Upon

MCR-AT-001MCR-AT-002
MENTAT · Mesh-Encrypted Network for Trusted Autonomous TransactionsImmutable · Content-Addressed · Tamper-Proof
MENTAT-CA-002|MCR-AT-004
2026-03-13

MENTAT Contribution Record

KERNEL

Computationally Verified

CONTRIBUTION LEVEL: KERNEL

Assembly Theory Verified Kernel — C & Rust Transpilation

Contributor

Apoth3osis Labs

R&D Division

Verified C and safe Rust transpilations of the core Assembly Theory modules. C: gcc -std=c11 -Wall -Wextra -Werror, 0 warnings on each translation unit. Rust: cargo build 0 warnings, cargo test 16/16 pass. Assembly space (Obj type), assembly index (DAG join count with hash tracking), and bounds verification (both upper and lower bounds) faithfully translated.

Builds Upon

MCR-AT-003
MENTAT · Mesh-Encrypted Network for Trusted Autonomous TransactionsImmutable · Content-Addressed · Tamper-Proof

Governed by MENTAT-CA-001 v1.0 · March 2026