The Soul and the Hands, Part III: Proof of Concept

Empirical evidence from Harmonic's Aristotle

1 February 2026

In Part I, I argued that AI safety requires attention to both intentions and capabilities. In Part II, I sketched how formal methods from operating systems and hardware verification might provide provable bounds on what AI agents can do.

A reader might reasonably ask: is this practical? Can AI systems actually generate the proofs we'd need?

I decided to find out. I submitted capability theorems to Harmonic's Aristotle, an AI system that generates machine-verified proofs in Lean 4. The results are instructive.

The Experiment

Aristotle is a theorem-proving AI developed by Harmonic. It achieved gold-medal performance on the 2025 International Mathematical Olympiad, with proofs that are machine-verified in Lean 4, not just answers in natural language.

The proofs Aristotle generates are checked by Lean's type-theoretic kernel, a small, trusted verifier that is independent of the AI. This matters: we don't need to trust Aristotle's reasoning, only that the kernel correctly implements the Calculus of Inductive Constructions. The proofs use Mathlib, the community-maintained library of formalised mathematics for Lean 4.

I submitted four theorems related to capability bounds:

Capability delegation preserves bounds (natural language input)
Capability monotonicity (Lean 4 with sorry placeholder)
Capability composition (multi-agent bounds)
No amplification (delegation chains)

All four theorems were proven. The proofs are short, correct, and illuminating.

Theorem 1: Delegation Preserves Bounds

Input (natural language):

Prove that capability delegation preserves bounds: If an agent A has capabilities that are a subset of set S, and agent A delegates some of its capabilities to agent B, then agent B's capabilities are also a subset of S.

Aristotle's response:

theorem capability_delegation_preserves_bounds
    {α : Type} (A B S : Set α)
    (hA : A ⊆ S) (hB : B ⊆ A) : B ⊆ S := by
  exact?

Aristotle correctly formalised the English description into a Lean 4 theorem statement. The exact? tactic is a proof search command that queries Mathlib for a matching lemma; in this case, it finds Set.Subset.trans, the transitivity of subset.

This demonstrates autoformalization: the ability to convert natural language into formal specifications. This matters because domain experts, whether security engineers, lawyers, or policy-makers, don't write Lean.

Theorem 2: Capability Monotonicity

Input (Lean 4 with placeholder):

theorem capability_monotonicity
    {α : Type*} (S A B : Set α)
    (h1 : A ⊆ S)   -- Agent has capabilities bounded by S
    (h2 : B ⊆ A)   -- Agent delegates subset to another
    : B ⊆ S := by  -- Delegated capabilities still bounded by S
  sorry

Aristotle's proof (complete output with headers):

/-
This file was generated by Aristotle.

Lean version: leanprover/lean4:v4.24.0
Mathlib version: f897ebcf72cd16f89ab4577d0c826cd14afaafc7
This project request had uuid: 1c73b9af-4f0c-47db-9604-1e52e30e624b

Co-authored-by: Aristotle (Harmonic) <aristotle-harmonic@harmonic.fun>
-/

theorem capability_monotonicity
    {α : Type*} (S A B : Set α)
    (h1 : A ⊆ S) (h2 : B ⊆ A) : B ⊆ S := by
  exact Set.Subset.trans h2 h1

One line of actual proof. Let me unpack what this line does.

Set.Subset.trans is a lemma from Mathlib stating that subset is transitive: given h2 : B ⊆ A and h1 : A ⊆ S, we can derive B ⊆ S. The exact tactic tells Lean to use this lemma directly as the proof.

Crucially, Aristotle didn't just generate this proof; Lean's kernel verified it. The kernel is a small, trusted piece of code (roughly 6,000 lines) that checks every proof step against the type theory's inference rules. If Aristotle had produced an incorrect proof, the kernel would reject it. This separation between proof generation (complex, AI-powered) and proof checking (simple, deterministic) is what makes the approach trustworthy.

This is the core security property of capability-based systems. CHERI calls it "reachable capability monotonicity." seL4 calls it "authority confinement." The mathematical content is the same: you cannot amplify authority through delegation.

Theorem 3: Capability Composition

Input:

theorem capability_composition
    {α : Type*} (S₁ S₂ : Set α) (C₁ C₂ : Set α)
    (h1 : C₁ ⊆ S₁)  -- Agent 1's capabilities bounded by S₁
    (h2 : C₂ ⊆ S₂)  -- Agent 2's capabilities bounded by S₂
    : C₁ ∪ C₂ ⊆ S₁ ∪ S₂ := by  -- Combined capabilities bounded by union
  sorry

Aristotle's proof:

theorem capability_composition
    {α : Type*} (S₁ S₂ : Set α) (C₁ C₂ : Set α)
    (h1 : C₁ ⊆ S₁) (h2 : C₂ ⊆ S₂)
    : C₁ ∪ C₂ ⊆ S₁ ∪ S₂ := by
  grind

The grind tactic is Lean 4's automated reasoning procedure for goals that can be solved by a combination of simplification, case splitting, and congruence closure. For set-theoretic facts like this one, it automatically derives the proof by reasoning about union and subset definitions.

This theorem matters for multi-agent systems. If Agent 1 has capabilities C₁ bounded by S₁, and Agent 2 has capabilities C₂ bounded by S₂, then their combined capabilities are bounded by S₁ ∪ S₂. No new capabilities emerge from composition.

Theorem 4: No Amplification

Input:

def CapabilitySystem (α : Type*) := Set α

theorem no_amplification {α : Type*}
    (initial : CapabilitySystem α)
    (bound : Set α)
    (h_bounded : initial ⊆ bound)
    (delegations : List (Set α))
    (h_valid : ∀ d ∈ delegations, d ⊆ initial)
    : ∀ d ∈ delegations, d ⊆ bound := by
  sorry

Aristotle's proof:

theorem no_amplification {α : Type*}
    (initial : CapabilitySystem α)
    (bound : Set α)
    (h_bounded : initial ⊆ bound)
    (delegations : List (Set α))
    (h_valid : ∀ d ∈ delegations, d ⊆ initial)
    : ∀ d ∈ delegations, d ⊆ bound := by
  intros d hd
  apply Set.Subset.trans (h_valid d hd) h_bounded

This is the most sophisticated of the four theorems. It models a sequence of delegations and proves that no delegation in the sequence can exceed the original bound. The proof introduces the delegation d, applies the validity hypothesis to get d ⊆ initial, then uses transitivity to conclude d ⊆ bound.

Why Trivial Proofs Are Good News

All four proofs reduce to transitivity of the subset relation. A sceptic might say: "These are trivial. Of course Aristotle can prove them."

But triviality is exactly what we want from security properties.

Consider seL4's integrity theorem. The statement is simple: if the security policy says process A cannot write to memory region X, then A cannot write to X. The proof is complex (thousands of lines in Isabelle/HOL), but the property itself is simple and auditable.

The capability theorems I submitted are foundational. They're the mathematical axioms on which more complex properties are built. We want these axioms to have short, obvious proofs. That makes them:

Easy to audit. Anyone who understands transitivity can verify the proof.
Easy to generate. Aristotle found the proofs quickly.
Hard to get wrong. There's little room for subtle errors.
Easy to explain. Regulators and policy-makers can understand "delegation can't amplify."

The complexity in a real capability system lives elsewhere: in the specification of what capabilities mean, in the mapping from abstract sets to concrete actions, in the runtime that enforces bounds. The mathematical layer should be boring. Aristotle has shown it can be.

The Harder Objection

A sceptic might raise a sharper objection: "This is just set theory. What does transitivity of ⊆ have to do with AI safety?"

The objection is fair, and I want to address it directly.

The connection is this: if we can model AI agent capabilities as sets, and if we can prove that our runtime correctly enforces set membership, then the abstract theorems translate to concrete guarantees. The proof B ⊆ S becomes "agent B cannot perform actions outside set S."

The gap is real. We don't yet have:

A formal semantics for AI agent execution
A proof that capability checking is correctly implemented
A verified mapping from abstract capability names to actual system calls

But this is precisely how seL4 and CHERI work. They connect abstract properties (capability monotonicity in a mathematical model) to concrete implementations (machine code, silicon). The abstract theorems are the specification; the engineering is in the refinement proof that connects specification to implementation.

What Aristotle demonstrates is that the specification layer is tractable. The refinement layer is the research problem that remains.

There's a deeper lesson here about the structure of verification. The proofs Aristotle generated are one-liners because the properties are simple. Transitivity of ⊆ is a foundational fact of set theory. But this simplicity is by design: we chose to model capability bounds using subset relations precisely because subset has good mathematical properties.

This is how embedded DSLs for verification work. You don't verify arbitrary programs; you design a restricted language whose programs are verifiable by construction.

Consider Lava, the hardware description language embedded in Haskell. In Lava, circuits are not arbitrary programs but structured compositions of primitive gates. The embedding in Haskell means that:

Circuit descriptions are first-class values that can be manipulated, composed, and transformed
The Haskell type system enforces structural invariants (you can't connect signals of incompatible widths)
The same circuit description can be interpreted multiple ways: as a simulation, as VHDL output, or as input to a theorem prover

This multi-interpretation property is key. A Lava circuit isn't just code; it's a mathematical object that can be reasoned about formally. Properties like "this adder is correct" or "this pipeline has no timing hazards" can be stated and proven because the DSL restricts what can be expressed to what can be verified.

The same approach could work for AI capabilities. A capability specification embedded in Lean 4 would be:

A first-class value that can be composed and delegated
Type-checked for structural invariants (you can't grant capabilities you don't hold)
Interpretable as both runtime enforcement logic and proof obligations

The proofs Aristotle generated are simple because the capability algebra is simple. That's not a limitation; it's the point. We're not trying to verify arbitrary agent behaviour. We're designing a restricted language where the security properties are structural, not emergent.

What This Proves (and What It Doesn't)

What we've demonstrated:

AI can generate capability proofs. Aristotle produced correct, machine-verified proofs for all four theorems.
Autoformalization works. Natural language descriptions can be correctly converted to formal specifications.
The mathematical foundations are tractable. The core properties of capability systems have short proofs.
Proof generation is practical. Aristotle returned results quickly, making interactive development feasible.

What remains unproven:

Connection to real systems. These proofs operate on abstract sets. They don't yet connect to actual AI agent architectures.
Capability vocabulary. We used Set α with no interpretation. A real system needs a concrete vocabulary: webSearch, fileWrite, codeExec.
Runtime enforcement. Who checks the proofs? When? How does the runtime refuse actions that fail verification?
Scaling. These theorems are simple. Will Aristotle handle the complexity of real capability specifications?

The Missing Middle

There's a gap between abstract mathematics and running systems:

┌─────────────────────┐ ┌─────────────────────┐ │ Abstract Proofs │ ??? │ Runtime Enforcement │ │ B ⊆ A, A ⊆ S │ ──────> │ Block unsafe calls │ │ ∴ B ⊆ S │ │ in real systems │ └─────────────────────┘ └─────────────────────┘

seL4 filled this gap for operating system kernels. They proved properties about C code, not just mathematical abstractions. The proof connects all the way from high-level specification to machine code.

CHERI filled it for hardware. The capability checks are implemented in silicon, verified against a formal model.

For AI agents, the gap remains. We can prove theorems about capability sets. We don't yet have:

A formal model of AI agent execution
A proof that the runtime correctly implements capability checking
A verified compiler from capability specifications to enforcement logic

This is the engineering work that remains.

A Research Agenda

The Aristotle results suggest a concrete path forward:

Near-term (provably achievable):

1. Capability DSL. Design a domain-specific language for expressing AI agent capabilities, following the Lava pattern: embed the DSL in Lean 4 (or another dependently-typed host), so that capability specifications are first-class values with multiple interpretations. A single capability grant c : Cap {webSearch, fileRead "/tmp/*"} would be:

Type-checked at definition time (no invalid capability combinations)
Interpretable as a runtime permission check
Interpretable as a proof obligation for static verification
Composable via delegation operators with automatic proof generation

This mirrors how Lava circuit descriptions can be simulated, compiled to VHDL, or fed to theorem provers, all from the same source.

2. Proof obligation generator. Given an agent action and a capability grant, automatically generate the theorem that must be proven (e.g., action.required ⊆ granted).

3. Certificate format. Define a standard format for capability proof certificates that can accompany agent actions.

A Concrete Starting Point

What would the simplest integration look like? Consider a LangChain agent with tool access:

# Current: runtime permission check
@tool
def shell_command(cmd: str) -> str:
    if not user_approved(cmd):
        raise PermissionError("Not approved")
    return subprocess.run(cmd, capture_output=True).stdout

# Proposed: capability-verified execution
@capability_bounded(requires={"shell"})
def shell_command(cmd: str) -> str:
    # Only callable if caller has shell capability
    return subprocess.run(cmd, capture_output=True).stdout

The decorator @capability_bounded(requires={"shell"}) generates a proof obligation: any call site must prove it holds the shell capability. The proof is generated at compile time (or by an AI prover at deployment time), checked by a trusted verifier, and cached. If the proof doesn't exist or doesn't verify, the call is rejected before execution.

This isn't science fiction. It's the pattern that Rust's borrow checker uses for memory safety, that effect systems use for I/O tracking, and that capability-secure operating systems use for authority control. The novelty is applying it to AI agent actions and using AI to generate the proofs.

Medium-term (research required):

4. Trusted verifier. Build a minimal, auditable verifier that checks proof certificates before action execution. The verifier must be simpler than the AI generating the proofs.

5. Capability vocabulary. Develop a practical vocabulary for AI agent capabilities that balances expressiveness with verifiability.

6. Integration with agent frameworks. Modify existing agent frameworks (LangChain, AutoGPT, etc.) to generate and check capability proofs. The concrete starting point above shows how this might look.

Long-term (open research):

7. Semantic capabilities. Extend beyond syntactic properties ("doesn't call shell_exec") to semantic ones ("doesn't acquire resources beyond X").

8. Hardware enforcement. Explore CHERI-like hardware support for AI capability bounds.

9. Compositional certification. Develop methods for certifying multi-agent systems from proofs about individual agents.

An Invitation

This essay is, in part, an invitation to the teams building AI theorem provers.

Harmonic has demonstrated that Aristotle can prove mathematical theorems at the level of IMO gold medallists. The capability theorems shown here are simpler than IMO problems. But they're the right theorems for AI safety: the foundational properties that capability systems must satisfy.

The natural next step is to test whether Aristotle can handle more realistic capability specifications. Can it prove properties about:

Capability hierarchies with inheritance and delegation rules?
Time-bounded capabilities that expire?
Capability revocation and its propagation effects?
The interaction between capability bounds and resource limits?

These are harder than transitivity of ⊆. But they're the properties a practical capability system would need. If Aristotle (or systems like it) can prove them, the path from formal methods to deployed AI safety becomes clearer.

I'd welcome collaboration on designing these experiments.

From Possibility to Engineering

In Part I, I argued that we should verify capabilities, not just intentions. In Part II, I sketched how existing formal methods might apply. In this essay, I've shown that AI systems can already generate the proofs we need.

The question has shifted from "is this possible?" to "how do we build it?"

The mathematical layer is solved. Aristotle proves capability theorems correctly. The proofs are simple, auditable, and machine-verified.

What remains is engineering: connecting abstract proofs to concrete systems, building runtimes that enforce verified bounds, developing the tooling that makes capability verification practical.

This is not a small task. But it's a tractable one. We have:

A theorem prover that can generate proofs (Aristotle)
A verification kernel that can check them (Lean 4)
Existence proofs that the approach works (seL4, CHERI)
A clear research agenda for what comes next

The soul matters. But now we know we can verify the hands.

The path from possibility to deployed safety runs through formal methods, embedded DSLs, and verified runtimes. It's the path that gave us trustworthy operating systems and secure hardware. The question is no longer whether AI can help us walk it, but how quickly we can move.

I'm grateful to Harmonic for making Aristotle available and for their pioneering work on AI-powered theorem proving. The Aristotle responses shown here were generated on 1 February 2026. The proofs are exactly as returned, with only formatting adjusted for readability.

This essay also builds on decades of work by the seL4 team at Data61/CSIRO, the CHERI project at Cambridge and SRI, and the Mathlib community. Their efforts to make formal verification practical laid the groundwork for what's now possible.

Co-authored-by: Aristotle (Harmonic) <aristotle-harmonic@harmonic.fun>

References

Harmonic. "Aristotle: IMO-level Automated Theorem Proving." 2025. arXiv:2510.01346
Klein, G., et al. "seL4: Formal Verification of an OS Kernel." SOSP 2009. PDF
Watson, R., et al. "CHERI: Capability Hardware Enhanced RISC Instructions." Project page
Dalrymple, D., et al. "Towards Guaranteed Safe AI." 2024. arXiv:2405.06624
ARIA. "Safeguarded AI Programme Thesis." PDF
Leijen, D. "Koka: Programming with Row-polymorphic Effect Types." arXiv:1406.2061
Bjesse, P., Claessen, K., Sheeran, M., Singh, S. "Lava: Hardware Design in Haskell." ICFP 1998. PDF

Part I: The Soul and the Hands: A Third Path for AI Alignment

Part II: From Intuition to Proof