My fork of https://github.com/leanprover-community/mathlib4. Currently working on computability theory, trying to formalize the P and NP classes and proving some facts about them. See my notes at https://pad.abstractnonsen.se/mathlib-computability for "documentation / progress" of what I'm doing.
  • Lean 99.7%
  • Python 0.2%
Find a file
2026-02-09 01:42:53 +01:00
.devcontainer feat(.devcontainer): use prebuilt Docker image (#23530) 2025-05-25 20:39:19 +00:00
.docker chore: use elan.lean-lang.org URL for install scripts (#23031) 2025-03-18 01:28:23 +00:00
.github chore: permit metadata tags in lake-manifest.json (#32078) 2025-11-25 13:04:32 +00:00
.vscode chore: remove acceptSuggestionOnEnter option (#25932) 2025-07-07 18:08:31 +00:00
Archive chore: use generic map_X lemmas where possible (#31244) 2025-11-26 01:16:14 +00:00
Cache chore: bump toolchain to v4.26.0-rc1 (#31763) 2025-11-18 10:56:20 +00:00
Counterexamples chore: use generic map_X lemmas where possible (#31244) 2025-11-26 01:16:14 +00:00
docs doc(GroupTheory): mention Cayley's theorem by name (#32056) 2025-11-24 15:41:23 +00:00
DownstreamTest chore: cleanup .gitignore files (#20795) 2025-01-16 12:40:25 +00:00
LongestPole chore: deprecation in LongestPole utility (#31044) 2025-10-31 12:30:16 +00:00
Mathlib cleanup: use proof_wanted instead of sorry 2026-02-09 00:48:08 +01:00
MathlibTest feat: beta reduce in the T% elaborator (#30879) 2025-11-26 10:04:12 +00:00
scripts chore: fix dependency counting script (#32133) 2025-11-26 12:00:48 +00:00
widget/src/penrose chore: fix spelling mistakes in create-adaptation-pr.sh and commutative.sty (#16101) 2024-08-26 10:46:31 +00:00
.gitignore feat: improvements to user_activity_report.py (#25721) 2025-06-12 03:08:56 +00:00
.gitpod.yml fix(gitpod.yml): force elan to self-update in gitpod (#13439) 2024-06-01 12:22:36 +00:00
.pre-commit-config.yaml feat: add pre-commit config for cleaning trailing whitespace (#29733) 2025-09-17 17:04:42 +00:00
Archive.lean feat(Archive): Kuratowski's closure-complement theorem (incl. sharpness) (#27090) 2025-09-02 18:25:57 +00:00
bors.toml ci: remove "CI Success" job (#30283) 2025-10-07 14:02:49 +00:00
CODE_OF_CONDUCT.md
Counterexamples.lean feat(Counterexamples): Peano curve (#26185) 2025-11-20 14:55:43 +00:00
docs.lean docs: introduction and guide to conv mode (#5800) 2023-07-18 03:34:01 +00:00
lake-manifest.json chore: adaptation for batteries#1500 (#32075) 2025-11-25 00:00:11 +00:00
lakefile.lean chore: fix version comparison in lake update hook (#31523) 2025-11-23 08:27:53 +00:00
lean-toolchain chore: bump toolchain to v4.26.0-rc2 (#31877) 2025-11-21 09:52:04 +00:00
LICENSE
Mathlib.lean feat: the empty line in commands linter (#25945) 2025-11-26 08:45:01 +00:00
README.md update README: put in contents of hedgedoc 2026-02-09 01:42:53 +01:00
README_mathlib.md update README: put in contents of hedgedoc 2026-02-09 01:42:53 +01:00

Computability in Mathlib

:::success The source code of this project can be found at git.abstractnonsen.se/max/mathlib4. :::

As part of the course on Formalized Mathematics in Lean, taught by Floris van Doorn and Michael Rothgang at the University of Bonn in winter term 2025, we need to do a student project, working on formalizing some part of mathematics of our choice that is not yet in mathlib. For me, I chose to work on computability, and these are my notes on what is and is not in mathlib, and what could be formalized to make further progress. Thus, this helps me as my personal scratchpad and also keep track of progress of the overall project.

As my main goal, I would like to be able to proof Cook's theorem, stating that the boolean SATisfiability problem is NP-complete. However as of now, the class NP is not defined in mathlib.

State of the project

Results formalized

I focused mostly on Cook's theorem. The main results fully formalized are:

  • The Turing machine models TM0 and TM1 (which were in mathlib already) are polynomially equivalent
  • Definitions of typeclasses for turing machines to be able to define P and NP. Proving that P is well-defined for polynomially-equivalent turing models.
  • (A version of) Cook's theorem: For any problem in NP and an istance I of that problem, there is a (polynomial-size) SAT such that this SAT is feasible if and only if the instance I is a yes-instance.
    • This is the "hard" step towards proving that SAT is NP-complete (found in textbooks).
    • The "easy" steps are recognizing that this SAT is (polynomially) computable from the instance I, which is "obvious", but not easy to formalize without more powerful turing machine models.

Overview of files / things done

I worked directly with (a fork of) mathlib, and edited only files in Mathlib/Computability. The following summarizes my changes by file (files are listed by logical / import order):

  • Encoding.lean: Define an identity encoding
  • RelationIn.lean: This does not necessarily belong to the computability section. It contains generalizations of ReflTransGen and TransGen that can track the number of transitivity applications to deduce a relation as well as standard lemmas on these
  • TuringBase.lean: Uses the framework from RelationIn.lean to build a notion of execution from step functions σ → Option σ. The important definition is the notion of one step function f₂ : σ₂ → Option σ₂ respecting a function σ₁ : σ₁ → σ₂, which means that there is a relation R : σ₁ → σ₂ → Prop describing "matching" states, and performing a step of f₁ can be modelled by performing a finite number of steps of f₂, yielding a matching state pair again. Based on this, one can prove theorems about whole execution sequences being analogue. This is the foundation for proofs of polynomial equivalence. A version of this code was already in mathlib, but not tracking execution steps (so no polynomial equivalence can be shown), since it worked with ReflTransGen. Using the variant ReflTransGenIn allowed to generalize these results
  • TMComputableBase. Here, the typeclass TM of general turing machines is introduced. Note that the typeclass depends on the input and output alphabet of the turing machine, i.e. [TM τ Γ₀ Γ₁] means that elements (tm : τ) can be interpreted as turing machines with input alphabet Γ₀ and output alphabet Γ₁. This was not present in mathlib before. Thus essentially, a type fulfills TM if it has input/output functions (from/to lists in some alphabet) as well as a step function on some state type. Based on this, we can already define what it means for a turing machine to produce certain output within a certain amount of time, or what it means to compute a function. Also, we can define what it means for two turing machines (on the same input/output alphabet) to be (polynomially) equivalent, and proof the important theorem PolynomiallyEquivalent_of_Respects, which yields that two turing machines with a respecting step function (see TuringBase.lean) are polynomially equivalent.
  • TuringModel.lean: Since we want to speak about the collection of Turing machines "built in a similar fashion" but with any input and output alphabet, the class TuringModel is introduced. A TuringModel is a is of type Type ⊕ (Type × Type) → Type*: For any single type (pair of types), it yields the type of turing machines with the given input/output alphabet (specifying a single type means input = output alphabet). We also define what it means for two such models to be (polynomially) equivalent
  • PostTuringMachine. Existed in mathlib before, and contains the definition of the turing machines TM0 and TM1. Theorems have been generalized to track execution steps and work towards polynomial equivalence.
  • TM0Computable: This did not exist before, but adapts the existing definitions of a FinTM2 (present in TMComputable.lean in mathlib) to TM0 and TM1 as well as defines the corresponding typeclass instances for TM and TModel. Here, the main theorem FinSupportTM0_polyEquiv_FinSupportTM1 is deduced, which states that these two models are polynomially equivalent
  • NP.lean. Defines P, gives a framework for verifier turing machines to define NP. Proves that definition of P is well-defined up to polynomial equivalence of given model and applies this to the TM0 <-> TM1 equivalence.
  • Tape.lean: Added the notion of Support and matching extensionality lemmas.
  • TMCook.lean: Preparations for the proof of Cook's theorem: A variant of the TM0 model is defined where execution can continue beyond halting by "doing nothing". This is done by introducing a designated HALT state. We want this model since it will be easier to work with when we want to encode execution semantics into a SAT.
    • Also, the notion of "outputs in at most n steps" now becomes "after n steps, state is halt and tape reads", which avoids case distinctions on when exactly the turing machine halts.
    • There is a natural projection from our Cfg' state to the existing Cfg' state of the TM0 model, and we prove that this projection commutes with step / step' as well as initialization / taking output, ultimately proving that the notion of computation agrees exactly between these models. We don't work with the TM typeclass here, since we want to be "bare-hands" for the cook proof.
    • There are several technical lemmas expressing the semantics of the step' function as a big quantifier over simple logical expressions. These will be needed to prove the equivalence with the SAT instance and essentially mimic its clause structure already
    • An important detail of our model is that we track the relative position of the read/write head to its starting position. This allows us to talk about absolute indices on the tape: Only one position of the tape changes at a time, and we retain the rest. In the existing TM0 model, we can only refer to the tape positions relative to the current head position, so in some sense, we would need to "update" all tape contains on every move instruction
  • Cook.lean: This contains the SAT-definition used for Cook's theorem. We define how to construct truth assignments from existing executions as well as a function to recover a state of execution from a (part of) truth assignment.
    • While the SAT is (of course) finite, we work with an infinite variable type and talk about "suitable" truth assignments. "suitable" here means that the values of non-occurring variables (which are in principle irrelevant to the SAT itself) have good default values. Every truth assignment can be made into a suitable one while maintaining whether (or not) it satisfies the SAT instance. Overall, this approach allows us to express lemmas more easily, since we avoid a lot of restrictions / assumptions needed in definitions everywhere.
    • Ultimately, we proof Cook_SAT_equiv, stating that the given SAT is feasible if and only if the given instance belongs to the language.
    • There is a technical requirement v.time_ge_tape_size present. This is the assumption that the time bound for the verifier is at least linear in the input size. We need this for the equivalence to be the case. While this is not itself part of the definition of a verifier, every NP-language has such a verifier by taking an existing one and just using a bigger time bound (since it's a bound and not an exact function, this is fine). See TMVerifiesInPolyTime.normalize for a definiton of such a normalization.
  • TMExamples.lean. This was me toying around with the library initially. Contains no interesting results, could be removed.
  • Other files only have minor modifications to them

Future work

This could be done

  • Clean up code / do some code golfing / performance tweaking
  • Prove equivalence of TM1 and TM2 models
  • Introduce more powerful "turing" machine models that resemble programming languages more. Prove their equivalence to existing oes
  • Prove that SAT is in NP (by giving a certificate checker)
  • Prove that the SAT from Cook's theorem is computable by a turing machine (by giving one)

Use of resources

I made extensive use of

  • loogle.lean-lang.org
  • leansearch.net
  • mathlib4 documentation
  • grepping the mathlib4 repository for usage of certain definitions / lemmata
  • reading the existing mathlib4 code, mostly on computability
  • Regarding the mathematical foundation of turing machines and computability
    • Combinatiorial Optimization by [Korte, Vygen]
    • Introduction to the theory of computation by [Sipser]
    • Introduction to automata theory, languages and computation by [Hopcroft, Motwani, Ullman]
  • Asking questions to fellow students

Use of AI

AI has not been used to generate code for this project, all code is of my own creation. AI has been used in the following ways:

  • Ask whether mathlib provides a certain "functionality", for example:
    • Me: In the lean4 propgramming language, is there short notation for Set.Icc ?
    • AI: Yes — Lean 4 does have short notation for Set.Icc, and its exactly what youd hope. The short notation [a, b]
  • "Searching" the mathlib4 library for relevant statemnts, for example
    • Me: In mathlib, there is List.product. Is there a similar version for the product of 3 lists?
    • AI: Short answer: there is no dedicated List.product₃ (or similar) in mathlib4, and that is by design.
    • Me: I am working with the type ENat. I have an (n : ENat), and I know n \neq \top. I would like to replace it by a cast of (m : Nat) everywhere. How do I do that?
    • AI: In ENat (the extended natural numbers), the key fact you want is: coeNat_ne_top and ne_top_iff_finite
  • Ask for the idiomatic pattern to handle something:
    • Me: I have a term involving Quotient.liftOn l f h. I would like to replace this by just f l. how do i do this?
    • AI: To rewrite Quotient.liftOn l f h as simply f l, you need to change the type of l: ➡️ Quotient.liftOn l f h only reduces to f x when l is actually Quotient.mk _ x.

:::warning From here on, you will find my personal notes that documented the progress of the project. These are not "good documentation". :::

Goal of the project

Even though I would like for this to be "complete" eventually, realistically, there is no such thing. The landscape of computability is much bigger than a 2-month student project, especially because mathlib is not very well-developed in this area right now. I want to get a picture of what is (not) in mathlib, and organize this to get a feeling of which routes of dependencies I could be formalizing to reach certain goals. Since I don't know how fast I will be and how much effort things will be (I have never done a project on this scale before with Lean), the outcome is sort of not very concrete yet.

What currently is in mathlib

Turing-Machine Models

TM0

  • works with one tape of symbols Γ, a set of labels Λ
    • Γ is inhabited: Tape is filled with default symbol "to infinity"
    • Λ is inhabited: default value is starting state
  • Instructions (type Stmt) are
    • move <Dir>
    • write <Symbol>
  • Machine / Program is Λ -> Γ -> Option (Λ x Instruction), halts when giving none
  • State (type Cfg) is
    • current label q : Λ
    • current tape Tape : Tape Γ
  • Output: (inclusive) right side of the tape with respect to the current position of machine

TM1

TM2

Finiteness

All TM models work with (potentially) infinite sets of instructions, i.e. infinite programs, which simplifies formalization. There is the notion of a subset S ⊂ Λ supporting the execution of a program, which means that all but finitely many states are reached. So, a finite TM can be recovered from a TM with finite support.

Reductions between Turing-Machine Models

The following reductions are currently formalized:

  • TM1 <-> TM0 (with arbitrary but same alphabet)
  • TM1(Γ) → TM1(Bool) (i.e. we can switch alphabets)
  • TM2 → TM1

All of them carry over finite support.

Definition of computability

Currently, Computability of functions is only defined with respect to the TM2 model. To be precise, there is the notion of a FinTM2 machine, which is structure bundling a TM2 together with finiteness guarantees. Then, computability (with some time function) is formalized as being achievable by any FinTM2.

Questions about mathlib

Style

  • Lots of variable declarations, make file hard to read sometimes. Is this good?

What could be formalized:

Note that the goal is not necessarily to do all of this. I just wanted some overview of which things are missing (in terms of dependencies) to achieve certain main goals.

  • Reductions (should both be easy):
    • Reduction TM2 -> TM1
    • Reduction TM1(Bool) -> TM1(Γ), for Γ ≠ Unit
  • Notion of computability:
    • Define Computability with respect to TM0 model
      • Define FinTM0
      • Define time functions, computability similar to TM2 versions
    • Define Computability with respect to TM1 model
      • Define FinTM1
      • Define time functions, computability similar to TM2 versions
    • Show equivalence of these three computability models
      • This should be doable by just using the existing reductions
      • Can also be inferred as corollary once we have this including polynomial-time
    • Show that equivalence is up to polynomial-time factor
      • Need to show this for each individual reduction
      • TM0 → TM1
        • Generalize tr_respects lemma to include bounds
      • TM1 → TM0
      • TM1(Γ) → TM1(Bool)
      • TM1(Bool) → TM1(Γ)
      • TM2 → TM1
      • TM1 → TM2
  • Well-behavedness of computability:
    • Show that concatenation of (polynomial-time) functions is again (polynomial-time) computable
      • Maybe this is a project on its own, but it would be very useful and interesting (mathlib has a proof_wanted declaration specifically for this theorem)
      • Do this using the TM2 model.
      • If computability notion is developed as above, then this holds for TM0 and TM1 as well, so translate this.
  • Turing machines of type TM0 viewed as different input / output alphabets:
    • Want: This is analogue to the definition of TM2, where we require strict equality in input / output alphabet
  • Finiteness of TMs
    • None of the TM models require finiteness. Instead, they work with the notion of a supported set of states, that is a subset S ⊂ Λ of state labels, together with the assertion that execution of the TM does not leave this set S (when starting in it). Then, there is an equivalence FinTM ~ TM with finite support.
    • Computability (in polynomial time or not) is defined with respect to bundled, finite TMs. The reductions are for general Turing-Machines and have accompanying lemmata that they translate finite-support machines to finite-support machines, meaning we can restrict the reduction using the equivalence from the previous point
    • Recover FinTM0 from TM0 with finite support
    • Recover FinTM1 from TM0 with finite support
    • Recover FinTM2 from TM0 with finite support
  • P / NP things
    • Define class P with respect to the three TM models and show equality
      • Define P with respect to any model
      • Show independence of model if models are polynomially equivalent
      • Specialize theorem for the three (six with finite / finite support distinction) models
        • This would require more equivalences of models, currently only the FinSupportTM0 <-> FinSupportTM1 is proven
    • Define NP class
    • Define NP-completeness
    • Proof P ⊂ NP
      • This is a bit annoying, since we need to explicitly give a TM and proof stuff about it. Probably a project on its own
    • Proof Cook's Theorem (Cook_SAT_equiv in Cook.lean)
      • Develop slightly different model for TM0, where "execution" can continue after halt (in TMCook.lean)
        • Define separate Cfg' type and step, run functions on it
        • Prove equivalence of "concrete" TM0Outputs predicate for this model, and general FinTM0 notion of output
        • Describe step' function in terms of SAT-like constraints
      • Define Encoding of SAT
      • Prove equivalence of TM-execution and valid SAT assignments
        • Recover description of TM from well-defined truth assignment
        • prove equivalence of subparts:
          • initialization
          • step
          • output
        • put together parts (should only be collecting results)
      • Deduce "raw" Cook theorem from equivalence
    • Further work: Define the language of satisfiable SAT-problems and prove that this language is NP-complete:
      • "raw" Cook theorem
      • show that SAT is in NP
        • This is essentially "trivial", but one needs to give an actual turing machine checking a certificate (a valid truth assignment). This could be done, but is tedious with the current foundations in mathlib. Better to develop more general machinery (equivalence of TM1 and TM2, but also introduce a more powerful TM3 model which can handel numbers and computation naturally)
      • show that the reduction given in the "raw" cook proof is computable and hence a polynomial transformation
        • Again, this needs a specific turing machine, and should be done at a later time