Dependent Type theory¶
Introduction¶
Dependent type theory is a way to set up the foundations of mathematics. What does that even mean? Some mathematicians perhaps learn in a logic course that ZFC set theory is a way of setting up the foundations of mathematics; dependent type theory is a slightly different way. But many mathematicians do not even worry about foundations, and of course people like Gauss were perfectly capable of doing mathematics without any. What do we need from foundations? Let’s think about some basic undergraduate-level definitions and try and figure out what is going on.
Let’s assume that we have already defined the integers, with addition and multiplication and all the basic facts proved. To go from here to the rationals we need to take the field of fractions. Formally, we consider pairs \((a,b)\) of integers with \(b\not=0\), modulo the equivalence relation \((a,b)\sim(c,d)\iff ad=bc\) (that is, iff \(a/b=c/d\), the point being that this does not yet make sense), and we can define the rationals to be the equivalence classes. Then, either by taking either Dedekind cuts or considering Cauchy sequences of rationals modulo null sequences, we can construct the reals, and finally by taking pairs \((x,y)\) of reals (thought of as representing \(x+iy\)) and defining an appropriate multiplication so that \((0,1)\times(0,1)=(-1,0)\) we can build the complexes.
All of this could be formalised in ZFC set theory, but when looking at these constructions it’s clear that all we need is some sort of system strong enough to be able to formalise basic notions such as ordered pairs, infinite sequences, basic concepts like \(+\) and \(<\) being applied to number systems, and equivalence relations/classes. ZFC set theory certainly offers this, but dependent type theory (as implemented by the Lean theorem prover) does too.
Foundations: Set theory vs Type Theory¶
Set theory¶
Everything is a set in set theory, including all elements of all sets. All the interesting things mathematicians like to think about, like groups and functions and complex manifolds and categories are all somehow shoehorned into being sets of 4-tuples or whatever, satisfying some axioms. For example, when a mathematician says “let \(G\) be a group”, what they mean is that have four sets. The first set is the set \(G\) (this is an abuse of notation – \(G\) denotes both the group and the underlying set). The second set is a function \(G\times G\to G\) called “multiplication”, which we will shorten to mul. The third set is a function inv from \(G\) to \(G\), and the fourth set is an element of \(G\) called id. There are also various unwritten rules about what you can and can’t do with \(G\). For example, looking at elements of the set \(G\) would be considered perfectly fine, however looking at elements of the set id would not be “correct” somehow, because questions about the elements of id are not invariant under group isomorphism: an isomorphism of groups \(G_1\) and \(G_2\) involves bijecting the sets \(G_1\) and \(G_2\) but not bijecting the sets id1 and id2: the sets \(G\) and id are playing different roles. Mathematicians do not however need to formally make this distinction because there is an implicit “interface” – some kind of secret society agreement which mathematicians have, where they will only do things to the group \(G\) which are “isomorphism-invariant” in some way. Basically the rule is that you can look at elements of the set \(G\) and of the sets which are functions, but you mustn’t look at elements of the elments of \(G\); these should be deemed inaccessible. These sorts of issues apparently cause problems for computer programs which try and solve maths problems in ZFC – although I am no expert at this.
In ZFC set theory mathematicians are able to formalise definitions of new structures. For example they might want to let \(X\) be a set which is also a topological space and a group at the same time, and then they might want to build some more complex notions such as the notion of a sheaf on \(X\) and then perhaps pull back this sheaf from some map coming from the group structure. It is kind of tedious to check that all of this can be formalised in ZFC set theory, but those of us who care know it can be, and with each new concept comes a new secret promise about which sets within it we will not look into.
Dependent type theory¶
In Lean there are more kinds of things, giving us a bit more space. There are universes, terms and types. The group \(G\) above is a type, and the element id of \(G\) is a term. Everything has a type in Lean. For example id has type G, and G has type Type, and Type is a universe. Note that not everything is a type – for example id is not a type, and nothing has type id. If you want to think in terms of sets, you can think of x having type alpha as meaning that \(x\in\alpha\), and this distinction between types and terms means that if x is a term then there is no way of looking at its elements.
Things like the abstract concept of a group, or a function f from type alpha to type beta (by which I mean a function taking as input any term a of type alpha and giving as output a term f a of type beta) – all of these things are types. But a number like 3, which in set theory is commonly represented as \(\{0,1,2\}\), is just a term of type nat in type theory, it doesn’t have any elements.
Type theory is just the same – there are three axioms enabling the user to construct new types from old, namely the axioms asserting the existence of function types, inductive types, and quotient types. Most of the time mathematicians need to know nothing more than these three axioms. In fact these axioms are far more aligned to what mathematicians do in practice – if you are a working mathematician who knows something about set theory, you might want to consider the last time you actually needed the power set axiom or the axiom of replacement in your work.
– for example there is a universe called Type. There are types, and there are terms. Groups and functions and so on – these are all types. The abstract notion of a group is also a type. If you’re used to set theory,
types, : propositions, proofs, types, terms, and universes. Every object in type theory is one of these five kinds of things. Here are some
Here’s a Universes, everything is either a universe, a type, or a term; the analogue of set theory’s \(x\in A\) is type theory’s x : A, meaning that the term x has type A. Many of the axioms of set theory give ways of constructing new sets from old (for example the power set axiom, or the axiom of replacement). Type theory is just the same – there are three axioms enabling the user to construct new types from old, namely the axioms asserting the existence of function types, inductive types, and quotient types. Most of the time mathematicians need to know nothing more than these three axioms. In fact these axioms are far more aligned to what mathematicians do in practice – if you are a working mathematician who knows something about set theory, you might want to consider the last time you actually needed the power set axiom or the axiom of replacement in your work.
Making types in type theory¶
Function types¶
Let’s see these three type theory axioms being used in Lean. Given two types A and B it is possible to construct the type A → B of functions from A to B whose terms are functions f : A → B (note the use of the colon here, which is standard mathematical notation as well as being standard type theory notation: f has type A → B). The one big typographical difference between standard mathematics and type theory is that when defining terms of type A → B, that is, functions, type theorists use the λ notation as opposed to the \(\mapsto\) notation common in mathematics. For example, the squaring function \(\mathbb{N}\to\mathbb{N}\) is written \(x\mapsto x^2\) in mathematics and as λ x, x ^ 2 in type theory. In full, one would write
definition f : ℕ → ℕ := λ x, x ^ 2
One difference between how type theorists and mathematicians think of functions is something which can look a little bizaarre at first to mathematicians, namely the way functions from a product are treated. For example a mathematician would regard addition on the naturals as a function :math:`+ :\mathbb{N}\times\mathbb{N}\to\mathbb{N}` and this would be the "usual" way of writing this function. However if :math:`n\in\mathbb{N}` then :math:`x\mapsto n+x` is a function :math:`\mathbb{N}\to\mathbb{N}`, so one can think of addition as a function from :math:`\mathbb{N}` to the functions :math:`\mathbb{N}\to\mathbb{N}`, or in other words a function :math:`\mathbb{N}\to\left(\mathbb{N}\to\mathbb{N})`. Computer scientists even write this type as `ℕ → ℕ → ℕ` with the convention that this means `ℕ → (ℕ → ℕ)` -- one needs a convention here because the function arrow is not associative. Mathematicians sometimes write :math:`A^B` to mean the set of functions from :math:`B` to :math:`A`, and computer scientists are exploiting the fact that :math:`\left(A^B\right)^C=A^{B\times C}`. Let's define a function :math:`g` representing addition using this convention, and evaluate :math:`2+3` using this convention.
definition g : ℕ → ℕ → ℕ := λ x y, x + y
#eval g 2 3 -- lean prints 5
Lean prints 5 as the result of the evaluation of the function. Note that everything after the – is a comment.
Functions are actually a special case of a slightly more general type – a Pi type.
Structures¶
Let’s see how Lean defines products of sets (or rather, of types) and how ordered pairs work in dependent type theory. If A and B are types then we want the product type A × B to be a type with the property that to give a term of this type is the same as giving a term of type A and a term of type B. This is done in Lean using a structure.
structure prod (A : Type) (B : Type) :=
(fst : A) (snd : B)
notation A × B := prod A B
In words, this simply says that if A and B are types, then we define a new type prod A B by saying that the way to construct a term of this type is to give as input a term fst of type A and a term snd of type B. We then also define the usual notation for a product. That’s it. The usual definition of ordered pairs in set theory is slightly more painful than this – however this is not an argument for type theory over set theory; any mathematician who has dwelt on this knows that it does not matter that ordered pairs are messy to do in set theory, all that matters is the fact that if \((a_1,b_1)=(a_2,b_2)\) with \(a_1,a_2\in A\) and \(b_1,b_2\in B\) (or rather a₁ a₂ : A and b₁ b₂ : B as a type theorist would say) then \(a_1=a_2\) and \(b_1=b_2\). This lemma is called prod.mk.inj in Lean; the naming of basic lemmas like this follows a standard convention, which it is not hard to learn, and after a while one finds that one can guess the names of basic lemmas of this sort. One also needs access to the projection functions \(A\times B\to A\) and \(A\times B\to B\), and both set theory and type theory offer these; in type theory, if c : A × B then c.fst : A and c:snd : B are the two projections.
structure prod (A : Type) (B : Type) :=
(fst : A) (snd : B)
notation A × B := prod A B
definition c : \N \times \N := \<4,5\>
#eval c.fst -- 4
Quotient types¶
I talk about these in the chapter on equivalence relations.