Section 6 : quotients

A few years ago, I started to understand much better the two completely different ways in which mathematicians construct new objects. One way to define an object is to say what it is. The other way is to say what it does. Let’s take a look at some examples.

The natural numbers

Here is a “concrete” definition of the natural numbers. We can define 0 to be the empty set. We define 1 to be {0}, we define 2 to be {0,1} and so on. An axiom of mathematics tells us that we can put all these things together to make an infinite set {0,1,2,3,...}. One can define n+1 to be the union of n and {n}, and then prove the principle of mathematical induction for this concrete model of the natural numbers.

Another definition of the natural numbers, just as old, goes back to Peano. He defines them via postulates: 0 is a natural, the successor of a natural is a natural, and that’s it. By “that’s it” we mean that “this is the only way to define naturals”; more precisely we mean that if you want to prove a theorem about naturals, then it suffices to prove it for 0, and also to show that if you’ve proved it for n then you can prove it for n+1. In other words, we are stating without proof that the principle of induction holds.

The first definition of a natural is a “what it is” definition; the second is a “what it does” definition. To a mathematician it simply does not matter which approach we use. All that a mathematician cares about is that there is an object called ℕ in which we can do arithmetic and which satisfies the principle of mathematical induction (and, strictly speaking, also the principle of mathematical recursion). Whether induction is an axiom or a theorem is of no interest to us, just as it was of no interest to people like Gauss and Euler. Furthermore, anyone who believes that “secretly they are using the concrete model all the time” will have a big shock when it comes to formalising; there are copies of the natural numbers embedded in the integers, the rationals, the reals, the complexes, the quaternions, the octonions, and the p-adic numbers for all prime numbers p, and anyone who has seen the construction of a sensible concrete model for any of these mathematical ideas will know that, for example, zero is not actually the empty set in any of them; and yet it turns out that we still use induction and recursion on these copies of the naturals (for example, when adding them).

Products

Let X and Y be sets or types or whatever you want to call a “collection of abstract elements”. A crucial construction in mathematics is the product of X and Y, typically denoted X × Y. We are often told what looks like a concrete model for this object; an element of X × Y consists of an ordered pair of elements (x,y), with x an element of X and y an element of Y. But what is an ordered pair? Well, there are two approaches: we can either say what it is, or what it does.

Wikipedia explains three ways to set up the theory of ordered pairs in set theory (due to Wiener, Hausdorff and Kuratowski), and of course there are many more. I will not explain any of them here because clearly we do not care about “what it is” here, we only care about “what it does”, which is that two ordered pairs (x₁, y₁) and (x₂, y₂) are equal if and only if x₁ = y₁ and x₂ = y₂. What it is, is an uninteresting implementation issue rather than a mathematical one.

On the other hand, when it comes to X × Y, what it does is the following: there are projection maps to X and Y, and for any Z we know that to give a map from Z to X × Y is to give a map from Z to X and a map from Z to Y, with the dictionary given via composition with the projections. However, imagine developing the theory of linear maps on \(\mathbb{R}^2\) knowing only this; it’s really helpful to know what it is, namely ordered pairs of reals \((x,y)\). We conclude that sometimes mathematicians do use what it is, but sometimes they only use what it does. We also see that sometimes there is more than one choice for what it is, and in a situation where we only care about what it does, we really don’t care what it actually is.

There are many other situations in more advanced mathematics where we don’t care what it is, but only what it does: examples are tensor products of vector spaces (and more generally of modules), pullbacks of sheaves on topological spaces. Cohomology is a situation where we do sometimes care about what it is, however our definition of what it is depends on what we’re doing; if doing abstract homological algebra our model might involve injective objects, and if doing a calculation it might involve cycles and boundaries (possibly homogeneous, possibly inhomogeneous). Mathematicians are extremely good at switching between models at will, or using no model at all, depending on the situation.

I’ll now explain how quotients work in Lean, using a “what it does” approach.

Quotients

Here’s the set-up. We have X and an equivalence relation ≈ on X. Equivalence relations are some kind of generalisation of equality, and we want a new object Q where ≈ actually becomes equality; in other words we want a surjection ⟦.⟧ : X → Q and the theorem that x₁ ≈ x₂ ↔ ⟦x₁⟧ = ⟦x₂⟧. In Cambridge I learnt what Q is, namely the set of equivalence classes of X, and the function ⟦.⟧ is called “equivalence class of”. Thirty years later Patrick Massot pointed out to me that actually I never once used “what it is”, I only ever used “what it does”, and it did not take me long to realise that he was exactly right. Just like the other examples above, this construction of the quotient Q as a set of sets is not “the answer” but in fact just a model. The reason I am bringing this up is that Lean does not use this model. In fact Lean uses a “secret” model for Q – a type called quotient s where s is a term which bundles together the binary relation ≈ and the proof that it’s an equivalence relation. We cannot “inspect” the terms of this type and ask if they are subsets of X; it is an opaque construction. However we know what quotient s does because Lean provides the API to do it. In particular, the fact that ⟦.⟧ is a surjection and that x₁ ≈ x₂ ↔ ⟦x₁⟧ = ⟦x₂⟧ are available to us as theorems in Lean’s library.

In theory, we could prove everything about quotients using these two facts, however there is a construction in mathematics which is so common that Lean singles it out and gives us extra API for it. We finish by talking about it. In Lean it is called quotient.lift which is a great shame because mathematically it is a descent, not a lift. In mathematics it goes by the far more unwieldy name of “the way to show that a function defined from a quotient using a certain method is well-defined”.

What does it mean to be “well-defined”?

One thing which I’ve seen students struggle with over my decades of teaching is the concept of a function out of a quotient being “well-defined”. We want to define a function f from Q to some other type/set Z and the strategy is as follows. Take an element q of Q and pretend it’s an equivalence class. Now choose a random element of the equivalence class; this is now an element x of X. Using x, construct an element of Z. Now claim that this element of Z is “well-defined”, and decree that f(q) is this element. What does this well-definedness boil down to? It’s the idea that if we had chosen a different element y of the equivalence class, and used that instead of x to construct the element of Z, we would have got the same element.

I find this very confusing to write, so no wonder students find it hard to comprehend. It’s much easier to explain it in the following way. Our “recipe” to make an element of Z from x is just a function F : X → Z. The check that if we’d chosen a different element y in the equivalence class of q instead of x we’d get the same answer, is simply the assertion that F(y)=F(x). Finally, the fact that x and y are in the same equivalence class is just the assertion that x ≈ y.

The upshot of all this is that the mathematician’s “principle of defining a function and then checking it’s well-defined” boils down to finding a term of the following type:

∀ (F : X → Z) (h : ∀ (x y : X), x ≈ y → F x = F y), (Q → Z)

This term is called quotient.lift in Lean. You feed it F : X → Z and a proof h that F is constant on equivalence classes, and it descends F to a function Q → Z called quotient.lift F h.

The only question left is how to prove that (quotient.lift F h) (⟦x⟧) = F x, that is, that quotient.lift F h really is a descent of F down to Q. And in Lean, not only is this true, but the proof is refl.