https://1lab.dev/1Lab.intro.html
Introduction to HoTT
1Lab
---------------------------------------------------------------------
* Introduction
* Type theory
* Functions
* Universes
+ Universes, internally
+ Universes and size issues
* Interlude: Basics of Paths
* Sums
* Dependent products
* What next?
+ Paths, in detail
+ Equivalences
+ Univalence
---------------------------------------------------------------------
[ ] Equations [ ] Serif Font [ ] Inline Footnotes
---------------------------------------------------------------------
[ ]
back to index
view all pages
link to source
---------------------------------------------------------------------
Written by Amelia Liao, Jonathan Coates, Julius Marozas, Lautaro
Garavano, Naim Favier, Reed Mullanix and kzvi
back to index
[ ] Equations [ ] Inline Footnotes
module 1Lab.intro where
Introduction
This page aims to present a first introduction to cubical type
theory, from the perspective of a mathematician who has heard about
type theory but has no previous familiarity with it. Specifically,
the kind of mathematician that we are appealing to is one who is
familiar with some of the ideas in category theory and homotopy
theory -- however, the text also presents the concepts syntactically,
in a way that can be read without any prior mathematical knowledge.
Note that whenever code appears, all of the identifiers are
hyperlinked to their definitions, which are embedded in articles that
describe the concepts involved. For instance, in the code block
below, you can click the symbol [?] to be taken to the page on path
types.
_ : 2 + 2 [?] 4
_ = refl
For some operations, we use the built-in Agda definitions, which do
not have explanations attached. This is the case for the + operation
above. However, in any case, those definitions are either built-ins
(in which case only the type matters), or defined as functions, in
which case the implementation is visible. The most important
built-ins for Cubical Agda, and those most likely to lead you to a
built-in Agda page, have their own explanations, linked below:
* The 1Lab.Path page explains the primitives I, i0, i1, _[?]_, _[?]_,
~, PathP, Partial, _[_-_], and hcomp.
* The 1Lab.Univalence page explains the primitives Glue, glue,
unglue.
With that out of the way, the first thing to do is the aforementioned
refresher on type theory. If you're already familiar with type
theory, feel free to skip to What's next?. If you're not familiar
with some of the connections that HoTT has to other areas of
mathematics, like higher category theory and topology, I recommend
that you read the type theory section anyway!
Aside: A note on terminology.
The text below refers to the foundational system that it is
presenting as "homotopy type theory", while it would be technically
more correct to refer to it as a homotopy type theory. HoTT is not a
specific type theory, but a family of type theories that put emphasis
on the homotopy-theoretic interpretation of type theory. In
particular, a homotopy type theory is expected to validate univalence
and have some class of higher inductive types.
Below, we describe cubical type theory, which is one particular
homotopy type theory. In cubical type theory, we interpret types as
the Kan complices in the category of De Morgan cubical sets, PSh(#)\
id{PSh}(\square)PSh(#). This interpretation gives rise to a model
structure on the category of cubical sets, referred to as a
"cubical-type model structure".
Type theory
Warning: The text below is subject to change! It's still very much a
document in flux. In particular, the type theory section doesn't talk
about induction, which is something that will be expanded on in the
future.
Type theory is a foundational system for mathematics which, in
contrast to a set-theoretic foundation like ZFC, formalises
mathematical constructions rather than mathematical proofs. That is,
instead of specifying which logical deductions are valid, and then
giving a set of axioms which characterise the behaviour of
mathematical objects, a type-theoretic foundation specifies directly
which mathematical constructions are valid.
Formally speaking, it is impossible to construct objects in set
theoretic foundations; Rather, one can, by applying the deduction
rules of the logical system and appealing to the axioms of the set
theory, prove that an object with the desired properties exists. As
an example, if we have a way of referring to two objects xxx and yyy,
we can prove that there exists a set representing the ordered pair
(x,y)(x, y)(x,y), using the axiom of pairing. Writing quasi-formally,
we could even say that this set is constructed using the expression
{x,{x,y}}\{x, \{x, y\} \}{x,{x,y}}, but this is only a notational
convenience.
In contrast, a type-theoretic foundational system, such as HoTT (what
we use here), specifies directly how to construct mathematical
objects, and proving a theorem becomes a special case of constructing
an object. To specify what constructions are well-behaved, we sort
the mathematical objects into boxes called types. Just like in set
theory, the word "set" is a primitive notion, in type theory, the
word "type" is a primitive notion. However, we can give an intuitive
explanation: "A type is a thing that things can be". For instance,
things can be natural numbers, but things can not be "two" -- hence,
the natural numbers are a type, but two is not. Formally, a type is
specified by giving:
* A formation rule, which tells us which expressions denote valid
types at all. For instance, the formation rule for product types
tells us that NxN\bb{N} \times \bb{N}NxN denotes a well-formed
type, but "Nx-\bb{N} \times \toNx-" does not. The formation rules
also have the duty of preventing us from running into size issues
like Russell's paradox, by constraining the use of universes.
* A handful of introduction rules, which tell us how to piece
previously-constructed objects together to form an object of a
different type. Sticking with products as an example, if we have
previously constructed objects called aaa and bbb, we can
construct an ordered pair (a,b)(a, b)(a,b).
Since the introduction rules are inseparable from their types, it
is impossible to construct an object without knowing in which
type it lives. If we used the rule which introduces ordered
pairs, we know for sure that we built an object of the product
type. To specify that an object lives in a certain type, we use
the notation a:Aa : Aa:A -- which can be read as "aaa inhabits AAA
", or "aaa is a point of AAA", or "aaa has type AAA".
The notation for stating inhabitation agrees with mathematical
parlance in when both make sense. For example, a function from
the reals to the reals is commonly introduced with the notation
f:R-Rf : \bb{R} \to \bb{R}f:R-R. In type theory, this is made
formal as a typing declaration: fff is a point of the type of
functions R-R\bb{R} \to \bb{R}R-R, hence we know that the rules
of function types can be used with fff. However, when introducing
a more complex object, we're forced to break into informality and
conventional shorthand: "Let G be a group". In HoTT, we can
directly write G:GroupG : \id{Group}G:Group^1, which tells us
that the rules for the type of groups apply to GGG.
* A handful of elimination rules, which say how inhabitants of that
type can be manipulated. Keeping with the running example, the
product type has elimination rules corresponding to projecting
either the first or second component of a pair.
Furthermore, each elimination rule comes with specified behaviour
for when it "interacts" with the introduction rules, a process
which we generally call reduction. Repeatedly applying these
"interactions" tells us how a given construction can have
computational behaviour -- by performing a delicate dance of
elimination and introduction, we can go from 2+22 + 22+2 to 444.
Now that we have an intuitive motivation for the notions of type
theory, we can more precisely discuss the structure of formal type
theory and how it differs from formal (material) set theory. The
first (and perhaps most important) difference is that formal set
theory is a "two layered" system, but type theory only has "one
layer". In (for example) ZFC, we have the deductive framework of
first-order logic (the "first layer"), and, using the language of
that logical system, we assert the axioms of set theory. This remains
valid if we switch set theories: For instance, IZF would substitute
first-order logic with first-order intuitionistic logic, and remove
the assertion that the axiom of choice holds.
In contrast, type theory only has one layer: The layer of types. What
in formal set theory would be a deduction in FOL, in type theory
becomes the construction of an inhabitant of a certain type, where
propositions are identified with certain types as in the table below.
Note that up to a first approximation, we can read each type former
as denoting a specific set, or as denoting a specific space.
Types Logic Sets Spaces
AAA proposition set space
a:Aa : Aa:A proof element point
B(x)B(x)B(x) predicate family of sets fibration
b(x):B(x)b(x) : B conditional proof family of elements section
(x)b(x):B(x)
[?],{[?]}\emptyset, \{\ [?],{*}\
[?],[?]\bot, \top[?],[?] [?],[?]\bot, \top[?],[?] emptyset\}[?],{[?]} emptyset, \
{*\}[?],{*}
AxBA \times BAxB A[?]BA \land BA[?]B Cartesian product product
space
A[?]BA \uplus BA[?]B A[?]BA \lor BA[?]B disjoint union coproduct
space
A-BA \to BA-B A-BA \to BA-B function set function
space
[?](x:A)B(x)\prod_ [?](x:A)B(x)\forall space of
{(x : A)}B(x)[?](x: {(x : A)}B(x)[?](x: n-ary product sections
A) B(x) A)B(x)
[?](x:A)B(x)\sum_ [?](x:A)B(x)\exists
{(x : A)}B(x)[?](x: {(x : A)}B(x)[?](x: n-ary disjoint union total space
A) B(x) A)B(x)
x[?]Ayx \equiv_A yx equality x=yx = y diagonal {(x,x):x[?]A}\ path space
[?]A y x=y { (x,x) : x \in A\}{( AIA^\bb{I}
x,x):x[?]A} AI
This segues nicely into another difference between type theory and
set theory, which concerns the setup of their deductive systems. A
deductive system is a specification for how to derive judgements
using rules. We can think of the deductive system as a game of sorts,
where the judgements are to be thought of as board positions, and the
rules specify the valid moves.
In the deductive system of first-order logic, there is only one kind
of judgement, stating that a proposition has a proof. If AAA is a
proposition, then the judgement "AAA has a proof" is written "[?]A\
vdash A[?]A". Note that the judgement [?]A\vdash A[?]A is a part of the
metatheory, and the proposition AAA is a part of the theory itself.
That is, [?]A\vdash A[?]A is an external notion, and AAA is an internal
notion.
In type theory, the basic judgement is inhabitation, written [?]a:A\
vdash a : A[?]a:A. If AAA is a type that denotes a proposition, then
this is analogous to the judgement [?]A\vdash A[?]A of first-order logic,
and we refer to aaa as a witness of AAA, or simply as a proof of AAA.
Even better, if AAA is a purely logical proposition such that [?]A\
vdash A[?]A holds without any appeal to the principle of excluded
middle, then there is a term aaa for which [?]a:A\vdash a : A[?]a:A holds
in type theory.
When AAA is being thought of as a collection of things, we might
imagine that a:Aa : Aa:A is analogous to a[?]Aa \in Aa[?]A. However, this
is not the case: While a[?]Aa \in Aa[?]A is a proposition (an internal
statement), a:Aa : Aa:A is a judgement, an external statement. That
is, we can not use a:Aa : Aa:A with the propositional connectives
mentioned in the table above -- we can not make statements of the form
"if it is not the case that a:Aa : Aa:A, then b:Bb : Bb:B". This is
the second difference we will mention between set theory and type
theory: The proposition a[?]Aa \in Aa[?]A is an internal relation between
previously-constructed terms aaa and AAA, but the external judgement
[?]a:A\vdash a : A[?]a:A is indivisible, and it does not make sense to
talk about aaa without also bringing up that it's in AAA.
The third and final difference is the treatment of equality, which is
arguably also the most important part of homotopy type theory. In set
theory, given sets aaa and bbb, we have a proposition a=ba = ba=b,
which is given meaning by the axiom of extensionality: a=ba = ba=b
whenever aaa and bbb behave the same with respect to [?]\in[?].
Correspondingly, in type theory, we have an equality type, written
a[?]Aba \equiv_A ba[?]A b (where a,b:Aa, b : Aa,b:A), giving us the
internal notion of equality. But we also have an external notion of
equality, which encodes the aforementioned interactions between
introduction and elimination rules. This is written [?]a=b:A\vdash a =
b : A[?]a=b:A, and it's meant to indicate that aaa and bbb are the same
by definition -- hence we call it definitional equality.
Since definitional equality is a judgement, it's also not subject to
the internal propositional connectives -- we can not prove, disprove,
assume or negate it when working inside type theory, for it does not
make sense to say "if aaa and bbb are equal by definition, then [...]".
Correspondingly, in the Agda formalisation, definitional equality is
invisible, since it's represented by the computation rules of the
type theory.
In the rest of this section, we will recall the presentation of the
most ubiquitous types. The idea is that if you've encountered type
theory before, then skimming the explanation here should help snap
your mind back into focus, while if this is your first time
interacting with it, then the content here should be enough for you
to understand the rest of the 1lab development.
Functions
If we have two previously-constructed types AAA and BBB, we can form
the type of functions from AAA to BBB, written A-BA \to BA-B. Often,
functions will also be referred to as maps. A function is,
intuitively, a rule prescribing how to obtain an inhabitant of BBB
given an inhabitant of AAA. In type theory, this is not only an
intuition, but rather a definition. This is in contrast with set
theory, where functions are defined to be relations satisfying a
certain property.
Given a function f:A-Bf : A \to Bf:A-B, we can apply it to an
inhabitant a:Aa : Aa:A, resulting in an inhabitant f af\ af a of the
codomain BBB, called the value of fff at aaa. For clarity, we
sometimes write f(a)f(a)f(a) for the value of fff at aaa, but in
Agda, the parentheses are not necessary.
The most general form for constructing an inhabitant of A-BA \to BA-B
is called lambda abstraction, and it looks like (l(x:A).e)(\lambda (x
: A). e)(l(x:A).e), though shorthands exist. In this construction,
the subexpression eee denotes the body of a function definition, and
xxx is the parameter. You can, and should, think of the expression
lx.e\lambda x. elx.e as being the same as the notation x-ex \mapsto e
x-e often used for specifying a map. Indeed, Agda supports definition
of functions using a more traditional, "rule-like" notation, as
shorthand for using l\lambdal abstraction. For instance, the
following definitions of inhabitants f and g of Nat-Nat\id{Nat} \to \
id{Nat}Nat-Nat are definitionally the same:
f : Nat - Nat
f x = x * 3
-- Function definition
-- with a "rule".
g : Nat - Nat
g = l x - x * 3
-- Function definition
-- with l abstraction
In general, we refer to a way of constructing an inhabitant of a type
as an introduction rule, and to a way of consuming an inhabitant of a
type as an elimination rule. Concisely, the introduction rule for
A-BA \to BA-B is l\lambdal abstraction, and the elimination rule is
function application. A general principle of type theory is that
whenever an elimination rule "meets" an introduction rule, we must
describe how they interact.
In the case of l\lambdal-abstraction meeting function application,
this interaction is called b\betab-reduction^2, and it tells us how
to compute a function application. This is the usual rule for
applying functions defined by rules: The value of (lx. e)e'(\lambda
x.\ e) e'(lx. e)e' is eee, but with e'e'e' replacing xxx whenever xxx
appears in eee. Using the notation that you'd find in a typical^3
type theory paper, we can concisely state this as:
(lx. e)e'[?]e[e'/x] (\lambda x.\ e) e' \longrightarrow e[e'/x] (lx. e)e
'[?]e[e'/x]
In addition, function types enjoy a definitional uniqueness rule,
which says "any function is a l\lambdal expression". Symbolically,
this means that if we have f:A-Bf : A \to Bf:A-B, the terms fff and
(lx. f x)(\lambda x.\ f\ x)(lx. f x) are equal, definitionally. This
is called e\etae-equality, and when applied right-to-left as a
reduction rule, e\etae-reduction.
Functions of multiple arguments are defined by iterating the unary
function type. Since functions are types themselves, we can define
functions whose codomain is another function -- where we abbreviate A-
(B-C)A \to (B \to C)A-(B-C) by A-B-CA \to B \to CA-B-C. By the
uniqueness rule for function types, we have that any inhabitant of
this type is going to be of the form lx. ly. f x y\lambda x.\ \lambda
y.\ f\ x\ ylx. ly. f x y, which we abbreviate by lx y. f x y\lambda x
\ y.\ f\ x\ ylx y. f x y. In Agda, we can write this as l x y - ...,
or by mentioning multiple variables in the left-hand side of a rule,
as shown below.
f : Nat - Nat - Nat
f x y = x + 3 * y
Universes
Instead of jumping right into the syntactic definition (and
motivation) for universes, I'm going to take a longer route, through
topos theory and eventually higher topos theory, which gives a
meaning explanation for the idea of universes by generalising from a
commonly-understood idea: the correspondence between predicates and
subsets. Initially, we work entirely in the category of sets,
assuming excluded middle for simplicity. Then we pass to an arbitrary
elementary topos, and finally to an arbitrary higher topos. If you
don't want to read about categories, click here
First, let's talk about subsets. I swear I'm going somewhere with
this, so bear with me. We know that subsets correspond to predicates,
but what even is a subset?
* If we regard sets as material collections of objects, equipped
with a membership relation [?]\in[?], then SSS is a subset of TTT if
every element of SSS is also an element of TTT. However, this
notion does not carry over directly to structural set theory
(where sets are essentially opaque objects of a category
satifsying some properties), or indeed to type theory (where
there is no membership relation) -- but we can salvage it.
Instead of thinking of a subset directly as a subcollection, we
define a subset S[?]TS \subseteq TS[?]T to be an injective function
f:S-Tf : S \hookrightarrow Tf:S-T. Note that the emphasis is on
the function and not on the index set SSS, since we can have many
distinct injective functions with the same domains and codomains
(exercise: give two distinct subsets of N\bb{N}N with the same
domain^4).
To see that this definition is a generalisation of the material
definition in terms of [?]\in[?], we note that any injective function
f:S-Tf : S \hookrightarrow Tf:S-T can be restricted to a
bijective function f:S-~im(f)f : S \xrightarrow{\sim} \id{im}(f)f
:S~ im(f), by swapping the original codomain for the image of the
function. Since the image is a subset of the codomain, we can
interpret any injective f:S-Tf : S \hookrightarrow Tf:S-T as
expressing a particular way that SSS can be regarded as a subset
of TTT.
* Alternatively, we can think of a subset as being a predicate on
the larger set TTT, which holds of a given element xxx when xxx
is in the subset SSS. Rather than directly thinking of a
predicate as a formula with a free variable, we think of it as a
characteristic function kh:T-{0,1}\chi : T \to \{0,1\}kh:T-{0,1},
from which we can recover the subset SSS as {x[?]T:kh(x)=1}\{ x \in
T : \chi(x) = 1 \}{x[?]T:kh(x)=1}.
This definition works as-is in structural set theory (where the
existence of solution sets for equations -- called "equalisers" --
is assumed), but in a more general setting (one without excluded
middle) like an arbitrary elementary topos, or indeed type
theory, we would replace the 2-point set {0,1}\{0,1\}{0,1} by an
object of propositions, normally denoted with O\OmegaO.
These two definitions are both appealing in their own right -- the
"subsets are injective functions" can be generalised to arbitrary
categories, by replacing the notion of "injective function" with
"monomorphism", while "subsets are predicates" is a very clean notion
conceptually. Fortunately, it is well-known that these notions
coincide; For the purposes of eventually generalising to universes,
I'd like to re-explain this correspondence, in a very categorical
way.
Consider some sets A,BA, BA,B and an arbitrary function f:A-Bf : A \
to Bf:A-B between them. We define the fibre of fff over yyy to be the
set f*(y)={x[?]A:f(x)=y}f^*(y) = \{ x \in A : f(x) = y \}f*(y)={x[?]A:f(x
)=y}, that is, the set of points of the domain that fff maps to yyy --
in the context of set theory, this would be referred to as the
preimage of yyy, but we're about to generalise beyond sets, so the
more "topological" terminology will be useful. We can often
characterise properties of fff by looking at all of its fibres. What
will be useful here is the observation that the fibres of an
injective function have at most one element.
More precisely, let f:A-Bf : A \hookrightarrow Bf:A-B be injective,
and fix a point y[?]By \in By[?]B so we can consider the fibre f*(y)f^*
(y)f*(y) over y. If we have two inhabitants a,b[?]f*(y)a, b \in f^*(y)a
,b[?]f*(y), then we know (by the definition of fibre) that f(a)=y=f(b)f
(a) = y = f(b)f(a)=y=f(b), hence by injectivity of fff, we have a=ba
= ba=b -- if f*(y)f^*(y)f*(y) has any points at all, then it has
exactly one. It turns out that having at most one point in every
fibre precisely characterises the injective functions: Suppose f(a)=f
(b)f(a) = f(b)f(a)=f(b); Then we have two inhabitants of f*(f(a))f^*
(f(a))f*(f(a)), aaa and bbb, but since we assumed that f*(y)f^*(y)f*(
y) has at most one inhabitant for any yyy, we have a=ba = ba=b, so ff
f is injective.
We call a set (type) XXX with at most one element (at most one point)
a subsingleton, since it is a subset of the singleton set (singleton
type), or a proposition, since a choice of point x[?]Xx \in Xx[?]X
doesn't give you any more information than merely telling you "XXX is
inhabited" would. Since there is a 1-to-1 correspondence between
subsingletons XXX and propositions "XXX is inhabited", the assignment
of fibres y-f*(y)y \mapsto f^*(y)y-f*(y) defines a predicate khf:B-O\
chi_f : B \to \Omegakhf :B-O, where we have implicitly translated
between a subsingleton and its corresponding proposition.^5
Conversely, if we have such an assignment kh:B-O\chi : B \to \Omegakh:B
-O, we can make a map into BBB with subsingleton fibres by carefully
choosing the domain. We take the domain to be the disjoint union of
the family kh\chikh, that is, the set [?](x:B)kh(x)\sum_{(x : B)} \chi(x)[?]
(x:B) kh(x). The elements of this set can be described, moving closer
to type-theoretic language, as pairs (i,p)(i, p)(i,p) where i[?]Bi \in
Bi[?]B and p[?]kh(i)p \in \chi(i)p[?]kh(i) is a witness that iii holds.
I claim: the map p1:([?](x:B)kh(x))-B\pi_1 : (\sum_{(x : B)} \chi(x)) \
to Bp1 :([?](x:B) kh(x))-B which takes (i,p)-i(i, p) \mapsto i(i,p)-i
has subsingleton fibres. This is because an element of p1*(y)\pi^*_1
(y)p1* (y) is described as a pair (i,q)(i , q)(i,q), and since it's
in the fibre over yyy, we have i=yi = yi=y; Since kh\chikh (by
assumption) takes values in subsingletons, concluding that i=yi = yi=
y is enough to establish that any two (i,q)(i, q)(i,q) and (i',q')
[?]p1*(y)(i',q') \in \pi_1^*(y)(i',q')[?]p1* (y) must be equal.
The point of the entire preceding discussion is that the notions of
"injections into BBB" and "maps B-OB \to \OmegaB-O" are equivalent,
and, more importantly, this is enough to characterise the object O\
OmegaO^6.
Let me break up this paragraph and re-emphasise, since it'll be the
key idea later: A type of propositions is precisely a classifier for
maps with propositional fibres.
Since the subobjects of BBB are defined to be injections X-BX \to BX-
B, and maps B-OB \to \OmegaB-O represent these subobjects (in a
coherent way), we call O\OmegaO a subobject classifier, following in
the tradition of "classifying objects". A subobject classifier is
great to have, because it means you can talk about logical
propositions -- and, in the presence of functions, predicates -- as if
they were points of a type.
Aside: More on general classifying objects.
A classifying object for some data is an object -- normally written B
(data)B(\text{data})B(data) -- such that maps X-B(data)X \to B(\text
{data})X-B(data) correspond to data "over" XXX, in the sense that the
data admits some map into XXX. Above, we were discussing the
subobject classifier that one can find in an elementary topos; Maps
X-OX \to \OmegaX-O (the object of propositions) correspond to maps
into XXX with "propositional data".
To find another example in category theory, we pass to higher
category theory, i.e. nnn-category theory for high values of nnn,
such as n=2n = 2n=2. As a 1-dimensional structure, we can consider
the data of a category EEE lying over another category BBB to be a
functor E-BE \to BE-B which has certain lifting properties - some
sort of categorical fibration.
One would like to find a correspondence between these "categories
over categories" and some data that we already know how to
conceptualise; It turns out that Grothendieck fibrations over BBB --
i.e., the functors F:E-BF : E \to BF:E-B satisfying a certain lifting
property -- correspond to weak 2-functors into Cat\id{Cat}Cat.
Conversely, this equivalence lets us think of a family of categories
indexed by BBB -- a 2-dimensional concept -- corresponds precisely to
1-dimensional data (a special class of functors between ordinary
categories). Furthermore, the equivalence is induced by a
construction on (weak 2-)functors -- the Grothendieck construction,
written [?]F\int F[?]F -- that exactly mirrors the disjoint union [?]F\sum F
[?]F that we used above!
To summarise:
* A classifying object B(data)B(\text{data})B(data) is an object
where maps X-B(data)X \to B(\text{data})X-B(data) corresponds to
data over ("pointing towards") XXX.
* The kind of classifying objects that we care about here are
ubiquitous in all levels of nnn-category theory, but especially
in (n,1)(n,1)(n,1)-category theory, they behave like universes of
truncated types.
* Since the objects in a nnn-category act like (n-1)(n-1)(n-1)
-categories, the "most complex" classifier that a nnn-topos can
have is for the (n-1)(n-1)(n-1)-category of (n-2)(n-2)(n-2)
-categories, and this all works better in the case where we're
actually talking about (n,1)(n,1)(n,1)-topoi.
However, we often want to talk about sets and families of sets as if
they were points of a type, instead of just subsets. We could pass to
a higher category that has a discrete object classifier -- this would
be a "type of sets" --, but then the same infelicity would be repeated
one dimension higher: we'd have a classifier for "sets", but our
other objects would look like "categories", and we wouldn't be able
to classify those!
Take a moment to convince yourself that the interpretation of a
discrete object classifier as a type of sets makes sense.
Note that we defined "proposition" to mean "at most one element". We
can generalise this to a notion which applies to categories by
defining a discrete category to be one that has propositional
hom-sets, i.e. one in which the only morphisms are identities. Note
that a discrete category is automatically a discrete groupoid, since
the identity map is an isomorphism.
A discrete object classifier (say Set\setSet) would then be a
classifier for functors which have discrete groupoids as fibres -- the
fibres have no categorical information, being nothing more than a
set. Given any set XXX, we'd have a name for it as the interpretation
of the unique map Disc(X)-*\id{Disc}(X) \to *Disc(X)-* as a map
*-Set* \to \set*-Set. The name Set\setSet was chosen because in the
(2,1)(2,1)(2,1)-category Grpd\grpdGrpd, the discrete object
classifier is exactly the category of sets.
Thus we might hope to find, in addition to a subobject classifier, we
could have a general object classifier, an object Type\tyType such
that maps B-TypeB \to \tyB-Type correspond to arbitrary maps A-BA \to
BA-B. Let's specify this notion better using the language of higher
category theory:
A subobject classifier is a representing object for the functor
Sub:Eop-Set\id{Sub} : \mathscr{E}\op \to \setSub:Eop-Set which takes
an object to its poset of subobjects. Since the subobjects of xxx can
be described as (isomorphism classes of) objects of a subcategory of
E/x\mathscr{E}/xE/x, we would hope that an object classifier would
classify the entire groupoid Core(E/x)\id{Core}(\mathscr{E}/x)Core(E/
x), keeping around all of the maps into xxx, and keeping track of the
isomorphisms between them.
Aside: Slice categories and cores, if you haven't heard of them
Giving a precise definition of slice categories and cores would take
us further afield than is appropriate here, so let me give a quick
summary and point you towards the nLab for further details. Fix a
category CCC that we may talk about:
The core of CCC, written.. well, Core(C)\id{Core}(C)Core(C), is the
largest groupoid inside CCC. It has all the same objects, but only
keeps around the isomorphisms.
The slice over an object ccc, written C/cC/cC/c, is a category where
the objects are arrows a-ca \to ca-c in CCC, and a morphism between
f:a-cf : a \to cf:a-c and g:b-cg : b \to cg:b-c in the slice over ccc
is a morphism a-ba \to ba-b in CCC that makes the following triangle
(in CCC) commute:
[light-535b] [dark-535bb]
But for us to be able to represent Core(E/-):Eop-Grpd\id{Core}(\
mathscr{E}/-) : \mathscr{E}\op \to \grpdCore(E/-):Eop-Grpd as an
object U\mathcal{U}U of E\mathscr{E}E, we need the hom-space HomE
(x,U)\id{Hom}_{\mathscr{E}}(x, \mathcal{U})HomE (x,U) to be a
groupoid, instead of just a set; Thus E\mathscr{E}E was actually a
2-category! We're not in the clear though, since a slice of a
n-category is another n-category, we now have a 2-category E/x\
mathscr{E}/xE/x, so the core of that is a 2-groupoid, so our E\
mathscr{E}E was, in reality, a 3-category. This process goes on
forever, from which we conclude that the only category that can have
a fully general object classifier is a [?]\infty[?]-category. However,
since we're always taking cores, we don't need mapping categories,
only mapping groupoids, so we can weaken this to a ([?],1)\io([?],1)
-category: a [?]\infty[?]-category where every map above dimension 1 is
invertible.
Using the proper higher-categorical jargon, we can define an object
classifier Type\tyType in an ([?],1)\io([?],1)-category H\ht{H}H to be a
representing object for the ([?],1)\io([?],1)-presheaf Core(H/
-):Hop-[?]-Grpd\id{Core}(\ht{H}/-) : \ht{H}\op \to \igpdCore(H/-):Hop-
[?]-Grpd.
Again, the importance of object classifiers is that they let us talk
about arbitrary objects as if they were points of an object Type\ty
Type. Specifically, any object BBB has a name, a map [?]B[?]:*-Type\
ulcorner B \urcorner : * \to \ty+B+:*-Type, which represents the
unique map B-*B \to *B-*. To make this connection clearer, let's pass
back to a syntactic presentation of type theory, to see what
universes look like "from the inside".
Universes, internally
Inside type theory, object classifiers present themselves as types
which contain types, which we call universes. Every type is contained
in some universe, but it is not the case that there is a universe
containing all types; In fact, if we did have some magical universe
Type[?]:Type[?]\ty_\infty : \ty_\inftyType[?] :Type[?] , we could reproduce
Russell's paradox, as is done here.
The solution to this is the usual stratification of objects into
"sizes", with the collection of all "small" objects being "large",
and the collection of all "large" objects being "huge", etc. On the
semantic side, a ([?],1)\io([?],1)-topos has a sequence of object
classifiers for maps with k\kappak-compact fibres, where k\kappak is
a regular cardinal; Syntactically, this is reflected much more simply
as having a tower of universes with Typei:Type1+i\ty_i : \ty_{1+i}
Typei :Type1+i . So, in Agda, we have:
_ : Type
_ = Nat
_ : Type1
_ = Type
In Agda, we do not have automatically that a "small" object can be
considered a "large" object. When this is the case, we say that
universes are cumulative. Instead, we have an explicit Lifting
construction that lets us treat types as being in higher universes:
_ : Type2
_ = Lift (lsuc (lsuc lzero)) Type
To avoid having Lift everywhere, universes are indexed by a (small)
type of countable ordinals, called Level, and constructions can be
parametrised by the level(s) of types they deal with. The collection
of all \ell-level types is Typelsuc \ty_{\id{lsuc}\ \ell}Typelsuc
, which is itself a type in the next universe up, etc. Since levels
are themselves inhabitants of a type, we do in fact have a definable
function l. Type\lambda \ell.\ \ty_\elll. Type . To avoid further
Russellian paradoxes, functions out of Level must be very big, so
they live in the "first infinite universe", Typeo\ty_\omegaTypeo .
There is another hierarchy of infinite universes, with Typeo+i\ty_{\
omega+i}Typeo+i inhabiting Typeo+(1+i)\ty_{\omega+(1+i)}Typeo+(1+i)
. To avoid having even more towers of universes, you can't quantify
over the indices iii in Typeo+i\ty_{\omega+i}Typeo+i . To summarize,
we have:
Type:Type1:Type2:[?]Typeo:Typeo+1:Typeo+2:[?] \ty : \ty_1 : \ty_2 : \
cdots \ty_{\omega} : \ty_{\omega+1} : \ty_{\omega+2} : \cdots Type
:Type1 :Type2 :[?]Typeo :Typeo+1 :Typeo+2 :[?]
To represent a collection of types varying over an value of AAA, we
use a function type into a universe, called a type family for
convenience: A type family over AAA is a function A-TypeiA \to \ty_iA
-Typei , for some choice of level iii. An example of a type family
are the finite types, regarded as a family Fin:N-Type\id{Fin} : \bb
{N} \to \tyFin:N-Type -- where Fin(n)\id{Fin}(n)Fin(n) is a type with
nnn elements. Type families are also used to model type constructors,
which are familiar to programmers as being the generic types. And
finally, if we have a type B:TypeiB : \ty_iB:Typei , we can consider
the constant family at BBB, defined to be the function l(x:A). B\
lambda (x : A).\ Bl(x:A). B.
To cap off the correspondence between universes and object
classifiers, thus of type families and maps, we have a theorem
stating that the space of maps into BBB is equivalent to the space of
type families indexed by BBB. As a reminder, you can click on the
name Fibration-equiv to be taken to the page where it is defined and
explained.
_ : [?] {B : Type} - (S[ E [?] Type ] (E - B)) [?] (B - Type)
_ = Fibration-equiv
Universes and size issues
Perhaps one of the most famous paradoxes in the history of formal
logic, Russell's paradox reminds us that, for most reasonable kinds
of "collection" -- be they sets, types, or categories -- considering
the collection of all collections that do not contain themselves is a
road that leads to madness. The standard way of getting around these
issues, at least in set-theoretic foundations as applied to category
theory, is to refer to such problematic collections as "classes", and
to only axiomatise the sets which are not too big.
In the branches of mathematics that Alexander Grothendieck
influenced, a more common approach is to work instead with
Grothendieck universes: A Grothendieck universe is a set U\ca{U}U
that forms a transitive model of ZF, closed under U\ca{U}U-small
indexed products. The classification of models of ZF implies that any
Grothendieck universe U\ca{U}U is equivalent to a segment of the von
Neumann universe Vk\bf{V}_\kappaVk , where k\kappak is a strongly
inaccessible cardinal -- essentially, an upper bound on the size of
the sets in U\ca{U}U.
In traditional mathematical writing, it is common to entirely ignore
size issues, save perhaps for an offhand mention of Grothendieck
universes (or strongly inaccessible cardinals) within the first few
sections. Working in a proof assistant forces us to be honest about
the size of our constructions: Correspondingly, we try to be precise
in our prose as well. As mentioned above, Agda universes are
stratified into a pair of hierarchies Typek\ty_\kappaTypek and
Typeo+n\ty_{\omega+n}Typeo+n , where we're using k\kappak to stand
for a variable of Level type. The Level is a built-in type which
contains a constant representing 000, is closed under taking
successors, and is closed under taking binary maxima.
We refer to structured type (e.g. a category, or a group) as k\kappak
-small when its underlying type inhabits the k\kappakth universe; For
the specific case of categories, we also use the terminology locally
k\kappak-small to refer to the universe in which the family Hom(-,-)\
hom(-,-)Hom(-,-) lands. We use subscript Greek letters in prose to
index our structures by their universe level. For example, Setsk\
sets_\kappaSetsk is the ((1+k)(1+\kappa)(1+k)-small, locally k\kappa
k-small) category of k\kappak-small sets.
Aside: Small (hah!) note on small categories
For those familiar with the notions of internal and enriched
categories, we might rephrase the classical definitions of k\kappak
-small and locally k\kappak-small categories as follows:
* A category C\ca{C}C is k\kappak-small if it is an internal
category in Setsk\sets_\kappaSetsk ;
* A category C\ca{C}C is locally k\kappak-small if it is enriched
over the Cartesian monoidal category Setsk\sets_\kappaSetsk .
Because every type is contained in some universe, we note that every
category that appears in our development is locally k\kappak-small
for some universe level k\kappak. In particular, we have no "type of
categories", but instead a type of "k\kappak-small, locally l\lambdal
-small categories".
Warning: Note that our use of the term "k\kappak-small" is
nonstandard. In set-theoretic foundations, where the only objects are
sets anyway, this means "category internal to Setsk\sets_\kappaSetsk
", as mentioned in the infobox above. In the 1Lab, the objects we
manipulate are higher groupoids rather than sets, and so very few
categories will be internal to a category of sets. Instead, when we
describe a category C\ca{C}C is k\kappak-small, we mean that the type
of objects of C\ca{C}C is an inhabitant of the universe Typek\ty_\
kappaTypek , and that the HomC(-,-)\hom_\ca{C}(-,-)HomC (-,-) family
is valued in Setsk\sets_\kappaSetsk . Our shorthand for the
traditional notion is a "strict k\kappak-small" category.
Our policy is to keep the use of universes "as precise as possible":
Definitions are polymorphic over as many levels as they can possibly
be, and they are always placed in the smallest universe in which they
fit. This is, however, an iterative process: For example, our
definition of "sheaf topos" was originally parametrised over 5
levels, but after formalising more desirable properties of topoi, we
have found that the definition only really mentions two levels.
Interlude: Basics of Paths
Since the treatment of paths is the most important aspect of homotopy
type theory, rest assured that we'll talk about it in more detail
later. However, before discussing the dependent sum of a type family,
we must discuss the fundamentals of paths, so that the categorical/
homotopical connections can be made clear. Before we even get
started, though, there is something that needs to be made very clear:
Paths are not equality!
The only conceptualisation of paths that is accurate in all cases is
that types are [?]\infty[?]-groupoids, and hence path spaces are the Hom-
[?]\infty[?]-groupoids given by the structure of each type. However, this
understanding is very bulky, and doesn't actually help understanding
paths all that much -- so we present a bunch of other analogies, with
the caveat that they only hold for some types, or are only one of the
many possible interpretations of paths in those types.
The ideas presented below are grouped in blue elements, but
not because they aren't important; You should read all of them.
Idea #1: Some types behave like sets, providing an embedding of
"classical" equality behaviour into homotopy type theory. While these
types may not be as "exciting" as others, they aren't any less
important!
Contradicting the big bold warning up there, in a very restricted
class of types, paths do behave like the equality relation! We call
these types sets, and their defining property is that they only have
identity paths -- hence they could also be called "discrete types", by
analogy with "discrete category", but this terminology is not
employed because "discrete" already has a distinct type-theoretic
meaning.^7
Types in this class include the natural numbers, the integers, the
rationals and the reals; Also included is every finite type, and the
powerset of any type at all. The type of functions A-BA \to BA-B,
where BBB is a set, is again a set. We can generalise this and say
that "most" type formers preserve set-ness.
Note that the meaning of "is a set"/"is not a set" in HoTT is
distinct from how that sentence is used in classical mathematics,
where a collection is a set if it's definable in the language of
whatever foundational system you're using. For example, in ZFC, the
collection of all sets is not a set -- because if it were, we could
reproduce Russell's paradox.
In HoTT, being a set has to do with spatial structure: a set has only
trivial paths. Examples of non-sets would then be given by types that
have interesting paths, like the circle, which has a non-trivial
loop. Coincidentally, the type of all sets is not a set, since its
spatial structure is given by the isomorphisms of sets -- and sets can
have non-trivial automorphism groups.
Idea #2: Some types are best understood as representing spaces, with
intrinsic homotopic information, which we can investigate using
type-theoretic methods. This interpretation is the core idea of
"synthetic homotopy theory", a whole field of maths!
We conceptualise some types (like the circle mentioned above) as
being representatives for spaces, though this does not mean that they
have any intrinsically-detectable topology in the sense of "open
sets"-topological spaces. Instead, we think of these types as being
an incarnation of spaces a la homotopy theory, with structure similar
to what you'd find there: Homology groups, cohomology groups, but
most easily seen are the homotopy groups. In these types,
type-theoretic paths quack like topological paths.
Indeed, any type in homotopy type theory has an associated sequence
of homotopy groups, and these behave precisely like the homotopy
groups of classical algebraic topology -- with the difference that the
way we compute them in HoTT has a distinctly more "typey" flavour.
This interpretation of types as spaces leads us to look at the
statement "there is a type XXX with p1(X)[?]Z\pi_1(X) \cong \bb{Z}p1 (X
)[?]Z" and rightly call XXX "the circle", even though a topologist
might object that our circle is not any subset of R2\bb{R}^2R2.
Similarly, we might take the integer 222 and apply the isomorphism
above, to get a path in XXX which "walks clockwise around the circle
twice".
Idea #3: HoTT provides a setting for doing category theory where all
constructions are intrinsically isomorphism-invariant. Under this
light, types provide the underlying groupoids of categories.
When doing category theory in HoTT, we have some types that naturally
arise as the type of objects in a category, such as the type of
groups relative to a universe. In category theory done
set-theoretically, the collection of objects in CCC has two different
notions of sameness: The set-theoretic equality a=ba = ba=b, with
behaviour given by the axiom of extensionality and divorced from the
categorical structure, and the class of isomorphisms of CCC, a[?]ba \
cong ba[?]b, which naturally has to do with the categorical structure.
A big part of doing category theory in set-theoretic foundations is
learning in which situations we should talk about the set-theoretic
equality of objects, and when we should talk about isomorphism -- what
you end up discovering is that constructions that talk about equality
of objects should be done very sparingly, so much so that they're
colloquially called "evil".
Conversely, the possibility of talking about equality of objects at
all means that we have to spend effort verifying that constructions
appropriately respect isomorphisms, though of course this
verification can be (and often ends up being) handwaved.
In HoTT, we can do better: We add a condition to the definition of
category that says the paths of the type of objects must correspond
to the categorical isomorphisms -- which we call "univalence". While
this condition might initially seem hard to fulfill, it's
surprisingly common to come by univalent categories, since most
constructions on categories preserve univalence! For instance:
* If AAA and BBB are univalent categories, then so is the product
AxBA \times BAxB and the disjoint union A[?]BA \uplus BA[?]B; The
initial and terminal categories are trivially univalent.
* If CCC is a univalent category, then so is the slice C/cC/cC/c
and the coslice c/Cc/Cc/C.
* If CCC is a univalent category, then for any DDD, so is the
functor category [D,C][D,C][D,C]; This implies that a very large
class of categories is univalent, e.g. the models of any Lawvere
theory in a univalent category.
Further, since Set\setSet is a univalent category, the Yoneda
lemma implies that every CCC can be embedded into a univalent
category - the functor category [Cop,Set][C\op,\set][Cop,Set]. If
we restrict to the subcategory of representable presheaves, we
have that every CCC admits a full, faithful and essentially
surjective functor into a univalent category, called the Rezk
completion of CCC; If CCC was already univalent, this functor is
an equivalence.
In addition to the benefit of univalent categories that will be
presented outside the fold, they're also interesting in their own
right because they are better constructively behaved than categories
with a set of objects: For instance, in the last bullet point above
we appealed to the statement "every fully faithful, essentially
surjective functor is an equivalence". For univalent categories, this
is a theorem without any form of choice axiom!
One benefit that HoTT brings to the table is that it very naturally
leads to mixing and matching the interpretations above. For example,
the underlying groupoid of a univalent category can just as well be
interpreted as a space, mixing interpretations #2 and #3. The spatial
interpretation of categories really shines when we pass to higher
categories, where the notion of "unique" behaves very badly.
When thinking nnn-categorically, even for small values of nnn, one
must first weaken "unique" to "unique up to unique isomorphism", and
then to something like "unique up to isomorphism, which is unique up
to unique invertible 2-cell". Reasoning spatially, we see that
"unique up to isomorphism" really means that there is a contractible
space of choices for that object -- a notion which is well-behaved at
any level of higher categorical complexity.
Every type AAA has an associated family of path types -- to form a
path type, we must have two inhabitants of AAA handy, called the
endpoints. The type of paths between xxx and yyy in AAA is written
x[?]Ayx \equiv_A yx[?]A y, where the subscript AAA is omitted when it can
be inferred from context. The name path is meant to evoke the
topological idea of path, after which the paths in cubical type
theory were modelled. In topology, a path is defined to be a
continuous function from the real unit interval [0,1]-X[0,1] \to X[0,
1]-X, so that we can interpret a path as being a point "continuously
varying over time".
It would be incredibly difficult (and oversized!) to formalise the
entire real line and its topology to talk about paths, which are
meant to be a fundamental concept in type theory, so we don't bother!
Instead, we look at which properties of the unit interval [0,1][0,1][
0,1] give paths their nice behaviour, and only add those properties
to our type theory. The topological interval is contractible, and it
has two endpoint inclusions, with domain the one-point space. Those
are the properties that characterise it (for defining paths), so
those are the properties we keep!
The interval type, I\bb{I}I, has two introduction rules -- but they
are so simple that we refer to them as "closed inhabitants" instead.
These are written i0\id{i0}i0 and i1\id{i1}i1, and they denote the
left- and right- hand side of the unit interval. A classic result in
the interpretation of constructive mathematics says that any function
defined inside type theory is automatically continuous, and so, we
define the type x[?]Ayx \equiv_A yx[?]A y to be the type of functions
f:I-Af : \bb{I} \to Af:I-A, restricted to thse where f(i0)=xf(\id
{i0}) = xf(i0)=x and f(i1)=yf(\id{i1}) = yf(i1)=y definitionally.
The introduction rule for paths says that if e:I-Ae : \bb{I} \to Ae:I
-A, with e(i0)=xe(\id{i0}) = xe(i0)=x and e(i1)=ye(\id{i1}) = ye(i1)=
y (those are definitional equalities), then we can treat it as a path
x[?]Ayx \equiv_A yx[?]A y. Conversely, if we have a term p:x[?]Ayp : x \
equiv_A yp:x[?]A y, then we can apply it to an argument i:Ii : \bb{I}i:
I to get a term p(i)p(i)p(i) of type AAA. The type of paths satisfies
the same reduction rule (b\betab-reduction) as function types, but
with an additional rule that says p(i0)p(\id{i0})p(i0) is
definitionally equal to xxx (resp. i1\id{i1}i1 and yyy), even if ppp
has not reduced to a l\lambdal-abstraction.
By considering the constant function at a point x:Ax : Ax:A, we
obtain a term (li. x)(\lambda i.\ x)(li. x) of type x[?]Axx \equiv_A xx
[?]A x, called the trivial path, and referred to (for historical
reasons) as refl, since for the types that are sets, it expresses
reflexivity of equality. In the interpretation of types as [?]\infty[?]
-groupoids, refl is the identity morphism. The rest of the groupoid
structure is here too! Any path p:x[?]yp : x \equiv yp:x[?]y has an
inverse p-1:y[?]xp^{-1} : y \equiv xp-1:y[?]x, and for any p:x[?]Ayp : x \
equiv_A yp:x[?]A y and q:y[?]Azq : y \equiv_A zq:y[?]A z, there is a
composite path p[?]q:x[?]Azp \bullet q : x \equiv_A zp[?]q:x[?]A z. There are
paths between paths which express the groupoid identities (e.g.
p[?]p-1[?]reflp \bullet p^{-1} \equiv \id{refl}p[?]p-1[?]refl), and those
paths satisfy their own identities (up to a path between paths
between paths), and so on.
This little introduction to paths was written so that we can make
some of the ideas in the following section, about dependent sums,
precise. It's not meant to be a comprehensive treatment of the path
type; For that, see the section Paths, in detail later on.
Sums
Recall that in the construction of a map into BBB from a predicate
kh:B-O\chi : B \to \Omegakh:B-O, we interpreted kh\chikh as a family of
sets with at most one element, and then took the disjoint union of
that family, written [?](x:B)kh(x)\sum_{(x : B)} \chi(x)[?](x:B) kh(x),
which admits a projection onto BBB. That was actually a sneaky
introduction of the dependent sum type former, [?]\sum[?]! Indeed, the
short discussion there also made some mention of the introduction
rule, but let's reiterate with more clarity here.
If we have some index type A:TypeA : \ty_\ellA:Type , and a family
of types B-Type'B \to \ty_{\ell'}B-Type' , then we can form the
dependent sum type [?](x:A)B\sum_{(x : A)} B[?](x:A) B. When the index is
clear from context, we let ourselves omit the variable, and write [?]B\
sum B[?]B. Since [?]B\sum B[?]B is a type, it lives in a universe; Since it
admits a projection onto AAA and a projection onto a fibre of BBB, it
must be a universe bigger than both \ell and '\ell'', but it need
not be any bigger: The dependent sum of BBB lives in the universe
indexed by [?]'{\ell \sqcup \ell'}[?]'.
The introduction rule for [?]B\sum B[?]B says that if you have an a:Aa :
Aa:A, and a b:B(a)b : B(a)b:B(a), then the pair (a,b)(a,b)(a,b)
inhabits the type [?]B\sum B[?]B. Thus the inhabitants of the dependent
sum are ordered pairs, but where the second coordinate has a type
which depends on the value of the first coordinate. If we take BBB to
be a constant type family, we recover a type of ordered pairs, which
we call the product of AAA and BBB, which is notated by AxBA \times B
AxB.
For the elimination rules, well, pairs would be rather useless if we
couldn't somehow get out what we put in, right? The first elimination
rule says that if we have an inhabitant e:[?](x:A)B(x)e : \sum_{(x :
A)} B(x)e:[?](x:A) B(x), then we can project out the first component,
which is written e.fste \id{.fst}e.fst, to get an inhabitant of AAA.
The computation rule here says that if you project from a pair, you
get back what you put in: (a,b).fst=a(a,b)\id{.fst} = a(a,b).fst=a.
Projecting out the second component is trickier, so we start by
considering the special case where we're projecting from a literal
pair. We know that in (a,b)(a,b)(a,b), the terms have types a:Aa : Aa
:A and b:B(a)b : B(a)b:B(a), so in this case, the second projection
is (a,b).snd:B(a)(a,b)\id{.snd} : B(a)(a,b).snd:B(a). When we're not
projecting from a literal pair, though, we're in a bit of a pickle.
How do we state the type of the second projection, when we don't know
what the first component is? Well - we generalise. We know that for a
pair, a=(a,b).fsta = (a,b)\id{.fst}a=(a,b).fst, so that we may
replace the aforementioned typing for .snd\id{.snd}.snd with the
wordy (a,b).snd:B((a,b).fst)(a,b)\id{.snd} : B((a,b)\id{.fst})(a,b)
.snd:B((a,b).fst). Now we don't use any subterms of the pair in the
typing judgement, so we can generalise:
e.snd:B(e.fst) e\id{.snd} : B (e\id{.fst}) e.snd:B(e.fst)
A very common application of the dependent sum type is describing
types equipped with some structure. Let's see an example, now turning
to Agda so that we may see how the dependent sum is notated in the
formalisation. We define a pointed magma structure on a type to be a
point of that type, and a binary function on that type. If we let the
type vary, then we get a type family of pointed magma structures:
Pointed-magma-on : Type - Type
Pointed-magma-on A = A x (A - A - A)
Taking the dependent sum of this family gives us the type of pointed
magmas, where inhabitants can be described as triples (T,i,m)(T,i,m)(
T,i,m) where T:TypeT : \tyT:Type is the underlying type, i:Ti : Ti:T
is the distinguished point, and m:T-T-Tm : T \to T \to Tm:T-T-T is
the binary operation. Note that, to be entirely precise, the pair
would have to be written (T,(i,m))(T,(i,m))(T,(i,m)), but since by
far the most common of nested pairs is with the nesting on the right,
we allow ourselves to omit the inner parentheses in this case.
Pointed-magma : Type1
Pointed-magma = S Type Pointed-magma-on
Note that, since we fixed the family to Type-Type\ty \to \tyType-Type
, and recalling that Type:Type1\ty : \ty_1Type:Type1 are the first
two universes, the type of all pointed magmas in Type\tyType lives in
the next universe, Type1\ty_1Type1 . If the type of all pointed
magmas were in Type\tyType, then we could again reproduce Russell's
paradox.
Just like functions, pairs enjoy a uniqueness principle, which (since
coming up with names is very hard), also goes by e\etae-expansion.
The rule says that every inhabitant of a S\SigmaS-type is
definitionally equal to the pairing of its projections: p=
(p.fst,p.snd)p = (p\id{.fst}, p\id{.snd})p=(p.fst,p.snd). This
essentially guarantees us that the inhabitants of a dependent sum
type can't have any "extra stuff" hiding from our projection maps;
They are exactly the pairs, no more, and no less.
The dependent sum is a construction that goes by many names. In
addition to the ones mentioned below (which are the names people
actually use), you could call it the Grothendieck construction,
denote it by [?]B\int B[?]B, and have a hearty chuckle -- if it weren't
for the fact that the Grothendieck construction is generally taken
for functors between categories rather than just groupoids.
* S\mathbf{\Sigma}S-type, named after its shape;
* Disjoint union, since if we take AAA to be the 2-point type, then
the dependent sum [?]B\sum B[?]B of B:2-TypeB : 2 - \tyB:2-Type is
exactly the disjoint union B(0)[?]B(1)B(0) \uplus B(1)B(0)[?]B(1);
* Thinking homotopically, you can consider a type family B:A-TypeB
: A \to \tyB:A-Type to be a fibration, since as we shall see
later it satisfies the path lifting property; In that sense, [?]B\
sum B[?]B gives the total space, and the actual fibration is the
first projection map x-x.fstx \mapsto x\id{.fst}x-x.fst.
This last correspondence will be particularly important later, and it
inspires much of the terminology we use in HoTT. Since universes are
object classifiers, we know that there is an equivalence between type
families F:B-TypeF : B \to \tyF:B-Type and maps into BBB. Dependent
sums and paths let us be more precise about this equivalence: We can
turn any type family F:B-TypeF : B \to \tyF:B-Type into a map
fst:[?]F-B\id{fst} : \sum F \to Bfst:[?]F-B, and looking at the fibre of
first over a point yyy recovers F(y)F(y)F(y). We thus blur this
distinction a bit and refer to F(x)F(x)F(x) for some x:Bx : Bx:B as a
fibre of FFF, and we say that something happens fibrewise if it
happens for each F(x)F(x)F(x).
One last thing to mention about S\SigmaS-types is that they are the
realization of general subsets. This is a special case of the idea of
stuff-with-structure, where the structure is merely a witness for the
truth of some predicate of the first value. For example, recall that
we defined the fibre of fff over yyy as {x[?]A:f(x)=y}\{ x \in A : f(x)
= y \}{x[?]A:f(x)=y}. In type theory, this is rendered with a S\SigmaS
-type, as below:
fibre : (A - B) - B - Type
fibre f y = S[ x [?] A ] (f x [?] y)
Dependent products
Imagine that we have two types, AAA and BBB. For simplicity, they
live in the same universe, which can be any universe of our choice.
We know that we can consider BBB to be a constant type family taking
indices in AAA: l(x:A). B\lambda (x : A).\ Bl(x:A). B. Furthermore,
we know that the total space of this family is the product type AxBA
\times BAxB, and that it comes equipped with a projection map
fst:[?]B-A\id{fst} : \sum B \to Afst:[?]B-A.
Given any function f:A-Bf : A \to Bf:A-B, we define its graph graph
(f):A-AxB\id{graph}(f) : A \to A \times Bgraph(f):A-AxB to be the
function lx. (x,f(x))\lambda x.\ (x, f(x))lx. (x,f(x)), which takes
each point of the input space to an ordered pair where the first
coordinate is the input, and the second coordinate is the output of
fff at that input. The graph of a function is special because of its
interaction with the projection map fst:AxB-A\id{fst} : A \times B \
to Afst:AxB-A. In particular, we can see that lx. graph(f).fst\lambda
x.\ \id{graph}(f)\id{.fst}lx. graph(f).fst is the identity function!
This property turns out to precisely characterise the functions
A-AxBA \to A \times BA-AxB which are the graphs of functions A-BA \to
BA-B.
By rewriting the equality we noticed above in terms of function
composition, we obtain .fst[?]graph(f)=id\id{.fst} \circ \id{graph}(f)
= \id{id}.fst[?]graph(f)=id. A function, like graph(f)\id{graph}(f)
graph(f), which composes on the right with some other function fst\id
{fst}fst to give the identity function, is called a section of fst\id
{fst}fst. This lets us characterise the functions A-BA \to BA-B as
the sections of fst:AxB-A\id{fst} : A \times B \to Afst:AxB-A. In
this sense, we see that a function A-BA \to BA-B is precisely a
choice of values f(x)f(x)f(x), where f(x)f(x)f(x) is in the fibre
over xxx -- in this case, the fibres over all points are BBB!
Given a map g:A-AxBg : A \to A \times Bg:A-AxB that we know is a
section of fst\id{fst}fst, we can recover a function f:A-Bf : A \to B
f:A-B. Specifically, given a point x:Ax : Ax:A, we know that the
second projection g(x).sndg(x)\id{.snd}g(x).snd has type BBB -- so we
can simply define fff to be lx. g(x).snd\lambda x.\ g(x)\id{.snd}lx.
g(x).snd. Since we know that ggg is a section of fst\id{fst}fst, we
conclude that graph(f)=g\id{graph}(f) = ggraph(f)=g.
This idea generalises cleanly from the case where BBB is just some
type, to the case where BBB is a type family indexed by AAA. We
define a section of BBB to be a function g:A-[?]Bg : A \to \sum Bg:A-[?]B
, where fst[?]g=id\id{fst} \circ g = \id{id}fst[?]g=id. Note that this
terminology conflates the type family BBB with the projection map
[?]B-A\sum B \to A[?]B-A, keeping with the trend of identifying type
families with fibrations. Generalising the argument from the previous
paragraph, we show that a section ggg gives a rule for assigning a
point in B(x)B(x)B(x) to each point in xxx. We again take the second
projection of g(x).sndg(x)\id{.snd}g(x).snd, but since now BBB is a
type family, this lives in the type B(g(x).fst)B(g(x)\id{.fst})B(g(x)
.fst) -- but since we know that ggg is a section of fst\id{fst}fst, we
can correct this to be B(x)B(x)B(x).
We would like to package up such a rule into a function A-BA \to BA-B
, but this is not a well-formed type, since BBB is a type family.
Moreover, this would "forget" that xxx is sent to the fibre B(x)B(x)B
(x). Thus, we introduce a new type former, the dependent product type
, written [?](x:A)B(x)\prod_{(x : A)} B(x)[?](x:A) B(x) or more directly
(x:A)-B(x)(x : A) \to B(x)(x:A)-B(x). The inhabitants of this type
are precisely the sections of BBB! More importantly, the
introduction, elimination, computation and uniqueness rules of this
type are exactly the same as for the function type.
The dependent product type takes on another very important role when
universes are involved: Supporting polymorphism. In programming, we
say that a function is polymorphic when it works uniformly over many
types; Dependent products where the index type is a universe reflect
this exactly. For example, using polymorphism, we can write out the
types of the first and second projection maps:
project-first : (A : Type) (B : A - Type) - S A B - A
project-first A B x = x .fst
project-second : (A : Type) (B : A - Type)
- (p : S A B) - B (project-first A B p)
project-second A B x = x .snd
In the logical interpretation, the dependent product type implements
the universal quantifier [?]\forall[?]. For example (keep in mind that in
general, we should not read path types as equality), we could read
the type (x y:A)-x[?]y(x\ y : A) \to x \equiv y(x y:A)-x[?]y as the
proposition "for all pairs of points xxx and yyy of AAA, xxx is equal
to yyy". However, an inhabitant of this type has a much more
interesting topological interpretation!
Worked example: Interpreting dependent sums as fibrations and
dependent products as sections lets us derive topological
interpretations for types. We show how to do this for the type
(x y:A)-x[?]y(x\ y : A) \to x \equiv y(x y:A)-x[?]y, and show that a
space AAA admitting an inhabitant of this type has the property of
"being contractible if it is inhabited".
First, we'll shuffle the type so that we can phrase it in terms of a
single dependent product, and a single type family. We define a type
AAA to be a subsingleton if there is a term
f:(xy:AxA)-(xy.fst[?]xy.snd)f : (xy : A \times A) \to (xy\id{.fst} \
equiv xy\id{.snd})f:(xy:AxA)-(xy.fst[?]xy.snd)
An fff of this type can be seen as a section of the type family
F:AxA-TypeF : A \times A \to \tyF:AxA-Type defined by the rule
F(x,y)-(x[?]y) F(x,y) \mapsto (x \equiv y) F(x,y)-(x[?]y)
The total space [?]F\sum F[?]F of this family is the space of all paths
in AAA, which will be written as AIA^\bb{I}AI to emphasise its nature
as a path space. Since it is the total space of a type family, it
comes equipped with a fibration fst:AI-AxA\id{fst} : A^\bb{I} \to A \
times Afst:AI-AxA. Given that a term in AIA^\bb{I}AI can be seen to
be a pair ((x,y),p)((x, y), p)((x,y),p) where p:x[?]yp : x \equiv yp:x[?]
y, we see that this fibration projects both endpoints of a path -- so
we will write it (d0,d1):AI-AxA(d_0,d_1) : A^\bb{I} \to A \times A(d0
,d1 ):AI-AxA, since it is the pairing of the maps which evaluate a
path at the left and right endpoints of the interval.
As a very quick aside, there is a map r:lx. (x,x,reflr : \lambda x.\
(x, x, \id{refl}r:lx. (x,x,refl making the diagram below commute.
This diagram expresses that the diagonal A-AxAA \to A \times AA-AxA
can be factored as a weak equivalence follwed by a fibration through
AIA^\bb{I}AI, which is the defining property of a path space object.
AAA.
A-r~AI-(d0,d1)-AxA A \xrightarrow{\~r} A^\bb{I} \xrightarrow{(d0,d1)}
\to A \times A Ar~ AI(d0,d1) -AxA
A section of this fibration -- that is, a dependent function like fff
-- is then a continuous function AxA-AIA \times A \to A^\bb{I}AxA-AI,
with the property that (d0,d1)[?]f=id(d_0,d_1) \circ f = \id{id}(d0 ,d1
)[?]f=id. Now assume that our space AAA is pointed, say with a point
y0:Ay_0 : Ay0 :A, and that we have a section fff. We can then define
the homotopy (x,t)-f(x,y0,t)(x,t) \mapsto f(x,y_0,t)(x,t)-f(x,y0 ,t),
mapping between the identity function on AAA and the constant
function at y0y_0y0 , exhbiting the space AAA as contractible.
If you assume the law of excluded middle, every space is either
pointed or empty. In this case, we can describe a space AAA equipped
with a map like fff as being "either empty or contractible", but we
prefer the terminology "contractible if inhabited" or subsingleton
instead.
What next?
While the text above was meant to serve as a linearly-structured
introduction to the field of homotopy type theory, the rest of the
1Lab is not organised like this. It's meant to be an explorable
presentation of HoTT, where concepts can be accessed in any order,
and everything is hyperlinked together. However, we can highlight the
following modules as being the "spine" of the development, since
everything depends on them, and they're roughly linearly ordered.
Paths, in detail
open import 1Lab.Path
The module 1Lab.Path develops the theory of path types, filling in
the details that were missing in Interlude - Basics of paths above.
In particular, the Path module talks about the structure of the
interval type I\bb{I}I, the definition of dependent paths, squares,
and the symmetry involution on paths. It then introduce the transport
operation, turns a path between types into an equivalence of types.
It also exposits about the structure of types as Kan complices; In
particular, the Path module defines and talks about partial elements,
which generalise the notion of "horn of a simplex" to any shape of
partial cube; Then, it defines extensibility, which exhibits a
partial element as being a subshape of some specific cube, and the
composition and filling operations, which express that extensibility
is preserved along paths. Finally, it introduces the Cartesian
coercion operation, which generalises transport.
To reiterate, paths are the most important idea in cubical type
theory, so you should definitely read through this module!
Equivalences
The idea of subsingleton type, mentioned in passing in the discussion
about universes and expanded upon in the section on dependent
products, generalises to give the notion of h-level, which measures
how much interesting homotopical information a type has. A h-level
that comes up very often is the level of sets, since these are the
types where equality behaves like a logical proposition. Due to their
importance, we provide a module defining equivalent characterisations
of sets.
open import 1Lab.HLevel
Contractible types can be used to give a definition of equivalence
that is well-behaved in the context of higher category theory, but it
is not the only coherent notion. We also have half-adjoint
equivalences and bi-invertible maps. Finally, we have a module that
explains why these definitions are needed in place of the simpler
notion of "isomorphism".
open import 1Lab.Equiv
open import 1Lab.Equiv.Biinv
open import 1Lab.Equiv.HalfAdjoint
open import 1Lab.Counterexamples.IsIso
Univalence
open import 1Lab.Univalence
The module 1Lab.Univalence defines and explains the Glue type former,
which serves as the Kan structure of the universe. It then
concretises this idea by describing how Glue can be used to turn any
equivalence into a path between types, which leads to a proof of the
univalence principle. After, we prove that the type of equivalences
satisfies the same induction principle as the type of paths, and that
universes are object classifiers.
open import 1Lab.Univalence.SIP
While univalence guarantees that equivalence of types is equivalent
to paths in types, it does not say anything about
types-with-structure, like groups or posets. Fortunately, univalence
implies the structure identity principle, which characterises the
paths in spaces of "structured types" as being homomorphic
equivalence.
---------------------------------------------------------------------
1. As usual, we do not have a type of all groups. Instead, we have a
family of types of groups that depends on how "big" their
underlying type is allowed to be: what universe it lives in.-[?]
2. Consequently, a term of the form (lx. e)e'(\lambda x.\ e) e'(lx.
e)e' is referred to as a b\betab redex (plural b\betab redices),
short for b\betab-reducible expression, but since we're not
really interested in the theory of programming languages here,
this terminology will not come up.-[?]
3. A survey paper of POPL proceedings by Guy L. Steele (Steele 2017)
identified twenty-eight different notations for substitution, so
the word "typical" is.. questionable, at best.-[?]
4. Solution: Recall what it means for a map to be injective - that f
(a)=f(b)-a=bf(a) = f(b) \to a = bf(a)=f(b)-a=b. If the domain of
a function is a set with a single point, then the consequent is
automatically satisfied! Since the maps *-N* \to \bb{N}*-N are in
1-to-1 correspondence with the elements of N\bb{N}N, we can
consider any two numbers to be distinct subsets with the same
domain.-[?]
5. In some foundational systems -- including HoTT, the one we are
discussing -- "subsingleton" is literally taken as the definition
of "proposition", so this translation isn't merely ommitted, it's
not necessary at all.-[?]
6. Let me make this more precise. In any (univalent) category with
finite limits E\mathscr{E}E, we have a functor Sub:Eop-Set\id
{Sub} : \mathscr{E}\op \to \setSub:Eop-Set, which takes ccc to
the full subcategory of E/c\mathscr{E}/cE/c on the injections. We
say that O[?]E\Omega \in \mathscr{E}O[?]E is a subobject classifier
if there is a natural isomorphism Sub[?]Hom(-,O)\id{Sub} \cong \hom
(-,\Omega)Sub[?]Hom(-,O), i.e., if O\OmegaO is a representing
object for the subobject functor. Since the Yoneda lemma implies
that representing objects are unique, this characterises O\OmegaO
.-[?]
7. Discreteness, type-theoretically, has to do with whether we can
decide equality in that type by an algorithm. For example, the
natural numbers are discrete, because we can compare the numerals
they represent in time bounded by the size of the number. On the
other hand, sequences of natural numbers are not discrete,
because an algorithm can not run forever. Any discrete type is a
set, and assuming excluded middle, any set is discrete.-[?]
---------------------------------------------------------------------
References
* Steele, Guy L. 2017. "It's Time for a New Old Language." SIGPLAN
Not. 52 (8): 1. https://doi.org/10.1145/3155284.3018773.