Help! I don't believe in the identity elimination rule for Martin-Löf type theory/HoTT!

Question

I was watching this video this video "$\infty$-Category Theory for Undergraduates" by Emily Riehl, and was onboard with everything except the path induction principle for identity types (27:00 minutes in) (using Riehl's color convention):

Given any type family [(context) $\color{gray}{\Gamma}$, (terms) $\color{orange}{x},\color{orange}{y}$ (of type) $:\color{red}{A}$, (and a "proof/witness of equality") $\color{orange}{p} : \color{red}{\text{Id}_A(x,y)}$ $\vdash \color{red}{B(x,y,p)}$ (any other type we form given the data on the LHS of $\vdash$)], to produce a term of type $\color{red}{B(x,y,p)}$ it suffices to do so only in the super special case of: $\color{orange}{y}$ is $\color{orange}{x}$, and $\color{orange}{p}$ is $\color{orange}{\text{refl}_x} : \color{red}{\text{Id}_A(x,y)}$

Riehl admits that "it is confusing if you're seeing it for the first time", and compares the situation to students learning proofs by induction for the first time, where "it's very confusing for a while but after you've done ten exercises writing induction proofs you sort of start to get the hang of it". I do not doubt/dispute that it is similar for path induction. However, the one major difference, is that I believed in the axiom/principle of mathematical induction when I first learned it, but I don't believe in the path induction principle now.

In another Riehl talk "How I became seduced by univalent foundations", at 30:00 minutes in, Riehl introduces this path induction principle (also called the identity elimination rule), and says "it's, uh, ..., magic". So I think Riehl recognizes that this may be a "sticking-point", and something that people do not find very intuitive. In the above talks (in particular the "...Undergrads" one) (and also in e.g. Elimination rule for identity types in Martin-Lof Type Theory), this identity elimination rule can be made more intuitive from the homotopy interpretation. As described at HomotopyTypeTheory.org (which refers to the path induction principle as the "J-rule"),

The interpretation of propositional equalities between paths (i.e. of propositional equalities between propositional equalities) is homotopy (continuous deformation). The intuition for why J holds but K does not is that a PathFrom M has one endpoint free, which gives you additional freedom in how you can deform the path. To deform (N , P) into (M , Refl), you move the endpoint N over to M, while deforming the P into reflexivity.

However, this now begs the question of why this homotopy interpretation of some type theory should have any credence to the wider entirety of mathematics. In this MO post "Homotopy Type Theory: What is it?", Mike Shulman addresses the question of why "people are pushing univalence and constructive type theory as a new foundations for mathematics" with an excellent blogpost "From Set Theory to Type Theory". I am totally on board with what he writes in this article. In particular I think the univalence axiom is quite well motivated:

The solution to this problem is the univalence axiom, which can be concisely described as saying that “isomorphic structures are equal”.

In another n-category cafe post by Riehl, another example is raised: $\mathbb N \simeq \{x \in \mathbb Z: x\geq 0\}$ vs. $\mathbb N =\{x \in \mathbb Z: x\geq 0\}$. Anybody with a semester/year of experience in abstract algebra can appreciate this notion of formalizing "isomorphic things are equal", and moreover the idea of working in a "isomorphism-invariant" environment:

In set theoretic foundations, of course, we could already prove that any two natural numbers objects were isomorphic. In material-set theory, all we can conclude from that is that any “isomorphism-invariant property” of one natural numbers object carries over to another. Then we have to do work (or, as is usually the case, leave work to the reader) proving that any particular property we care about is isomorphism-invariant. The situation in structural-set theory is somewhat better: if a structural-set theory is presented carefully enough, then we can only say isomorphism-invariant things. Both of these approaches, however, eventually lead to dependent type theory anyway: it is the way to characterize the isomorphism-invariant properties, and the careful way to present structural set theories. Moreover, in univalent type theory we can prove the statement “all properties are invariant under isomorphism” directly, rather than only observing it as a meta-property of the theory.

So alright, perhaps I can appreciate how homotopy theory (and how it relates to being able to formalize this philosophy behind the univalence axiom "isomorphic things are equal") can be used in the wider mathematical universe. But I am still hung up on the J-rule, which is not mentioned in Shulman's overview/type-theory advertisement. I do not see what the insight of the J-rule specifically is, in the wider mathematical universe.

Question 1: Since Martin-Löf came up with his type theory decades before the homotical interpretation, there surely is some deeper/more universal mathematical idea he was trying to capture with his definitions. So what were his justifications for introducing this J-rule? ...though, I also heard that there are other rules, a K-rule and an I-rule (also by Martin-Lof?) that are no longer "accepted" today, so perhaps I should not look so hard into Martin-Lof's original motivations...

This nlab post presents an alternative/equivalent definition of the J-rule, in terms of (IIa) "Substitution of identifications preserves computations" (Liebniz's "indiscernibility of identicals"), and (IIb) "An identification identifies itself with a self-identification". I agree with nlab when it says (IIb) is "subtle", but I don't find it "self-evident" at all. What exactly is this (IIb) fundamentally trying to say about different "proofs of identification with $x$"? That they are "homotopic" in some sense? Why is that natural/reasonable? What insight about everyday proofs does this homotopy interpretation reveal/corroborate?

Compare again the situation to the univalence axiom: it corroborates the everyday intuition/experience of mathematicians treating isomorphic things as equal, and the homotopy interpretation is fundamentally saying that the perspective/tools of homotopy and type theory allow us to work in an "isomorphism-invariant" environment, and to formalize the aformentioned intuition.

Perhaps a more concrete phrasing of the question is: from the homoptopy pictures it looks like if we fix $\color{orange}{x},\color{orange}{y}:\color{red}{A}$, the 2 paths/proofs $\color{orange}{p},\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ may not be homotopic. But the J-rule is the statement that fixing only $\color{orange}{x}:\color{red}{A}$ and letting $\color{orange}{y}$ roam free, every $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$ is homotopic to $\color{orange}{\text{refl}_x}:\color{red}{\text{Id}_A(x,x)}$. What phenomenon is this "2-fixed points" vs. "1-fixed point" trying to capture about these "proofs of equality" $\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$? It seems reasonable to me that given a type $\color{red}{B(x,y,p)}$ involving $\color{orange}{x},\color{orange}{y}:\color{red}{A}$,$\color{orange}{p}:\color{red}{\text{Id}_A(x,y)}$, maybe $\color{orange}{p}$ is "nice" in some way that $\color{orange}{q}:\color{red}{\text{Id}_A(x,y)}$ isn't, so a term/proof/witness in $\color{red}{B(x,y,p)}$ may not necessarily lead to a term/proof/witness in $\color{red}{B(x,y,q)}$. But why then should a term/proof/witness in $\color{red}{B(x,x, \text{refl}_x)}$, the "simplest/nicest" scenario, necessarily lead to a term/proof/witness in $\color{red}{B(x,y,p)}$? I do not see what philosophy about "identity" these definitions are trying to capture.

P.S. in video educational content about HoTT online, there's this beautiful series "Introduction to Homotopy Type Theory" by Jacob Neumann which hasn't gotten to identity elimination yet, and the HoTTEST summer school lecture 3, which covered it @19:00 in one breath. Since I am more accepting of (IIa) Leibniz equality, I found "⩧ ≅ ≡: Leibniz Equality is Isomorphic to Martin-Löf Identity, Parametrically", but this video did not discuss the J-rule. I really did look everywhere I could.

I see a lot of push from HoTT/category theory people to introduce these concepts early (e.g. to undergrads), trying to reduce technicalities and abstraction. But I think they are missing one ingredient; it is not the technicalities or abstraction that keep me away, but rather my incredulity :( Sorry if my question is too long.

The path induction rule simply says that everything is invariant under homotopy, up to homotopy. If you take the proof-irrelevant form, it just says that identicals are indiscernible, i.e. equal things have the same properties. The homotopical version is a bit more subtle, certainly, so you should convince yourself of the proof-irrelevant form first. — Zhen Lin, Apr 10 '23 at 22:37
Two paths from $x$ to $y$ may not be homotopic if the endpoints are fixed. But in the totality of paths from $x$ to all, not necessarily fixed $y$, all paths are homotopic; indeed, the path space of any space is contractible. — Арсений Кряжев, Apr 11 '23 at 01:06
@АрсенийКряжевiswithUkraine right I understand that, but I don't see what this is saying beyond/outside of homotopy theory. If there is an analogy between paths in topological spaces and proofs of propositions, what is this fixed endpoint/free endpoint contractible subtlety about paths in topology saying about proofs in logic? — D.R., Apr 11 '23 at 02:22
@ZhenLin I like this deceptively simple but actually profound phrase "everything is invariant under homotopy, up to homotopy". Do you have some "explicit example" showing this philosophy is insightful in general math, like how I mentioned the philosophy of univalence is insightful even in the simple example of $\mathbb N$ vs. $\mathbb Z$? And could you explain how the J-rule captures this slogan?
I also don't know this term "proof-irrelevant". I am already convinced of nlab's (IIa) as I mentioned above in my post, which I think is what you mean by "equal things have the same properties". — D.R., Apr 11 '23 at 02:33
One way to view J is in the context of other induction principles. Loosely, an induction principle for an inductive type tells you that, to map out of the type, it suffices to specify what happens to the constructors. E.g., to map out of a $\sum$ type, you only have to specify what happens to pairs. Similarly, J tells you that to map out of the identity type (family), it suffices to specify what happens to the reflexivity terms. This way, we can see that identity types are plain old inductive types. Whats surprising is that they have "room" for non-trivial paths at all. — IsAdisplayName, Apr 11 '23 at 04:26
In my opinion, the best way to get a sense for path induction is to follow Riehl's advice and solve a bunch of exercises (from, e.g., the HoTT book or the new intro to HoTT book), even if you are not yet convinced of the principle. — IsAdisplayName, Apr 11 '23 at 04:30
Some mathematicians do not believe in the axiom of choice, existence of infinite sets, or the law of the excluded middle. You are not obligated to believe in the "path induction principle". — Somos, Apr 11 '23 at 21:56
@Somos This is true. But in threads about those questions (e.g. https://math.stackexchange.com/questions/1253650/is-the-axiom-of-choice-really-all-that-important, or https://math.stackexchange.com/questions/120925/what-is-the-advantage-of-accepting-the-axiom-of-choice-over-other-axioms), answers have been provided along the lines of e.g. "the negation of AoC is equivalent/leads to this "weird" consequence", which the initial poster (or anybody else) may be more inclined to believe. Such an answer would fit perfectly here, and in fact pretty much what I'm looking for. — D.R., Apr 12 '23 at 03:49
I can't get into details in a brief comment, but those answers are "pro-choice" and don't present the case against AOC. In brief, much of "modern mathematics" beginning with ZFC set theory is mostly non-constructive "imaginary" and it is very convenient to accept AOC to bring some order to this imaginary realm. For example, every vector space has a Hamel basis. This is non-constructive existence. — Somos, Apr 12 '23 at 11:00
@somos sorry yes I mistyped, I meant to say "AoC is equivalent/leads to this totally reasonable sounding thing (e.g. Cartesian product of non-empty sets is non-empty; or a well-behaved notion of cardinality), which the initial poster (or anybody else) may be more inclined to believe". So for my question, a potential answer I could consider good would give some equivalent statement of the path induction principle that may sound more reasonable to believe. — D.R., Apr 13 '23 at 05:10

score 10 · Answer 1 · answered Apr 11 '23 at 23:14

You need to take a few steps back and think about what equality means logically. The fundamental principle is that identicals are indiscernible. That means:

In a context where $x$ and $y$ are variables of the same type and $\phi$ is a proposition where $y$ does not appear, assuming $x = y$ and $\phi$, we can deduce $\phi$ with some (or none, or all) appearances of $x$ replaced by $y$.

For example, imagine $x$ and $y$ are variables of type $\mathbb{N}$ and $\phi$ is the proposition "$x$ is even". Then the rule says: assuming $x = y$ and $x$ is even, we can deduce $y$ is even. Or, imagine $\phi$ is proposition "$x + 1 = x + 1$". Then the rule says: assuming $x = y$ and $x + 1 = x + 1$, we can deduce $x + 1 = y + 1$. This rule is so obvious that we almost never think about it, but if you have never even thought about it once before, it is worth pausing now to reflect on it. Or perhaps you could try using this rule to formally prove some basic facts about equality, like symmetry and transitivity.

It is a bit tricky to formalise replacing only some appearances of a variable in a formula, so we usually do it in reverse when formalising the rule:

In a context where $x$ and $y$ are variables of the same type and $\phi$ is a proposition possibly depending on $x$ and $y$, assuming $x = y$ and $\phi$ with (all appearances of) $y$ replaced by $x$, we can deduce $\phi$.

But this is precisely the proof-irrelevant form of the J rule! There is not really a formal process for "promoting" a proof-irrelevant logical rule to a proof-relevant one, but we could imagine something like this. (This may or may not be how Martin-Löf invented the rule.)

First, we need to introduce a variable $p$ of type $x = y$, because we are working in a propositions-as-types system. Then, we need to allow $p$ (as well as $x$ and $y$) to appear in $\phi$, because we are working in a proof-relevant system. Since $\phi$ is a type now, instead of assuming $\phi$, we are given terms of type $\phi$. Thus:

In a context where $x$ and $y$ are variables of the same type, $p$ is a variable of type $x = y$, and $\phi$ is a type possibly depending on $x, y, p$, given a term $t$ of type $\phi$ with $y$ replaced by $x$ and $p$ replaced by $\textrm{refl}_x$, $J (t; x, y, p)$ is a term of type $\phi$.

We also need a rule to reduce $J (t; x, y, p)$ back to its inputs:

Furthermore, $J (t; x, x, \textrm{refl}_x)$ reduces to $t$.

Voilà: this is the identity induction principle.

Sorry I'm reading this again after a while (it is my favorite answer, but I'm not accepting since I still feel like I want other perspectives), and don't remember what you meant by $J(t; x,y,p)$; I also seemed to forget where else used/introduced that notation. Also to be clear on the other notation, you have a type $\phi= \phi(x,y,p)$, and a term $t$ of type $\phi(x,x, \text{refl}_x)$, right? Do you also have any more thoughts/intuitions on the more "philosophical" side, e.g. on 'What phenomenon is this "2-fixed points" vs. "1-fixed point" trying to capture about these "proofs of equality"'? — D.R., Dec 04 '23 at 06:12

Dan Doel · Answer 2 · 2023-04-14T23:44:09.507

$\newcommand{\J}{\mathsf{J}}$ $\newcommand{\K}{\mathsf{K}}$ $\newcommand{\refl}{\mathsf{refl}}$ $\newcommand{\Id}{\mathsf{Id}}$

The original motivation of the $\J$ rule is essentially the same as motivating any induction rule by thinking about 'standard' elements. For instance, for the natural numbers we are given two ways of building them: zero and successor. Then the standard explanation of induction is that states that every value is built by applying one of these two constructors finitely many times.

The identity type is similar. In this case, one thinks of an inductively generated family of types, rather than a single type. It's most convenient to consider the family to be:

$$\Id_A x : A → \mathsf{Type}$$

Then this family is given a single generator:

$$\refl : \Id_A x\ x$$

which generates a value only in the portion of the family over $x$. So now, the standard explanation is that if we have $e : \Id_A x\ y$, it must actually be $\refl$, and $x$ must be $y$, and this is, allegedly, what the induction principle $J$ is trying to express. So, for some other family $F$, and a value $v : F\ x\ \refl$, we can get a value $F\ y\ e$, because actually $y$ is $x$ and $e$ is $\refl$, so $v$ is such a value.

Of course, the whole point of HoTT is that this is not actually what $\J$ ensures, much like induction on the natural numbers doesn't rule out non-standard models. And it can be fruitful to add principles (like univalence) that are motivated by the 'non-standard' models, rather than the 'standard' one (like $\mathsf{K}$).

Edit: I'm not sure if this helps, but it is potentially another aspect:

Martin-löf was trying to write rules for (constructive) sets (or maybe some kind of realizability underlying them). His rules are (I think) pretty natural from that perspective. The identity type appears to be the definition of a family of types that is trivially inhabited along the diagonal ($\Id_A x\ x$) and empty in the off-diagonal. And the $\J$ rule easily leads to a substitution rule by eliminating into a family that ignores the equality proof.

The remarkable thing is that the rules still make sense when you interpret identity types/proofs in terms of homotopy paths, as long as you leave out later attempts to 'fix' the identity type to actually behave more like set theoretic equality. This is the purpose of the $\mathsf{K}$ rule, for example. By thinking about the 'standard' interpretation of what the $\J$ rule is supposed to mean, you'd probably expect it to entail the $\mathsf{K}$ rule, because, "induction means that only generated values exist," and that is what $\K$ says, too, except specialized to diagonal values. But in fact it doesn't.

This is also part of the philosophy behind teaching things this way, I believe. Part of the 'agenda' is to show that homotopical mathematics is in many ways just as natural as set theoretic mathematics. And one way you can tell this is that someone who was trying to write down a system for set theoretic mathematics ~50 years ago accidentally wrote down rules for homotopical mathematics instead, and you just have to not add rules that force it to be set theoretic.

Right but I don't see any reason to believe that that type family should be generated by a single generator refl. Indeed I think that this way of presenting the material is not good for first time learners, because as you said the focus of HoTT is really on the non-standard models (and the intuitions therein), whereas the entirety of one's intuition about $\mathbb N$ comes from the standard model. Although technically the situation may be similar to that of $\mathbb N$, philosophically it is not. — D.R., Apr 11 '23 at 19:10
You phrase the univalence axiom as something that could be done, if one pursues the "side-road" of non-standard models; but if HoTT is to be thought of as foundation of math, I don't want it to feel like a "side-road", I want it to feel inevitable. Like a new perspective on the "main road" of math, a re-analysis of the earth on which we already tread. Not a "could be done", but a "must be done". — D.R., Apr 11 '23 at 19:10
'The family inductively generated by $\mathsf{refl}$,' is the definition of the identity type, in Martin-löf type theory. I don't think I can help if you "don't believe" the definitions of things. — Dan Doel, Apr 11 '23 at 19:44
As @DanDoel says. If we use the non-based version of the identity type (which is equivalent to the fixed base version), that defines exactly the initial reflexive relation on a type. Its induction principle is equivalent to the statement of initiality and also equivalent to J. — András Kovács, Apr 11 '23 at 23:32
I don't think it can be argued that equality should not be an initial reflexive relation. What we can choose is whether it should be a proof-relevant relation, which is compatible with HoTT, or a proof-irrelevant one, which isn't. Intensional type theory is consistent with both. — András Kovács, Apr 11 '23 at 23:34
@DanDoel it is not that I don't believe that that is the definition; of course it is. But I am asking for help to be convinced that that is the correct definition. For what I mean by "wrong" or "correct" definition, see https://mathoverflow.net/questions/31358/can-a-mathematical-definition-be-wrong. — D.R., Apr 12 '23 at 03:45

score 2 · Answer 3 · answered Apr 11 '23 at 20:37

2

The induction rule for identity types is, I think, quite natural in the context of extensional type theory. See nCatLab extensional type theory article for instance.

Identity types in the context of intensional type theory have a much more complex structure, but possibly the initial intuition of Per Martin-Löf was first inherited from the extensional case ? I have searched a few months ago the initial article where identity types were mentioned to try to get an answer, but this was not successful.

The similarity between the induction rules for natural numbers and identity types is also not very appealing to me. There is of course a similarity, but it remains fuzzy in the details, when for other inductive types, the induction rules are not just similar, they can really be derived from the same meta-theoretical concept.

Identity types in the context of intensional type theory can be complicated objects. So isn't normal that the expected, desired behaviour of such complicated objects is not fully obvious at first? It is the result of experience and research (see for instance equality and dependant type theory), and it becomes natural only when you become more familiar with the objects of the theory.

answered Apr 11 '23 at 20:37

L. Garde

641

Thank you for trying to answer my question on the history/initial days of identity types, even if the search was unsuccessful. I'm glad to hear that you find the "similarity between the induction rules ... is also not very appealing"; as a novice it's hard to know what the professionals deem trivial/natural. – D.R. Apr 12 '23 at 04:05
1

Yes, it is "normal that the expected, desired behaviour of such complicated objects is not fully obvious at first", but as with all math, I think that there should always be hints/foreshadows of this depth. It is the task of the expositor to find the simplest, most natural (in the perspective of the intended audience) non-trivial demonstration of the interesting/new phenomena. – D.R. Apr 12 '23 at 04:05
@D.R. Sorry, I am not a professional, don't infer too much from my answer ! – L. Garde Apr 20 '23 at 19:37
@D.R. From my point of view the identity induction principle is natural for structures (types): you expect that a proof p of a property P(A,B,e) about 2 structures A and B and a proof e that they are equivalent, can be derived automatically by a constructive transport from a proof p' of P(A,A,refl_A). The J-rule gives you this result, but not constructively: it asserts that the expected proof holds, but does not compute p. This intuition about types however does not extend so naturally to terms... – L. Garde Apr 20 '23 at 20:10
The similarity is not fuzzy at all. Identity types and other inductive types are all special cases of inductive families, as developed by Dybjer. They all share a uniform set of rules. – Trebor Apr 29 '23 at 14:40

Help! I don't believe in the identity elimination rule for Martin-Löf type theory/HoTT!

3 Answers3