https://math.stackexchange.com/a/5075056/6708

Skip to main content
 
 

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including
Stack Overflow, the largest, most trusted online community for
developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange
[                    ]
Loading...
 1.  
      + Tour Start here for a quick overview of the site
      + Help Center Detailed answers to any questions you might have
      + Meta Discuss the workings and policies of this site
      + About Us Learn more about Stack Overflow the company, and our
        products
 2.  
 3. current community

      +  
        Mathematics
        help chat
      + 
         
        Mathematics Meta
   
    your communities

    Sign up or log in to customize your list.

    more stack exchange communities

    company blog
 4.
 5. Log in
 6. Sign up

Mathematics

 1.
     1.  
        Home
     2.  
        Questions
     3.  
        Unanswered
     4.  
        Tags
     5.
     6.  
        Chat
     7.  
        Users
     8.
 2. Teams
    [teams-promo]

    Ask questions, find answers and collaborate at work with Stack
    Overflow for Teams.

    Try Teams for free Explore Teams
 3.  
    Teams
 4. Ask questions, find answers and collaborate at work with Stack
    Overflow for Teams. Explore Teams

Teams

Q&A for work

Connect and share knowledge within a single location that is
structured and easy to search.

Learn more about Teams

Can PA prove "each Goodstein sequence can be proven in PA to reach
zero"?

Ask Question
Asked 3 days ago
Modified today
Viewed 458 times
6
 
$\begingroup$

This is one of a pair of questions trying to understand this comment
on the xkcd forum contest My number is bigger than yours!. For a
definition of Goodstein sequences, see this question.

Let $G(n)$ be a formula that says "the Goodstein sequence starting
with $n$ eventually reaches $0$". Now, it is known that Peano
arithmetic ($\mathrm{PA}$) can prove any particular instance $G(n)$
where $n$ is a "standard" natural number (e.g. $15$, $268$, etc.),
but it cannot prove $\forall n \in \mathbb{N}: G(n)$.

The proof in $\mathrm{PA}$ for any particular instance isn't
difficult: just calculate enough terms of the sequence until it
reaches $0$. But we only know this always works (and know how many
terms to calculate) thanks to a stronger system (such as $\mathrm{ZF}
$) that can prove Goodstein's theorem.

Now, the comment author claims that in fact, this can be proven using
$\mathrm{PA}$ itself. I find that dubious, since the simple argument
above does not go through. But perhaps there is some way to use parts
of the proof of Goodstein's theorem that are accessible to $\mathrm
{PA}$ (using ordinal numbers up to, but not including, $\
varepsilon_0$).

Just to clarify the question, let us use the notation $P_{\mathrm
{PA}}(p, \ulcorner\phi\urcorner)$ to mean "$p$ codes for the proof of
$\phi$ in $\mathrm{PA}$". Then my question is what system we need to
prove:

$$ \forall n \in \mathbb{N}: \exists p \in \mathbb{N}: P_{\mathrm
{PA}}(p, \ulcorner G(n)\urcorner) $$

Is $\mathrm{PA}$ enough, or do we need a stronger system, e.g. $\
mathrm{ZF}$?

  * ordinals
  * proof-theory
  * peano-axioms

Share
Cite
Follow 
edited Jun 10 at 22:58
 
J.G.'s user avatar
J.G.
118k88 gold badges7979 silver badges146146 bronze badges
asked Jun 10 at 21:55
 
Kotlopou's user avatar
KotlopouKotlopou
75955 silver badges1717 bronze badges
$\endgroup$
19

  * 1
    $\begingroup$ I've given it a try. $\endgroup$
    - J.G.
    Commented Jun 10 at 22:58
  * 1
    $\begingroup$ The answer is that PA is enough. I'm writing up an
    explanation, but it is kind of long. $\endgroup$
    - btilly
    Commented 2 days ago
  * 1
    $\begingroup$ @CarlMummert I think maybe we can distill it down
    to two claims that suffice for a positive result, that PA proves
    each of the following: 1) $\forall k\; PA \vdash TI(\omega_k) $
    and 2) $\forall n \exists k\; PA\vdash (TI(\omega_k) \to G(n)).$
    I think those are both true (though if your point is just that
    it's hard to see if this issue, which is the crux, is adequately
    explained in the answer, then I certainly agree with that). $\
    endgroup$
    - spaceisdarkgreen
    Commented 2 hours ago
  * 1
    $\begingroup$ @CarlMummert $n$ and $k$ are not restricted to
    standard here... I'm claiming (or saying I think so anyway) that
    1) and 2) can be proven in PA. e.g. for 1) I don't see an issue
    with performing the usual inductive proof of this fact inside PA.
    $\endgroup$
    - spaceisdarkgreen
    Commented 2 hours ago
  * 2
    $\begingroup$ @CarlMummert Sorry, maybe I'm off-base here, but
    here's how I'm thinking about it: (1) and (2) are statements in
    the language of arithmetic (I'll refer to this as the
    metalangauge, for clarity) that I'm thinking are provable in PA.
    $\omega_k$ is a term in the language of arithmetic (the object
    langauge) defined in arithmetic (the metalanguage), recursively
    on $k$. (We can also quantify in metalanguage, outside the
    turnstile, over formulas $TI$ ranges over, but I think we only
    need to talk about once instance of that schema here.) $\
    endgroup$
    - spaceisdarkgreen
    Commented 2 hours ago

  |  Show 14 more comments

1 Answer 1

Sorted by: Reset to default
[Highest score (default)     ]
3
 
$\begingroup$

PA is enough, because PA can encode computation.

This is longer than I expected, and was made longer still by some
browser crashes. But I'd been idly thinking of writing these ideas
up. I hadn't for these reasons.

 1. It is a lot of work.
 2. What I have to say is obvious to logicians, and they would
    consider the detour into programming to only be a distraction.
 3. Computer programmers who can appreciate the programming detour,
    are mostly not that interested in logic.

But this question hits the perfect intersection, and so I'll outline
my thoughts in some detail. First I will answer the question in
detail, assuming that anything you can do by a mechanical process
(for example on a computer), can be done in PA. Then I will explain
how PA encodes computation by outlining how to bootstrap Lisp from
PA. If you're curious about bootstrapping Lisp, search for "Prelude".
Otherwise, read on.

To answer this question, it suffices for PA to be able to prove how
long the proof must be for any particular $G(n)$, and to prove that
it can construct that proof. I will show this by demonstrating that
PA can construct a proof of length $O(\log^*(n) \log(\log^*(n)))$,
where the funny function is the very slowly growing iterated
logarithm. In general, it takes more work to write out $n$, than it
does to write out a partial proof of Goodstein's theorem which covers
$n$!

But since larger values of $n$ need longer proofs, this doesn't lead
to a proof in PA of Goodstein's theorem.

First a detour into the roots of Goodstein's theorem.

---------------------------------------------------------------------

What are ordinals?

In John Von Neumann's famous construction, you get ordinals in ZFC
from the following rules.

 1. Every ordinal is a set.
 2. The empty set $\{\}$ is an ordinal, we call it $0$.
 3. If $\text{ord}$ is an ordinal, then so is $\text{ord} \cup \{\
    text{ord}\}$. (In Lisp notation we can call that $(s \, \text
    {ord})$.)
 4. If $X$ is a set of ordinals, then the union of $X$ is also an
    ordinal.

In this construction, the ordinals have an order $<$ created by the
rule that $x < y$ if and only if $x \in y$.

So the ordinals start off with $0 = \{\},\; 1 = \{0\},\; 2 = \{0, 1
\},\; \ldots,\; \omega = \{0, 1, 2, \ldots\},\; \omega + 1 = \{0, 1,
2, \ldots, \omega\}$ and so on. Where "and so on" quickly boggles the
mind.

The Goodstein Sequence is built around a hereditary base notation.
This notation represents an ordinal written out in Cantor normal form
.

Cantor normal form works like this. Any ordinal $\text{ord}$ can be
written out uniquely in the form $((n_1, \text{ord}_1), (n_2, \text
{ord}_2, \ldots, (n_k, \text{ord}_k))$ where each $n_i$ is a positive
natural number, each $\text{ord}_i$ is an ordinal, and $\text{ord}_1
> \text{ord}_2 > \ldots > \text{ord}_k$.

I will not prove this result. Nor will I prove that this notation
always represents an ordinal. But the following example gives you an
idea of how it works out in practice.

In this notation, the empty list, $()$, represents the ordinal $0$.
And $((n_1, \text{ord}_1), (n_2, \text{ord}_2, \ldots, (n_k, \text
{ord}_k))$ represents $n_1 \omega^{\text{ord}_1} + n_2 \omega^{\text
{ord}_2} + \ldots + n_k \omega^{\text{ord}_k}$.

For example, we can write $\omega^{\omega}$ as $((1, \omega))$. Which
in turn is $((1, (1, 1)))$.

In this notation, $<$ is calculated lexicographically as follows.

To compare $((n_1, \text{ord}_1), (n_2, \text{ord}_2, \ldots, (n_k, \
text{ord}_k))$ to $((m_1, \text{ord2}_1), (m_2, \text{ord2}_2, \
ldots, (m_k, \text{ord2}_k))$, first we compare $\text{ord}_1$ to $\
text{ord2}_1$, then we compare $n_1$ to $m_1$, then we compare $\text
{ord}_2$ to $\text{ord2}_2$, then we compare $n_2$ to $m_2$, and so
on. We stop as soon as any comparison shows that one is larger than
the other, or if one ordinal sequence runs out before the other.
(Shorter is smaller.)

Now go back and look at the Goodstein Sequence, and convince yourself
that hereditary base notation represents ordinals written out in
Cantor normal form.

---------------------------------------------------------------------

About induction.

The fifth axiom of PA gives us induction. Here is what it says.

Suppose that $S$ is a statement that we can make about natural
numbers. Suppose further that we can prove two things:

 1. $S$ is true of $0$.
 2. If $S$ is true of $n$, then $S$ is true of $(s \, n)$. (I will
    use Lisp notation a lot in this answer. I'm just getting you used
    to it right now.)

Then $S$ is true of all natural numbers.

From induction, PA can recursively define $<$ and show that it is
true. And it can also prove the theorem that strong induction works:

Strong Induction Suppose that $S$ is a statement about the natural
numbers. Suppose that we can prove that, for every natural number $n$
, if $S$ is true for all natural numbers less than $n$, then it must
be true for $n$.

Then $S$ is true for all natural numbers.

Note that no specific reference is made to $0$. But $0$ works because
there are no natural numbers that are less than $0$, and every
possible statement is true about all members of the empty set. That
said, it is often convenient in proofs to deal with $0$ as a special
case.

In ZFC, we can go further. We can prove that strong induction works
for the ordinals. The result is a kind of induction that goes up to
infinite ordinals, so we call it transfinite induction.

I will demonstrate how transfinite induction works with two proofs
about things written out in Cantor normal form. (Remember, I told you
that these are ordinals, but I didn't prove that they actually are
ordinals. So these theorems do not immediately follow for us from
transfinite induction.)

Theorem 1: Any descending sequence of numbers in Cantor normal form
must be of finite length.

Theorem 2: We can prove statements about numbers in Cantor normal
form with transfinite induction.

Note that the observation that hereditary base notation is something
written out in Cantor normal form, and Theorem 1, we can prove
Goodstein's theorem. We will need Theorem 2 later.

Here are the proofs.

Theorem 1: Proof Suppose that we have a descending sequence of
numbers written in Cantor normal form. If it is an empty sequence,
then the theorem is true, so let's assume that it is a nonempty
sequence. Let's look at the first element in the sequence.

If the first entry is the empty list, then the list cannot have more
members, and the theorem is true of this sequence. Therefore we may
assume that the first entry has a first term of the form $(n, \text
{ord})$ with $n$ a positive integer, and $ord$ an ordinal. And, by
the induction hypothesis, we know that all descending sequences that
start with a smaller ordinal, or with the same ordinal and a smaller
natural number, must be finite in length.

We shall break the original sequence into two descending sequences,
with the second possibly empty. The first sequence is the members of
the sequence whose entries have $(n, \text{ord})$ as their first
term. The second sequence is the members that don't.

All of the members of the first sequence can be mapped to another
sequence that just has that common first entry missing.

The second sequence might be empty, in which case it has length $0$,
which is finite.

Either sequence could start with the element $()$, in which case it
has length $1$, which is finite.

Otherwise each sequence must start with an element whose first term
is of the form $(m, \text{ord2})$, with either $\text{ord2} < \text
{ord}$, or with $\text{ord2} = \text{ord}$ and $m < n$. In which
case, by the induction hypothesis, the sequence must have finite
length.

Since the overall sequence is the combination of two finite
sequences, it must also be finite.

And now that we have the first theorem, we can prove the second
fairly easily.

Theorem 2: Proof Suppose that we have a statement $S$, and we have
shown that if this statement is true for all things written in Cantor
normal form less than a given one, then it is true of that one as
well.

Suppose further that we have $x_1$ in Cantor normal form, and
statement $S$ is false for $x_1$.

In order for $S$ to be false for $x_1$, there must exist a smaller 
$x_2$ for which $S$ is false. And below $x_2$ there must be a smaller
$x_3$, and so on. And so we get a descending sequence of things
written in Cantor normal form.

By theorem 1, this must be a finite sequence. It must wind up like $
(x_1, x_2, \ldots, x_k)$ and be unable to be extended farther. By
construction, the statement $S$ is false for $x_k$. By the induction
hypothesis, $S$ must be true for $x_k$. This is a contradiction, and
so $x_1$ cannot exist.

If $S$ can't be false for anything written in Cantor normal form,
then $S$ is true of all of them. And so the theorem is proven.

---------------------------------------------------------------------

Transfinite Induction in PA.

PA cannot prove that transfinite induction works for all ordinals.
But it can prove it for some ordinals.

First of all, PA can prove that strong induction works. This is
transfinite induction for all members of $\omega$.

If you look at the proofs of theorems 1 and 2, you'll see that from
transfinite induction on $\omega$, PA can prove transfinite induction
for $\omega^{\omega}$.

If you look at the proofs of theorems 1 and 2, you'll see that from
transfinite induction on $\omega^{\omega}$, PA can prove transfinite
induction for $\omega^{\omega^{\omega}}$.

If you look at the proofs of theorems 1 and 2, you'll see that from
transfinite induction on $\omega^{\omega^{\omega}}$, PA can prove
transfinite induction for $\omega^{\omega^{\omega^{\omega}}}$.

And so on. Absolutely nothing changes between these proofs other than
the height of the tower of $\omega$s. Therefore they can be produced
purely mechanically. Because you we are writing out that tower each
time, the $m$th such proof winds up with length $O(m)$. And to get
there we need $m$ proofs. And so we see that a proof of this exists
for a tower of height $m$ in PA, and the proof is of length $O(m^2)$.

These proofs can be trivially shortened with a notation such as $\
omega^{[m]}$ instead of a tower of height $m$. Writing out $m$ can be
done in length $O(\log(m))$, so this makes each proof $O(\log(m))$.
That makes the overall proof only $O(m \log(m))$.

And so for every ordinal in $\varepsilon_0$, there exists a proof of
transfinite induction in PA for that ordinal. But, combined, these
proofs are of infinite length. And so don't give a proof of
transfinite induction for $\varepsilon_0$. Which is a good thing,
because from transfinite induction on $\varepsilon_0$ it is possible
to prove that PA is consistent. So if PA proved transfinite induction
on $\varepsilon_0$, then it would run afoul of Godel's second
incompleteness theorem, and we can't have that!

Note that for any given $n$, we only need to prove it to the height
of the tower that its hereditary base notation has. Which height is 
$O(\log^*(n))$, where that is the iterated logarithm. A function that
PA can easily compute, and which grows extremely slowly. This is
exactly what I promised I would show at the start.

Read on if you want to think about how one could encode Lisp in PA.

---------------------------------------------------------------------

Prelude.

When I talk about "encoding", this is not strictly a mathematical
concept. Concepts exist in our minds, and not in mathematical
symbols. We decide what things will mean. And they can mean anything.
We can, as long as we keep our context straight, even overload
meanings. So the same number can "mean" one thing one time, and
another thing another time.

For a simple example, consider the signal that kicked off Paul
Revere's famous ride. "One if by land, two if by sea." In this way a
collection of lanterns, representing a single number, could mean "by
land" or "by sea". But if Robert Newman had hung 42 lanterns up, it
would have just caused confusion. Because Douglas Adams hadn't yet
told us the answer to life, the universe, and everything. And
computer scientists had not yet decided that in ASCII 42 means *.
Since programmers know that * is the symbol for everything, the
punchline to Douglas Adams' joke actually arrived before the joke was
made. Yes, that is recursive humor. It seemed appropriate given that
I will discuss bootstrapping.

Yes, I am a dad. Why do you ask?

Moving along...

---------------------------------------------------------------------

Why Lisp?

I will be giving all of my programming examples in Lisp. That's
because it is easy to explain and interpret.

Here is the explanation. The important thing that you have to
remember is this. Most Lisp terms look like (command and arguments).
The first word is a command that describes what to do with and
arguments. Usually the command will be a function, so we eval each
argument in turn. But there are some weird, but important,
exceptions. They usually make sense when you see them, and there
aren't that many. For example:

; Any side effects only exist in your mind...
(defvar false 0)
(defvar true 1)

It would make no sense to try to eval variables while we are trying
to define them, so we don't. This is an example of a special form. We
will see others.

Incidentally, we just encoded Booleans as numbers.

Not all terms have to have parentheses. For example 6 is a perfectly
valid Lisp term, and I think that you can guess what its value is.
Now that we've defined the variables, true and false are valid terms
as well.

We might as well throw in basic logic. If you can't figure out how
these work, come back after reading the next section.

(defun not (x)
    (if x
        false
        true)

(defun and (x y)
    (if x
        y
        false))

(defun or (x y)
    (if x
        true
        y))

Strictly speaking, a programming language doesn't really need
comments. But Lisp has them, and they start with a ;.

Lisp isn't just easy to explain to humans. When you're trying to
bootstrap a programming language, one of the things that you have to
do is have the computer parse the language. Consider something as
simple as, 1 * 2 + 3 * 4. To properly get 14 out, your parser has to
figure out operator precedence. That's surprisingly tricky.

But Lisp makes it easy. The equivalent expression is (+ (* 1 2) (* 3
4)). The first time you see this, you may need to work to figure out
what it says. That's just because it is unfamiliar. But it is really
easy to program a computer to look for balanced parentheses, then
always look for the command at the first term.

---------------------------------------------------------------------

What is built into PA?

The Peano axioms give us the following useful foundation.

  * 0
  * The notion of a successor, which gives us the function (s n).
    (You can think of it as (+ n 1), though we actually will define +
    in terms of s and not the other way around.)
  * The notion of "sameness", which lets us define a function (= n m)
    that returns 0 or (s 0) depending on whether the two are
    different or the same.
  * The notion of a predecessor, which is defined for all numbers
    except 0. This gives us the function (p n). (But don't call it on
    0!)
  * The ability to define things recursively, as long as the
    recursion can be proven to work by induction. An example of a
    recursion that doesn't work is (defun growing (n) (growing (s
    n))). As you start trying to evaluate it, the numbers keep
    growing and growing, and you never get to the end!
  * The ability to do something differently when we see a 0 or a 1.
    This lets us build the special form (if test on-true on-false).
    That allows for conditional logic like we used to build boolean
    logic.

And that's it. Is it enough?

---------------------------------------------------------------------

Basic number theory

PA lets us define familiar basic functions, and prove their basic
properties. Here is an example:

; It would be nice to know when one number is larger than another...
(defun < (n m)
    (if (= m 0)
        false
        (if (= n 0)
            true
            (< (p n) (p m)))))

(Exercise for the reader. Prove by induction on n that this is a
well-defined function for any pair of natural numbers (n, m).)

That nested logic gets annoying, quickly. That's why Lisp defines
cond. It lets us write that same function like this:

; It would be nice to know when one number is larger than another...
(defun < (n m)
    (cond
        (= m 0) false
        (= n 0) true
                (< (p n) (p m)))))

Here cond is a macro. It just takes its arguments, rewrites it as the
previous nested if statements, and then executes it with eval. On
more complex chains of conditional logic, this becomes a big saving.

Here are some more basic functions that you would expect from
elementary school. If you work out why these are the usual functions,
you'll be well on your way to reading Lisp.

; These seem useful
(defun min (n m)
    (if (< n m)
        n
        m))

(defun max (n m)
    (if (< n m)
        m
        n))

; We should be able to add numbers...
(defun + (n m)
    (if (= n 0)
        m
        (s (+ (p n) m))))

; After a while, you stop noticing that stack of closing parens.

; Multiplication seems important...
(defun * (n m)
    (if (= n 0)
        0
        (+ (* (p n) m) m))))

; Luckily this code runs at the speed of thought

;Exponentiation anyone?
(defun pow (n m)
    (if (= m 0)
        1
        (* n (pow n (p m)))))

; You may have to think faster...

; % gives you the remainder after division
(defun % (n m)
    (cond
        (= n 0)               0
        (= (% (p n) m) (p m)) 0
                              (s (% (p n) m) m)))

; // is integer division, it drops the remainder
(defun // (n m)
    (cond
        (= n 0)       0
        (= (% n m) 0) (s (// (p n) m))
                      (// (p n) m)))

If you prove why these functions have the properties you expect, then
you'll be well on your way to understanding PA. For example about +
we can prove from the definition the following statements:

  * (= (+ 0 m) m)
  * (= (+ (s n) m) (s (+ n m)))

From those two statements, you can prove by induction that (+ n m) is
always well-defined. (Meaning that for any pair of arguments, there
is always one and only one possible answer.)

By induction, we can then prove the following two statements:

  * (= (+ n 0) n)
  * (= (+ n (s m)) (s (+ n m)))

From the fact that the first pair of statements uniquely defined what
addition must be, the second pair of statements proves that (= (+ n
m) (+ m n)) - ie the commutative law holds.

---------------------------------------------------------------------

From one we get two.

So far we are just doing basic arithmetic. Let's introduce some
programming. Let's see how to get two numbers out of one number, or
one number out of any two numbers.

The whole idea here is this. We can write any number in base 2. If we
just look at the odd bits, we get one number that we can call the
head of the pair. If we just look at the even bits, we get another
number that we can call the tail of the number. We can encode 2
numbers as one just by interleaving their bits.

I've had to be slightly clever, but the following functions do
exactly that. (Unless I have a bug. You know how it goes.)

; I thought about calling these car and cdr...
; ...then decided that I'm not really THAT addicted to Lisp

(defun get-binary-head (n)
    (if (= n 0)
        0
        (+ (% n 2)
           (* (get-binary-head (// n 4)) 2))))

(defun get-binary-tail (n)
    (get-binary-head (// n 2)))

; All pure functional programming - return the new structure.
(defun make-binary-head (n)
    (if (= n 0)
        0
        (+ (% n 2)
           (* (make-binary-head (// n 2)) 4))))

(defun make-binary-tail (n)
    (* (make-binary-head n) 2))

(defun make-binary-pair (n m)
    (+ (make-binary-head n) (make-binary-tail m)))

(defun set-binary-head (pair n)
    (+ (make-binary-head n) (make-binary-tail (get-binary-tail pair))))

(defun set-binary-tail (pair n)
    (+ (make-binary-head (get-binary-head pair)) (make-binary-tail n)))

In order to prove that these work in PA, you'll need to first prove
the theorem of strong induction. This theorem says that to prove that
a statement S is true of all natural numbers, it suffices to prove
the following two statements:

 1. S is true of 0.
 2. If S is true for all n with (< n m), then S is true of m.

(This has to be a theorem, because PA did not start with a notion of
"less than".)

---------------------------------------------------------------------

From two, many.

As soon as a competent programmer sees the ability to make pairs, the
next question is whether they can make linked lists. Because with
linked lists, we can build anything!

The following construction shows that we can do that in PA.

; Calling 0 by the right name can remind us of what we mean.
(defvar nil 0)

; Let's have empty lists
(defun make-empty-binary-list ()
    nil)

; Extend a list by adding a value on the start
; (Note the use of s to ensure that nonempty lists are never nil)
(defun add-binary-list-head (head list)
    (make-binary-pair (s head) tail)

; Be sure to undo the s in the last function!
(defun get-binary-list-head (list)
    (p (get-binary-head list)))

; What's in a name?
(defun get-binary-list-tail (list)
    (get-binary-tail list))

; How long is the list?
(defun binary-list-length (list)
    (if (= list nil) 0
        (s (binary-list-length (get-binary-list-tail list)))))

; Let's get any element!
(defun get-binary-list-at (list n)
    (if (= n 0)
        (get-binary-list-head list)
        (get-binary-list-at (get-binary-tail list) (p n))))

; Let's insert an element anywhere
(defun insert-binary-list-at (list n value)
    (if (= n 0)
        (add-binary-list-head value list)
        (add-binary-list-head
            (get-binary-list-head list)
            (insert-binary-list-at
                (get-binary-list-tail list)
                (p n)
                value))))

; How big are these numbers now...?
; ...doesn't matter. There are always more!

; Let's remove an element anywhere
(defun remove-binary-list-at (list n)
    (if (= n 0)
        (get-binary-list-tail list)
        (add-binary-list-head
            (get-binary-list-head list)
            (remove-binary-list-at list (p n)))))

---------------------------------------------------------------------

From many, any.

At this point, any competent programmer will recognize that it is
game over. Once we have numbers, pairs, and lists, we can build
anything at all that we want. Stacks, queues, trees, text documents,
virtual machines, and so on. All can be represented. All can be
manipulated in any way that we can describe with a computer.

In fact we can build a single encoding that can include all of these,
all at once.

First let's create a basic list of types. Our list might begin
(type-number, type-boolean, type-pair, type-list, ...). We can keep
on adding to this list over time. For example we can use ASCII to
create a type-text. And on it goes.

Second, every number will be mapped to a pair (type, value). Where
value is expected to be of the right type. Some numbers won't
actually map to something that works as a value of that type - that's
OK. We have lots of numbers. But those that do map, will map to a
completely unique thing. Our list starts off with:

n | binary | pair   | interpretation
------------------------------------
0 |    0   | (0, 0) | the number 0
1 |    1   | (1, 0) | false
2 |   10   | (0, 1) | the number 1
3 |   11   | (1, 1) | true
4 |  100   | (2, 0) | the pair (0, 0)
5 |  101   | (3, 0) | the empty list ()
6 |  110   | (2, 1) | the pair (1, 0)
7 |  111   | (3, 1) | the list (0)
8 | 1000   | (0, 2) | the number 2
9 | 1001   | (1, 2) | INVALID BOOLEAN

If we're going to work with this encoding, it makes sense to rename
every function that we've been using so far by adding PA- to the
start of the name. That frees up the functions that we have defined
to work on things with type type-number. And now we can add this
handy convenience function:

(defn type-of (n)
    (make-binary-pair type-type (PA-get-binary-head n))

; Lisp fans might want to implement make-type...
; ...why did you need to make it a macro?

And now it is a tedious but straightforward exercise to build up our
functions to understand Lisp, a Lisp virtual machine, and a Lisp
interpreter that runs on that machine.

And, of course, Lisp is Turing complete. So we can then build any
other kind of virtual machine, for any other language we want, and so
on.

This gives us the ability to encode any possible computable
procedure, any possible state we might encounter following that
procedure, and to work out the exact state of the computation after
any number of steps.

This is what I mean when I say that PA can encode computation.

---------------------------------------------------------------------

PA also encodes proofs in PA

Logicians have created a formal language with a formal set of symbols
called First Order Logic. A proof in this language is just a list of
statements. Each statement is a single reasoning step. You are free
to write down statements that are complete nonsense, but some make
sense. You are allowed to write down a proof with invalid reasoning
steps, but that is an invalid proof.

In other words, we can create a type type-proof. It will consist of a
list of statements. Each statement is a list of characters. You can
write a program that can verify whether any individual statement is
valid. And so on. In short, it is possible to make something like the
following program work.

(defn proof_step_is_valid (proof step)
    ...)

(defn proof-is-valid (proof)
    (check-proof proof 0))

; Meta-programming with a vengeance!
(check-proves (axioms statement n)
    ; This defines a local variable
    (let proof-found
        (if
            (not (= (type-of n) type-proof))
            false ; not of type proof
            ; need another local variable
            (let proof (value-of n)
                ; the code in here knows what proof is
                (cond
                    ; the data structure encoded in the value may be invalid!
                    (not (proof-is-well-formed proof)) false
                    (not (proof-is-valid proof))       false
                    (not (proof-assumes axioms))       false
                    (not (proof-concludes statement))  false
                                                       true)))
        ; The code here knows what proof-found is
        (if proof-found true (check-proves axioms statement (PA-s n)))))

; It may be hard to figure out whether this returns...
(defn proves (axioms statement)
    (check-proves axioms statement PA-0))

In other words if any proof of any statement from any axioms exists,
there is a specific PA number that must encode it. There is a
specific function that can find that proof. And so we see that all of
logic is encoded in PA.

Incidentally, the kind of recursive self-reference that I just
demonstrated in proof is what allows us to embed self-recursion.
Which allow us to construct statements whose meaning is some
variation on, "This statement is a lie." The impossibility of proving
such statements true or false is what gives us things like Godel
incompleteness, the unsolvability of the Halting problem, and so on.

Note, Godel managed to encode logic in PA without having to also
encode computation. To a logician, embedding computation is just a
giant detour. But as a programmer, I find this a very natural way to
understand things.

If you've actually read to this point, you're likely to be a
programmer. And hopefully you agree with my point of view.

Share
Cite
Follow 
edited 3 hours ago
answered yesterday
 
btilly's user avatar
btillybtilly
1,37688 silver badges1313 bronze badges
$\endgroup$
6

  * 
    $\begingroup$ Although the content is basically correct, this is
    less an answer and more like a blog post :) $\endgroup$
    - Trebor
    Commented 18 hours ago
  * 
    $\begingroup$ @Trebor True. But it does contain an answer. $\
    endgroup$
    - btilly
    Commented 17 hours ago
  * 1
    $\begingroup$ Whoa! I certainly didn't expect this. It's amazing,
    hits a lot of the reasons I'm looking into this stuff, and I
    think will become a nice reference for the computational power of
    PA. If you look into my SO profile, you'll see that I appreciate
    Lisp :) $\endgroup$
    - Kotlopou
    Commented 8 hours ago
  * 1
    $\begingroup$ wow this is a perfect post $\endgroup$
    - Pacmanboss256
    Commented 6 hours ago
  * 1
    $\begingroup$ @Kotlopou You're welcome. I didn't know what level
    your knowledge was at, and so targeted it for a curious
    programmer without a background in logic. $\endgroup$
    - btilly
    Commented 5 hours ago

  |  Show 1 more comment

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

  * ordinals
  * proof-theory
  * peano-axioms

See similar questions with these tags.

  * Featured on Meta
  * 
    Thoughts on the future of Stack Exchange site customisation
  * 
    How Can We Bring More Fun to the Stack Ecosystem? Community Ideas
    Welcome!
  * 
    Two New Chat Rooms Experiment Geared Towards New Users

Linked

 
2
How much more theory strength does the local reflection principle
over $\mathrm{PA}$ provide?
 
2
Can Goodstein's theorem be proven in $\mathrm{PA + Con(PA)}$?

Related

 
6
How can know if a proof technique can actually prove something?
Specifically, induction
 
2
Precise definition of relative consistency in Kunen's "Set Theory"
 
5
a stronger fixed point theorem
 
1
Formulas that can be proved from the uniform reflection schema
 
0
Is PA provably definable?
 
0
Confusion about $\mathsf{PA}$'s self-provable consistency sentences
 
3
Unbounded minimal $T$-proof lengths and $T$-provably total functions
 
2
Can Goodstein's theorem be proven in $\mathrm{PA + Con(PA)}$?

Hot Network Questions

  * 
    I'm new to PCB design, please help me check if my board looks
    correct before I order it
  * 
    Why do customers who used their credit cards have "refund
    priority" over customers who didn't in case of a bankruptcy?
  * 
    How can I reduce the time to get a list of triples?
  * 
    How to import image planes directly to camera view?
  * 
    Will PhD grades be converted using ECTS during degree recognition
    in Europe
  * 
    Why is the Unix command "touch" called touch? How is it related
    to new/create or update?
  * 
    Can I travel on holiday to the US if I was born in Iran but never
    had Iranian citizenship, passport but I do have my British
    passport - Trumps new ban
  * 
    What would the advantages or disadvantages of being a
    swordfighter with digitigrade legs and a tail?
  * 
    Which industrial revolution-age technologies could be mimicked
    with magic pistons?
  * 
    Can an updraft cause a plane to lose lift
  * 
    Why not design a moon lander that can land in any orientation?
  * 
    Best method to reheat Tri Tip beef roast?
  * 
    How do I merge a volume group together with my system hard drive?
  * 
    Why would humans domesticate an omnivorous species that has
    human-level intelligence, varying sizes, and a longer lifespan
    than humans themselves?
  * 
    Why did Renata slap Ernie when he said that the agency had erased
    Quaid's memory in "Total Recall" (1990)?
  * 
    Do LLMs weaken Penrose's consciousness argument?
  * 
    ZX Spectrum Elite are vipers coming out of docking station?
  * 
    Capacitance from physics
  * 
    Vraska and layers
  * 
    What did Paul mean when he spoke in Colossians 1:24
  * 
    Which Retirement Accounts Should I Prioritize: Roth IRA, 401(k),
    or Traditional IRA?
  * 
    Will the embassy of France accept printed PDFs and screenshots as
    evidence of employment?
  * 
    Special relativity travel duration
  * 
    Determining direction of motion: particle on a rough slope
    (mechanics)

more hot questions
Question feed

Subscribe to RSS

Question feed

To subscribe to this RSS feed, copy and paste this URL into your RSS
reader.

[https://math.stackex]
 
*

Mathematics

  * Tour
  * Help
  * Chat
  * Contact
  * Feedback

Company

  * Stack Overflow
  * Teams
  * Advertising
  * Talent
  * About
  * Press
  * Legal
  * Privacy Policy
  * Terms of Service
  * Your Privacy Choices
  * Cookie Policy

Stack Exchange Network

  * Technology
  * Culture & recreation
  * Life & arts
  * Science
  * Professional
  * Business
  * API
  * Data

  * Blog
  * Facebook
  * Twitter
  * LinkedIn
  * Instagram

Site design / logo (c) 2025 Stack Exchange Inc; user contributions
licensed under CC BY-SA . rev 2025.6.13.29290