CS 245 — Lecture 24

The Halting Problem

Recall that we defined the Halting Problem last class for the purposes of having an undecidable problem.

Problem 23.6. Given a program $P$ and input $I$, will $P$ terminate when run on input $I$?

Today, we will talk about the Halting problem and finally show that it is undecidable. But first, why do we care about the Halting Problem?

First, it seems counterintuitive that we shouldn't be able to determine whether a program halts or not. I mean, if we study a program for long enough, it seems to be obvious. Of course, such things become less obvious at 3 am before a deadline and our projects are hanging for no discernable reason when we run them. But more fundamentally, we should remember what it means to decide a problem. Just because there are easy instances of a problem doesn't say anything about whether the problem is decidable. As we have seen and we will see, even some surprisingly simple-looking problems turn out to be undecidable.

And remember that back in the 1920s and 30s, what Turing was really asking about was whether we could tell whehter the kind of purported mathematics-solving algorithm that Hilbert wanted would eventually lead to an answer or not. In mathematics, there are all sorts of problems where we would like to know if a particular object exists or not. We've actually run into one of these in this course: trying to find a proof of a statement in first-order logic. Recall that an easy "algorithm" to find a proof is to just check every string of symbols in lexicographic order to see if it's a valid proof of our formula. If there is a proof, then we'll eventually hit it. The problem is if there is no proof; in this case, our algorithm will just keep on running forever. But if we could solve the Halting problem, then we'd have a way to determine whether our formula has a proof or not! We can just run the Halting algorithm on our naive proof-finding algorithm: if it halts, then we know there's a proof and if it doesn't, we know there isn't.

We can solve other interesting problems in the same way. For instance, Goldbach's conjecture says that every even number greater than 2 can be written as a sum of two primes. So we can write a program that just goes through every even number that checks whether it can be written as the sum of two primes and halts when it encounters an even number that can't. If we threw this at an algorithm for the Halting problem, we could immediately get an answer. But we can't.

Turing's proof

To think about what Turing tries to do in his proof, we should take a step back and think again about sets. Recall that the correspondence between problems and set membership means that we can think of decision problem as sets. Bertrand Russell noticed that we can get funny sets like the one from what we now call Russell's paradox. A variant of Russell's paradox is stated in terms of barbers:

There is a barber who is said to shave all men, and only those men, who do not shave themselves. Does the barber shave himself?

Why is this important? Remember back when we were talking about syntax and semantics of first-order logic and I mentioned that one of the first to think about this stuff was Gottlob Frege. Well, Russell's paradox was an example of a sentence of Russell's which showed that some inconsistent objects could be defined using Frege's rules, which would not be good for something purporting to be the laws of mathematics.

The actual statement of Russell's paradox is in defining the set $T = \{S \mid S \not \in S\}$. The big question here is: is $T \in T$? Suppose $T \in T$. Then by definition, $T \not \in T$. Well, that's not good. So it must be the case that $T \not \in T$, right? But then by definition, $T \in T$. And so we have that $T \in T$ if and only if $T \not \in T$, a contradiction.

The proof of the Halting problem has its roots in the kind of thing that Russell was trying to do with sets.

Theorem 23.7. The Halting Problem is undecidable.

Proof. Suppose for a contradiction that the Halting Problem is actually decidable and we have an algorithm $H$ that solves it. You can imagine that we run the algorithm by calling $H(P,I)$.

Now, we construct an algorithm $D$ which takes some program $P$ as input, described as follows:

Call $H(P,P)$. That is, we Use $H$ to check if $P$ halts when run on input $P$.
If $H(P,P)$ tells us $P$ halts on input $P$, then loop forever.
If $H(P,P)$ tells us $P$ doesn't halt on input $P$, then halt.

So $D$ is a perfectly fine program, which means that we can now consider $H(D,D)$. That is, we consider what $H$ tells us about $D$ when run on itself as input.

If $H(D,D)$ tells us $D$ halts on input $D$, then by definition of $D$, it should loop forever.
If $H(D,D)$ tells us $D$ does not halt on input $D$, then by definition of $D$, it should halt.

In every case, the definition of $D$ leads us to a contradiction. Therefore, $H$ must not exist and the Halting problem is undecidable. $$\tag*{$\Box$}$$

This proof makes a lot of people uncomfortable because it uses a technique called diagonalization. Georg Cantor first used diagonalization to show that there were strictly more reals than natural numbers. Now, there are people out there who absolutely hate Cantor's argument and still believe that there are just as many reals as natural numbers. Much like how some people find flat-earthers or moon landing-deniers kind of entertaining, Cantor-deniers are the most entertaining crackpots to me.

At least for our purposes, much of the discomfort with our proof stems from the idea that you can run a program on itself. Why would anyone ever do that? The obvious example of an instance where this would be completely normal is a compiler. For a classic and fun example of fun things that can happen with a compiler compiling its own code, see Ken Thompson's Reflections on Trusting Trust.

Now, if you are familiar with the diagonalization argument, then you can see that running a program on itself leads us right to the form of the diagonal. We can view the outcomes of the algorithm $D$ presented as a table. To see this, instead of enumerating each natural number, we enumerate every binary string. Since programs are just strings, we will eventually enumerate them all in the process of doing this. Then we construct a table with the inputs running across and the programs running down, but we are enumerating every string in either case; if a string isn't a program, then it just fails and halts. So the table tells us whether string $P_i$ treated as a program halts on input string $P_j$.

	$ P_1 $	$ P_2 $	$ P_3 $	$ P_4 $	$ P_5 $	$\cdots$
$P_1$	✔️️	❌	✔️️	✔️️	❌
$P_2$	✔️️	❌	✔️️	❌	❌
$P_3$	❌	✔️️	✔️️	✔️️	❌
$P_4$	✔️️	❌	❌	❌	❌
$P_5$	✔️️	❌	✔️️	✔️️	❌
$\vdots$

Then the outcome of $D$ is going to be found along the diagonal and the question is what we will find when we reach $D$.

Example 24.1. The undecidability of the Halting problem actually gives us the Incompleteness theorem for free. Suppose there exists a proof system that was both consistent and complete. Then we can use our proof-searching algorithm from before and search for either a proof that $P$ halts on $I$ or it doesn't. Since our system is consistent and complete, one of the two statements must be true and we should find a proof. However, this would give us an algorithm for solving the Halting problem. Therefore, our complete and consistent proof system can't exist.

Example 24.2. We have already seen a few examples of variations of Halting problems which all turn out to be undecidable. We can think of halting as a semantic or behaviourial property of the program. As it turns out that deciding any non-trivial semantic property of a program is undecidable. This result is called Rice's Theorem and was shown by Henry Gordon Rice in 1951.

Example 24.3. What the above means is that doing something like testing partial correctness of a Hoare triple is undecidable, since the Hoare triple is about specifying conditions that the program's behaviour must satisfy. To see that this is undecidable, consider the Hoare triple $$\llparenthesis \textrm{true} \rrparenthesis \texttt{P(I)} \llparenthesis \textrm{false} \rrparenthesis.$$ This Hoare triple is satisfied under partial correctness if and only if P(I) does not halt. Then this means that if we could decide partial correctness, we could decide the Halting problem!

Example 24.4. As we saw with Hilbert's tenth problem, not every undecidable problem is about programs and may look very innocuous. One such fundamental problem is Post's Correspondence Problem. The problem is stated

Given a finite list of pairs of strings $(x_1, y_1), (x_2, y_2), \dots, (x_n, y_n)$, does there exist a finite nonempty list of indices $i_1, i_2, \dots, i_r$ such that $$x_{i_1} x_{i_2} \cdots x_{i_r} = y_{i_1} y_{i_2} \cdots y_{i_r}?$$

This problem was proposed and shown to be undecidable by Emil Post in 1946 (Post, you may remember, showed that propositional logic was sound and complete in 1920). Sometimes the proof of this is shown in CS 360 and it involves showing that you can define your set of pairs $(x_i, y_i)$ in such a way that you end up simulating the computation of a Turing machine (and therefore solve the Halting problem for it).

Example 24.5. A problem with a similar flavour is called the matrix mortality problem. The problem is stated

Given a set of $n \times n$ matrices $\{M_1, \dots, M_k\}$, is there a product $M_{i_1} M_{i_2} \cdots M_{i_\ell} = 0$?

This problem was shown to be undecidable by Michael Paterson in 1970 for $n = 3$. One can kind of see how you might be able to use a reduction from PCP to show that this is undecidable.