CISC 462 — Lecture 6

Undecidability

So we saw some examples of decidable languages. Now, we'll tackle the notion of undecidability. The notion of algorithmically unsolvable problem poses a problem in practical terms (and philosophically as well). To demonstrate this, we show that the following language is undecidable. $$A_{TM} = \{ \langle M,w \rangle \mid \text{$M$ is a Turing machine and $M$ accepts $w$} \}$$

Theorem. $A_{TM}$ is undecidable.

It's important to note here that while $A_{TM}$ is not decidable, it is recognizable. The first consequence of this fact is that the recognizable languages are a larger and more powerful set of languages than the decidable languages. Secondly, this means we can construct a Turing machine that simulates a given Turing machine.

We can define a universal Turing machine $U$. We've played around with this idea before. The idea is that we can encode a Turing machine as a string via some appropriate encoding. It is worth pausing a bit to consider the philosophical implications of what it is we're doing. In particular, we're making a number of assumptions that we sort of take for granted as computer scientists.

The first is that we can encode Turing machines as strings. Of course, we've encountered the notion of encoding objects as strings before. However, in this case, we can think of this as a sort of formal way of doing what we've been doing all along: writing programs. And just as we can treat programs we write as text files, we treat the Turing machines as strings.

And what do we do with programs that we treat as text? We usually run them through a compiler that transforms our human-readable code into machine code, which is then run on a computer. There are tons of details in the process which we sort of abstract nowadays, but this is not unlike the notion of some Turing machine taking as input the encoding of another Turing machine and transforming it into something that can be simulated.

The final thing to note here is the idea of the universal Turing machine. We've studied a number of problems where we're essentially simulating other models of computation, but the idea of a universal Turing machine acts very much like our real electronic computers do. And each of our computers has various differences, but via encodings and the power of being able to decode these encodings allows us to simply write programs and have them work on every computer.

But now, we'll move on to describing a universal Turing machine. On input $\langle M,w \rangle$, where $M$ is a Turing machine and $w$ is a string, $U$ operates as follows:

  1. Simulate $M$ on input $w$.
  2. If $M$ enters the accept state, accept; if $M$ ever enters the reject state, reject.

Note that because $U$ simulates another Turing machine, $U$ will loop forever if $M$ never enters the acecpt or reject state. Of course, this is a rather unsatisfying description of a very important concept, so let's take a closer look at the implementation.

Our universal Turing machine uses three tapes:

  1. A tape containing the description of the Turing machine $M$ to be simulated.
  2. A simulation of the work tape of $M$.
  3. The current state of $M$.

From this, it should not be difficult to see how the machine works. The simulation of the work tape is straightforward, so it remains to see how the simulation of the finite state portion of $M$ is simulated. This is taken care of by scanning the description of $M$ in the first tape. Since the current state is kept track of on the third tape, the machine only needs to scan the transitions in the description of $M$ on the first tape to determine which transition to take.

Diagonalization

Now, we move on to the technique that we'll use to show that $A_{TM}$ is undecidable. The crux of the idea has to do with the number of languages that exist compared to the number of Turing machines there are.

This will take us back to some mathematics in the late 19th century involving the sizes of sets. Specifically, we'll be dealing with the size of infnite sets. Georg Cantor was interested in the question of how to compare the sizes of two infinite sets. For instance, if we had the set of natural numbers and the set of all even numbers, what can we say about their sizes? Cantor's solution to this was to show that two sets have the same size if you could pair every element of one set with the other.

Let $A$ and $B$ be two sets and consider a function $f:A \to B$. We say $f$ is one-to-one or injective if $f(a) \neq f(b)$ whenever $a \neq b$. We say $f$ is onto or surjective if for every $b \in B$, there exists $a \in A$ such that $f(a) = b$. A function that is both one-to-one and onto is a correspondence or bijection.

We say a set $A$ is countable if either it is finite or has the same size as $\mathbb N$.

Going back to our question of the even numbers, it turns out the size of $\mathbb N$ and $2 \mathbb N$ are the same size, since $f(n) = 2n$ is a bijection from $\mathbb N$ to $2 \mathbb N$. Furthermore, this shows us that the set of even numbers is countable.

Of course, there are more interesting sets that we can consider. For instance, the set of rational numbers $$ \mathbb Q = \left\{\frac m n \mid m,n \in \mathbb N\right\}$$ is also countable. In other words, we can show that there is a correspondence between $\mathbb N$ and $\mathbb Q$.

Theorem. $\mathbb Q$ is countable.

Proof. What we want to do to show that $\mathbb Q$ is countable is to give a way to enumerate the elements of $\mathbb Q$. First, observe that we can list all of the elements of $\mathbb Q$ by constructing a table, where $\frac i j$ is in the $i$th row and the $j$th column:

$\frac 1 1$ $\frac 1 2$ $\frac 1 3$ $\frac 1 4$ $\frac 1 5$
$\frac 2 1$ $\frac 2 2$ $\frac 2 3$ $\frac 2 4$ $\frac 2 5$
$\frac 3 1$ $\frac 3 2$ $\frac 3 3$ $\frac 3 4$ $\frac 3 5$ $\cdots$
$\frac 4 1$ $\frac 4 2$ $\frac 4 3$ $\frac 4 4$ $\frac 4 5$
$\frac 5 1$ $\frac 5 2$ $\frac 5 3$ $\frac 5 4$ $\frac 5 5$
$\vdots$

However, we have to be a bit careful in how we enumerate the elements. We can't just go row by row, since each row is infinitely long and we won't ever reach the end of a row and go on to the next. What we do instead is enumerate along each diagonal from the top left-hand corner. We must also be careful to skip any elements that have already been accounted for, such as $\frac 2 2, \frac 3 3, \dots$. Then we will encounter every element of $\mathbb Q$ by following this process. $$\tag*{$\Box$}$$

Now after seeing this, we might expect that we can do this with any infinite set if we try hard enough to come up with a correspondence. As it turns out, this is not the case and there are infinite sets that are larger than $\mathbb N$. Such a set is uncountable.

Theorem. $\mathbb R$ is uncountable.

Proof. We will show this by contradiction. We assume that we have some appropriate bijection $f:\mathbb N \to \mathbb R$. Then, we begin listing all of the numbers.

$n$$f(n)$
10.00010000$\dots$
21.89327412$\dots$
312.12971290$\dots$
49.39102717$\dots$
53.19023721$\dots$
60.18283911$\dots$
$\vdots$

We claim that we can find a real number that cannot be in this correspondence. To do this, we define our number by setting its $i$th decimal place to be a number different from the $i$th decimal place of $f(i)$. So according to our example, our number can begin as 1.234168$\dots$. But according to this definition, our number can't be on the list that we've enumerated. Suppose that our number corresponds to some $t \in \mathbb N$. According to our definition, the $t$th digit of our number must differ from the $t$th digit of $f(t)$, which is a contradiction. Thus, our number is not enumerated and doesn't exist in our mapping and thus, the mapping is not a bijection. $$\tag*{$\Box$}$$

So how many real numbers are there? By convention, the cardinality of the natural numbers is defined to be $\aleph_0$. It turns out that Cantor also showed that the size of the set of real numbers is $2^{\aleph_0}$, or the number of subsets of natural numbers. This fact is quite interesting on its own, but it leads to another question: are there any sets $S$ such that $\aleph_0 < |S| < 2^{\aleph_0}$? The continuum hypothesis says that there isn't. What is interesting about continuum hypothesis is that it's an example of a result that is independent from Zermelo-Fraenkel set theory. ZF is one of the main axiomatic set theory systems that forms the basis for mathematics and is widely believed to be consistent. Recall that Gödel showed that a sufficiently complex system is either incomplete or inconsistent. The continuum hypothesis is an example of one of those statements that can neither be proved nor disproved in ZF.