CMSC 28000 — Lecture 21

Universality

It seems like Turing machines are a very powerful model. Having arbitrary access to unlimited amounts of memory seems to allow us to compute relatively complicated tasks, even if it takes a bit of work to define. And the simplest model seems to be able to simulate surprisingly complex modifications to the model.

So just how powerful are Turing machines? We will be seeing some very surprising answers to that question. However, we should begin with some observations.

First, recall at the very beginning of the quarter, we made a very critical assumption. This is the fact that every “thing” we would like to “do computation on” must have a string encoding. We will introduce some notation for this idea.

Let $A$ be an object. We denote by $\llbracket A \rrbracket$ the encoding of the object $A$ as a string.

Next, we observe that anything that we can describe as a string should have a reasonable encoding. This includes Turing machines—recall that a Turing machine $M$ is a 7-tuple $(Q, \Sigma, \Gamma, \delta, q_0, q_{\mathsf{acc}}, q_{\mathsf{rej}})$. This means that to encode a Turing machine, we just need

Since states and tape symbols can be labelled by integer and a transition is just a tuple of these things, it’s not hard to see how we can come up with a reasonable encoding. Kozen Chapter 31 walks through this in a bit more detail but it is not particularly earth-shattering stuff.

But the existence of an encoding for Turing machines means that we can necessarily compute and process encodings of Turing machines. This suggests that we could define a language concerning Turing machines.

The membership problem for Turing machines is “Given a Turing machine $M$ and input string $w$, does $M$ accept $w$?” and is defined as the language \[\mathsf{A_{TM}} = \{ \llbracket M,w \rrbracket \mid \text{$M$ is a Turing machine and $M$ accepts $w$} \}\]

Now that we’ve defined the language, the natural follow-up question is whether there exists a Turing machine that recognizes it. We will see that this is the case.

$\mathsf{A_{TM}}$ is Turing-recognizable.

This is not completely obvious. In fact, it’s quite a profound idea. What this says is that there is a Turing machine out there that can take an encoding of a Turing machine and some input, and simulate the encoded Turing machine on the desired input word. Such a Turing machine is called a universal Turing machine.

To prove that $\mathsf{A_{TM}}$ is recognizable, we’ll describe how a universal Turing machine could work.

We will describe a universal Turing machine $U$. On input $\llbracket M,w \rrbracket$, where $M$ is a Turing machine and $w$ is a string, $U$ operates as follows:

  1. Simulate $M$ on input $w$.
  2. If $M$ enters the accept state, accept; if $M$ ever enters the reject state, reject.

This is essentially an algorithm and we can read it as such.

        \begin{algorithmic}
        \PROCEDURE{tm-accept}{$\llbracket M, w \rrbracket$}
            \STATE result $ \gets $ \CALL{M}{$w$}
            \IF{result $ = $ \textsf{Accept}}
                \RETURN{\textsf{Accept}}
            \ELSE
                \RETURN{\textsf{Reject}}
            \ENDIF
        \ENDPROCEDURE
        \end{algorithmic}
    

Note that because $U$ simulates another Turing machine, $U$ will loop forever if $M$ never enters the acecpt or reject state. Of course, most of the work is done in the “Simulate $M$” step, which on its own is a rather unsatisfying description of a very important concept. Let’s take a closer look at the implementation.

Our universal Turing machine sets up its tape in the following way, for a Turing machine $M$ and input string $w = a_1 a_2 \cdots a_n$

\[ \llbracket M, w \rrbracket \# \llbracket q_0 \rrbracket \# \llbracket \dot a_1 a_2 \cdots a_n \rrbracket \] Our tape is partitioned into the following sections:

In our example, we denote the location of the tape head with the symbol $\dot a$. This means for every symbol in the tape alphabet of $M$, $a \in \Gamma$, we add a symbol $\dot a$ to use as the tape head indicator.

The operation of the machine applying one transition is as follows:

  1. Scan the current state
  2. Scan the tape symbol “under” the virtual tape head
  3. Scan the encoding of the Turing machine looking for the corresponding transition
  4. Update the state
  5. Write to the tape
  6. Update the location of the virtual tape head

One can take this tape parititioning setup and generalize it to show how to simulate any Turing machine with $k$ tapes with independent tape heads. Just partition any single tape into $k$ segments and use indicators to identify the location of the tape head for each “virtual tape”.

What this says is that it’s possible to simulate the computation of a Turing machine. We can then confirm whether a string is accepted by that Turing machine—after all, we just run it. But what if the string isn’t accepted by the machine? We have two possibilities:

In essence, what this says is that in order to decide this language, we need to do it in a more clever way than just running the machine. Is this possible? We’ll see that it is not.

But first, there are a few interesting observations we can make.

The Church–Turing thesis

We’ve studied a number of problems where we’re essentially simulating other models of computation, but the idea of a universal Turing machine acts very much like our real general-purpose computers do. And each of our computers has various differences, but via encodings and the power of being able to decode these encodings allows us to simply write programs and have them work on every computer—in other words, software.

So there’s a very clear analogy here to writing programs. Just as we can treat the programs we write as text files, we can treat the Turing machines as strings. But there is another analogy that many of you will be familiar with if you’ve taken a functional programming class: functions as data. In functional programming, we are used to the idea of treating functions as essentially another kind of value that can be passed around. The idea behind encoding Turing machines as strings is no different.

In fact, it’s this idea that ties together all of the formalisms of the 1930s that try to nail down a definition of effective computability:

These formalisms look very different. One of the issues back in the 1930s was that it wasn’t entirely clear that these were all really different ways of doing the same thing. We’ve seen hints of this already: there’s no real reason we should’ve expected, say, certain automata and grammars to be equivalent, but it turned out they are.

But there are some common themes:

What ended up tying all of these things together was the Turing machine. Recall that what Turing attempted to do was formalize how human computers might compute a problem. We can view the finite state system as a set of instructions we might give to a person, that tells them what to do when they read a particular symbol. We allow this person to read and write on some paper and we let them have as much paper as they want.

This gives us a fairly concrete notion of computation compared to things having to do with recursive functions. The clarity of the model and how it works makes it easy to prove that these other systems were equivalent to Turing machines. This led to the observation that’s now called the Church–Turing thesis.

Nowadays, the Church–Turing thesis is often paraphrased in the context of real computers and the notion of Turing-completeness of programming languages and other systems. However, the Church–Turing thesis is really a statement about the equivalence of those formalisms from the 30s. Since Turing machines are now accepted as the standard of computability, Turing-completeness is just the property of a particular system to be able to simulate a Turing machine, and implies that this system is capable of “real” computation.

Most people are introduced to the notion of Turing-completeness when thinking about programming languages and languages similar to programming languages. For instance, it’s not surprising that your typical programming language is Turing-complete. Where things get more interesting is when dealing with those languages that we don’t typically think of as “programming” languages, like $\TeX$ or C++ template metaprogramming. Of course, this has important theoretical uses, like showing that theoretical molecular or quantum models of computation can at least achieve the standard notion of computability. This also has less important theoretical uses, like showing that Magic: The Gathering or PowerPoint transitions also have the full power of computability.

Enumeration

Now, we’ll begin to work towards showing that $\mathsf{A_{TM}}$ is undecidable. The crux of the idea has to do with the number of languages that exist compared to the number of Turing machines there are.

This will take us back to some basic set theory. Specifically, we’ll be dealing with the size of infnite sets. In the late 1800s, Georg Cantor considered the question of sizes of infinite sets. For instance, if we had the set of natural numbers and the set of all even numbers, what can we say about their sizes?

Recall the following definitions about functions.

Let $A$ and $B$ be two sets and consider a function $f:A \to B$.

One of the uses of these functions is to give us a formal basis for the idea of counting and “sizes” of sets.

A set $A$ is said to be countable if either it is finite or has the same cardinality as $\mathbb N$.

At first this seems unremarkable, but we find that there are some sets that seem like they should be “bigger” than the natural numbers but turn out to have the same cardinality. For example, the even natural numbers, the integers, the cartesian product of integers, and the rational numbers are all examples of sets that are countable.

In our case, we are particularly interested about strings.

For every alphabet $\Sigma$, the set of strings $\Sigma^*$ is countable.

The idea for this is not difficult: just take $f(n) = w$, where $w$ is the $n$th string in lexicographic order, for some order on $\Sigma$.

Notice that by setting up this correspondence with the natural numbers, we have defined an enumeration of the set. If $L$ is countable, then there is a bijection $f: \mathbb N \to L$. This means that we can list all the elements of $L$ by considering them in “order”: $f(0), f(1), f(2), \dots$. It turns out the property of being “enumerable” is important for a language—in particular, the recursively enumerable languages.

An enumeration machine $E$ for a language $L$ is a Turing machine equipped with an additional write-only output tape such that $E$ never moves left on the output tape and when run on a blank input/work tape, will write a string of the form $\#x_1\#x_2\#x_3\# \cdots $ on the output tape, where each $x_i \in L$ and every element of $L$ will appear on the output tape exactly once.

Notice that this gives a natural bijection between $L$ and $\mathbb N$. We can use enumerators to show the following.

A language $L$ is recursively enumerable if and only if there exists an enumeration machine $E$ that enumerates $L$.

Suppose we have an enumeration machine $E$ that enumerates $L$. We can use $E$ to construct a Turing machine $M$ that recognizes $L$ by doing the following:

            \begin{algorithmic}
            \PROCEDURE{M}{$w$}
                \FOR{each $x_i$ written by $E$}
                    \IF{$w = x_i$}
                        \RETURN{\textsf{Accept}}
                    \ENDIF
                \ENDFOR
            \ENDPROCEDURE
            \end{algorithmic}
        

Essentially, what we do is use $E$ to enumerate a list of every string in $L$. If $w$ really is in $L$, then it will eventually show up in the list produced by $E$. If it isn’t, then our machine runs forever. Since $M$ only needs to recognize $L$, guaranteeing acceptance is all that’s necessary.

Now, suppose that we have a Turing machine $M$ that recognizes $L$. We want to build an enumerator $E$ that enumerates $L$. One idea is to make use of the countability of $\Sigma^*$. We can go through each string in $\Sigma^*$, say in lexicographic order, and test whether $M$ accepts it.

Unfortunately, there is a problem with this. Since $M$ only recognizes $L$, it may be the case that $M$ runs forever on some string. In this case, our algorithm would get “stuck”. So we need a way to avoid this.

An alternative idea is to try to “parallelize” this strategy by running $M$ for one step on each string, then two steps, and so on. Again, we have an issue: if we try to run $M$ one step on every string, we are going to be waiting a long time to get through every string.

We can break out of this logjam by trying to do both at once: for each $n$, test all strings of length up to $n$ for $n$ steps. This way, both the set of strings being tested and the number of steps being run is finite and guaranteed to stop.

This gives us our machine:

            \begin{algorithmic}
            \PROCEDURE{E}{}
                \STATE write $\#$ to output
                \FOR{$n = 0, 1, 2, \dots$}
                    \FOR{each $w \in \Sigma^{\leq n}$}
                        \STATE run $M$ on $w$ for $n$ steps
                        \IF{$M(w) = $ \textsf{Accept} and $w$ has not already been written}
                            \STATE write $w\#$ to output
                        \ENDIF
                    \ENDFOR
                \ENDFOR
            \ENDPROCEDURE
            \end{algorithmic}
        

This way, a string of length $\ell$ and is accepted in $k$ steps will be encountered in the loop when $n = \max\{k,\ell\}$.

The two sides of this proof give us an idea of what characterizes recognizable languages: existential search.