Here’s another kind of question we can ask of Turing machines. Since they are so powerful, we may want to ask whether or not the language of a TM can be recognized by a less powerful model of computation, like a DFA.
The regularity problem for Turing machines is “Given a Turing machine $M$, is the language accepted by $M$ regular?” and is defined \[ \mathsf{REG}_{\mathsf{TM}} = \{ \llbracket M \rrbracket \mid \text{$M$ is a TM and $L(M)$ is a regular language} \}.\]
If we are able to tell whether a Turing machine accepts a language that is actually regular, then maybe we could come up with a simpler or more efficient way to solve this problem. As it turns out, this is not decidable either.
$\mathsf{REG}_{\mathsf{TM}}$ is undecidable.
Let $R$ be a TM that decides $\mathsf{REG}_{\mathsf{TM}}$. Construct a TM $S$ to decide $\mathsf{A_{TM}}$ by doing the following:
What exactly $M'$ does on input $x$ can be a bit easier to trace as pseudocode.
\begin{algorithmic}
\PROCEDURE{tm-accept}{$\llbracket M,w \rrbracket$}
\PROCEDURE{M-w-hardcoded}{$x$}
\IF{$x = x^R$}
\RETURN{\textsf{Accept}}
\ELSE
\STATE result $ \gets $ \CALL{M}{$w$}
\IF{result $ = $ \textsf{Accept}}
\RETURN{\textsf{Accept}}
\ELSE
\RETURN{\textsf{Reject}}
\ENDIF
\ENDIF
\ENDPROCEDURE
\STATE result2 $ \gets $ \CALL{tm-regular}{$\llbracket$ \CALL{M-w-hardcoded}{}$\rrbracket$}
\IF{result $ = $ \textsf{Accept}}
\RETURN{\textsf{Accept}}
\ELSE
\RETURN{\textsf{Reject}}
\ENDIF
\ENDPROCEDURE
\end{algorithmic}
How does this work? The TM $M'$ always accepts all strings in $\{x \in \Sigma^* \mid x = x^R\}$, the set of palindromes, which we know is context-free. Then for all other input strings, acceptance depends on whether $M$ accepts $w$. As a result, if $M$ doesn't accept $w$, then $L(M') = \{x \in \Sigma^* \mid x = x^R\}$ and if $M$ accepts $w$, then $M'$ accepts every other string and in this case $L(M') = \Sigma^*$, which is regular. In other words, the language of $M'$ is \[L(M') = \begin{cases} \Sigma^* & \text{if $M$ accepts $w$}, \\ \{x \in \Sigma^* \mid x = x^R \} & \text{if $M$ does not accept $w$}. \end{cases}\] In this way, if there’s a way to figure out whether $M'$ recognizes a regular language, then we can come up with a way to decide $\mathsf{A_{TM}}$.
Something that can be confusing is the choice to accept strings that satisfy $x = x^R$. What do these strings have to do with $M$ or $w$? The answer is that they have nothing to do with $M$ or $w$! The choice of palindromes is totally arbitrary. We could have chosen to accept, say strings of the form $0^n1^n0^n$ instead.
The point of this construction is to create a situation where the language accepted by the machine $M'$ has one of two possiblities: either it accepts everything (when $M$ accepts $w$) or it only accepts some things (when $M$ doesn’t accept $w$). Here, we have chosen the “some” things to be a non-regular set of strings. This means that if we have a way of checking whether the language of $M'$ is regular or not, then we have a way to solve the acceptance problem.
At this point, we begin to see a possible pattern emerge: if we want to ask about whether a Turing machine accepts this or that set of strings, we can come up with an easy way to test for membership: simply construct a Turing machine whose language collapses to one language or another, depending on whether the given machine accepts the given string.
Perhaps we can generalize this. After all, it seems like every question we ask about Turing machines is undecidable. As it turns out, this is not just an educated hunch, but a theorem. First, we’ll try to formalize the notion of a “property”.
A property $P$ of a Turing machine is a semantic property if it depends on the language recognized by $M$ and not the syntactic structure of $M$. In other words, if $L(M_1) = L(M_2)$, then $M_1$ and $M_2$ have the same semantic properties.
A property $P$ can be expressed as a language consisting of exactly the encodings $\llbracket M \rrbracket$, where $M$ has property $P$. A semantic property $P$ is said to be non-trivial if there exists a TM $M_1$ such that $\llbracket M_1 \rrbracket \in P$ and a TM $M_2$ such that $\llbracket M_2 \rrbracket \not\in P$.
Then we have the following theorem.
All non-trivial semantic properties of Turing machines are undecidable.
In other words, for any given Turing machine $M$, we can’t decide any properties about $L(M)$ except for properties that are true for either exactly all or exactly none of the languages recognized by Turing machines. The proof is surprisingly simple, since it follows the basic template that we’ve been using so far.
To show this, we let $P$ be a property (that is, a language of encodings of Turing machines that have the property) and assume that it is decidable. Let $R_P$ be a Turing machine that decides $P$. We will reduce TM membership to property testing.
Let $T_\emptyset$ be a Turing machine that rejects every string. So $L(T_\emptyset) = \emptyset$. Without loss of generality, we assume that $\llbracket T_\emptyset \rrbracket \not\in P$. Otherwise, we can just use $\overline P$ instead. Since $P$ is non-trivial, there exists a Turing machine $T$ with $\llbracket T \rrbracket \in P$.
We construct a Turing machine that decides $\mathsf{A_{TM}}$ by being able to distinguish between $T_\emptyset$ and $T$.
When viewed as pseudocode, we have
\begin{algorithmic}
\PROCEDURE{tm-accept}{$\llbracket M,w \rrbracket$}
\PROCEDURE{M-w-hardcoded}{$x$}
\STATE $T \gets $ hardcoded Turing machine with property $P$
\STATE result $ \gets $ \CALL{M}{$w$}
\IF{result $ = $ \textsf{Reject}}
\RETURN{\textsf{Reject}}
\ELSE
\STATE result $ \gets $ \CALL{T}{$x$}
\IF{result $ = $ \textsf{Accept}}
\RETURN{\textsf{Accept}}
\ELSE
\RETURN{\textsf{Reject}}
\ENDIF
\ENDIF
\ENDPROCEDURE
\STATE result2 $ \gets $ \CALL{test-property-P}{$\llbracket$ \CALL{M-w-hardcoded}{}$\rrbracket$}
\IF{result $ = $ \textsf{Accept}}
\RETURN{\textsf{Accept}}
\ELSE
\RETURN{\textsf{Reject}}
\ENDIF
\ENDPROCEDURE
\end{algorithmic}
Notice that our machine tests $M$ on $w$. If $M$ rejects $w$, $M_w$ will reject every string and if $M$ doesn’t halt, neither will $M_w$—in either case, the language of $M_w$ is empty. If $M$ does accept $w$, $M_w$ will proceed to simulate $T$. This gives us \[L(M_w) = \begin{cases} L(T) & \text{if $M$ accepts $w$}, \\ \emptyset & \text{otherwise}. \end{cases}\] When given $\llbracket M_w \rrbracket$, $R_P$ will test it for property $P$ and in doing so, reveal what the language of $M_w$ is. Since this depends on whether $M$ accepts $w$, $\llbracket M_w \rrbracket \in P$ iff $M$ accepts $w$.
Rice’s theorem was proved by Rice in 1951 (the result appeared in a journal later in 1953). It is interesting to note that a similar result was discussed by Turing in his original 1936 paper about undecidability.
In essence, what Rice’s theorem says is that if we want to know something about the result of a program, there’s no silver bullet to figure this out just by reading the source code.
We’ve seen that we can ask questions about Turing machines, like membership, emptiness, and equality. We can also ask these same questions about less powerful models, like DFAs and CFGs and get corresponding decision problems for them. Let’s consider such problems.
While we’ll see that a lot of problems are decidable, there are some very surprising examples of problems which are undecidable. Many decidability properties that we’ll be discussing were shown by Rabin and Scott in 1959 and Bar-Hillel, Perles, and Shamir in 1961.
Here are the problems we’re interested in:
Here, “language device” just means some representation of the language. Remember that a language is a mathematical object that exists abstractly—it’s a set of strings. In order to compute something about a language, we need a representation. Such a representaiton needs to be finite, so unless our language is finite, we can’t just list all the strings in it.
Representations of languages are the objects that we’ve been working with all quarter: automata, regular expressions, grammars, and so on. So for every device (deterministic finite automata, nondeterministic finite automata, regular expressions, context-free grammars, pushdown automata, Turing machines, and so on), we can ask each of the above questions.
And as we’ve seen in other computer science classes, how we get an answer for each question will depend on the representation, even if the two representations are “equivalent”. This is why the questions are framed in terms of representation rather than language class—the thing we’re doing computation on is the representation.
The simplest problem we can ask about a language device and a word is whether the word belongs in the language. Again, this is the language membership problem: Given a device $A$ and a word $w$, is $w \in L(A)$?
We define the following language for the DFA acceptance problem: $$ \mathsf{A_{DFA}} = \{\llbracket B, w \rrbracket \mid \text{$B$ is a DFA that accepts $w$}\}.$$
$\mathsf{A_{DFA}}$ is a decidable language.
We can do the same thing we did for $\mathsf{A_{TM}}$: simulate the DFA. Recall that such a machine sets up three tapes:
In fact, this is even easier since we only need to keep track of how far we’ve read on the second tape.
We can ask the same question for NFAs: $$ \mathsf{A_{NFA}} = \{\llbracket B, w \rrbracket \mid \text{$B$ is an NFA that accepts $w$}\}.$$
$\mathsf{A_{NFA}}$ is a decidable language.
There are a few ways to approach this. One can consider simulating the NFA directly and it’s not hard to come up with a scheme to do this. But we can also transform our NFA into a DFA and just run the Turing machine for $\mathsf{A_{DFA}}$ on it.
Now, we’ll turn our attention to context-free languages. Again, from our Turing machine simulation, we can very easily get a Turing machine for \[\mathsf{A_{PDA}} = \{\llbracket P, w \rrbracket \mid \text{$P$ is a PDA, $w \in L(P)$}\}.\]
$\mathsf{A_{PDA}}$ is a decidable language.
What about grammars? Consider the following language. $$ \mathsf{A_{CFG}} = \{\llbracket G, w \rrbracket \mid \text{$G$ is a CFG that generates $w$}\}.$$
$\mathsf{A_{CFG}}$ is a decidable language.
This is also decidable; we even gave an efficient algorithm for it (CYK)! So we can pretend that we have a CYK Turing machine.