CS 360 — Lecture 17

Undecidable problems

We'll be looking at some examples of undecidable languages. To show that these languages are undecidable, we will be using an idea that is fairly central to theoretical computer science: reducibility. Essentially, this is the idea that you can be clever about using a solution to some other problem you've already solved. For instance, it is fairly labourious to show that a problem is undecidable by setting up a diagonalization argument every time. Instead, we want to be able to make use of results that we've already proved, which in this case would involve reducing a problem to some other problem that we know is already undecidable.

To start, we'll consider a problem that you've already heard about: Turing's halting problem. The halting problem is defined $$ HALT_{TM} = \{ \langle M,w \rangle \mid \text{$M$ is a TM and $M$ halts on input $w$} \}.$$

Theorem. $HALT_{TM}$ is undecidable.

Proof. Assume that there exists a Turing machine $H$ that decides $HALT_{TM}$. Then we can construct the following Turing machine to decide $A_{TM}$:

On input $\langle M,w \rangle$, where $M$ is a TM and $w$ is a string, run TM $H$ on input $\langle M,w \rangle$.
If $H$ rejects, reject.
If $H$ accepts, simulate $M$ on $w$ until it halts.
If $M$ accepts, then accept; if $M$ rejects, then reject.

So if $H$ decides $HALT_{TM}$, then we can decide $A_{TM}$ with the preceding TM. However, we know that $A_{TM}$ is undecidable, so this is a contradiction and $HALT_{TM}$ must be undecidable. $$\tag*{$\Box$}$$

As it turns out, Turing's original proof of (what we now call) the halting problem is shown directly via a diagonal argument like the proof of the undecidability of $A_{TM}$. It's not too hard to see how we might be able to modify the proof of $A_{TM}$ to work for the halting problem.

Now, we can go and ask the same questions of TMs as we've been asking about DFAs and CFGs. As you'd probably expect, they are all undecidable. Let's consider the following language $$ E_{TM} = \{ \langle M \rangle \mid \text{$M$ is a TM and $L(M) = \emptyset$} \}. $$

Theorem. $E_{TM}$ is undecidable.

Proof. First, we construct a machine $M_w$ that operates in the following way:

On input $x$, if $x \neq w$, reject.
If $x = w$, run $M$ on input $w$ and accept if $M$ accepts.

The machine $M_w$ rejects every word that is not $w$. Then the only word that can be accepted is $w$. If $M_w$ doesn't accept $w$, then $L(M_w) = \emptyset$. Thus, $M_w$ recognizes the empty language if and only if it doesn't accept $w$.

Now, we assume that there exists a TM $R$ that decides $E_{TM}$ and construct a TM $S$ that decides $A_{TM}$ as follows:

On input $\langle M,w \rangle$, where $M$ is an encoding of a TM and $w$ is an input string, construct $M_w$ as described.
Run $R$ on $\langle M_w \rangle$
If $R$ accepts, reject; if $R$ rejects, accept.

Thus, if $R$ decides $E_{TM}$, then $S$ decides $A_{TM}$. However, we know that $A_{TM}$ is undecidable, so $R$ cannot exist and $E_{TM}$ is undecidable. $$\tag*{$\Box$}$$

Here's another kind of question we can ask of Turing machines, since they are so powerful. We can ask whether or not the language of a TM can be recognized by a less powerful model of computation, like a DFA. Such a language is defined $$ REG_{TM} = \{ \langle M \rangle \mid \text{$M$ is a TM and $L(M)$ is a regular language} \}. $$ Or as it turns out, we can't ask that question because it is undecidable.

Theorem. $REG_{TM}$ is undecidable.

Proof. Let $R$ be a TM that decides $REG_{TM}$. Construct a TM $S$ that does the following:

On input $\langle M,w \rangle$, where $M$ is a TM and $w$ is a string, construct the TM $M'$:
1. On input $x$, if $x$ has the form $0^n 1^n$, accept.
2. If $x$ doesn't have this form, run $M$ on $w$ and accept if $M$ accepts $w$.
Run $R$ on $\langle M' \rangle$.
If $R$ accepts, accept; if $R$ rejects, reject.

How does this work? The TM $M'$ accepts all strings in $\{0^n 1^n \mid n \geq 0\}$, which we know is context-free. If $M$ doesn't accept $w$, then $L(M') = \{0^n 1^n \mid n \geq 0\}$. If $M$ accepts $w$, then $M'$ also accepts every other string and in this case $L(M') = \Sigma^*$, which is regular. In this way, if there's a way to figure out whether $M'$ recognizes a regular language, then we can come up with a way to decide $A_{TM}$. $$\tag*{$\Box$}$$

Finally, let's consider the problem of equality between two TMs, specified as $$ EQ_{TM} = \{ \langle M_1, M_2 \rangle \mid \text{$M_1,M_2$ are TMs and $L(M_1) = L(M_2)$} \}. $$

Theorem. $EQ_{TM}$ is undecidable.

Proof. This time, we reduce from $E_{TM}$. Suppose we have a TM $R$ that decides $EQ_{TM}$. Then we can construct the following TM to decide $E_{TM}$.

On input $\langle M \rangle$, where $M$ is a TM, run $R$ on $\langle M, N \rangle$, where $N$ is a TM that rejects all inputs.
If $R$ accepts, accept; if $R$ rejects, reject.

Here, $N$ rejects all input so $L(N) = \emptyset$. If we test whether $M$ and $N$ accept the same language, then we're basically testing whether $L(M) = \emptyset$ and now we have a way to decide $E_{TM}$. $$\tag*{$\Box$}$$

Equality for CFGs is undecidable

Now, we return to the problem of showing that $EQ_{CFG}$ is undecidable.

Let $M$ be a Turing machine and $w$ and input string. An accepting computation path for $M$ on $w$ is a sequence of configurations $C_1, \dots, C_\ell$, where $C_1$ is the start configuration of $M$ on $w$ and $C_\ell$ is an accepting configuration of $M$ on $w$ and each $C_i$ follows directly from $C_{i-1}$ (that is, $C_{i-1} \vdash C_i$) as defined by the transition function of $M$. We define a rejecting computation path similarly.

First, we show that a related problem, universality for CFGs, is undecidable: $$ALL_{CFG} = \{\langle G \rangle \mid \text{$G$ is a CFG and $L(G) = \Sigma^*$} \}$$

Theorem. $ALL_{CFG}$ is undecidable.

Proof. We will assume that $ALL_{CFG}$ is decidable and show that doing so will allow us to decide $A_{TM}$. We will construct a CFG $G$ that accepts all strings if and only if $M$ does not accept $w$. In particular, if $M$ accepts $w$, $G$ will not generate the accepting computation path for $M$ on $w$.

What does an invalid accepting computation path look like? A computation path $C = C_1, \dots, C_\ell$ is given as a string $\langle C \rangle = \# C_1 \# \cdots \# C_\ell \#$. Then we check for one of the following:

$C_1$ isn't a starting configuration,
$C_\ell$ isn't an accepting configuration,
Some configuration $C_i$ doesn't yield $C_{i+1}$ according to $M$.

The idea is that if $M$ doesn't accept $w$, then there is no accepting computation path for $w$ and every string is an invalid accepting computation path.

We can construct a PDA $D$ to check this for us that operates as follows: on an input word $x$, we nondeterministically check each of the three conditions. If any of the branches is satified, we can accept. Also note that implicit in this is that we also accept all strings that do not conform to the "configuration path" format, since those are clearly not valid computation paths at all.

Checking $C_1$ and $C_\ell$ is straightforward, so we will go through checking $C_i$ in more detail. In this branch, we nondeterministically check each $C_i$, which begins at a $\#$. We then read $C_i$ push each symbol onto the stack until we reach the next configuration denoted by $\#$. We then pop $C_i$ from the stack to compare with $C_{i+1}$.

This works conceptually, but there is a slight technical problem: pushing $C_i$ onto the stack means that it is popped in reverse order. This is easy to fix: simply write every other configuration in reverse order.

Thus, $D$ recognizes all of the strings that are not valid accepting computation paths of $M$ on $w$. We can convert $D$ into a grammar $G'$ and use a TM that decides $ALL_{CFG}$ to decide if $G'$ is universal or not. As detailed at the beginning of the proof, $G'$ is universal if and only $M$ does not accept $w$. Thus, $ALL_{CFG}$ is undecidable. $$\tag*{$\Box$}$$

Finally, we can show that $EQ_{CFG}$ is undecidable.

Theorem. $EQ_{CFG}$ is undecidable.

Proof. If we have a TM $R$ that decides $EQ_{CFG}$, then we can decide $ALL_{CFG}$ with the following machine:

On input $\langle G \rangle$, where $G$ is a CFG, run $R$ on input $\langle G,G' \rangle$, where $L(G') = \Sigma^*$.
If $R$ accepts, accept; if $R$ rejects, reject.

But $ALL_{CFG}$ is undecdiable, so $EQ_{CFG}$ must also be undecidable. $$\tag*{$\Box$}$$

This technique of defining CFGs to generate Turing machine computations (or not Turing machine computations, as it turns out) was shown by Hartmanis in 1967. He shows that several other decision problems regarding CFGs are undecidable using the same approach.