CISC 462 — Lecture 4

Regular languages

We'll be looking at examples of decidable languages and how to describe Turing machines that decide them. In particular, we'll be drawing from problems in formal language theory. There are a few reasons for this. The first is that since everyone should have taken CISC 223, everyone should be relatively familiar with the concepts of DFAs and PDAs and such. Another reason is that while we'll see that a lot of problems are decidable, formal language theory has some very stark examples of problems which are unsolvable.

First, we begin with problems dealing with regular languages. Remember that, informally, a regular language is a language that can be recognized by a deterministic finite automaton and can be described by a regular expression. If you would like a more detailed refresher on regular languages, you can refer to Chapter 1 of the textbook.

Membership

The simplest problem we can ask about a DFA and a word is whether the DFA will accept the word. This problem is obviously decidable. We define the following language for the DFA acceptance problem: $$ A_{DFA} = \{\langle B, w \rangle \mid \text{$B$ is a DFA that accepts $w$}\}.$$

Theorem. $A_{DFA}$ is a decidable language.

The idea of the proof is rather simple: we simply simulate $B$ on the input string $w$. How does would a Turing machine do this? The machine would need to keep track of two things:

  1. The current state of $B$, and
  2. The current location in the word $w$.

Then the machine accepts the input if and only if $B$ is in an accepting state when the last symbol of $w$ is processed. Otherwise, it rejects.

We can ask the same question for NFAs: $$ A_{NFA} = \{\langle B, w \rangle \mid \text{$B$ is an NFA that accepts $w$}\}.$$

Theorem. $A_{NFA}$ is a decidable language.

Here, we can take advantage of the theorem we just proved above. We can construct a Turing machine that does the following on the input $\langle B,w \rangle$, where $B$ is an NFA and $w$ is an input word:

  1. Convert $B$ into a DFA $B'$ via subset construction.
  2. Run the Turing machine from the above theorem on the input $\langle B', w \rangle$ to check if $w \in L(B')$.
  3. If the TM accepts, accept. Otherwise, reject.

This machine demonstrates two ideas. The first is that we can use other Turing machines as subroutines like a typical programmer would think of doing. The second is that we are able to convert NFAs to DFAs, so we don't need to consider DFAs and NFAs separately anymore.

Emptiness

Now, we want to determine whether a given DFA will accept any strings. We will consider the following language $$ E_{DFA} = \{ \langle A \rangle \mid \text{$A$ is a DFA and $L(A) = \emptyset$}\}. $$

Theorem. $E_{DFA}$ is a decidable language.

To solve this problem, we simply treat it like a graph problem. A DFA accepts a word if there is an accepting state that is reachable from the initial state. Then we can use a similar idea to the graph connectedness algorithm from last week for this problem. The TM looks like this:

  1. Mark the initial state of $A$.
  2. Repeat until no new states get marked:
    Mark any state that has a transition coming into it from any state that is already marked.
  3. If no accepting state is marked, accept. Otherwise, reject.

Containment

Another major property of regular languages that is of use is to figure out whether one language is contained in another. This is expressed via the following language, $$ CO_{DFA} = \{ \langle A,B \rangle \mid \text{$A$ and $B$ are DFAs and $L(A) \subseteq L(B)$} \}.$$ Of course, this is another decidable property, so we have the following theorem.

Theorem. $CO_{DFA}$ is a decidable language.

Again, we make use of the machine that we constructed in the previous theorem. First, let's consider what it means for $L(A)$ to be contained in $L(B)$. We want to check whether every word in $L(A)$ is also in $L(B)$. However, an equivalent way of checking this property is to check whether there are any words in $L(A)$ that are not in $L(B)$. So we construct a DFA $C$ such that $$ L(C) = L(A) \cap \overline{L(B)}$$ and check whether or not $L(C) = \emptyset$. Luckily, we've just shown how to check the emptiness of a DFA.

  1. On input $\langle A,B \rangle$, construct the DFA $C$ as described above.
  2. Run the TM from the previous theorem on the input $\langle C \rangle$ to check if $L(C) = \emptyset$.
  3. If the machine accepts, accept. Otherwise, reject.

Equality

The last major formal languages question we'll cover in class deals with equality. Given two DFAs $A$ and $B$, do they recognize the same language? We have the following language, $$ EQ_{DFA} = \{ \langle A,B \rangle \mid \text{$A$ and $B$ are DFAs and $L(A) = L(B)$} \}.$$

Theorem. $EQ_{DFA}$ is a decidable language.

Of course, we have the opportunity to make use of the machine that we constructed in the previous theorem. Recall that for two sets $S$ and $T$, $S = T$ if and only if $S \subseteq T$ and $T \subseteq S$. This fact combined with the containment machine we constructed gives us a rather simple algorithm for checking equality.

  1. On input $\langle A,B \rangle$, run the TM from the previous theorem on the input $\langle A,B \rangle$ to check if $L(A) \subseteq L(B)$ and run it again on $\langle B,A \rangle$ to check if $L(B) \subseteq L(A)$.
  2. If both machines accept, accept. Otherwise, reject.

Universality

There is one more important property: universality. This is the problem of determining whether or not a given DFA accepts every string. In other words, does $L(A) = \Sigma^*$? We won't be going through the proof here since it's a problem on the first assignment.