CS 360 — Lecture 3

We won't go through a formal correctness proof for the Hamming neighbourhood DFA, but it's not difficult to see the approach for such a proof. What we would need to show is that an input word $w = b_1 b_2 \cdots b_\ell$ reaches a state $\langle i,j \rangle$ if and only if $|w| = i$ and $H(w,a_1 \cdots a_i) = j$.

We can try to extend this approach to finite languages or even regular languages. For instance, consider the following DFA:

It was shown by Grigoriy Povarov in 2007 that for this DFA, to recognize all words within a Hamming distance of 1, we actually need a different construction that uses $O(2^n)$ states. It appears that we run into problems with the restriction that the transition function can only go to one state for every state and symbol. Perhaps, relaxing this condition may help.

Nondeterminism

As DFAs are currently defined, and so named, the transition function $\delta$ is deterministic. Specifically, it's determined by the current state and input symbol; there is exactly one state that the transition function points to. This does keep things simple for us and it accurately reflects how real computers work. However, it does seem like a bit of a stifling restriction and sometimes there are a few points where it'd be so much easier just to be able to go to multiple states.

Here, we'll relax the condition that the transition function is deterministic by introducing nondeterminism. We call these machines nondeterministic finite automata for obvious reasons. Informally, this means that in an NFA, for each state and transition symbol, we'll allow transitions to multiple states.

Conceptually, what does this mean? In essence, the NFA can "guess" which transition to take. While with a DFA we could represent a computation on an input word with a single path, with nondeterminism, there are multiple paths which we can represent as a tree, with branches forming at the points where there are choices to be made.

But if there are multiple possible paths, how do we define acceptance under this model? If there is at least one path which, after reading the input word, reaches an accepting state, then the NFA accepts the word. This fits with the idea that the NFA makes guesses whenever it has a choice; the existence of a path that reaches a final state means that there is a set of guesses that the NFA could make in order for the word to be accepted. Conversely, the NFA rejects $w$ if there are no paths which reach a final state upon reading $w$.

The following is an example of an NFA, which accepts the language $\{a,b\}^* b \{a,b\}^2$. Note that on $q_0$, there are two transitions on $b$: one to $q_0$ and one to $q_1$.

Nondeterministic finite automata

Formally, a nondeterministic finite automaton (NFA) is a 5-tuple $A = (Q,\Sigma,\delta,I,F)$, where

$Q$ is a finite set of states,
$\Sigma$ is an alphabet,
$\delta$ is a transition function $\delta : Q \times \Sigma \to 2^Q$,
$q_0$ is the initial state,
$F$ is a set of final states.

This looks basically the same as the DFA, except for one important difference: the transition function $\delta$ is now a function from a state and a symbol to a set of states. This explains why we don't require that $\delta$ is defined explicitly for all states and symbols. Since the image of $\delta$ is sets of $Q$, we implicitly define "missing" transitions as going to the empty set $\emptyset$.

This also means that we will need to formally redefine acceptance of a word with nondeterminism. Again, we will extend our transition function to a function $\hat \delta: Q \times \Sigma^* \to 2^Q$. We define $\hat \delta(q,\varepsilon) = \{q\}$. Then for a word $w$ with $|w| > 0$, we let $w = xa$ for $x \in \Sigma^*$ and $a \in \Sigma$, we have $$ \hat \delta(q,xa) = \bigcup_{p \in \hat \delta(q,x)} \delta(p,a).$$ Let's take a moment to see how this definition fits with our intution about how nondeterminism behaves. It is easy to see that by taking the union of these sets, we really do get all possible transitions. However, note that this definition also handles transitions that are missing: if $\delta(p,a) = \emptyset$, then for any set $S$, we have $S \cup \emptyset = S$, and so $\hat \delta$ remains well-defined.

Now, we can define acceptance under the nondeterministic model. We say that an NFA $A$ accepts a word $w$ if $\hat \delta(q_0,w) \cap F \neq \emptyset$. Then the language of an NFA $A = (Q,\Sigma,\delta,q_0,F)$ is defined by $$ L(A) = \{w \in \Sigma^* \mid \hat \delta(q_0,w) \cap F \neq \emptyset \}.$$

NFAs vs DFAs

So now, the obvious question to ask is: does nondeterminism really give us more power? That is, can NFAs recognize languages that DFAs can't? It turns out the answer is no.

But how? We'll show that given any NFA $N$, we can construct a DFA $M$ that accepts the same language. The key observation is in examining what exactly is happening with our nondeterministic transition function. Remember that for each state and symbol, the transition function gives a set of states rather than a single state as a deterministic finite automaton.

So how many sets are there? If there are $n$ states, then there are $2^n$ possible subsets of $Q$. This is a large number which we've been trained as computer scientists to be alarmed at, but the important thing is that it is finite. This gives us a way to "simulate" the nondeterminism of an NFA using a DFA: just make each subset of $Q$ its own state!

Let's try this with the example above. Here is the transition table with all the subsets.

	$a$	$b$
$\emptyset$	$\emptyset$	$\emptyset$
$\{q_0\}$	$\{q_0\}$	$\{q_0,q_1\}$
$\{q_1\}$	$\{q_2\}$	$\{q_2\}$
$\{q_2\}$	$\{q_3\}$	$\{q_3\}$
$\{q_3\}$	$\emptyset$	$\emptyset$
$\{q_0,q_1\}$	$\{q_0,q_2\}$	$\{q_0,q_1,q_2\}$
$\{q_0,q_2\}$	$\{q_0,q_3\}$	$\{q_0,q_1,q_3\}$
$\{q_0,q_3\}$	$\{q_0\}$	$\{q_0,q_1\}$
$\{q_1,q_2\}$	$\{q_2,q_3\}$	$\{q_2,q_3\}$
$\{q_1,q_3\}$	$\{q_2\}$	$\{q_2\}$
$\{q_2,q_3\}$	$\{q_3\}$	$\{q_3\}$
$\{q_0,q_1,q_2\}$	$\{q_0,q_2,q_3\}$	$\{q_0,q_1,q_2,q_3\}$
$\{q_0,q_1,q_3\}$	$\{q_0,q_2\}$	$\{q_0,q_1,q_2\}$
$\{q_0,q_2,q_3\}$	$\{q_0,q_3\}$	$\{q_0,q_1,q_3\}$
$\{q_1,q_2,q_3\}$	$\{q_2,q_3\}$	$\{q_2,q_3\}$
$\{q_0,q_1,q_2,q_3\}$	$\{q_0,q_2,q_3\}$	$\{q_0,q_1,q_2,q_3\}$

Note that all the subsets in red cannot be reached from the initial state $\{q_0\}$. Now, here is the state diagram for the DFA.

I've omitted all the subsets that can't be reached, but this is still a large DFA relative to the size of the NFA. In fact, one can show that the language $\{a,b\}^* b \{a,b\}^{n-1}$ can be recognized by an NFA with $n$ states but requires a DFA with $2^{n-1}$ states.

This construction is called the subset construction and was due to Rabin and Scott in 1959. An interesting question that one can consider is when the worst case of $2^n$ states comes up. Similar questions about the worst-case growth in the number of states of a DFA or NFA can be asked for other transformations or language operations in a line of research in automata theory called state complexity.

Now, let's define the subset construction formally. Let $N = (Q,\Sigma,\delta,q_0,F)$ be an NFA. We will construct a DFA $\mathcal D(N) = (Q',\Sigma,\delta',q_0',F')$ such that $L(\mathcal D(N)) = L(N)$. We will define each component of $\mathcal D(N)$:

$Q' = 2^Q$. That is, $Q'$ is the set of subsets of $Q$. This means that a state of $\mathcal D(N)$ is a subset of $Q$.
$\Sigma$ is the same alphabet as in $N$.
$q_0' = \{q_0\}$. That is, the initial state of $\mathcal D(N)$ is the set containing only $q_0$.
$F' = \{S \subseteq Q \mid S \cap F \neq \emptyset\}$. Remember that the acceptance condition of $N$ is that there is at least one path in $N$ that ends in an accepting state. Thus, the set of final states of $\mathcal D(N)$ is any subset of $Q$ that contains a final state of $N$.
Finally, we have the transition function $\delta'$, which we define in terms of a set $S \subseteq Q$ and an input symbol $a \in \Sigma$: $$\delta'(S,a) = \bigcup_{p \in S} \delta(p,a).$$ In other words, $\delta'(S,a)$ is all states reachable in $N$ from $S$ on the symbol $a$.

Now, we need to show that $\mathcal D(N)$ and $N$ recognize exactly the same language.

Theorem 2.11. Let $N$ be an NFA. Then $L(N) = L(\mathcal D(N))$.

Proof. We want to show that $w \in L(N)$ if and only if $w \in L(\mathcal D(N))$. This means that by definition, $\hat \delta(q_0,w) \cap F \neq \emptyset$ if and only if $\hat \delta'(\{q_0\},w) \cap F \neq \emptyset$. So, we will show that $\hat \delta(q,w) = \hat \delta'(\{q\},w)$. We will show this by induction on $|w|$.

First, we have $|w| = 0$ and so $w = \varepsilon$. Then $\hat \delta(q,\varepsilon) = \{q\} = \hat \delta'(\{q\},\varepsilon)$.

Now, suppose $|w| > 0$ and that for every string $u$ with $|u| < |w|$, we have $\hat \delta(q,u) = \hat \delta'(\{q\},u)$. Let $w = ua$ with $a \in \Sigma$. Then we have \begin{align*} \hat \delta'(\{q\},ua) &= \delta'(\hat\delta'(\{q\},u),a) &\text{(by definition of $\hat \delta'$)}\\ &= \delta'(\hat \delta(q,u),a) &\text{(by inductive hypothesis)}\\ &= \bigcup_{p \in \hat\delta(q,u)} \delta(p,a) &\text{(by definition of $\hat \delta'$)}\\ &= \hat \delta(q,ua). &\text{(by definition of $\hat \delta$)} \end{align*} Thus, we have shown that $\hat \delta(q,w) = \hat \delta'(\{q\},w)$ for all $q \in Q$ and $w \in \Sigma^*$, and we can conclude that $L(N) = L(\mathcal D(N))$. $$\tag*{$\Box$}$$

Of course, this only gives us half of what we need, although what remains is fairly simple.

Theorem 2.12. A language $L$ is accepted by a DFA if and only if $L$ is accepted by an NFA.

We have already shown the only if direction of this theorem in Theorem 2.11. What's left to show is the if direction, which involves taking a DFA and constructing an equivalent NFA. This should be fairly simple to define and prove correctness for, so we won't go through it.