Let's consider the language $$ L_3 = \{w \in \{0,1\}^* \mid \text{$w$ is the binary representation of a number that is divisible by 3} \}.$$
Let's build a DFA that recognizes this language. Let $A = (Q,\Sigma,\delta,q_0,F)$, where
| $0$ | $1$ | |
|---|---|---|
| $0$ | $0$ | $1$ |
| $1$ | $2$ | $0$ |
| $2$ | $1$ | $2$ |
The state diagram for $A$ is shown below.
\begin{tikzpicture}[shorten >=1pt,on grid,node distance=2.5cm,>=stealth,thick,auto]
\node[state,initial,accepting] (0) {$0$};
\node[state] (1) [right of=0]{$1$};
\node[state] (2) [right of=1]{$2$};
\path[->]
(0) edge [loop above] node {$0$} (1)
(0) edge [bend left] node {$1$} (1)
(1) edge [bend left] node {$1$} (0)
(1) edge [bend left] node {$0$} (2)
(2) edge [bend left] node {$0$} (1)
(2) edge [loop above] node {$1$} (2)
;
\end{tikzpicture}
from automata.fa.dfa import DFA
A = DFA(
states=set('012'),
input_symbols=set('01'),
transitions={
'0': {'0': '0', '1': '1'},
'1': {'0': '2', '1': '0'},
'2': {'0': '1', '1': '2'},
},
initial_state='0',
final_states={'0'}
)
Let's think about how this machine should work. The DFA works by reading a string from left to right, which means that it'll be reading the most significant bit first. The natural way to think about this problem, then, is to consider what our input modulo 3 should be. Obviously, we want to accept a string if and only if it corresponds to an integer that's equivalent to $0 \bmod 3$. Keeping this in mind, having each state for the different possibilities modulo 3 seems like a reasonable idea.
There are a few caveats we should consider. First, how should we interpret $\varepsilon$? Since we begin in state 0, we're interpreting $\varepsilon$ as 0. For our purposes, this is fine. But if we didn't want to accept this interpretation, then we would need to add an extra state to handle this case. Similarly, we can ask the question of whether we'd like to allow inputs with leading zeroes and we would have to adjust our DFA accordingly.
Now, suppose we've read a string $w$ and now we want to read either a $0$ or a $1$. We just have to remember a bit about binary math. Let $\llbracket w \rrbracket_2$ denote the integer that $w$ represents in binary. Then, $$\llbracket w0\rrbracket_2 = 2 \cdot \llbracket w\rrbracket_2 \quad \text{and} \quad \llbracket w1\rrbracket_2 = 2 \cdot \llbracket w\rrbracket_2 + 1.$$ With this in mind, we just need to think about how reading a new bit will change our result modulo 3. We can summarize the changes to $\llbracket w\rrbracket_2 \bmod 3$ in the following table:
| $\llbracket w\rrbracket_2 \bmod 3$ | $\llbracket w0\rrbracket_2 \bmod 3$ | $\llbracket w1\rrbracket_2 \bmod 3$ |
|---|---|---|
| $0$ | $0$ | $1$ |
| $1$ | $2$ | $0$ |
| $2$ | $1$ | $2$ |
Now, this looks suspiciously like a transition table. If $q = \llbracket w \rrbracket_2 \bmod 3$ is the state that we're in, then we have $\delta(q,a) = (2 \cdot q + a) \bmod 3$ and this corresponds exactly with our definition of $\delta$ from above. We will use this to show that our DFA is correct.
$L_3 = L(A)$.
Let $\llbracket w \rrbracket_2$ denote the integer represented by the binary string $w$. First, we claim that a string $w$ reaches state $i$ if and only if $\llbracket w \rrbracket_2 = i \bmod 3$. In other words, $\delta^*(q_0, w) = i$ if and only if $\llbracket w \rrbracket_2 = i \bmod 3$. We will show this by induction on $w$.
Our base case is $w = \varepsilon$. Then $\delta^*(0,\varepsilon) = 0$ by definition and we have $\llbracket \varepsilon \rrbracket_2 = 0 = 0 \bmod 3$.
Now let $w = xa$, with $a \in \Sigma$ and $x \in \Sigma^*$. We assume $\delta^*(0,x) = i$ if and only if $\llbracket x \rrbracket_2 = i \bmod 3$. We have \begin{align*} \delta^*(0,xa) &= \delta(\delta^*(0,x),a) &\text{Definition of $\delta^*$} \\ &= \delta(\llbracket x \rrbracket_2 \bmod 3, a) &\text{Inductive hypothesis} \\ &= (2 \cdot (\llbracket x \rrbracket_2 \bmod 3) +a) \bmod 3 &\text{Definition of $\delta$}\\ &= (2 \cdot \llbracket x \rrbracket_2 + a) \bmod 3 &\text{Reduction in $\mathbb Z_3$}\\ &= \llbracket xa \rrbracket_2 \bmod 3 &\text{Definition of $\llbracket \cdot \rrbracket_2$} \end{align*}
Thus, we have shown that $\delta^*(q_0,w) = i$ if and only if $\llbracket w \rrbracket_2 = i \bmod 3$.
To argue that $L_3 = L(A)$, we observe that \begin{align*} w \in L_3 &\iff 3 \mid \llbracket w \rrbracket_2 & \text{Definition of $L$}\\ &\iff \llbracket w \rrbracket_2 = 0 \bmod 3 & \text{Definition of divsibility}\\ &\iff \delta^*(q_0, w) = 0 &\text{Proved above}\\ &\iff w \in L(A) &\text{Since $F = \{0\}$} \end{align*} Therefore, we can conclude that $A$ accepts $w$ if and only if $\llbracket w \rrbracket_2$ is divisible by 3.
It's important to note the structure of the argument that was made here. In order to prove that our DFA accepted the correct language, we need to show that it accepts only the strings in the target language. That is, the DFA must accept the strings in $L$ and it must reject the string not in $L$.
In order to argue this, we must argue that every string reaches the "correct' state in the DFA. This is the most common mistake that is made in these proofs—you cannot prove that $L = L(A)$ by induction on strings in $L$.
Once you have successfully shown that every string reaches the correct state, we finish the proof by arguing that a string is in $L$ if and only if it reaches a final state. This should be relatively straightforward at this point.
From this, one might ask whether it's possible to construct a DFA that recognizes all of the binary strings that represent integers that are divisible by $n$ for any arbitrary integer $n$. The answer is yes, and it's not hard to see why. There are only finitely many equivalence classes modulo $n$ (or rather, exactly $n$) and all we need to do is keep track of how this number changes with each additional symbol read. You can even take this approach and make it work for strings representing integers in any base $k$.
This captures the kinds of problems that finite automata can solve. The key property of finite automata is that they can only keep track of a finite amount of information. This information doesn't scale with the size of the input—even with a finite amount of information, we're able to describe an infinite set like the integers modulo 3. In the language of complexity theory, these are problems that have space complexity $O(1)$. What kinds of problems can we solve with only a finite amount of information?
Recall that languages are just sets of words and that we can perform set operations on languages. However, once we start thinking about computation, particularly with our simple model of computation, an important question is whether the resulting language that we get from applying these operations is still computable.
What does applying these operations allow us to do? Consider the boolean set operations. If we took one of the problems we've seen, like divisibility by 3 or searching for a substring, we can use the boolean set operations to express natural followup questions:
We say that a class of languages is closed under an operation if whenever the arguments of the operation are in the class, the result is also.
Let's consider the DFA languages. If we want to show that the DFA languages are closed under a particular operation, we have to show that the result of the operation can be recognized by a DFA. The most straightforward way of doing this is to take the input DFAs and use them to construct a DFA for the result—an algorithm.
First, let's do a simple, but very useful, example.
The DFA languages are closed under complement.
Proof. Let $L \subseteq \Sigma^*$ be a DFA language. Since $L$ is DFA, it is recognized by a DFA $A = (Q,\Sigma^*,\delta,q_0,F)$. We will construct a DFA $A'$ that recognizes $\overline L$ as follows: let $A' = (Q,\Sigma,\delta,q_0,Q \setminus F)$.
In other words, we take the DFA $A$ and swap the set of non-final and final states. Then we have for all $w \in \Sigma^*$, \begin{align*} w \in L(A') &\iff \delta(q_0,w) \in Q \setminus F \\ &\iff w \not \in L(A) \\ &\iff w \not \in L \\ &\iff w \in \overline L \end{align*} $\tag*{$\Box$}$
The idea that this is an algorithm becomes even more stark if we render it as a program.
def complement(A: DFA) -> DFA:
return DFA(
states=A.states,
input_symbols=A.input_symbols,
transitions=A.transitions,
initial_state=A.initial_state,
final_states=(A.states - A.final_states)
)
If we think back to languages as decision problems, then this gives us a handy way to solve the complement of some problems. For instance, we now have a very easy way to determine whether a number isn't divisible by 3. That is, the complement of a language is really the same as asking for the negation of the respective decision problem.