CISC 462 — Lecture 19

The Cook-Levin Theorem

Now we'll finally get around to proving that $\text{SAT}$ is $\mathbf{NP}$-complete. It would be silly if after all of the reductions we've done that SAT wasn't NP-complete, since we've sort of hinged everything on that. Before we get started, let's review what our objective is. We need to show that SAT is NP-complete. In other words, we need to show that we can reduce every single problem in NP to SAT. Essentially, what we will show is that we can encode the computation of an arbitrary polynomial time nondeterministic Turing machine as a boolean formula of polynomial size.

Theorem. $\text{SAT}$ is $\mathbf{NP}$-complete.

Proof. It is clear that $\text{SAT}$ is in $\mathbf{NP}$, so we just show that $\text{SAT}$ is $\mathbf{NP}$-hard.

Let $L$ be a language in $\mathbf{NP}$ and let $M$ be the polynomial time nondeterministic Turing machine that decides it. We show that $L$ is polynomial time reducible to $\mathrm{SAT}$ by describing a way to tranform in polynomial time every input $x \in \Sigma^*$ into a boolean formula $\varphi_x$ such that $x \in L$ if and only if $\varphi_x$ is satisfiable.

We suppose that $M$ runs in time $n^k$. Then we can represent the computation of $M$ on an input word $x$ as an $n^k \times n^k$ grid called a tableau. Each row is a configuration, with the first and last columns bordered by $\#$. The first row is the start configuration of $M$ on $x$. A tableau is accepting if some row is an accepting configuration. An accepting tableau then corresponds to an accepting branch of a computation of $M$ on $w$.

Now we describe our polynomial time transformation. We let $C = Q \cup \Gamma \cup \{\#\}$ be our alphabet and define $cell[i,j]$ to be the contents of the $i,j$th cell of the tableau. Then we define a variable $x_{i,j,s}$ for every $1 \leq i,j \leq k$ and $s \in C$ by $$x_{i,j,s} = 1 \iff cell[i,j] = s.$$ Based on these variables, we construct our formula, which has four parts, $$\varphi_x = \varphi_{cell} \wedge \varphi_{start} \wedge \varphi_{move} \wedge \varphi_{accept}.$$

The first part, $\varphi_{cell}$, checks that each cell of the tableau contains exactly one symbol, $$\varphi_{cell} = \bigwedge_{1 \leq i,j \leq n^k} \left[ \left( \bigvee_{s \in C} x_{i,j,s} \right) \wedge \left( \bigwedge_{s,t \in C \\ s \neq t} \left(\overline{x_{i,j,s}} \vee \overline{x_{i,j,t}} \right) \right)\right].$$

The formula $\varphi_{start}$ encodes the start configuration. If our starting configuration is $$\# q_0 a_1 a_2 \cdots a_n \Box \cdots \Box \#$$ where $q_0$ is the initial state and the input word is $x = a_1 a_2 \cdots a_n$, then we have $$\varphi_{accept} = x_{1,1,\#} \wedge x_{1,2,q_0} \wedge x_{1,3,a_1} \wedge x_{1,4,a_2} \wedge \cdots \wedge x_{1,n+2,a_n} \wedge x_{1,n+3,\Box} \wedge \cdots \wedge x_{1,n^k-1,\Box} \wedge x_{1,n^k,\#}.$$

The formula $\varphi_{accept}$ encodes the condition for an accepting configuration. In practice, this just checks that some configuration contains the accepting state $q_A$ at some point in the tableau, $$ \varphi_{accept} = \bigvee_{1 \leq i,j \leq n^k} x_{i,j,q_A}.$$

Finally, $\varphi_{move}$ encodes the legal transitions from one configuration to the next. This sounds like a lot of work, but in reality, we only have to check a small portion of each configuration. Specifically, we have to check every $2 \times 3$ block of cells to ensure that all of the transitions are valid. Why? On any transition, we have the same sequence of actions:

Read the symbol under the tape head.
Move to the next state.
Write a symbol to under the tape head.
Move the tape head one cell left or right.

Rather than formally define what constitutes a legal window (since it's tedious and not super important to get the gist of the proof), we'll look at some examples.

Here, we have a transition moving left from $q_1$ to $q_2$.

$a$	$q_1$	$b$
$q_1$	$a$	$c$

Here, we have a transition moving right from $q_1$ to $q_2$.

$a$	$q_1$	$b$
$a$	$a$	$q_2$

Here, we have a transition moving to the right, out of the window.

$a$	$a$	$q_1$
$a$	$a$	$b$

Here, the transition is not taking place on this part of the tape.

$\#$	$a$	$b$
$\#$	$a$	$b$

Here, we have a transition moving on to the window from the right.

$a$	$b$	$a$
$a$	$b$	$q_2$

Here, we have a transition where the tape head indicator is just outside the window to the left.

$a$	$a$	$a$
$c$	$a$	$a$

From this, we can see that if the top row of the tableau is the start configuration and every window in the tableau is legal, then each row of the tableau is a configuration that follows from the preceding one.

Then, we can express the formula for $\varphi_{move}$, $$\varphi_{move} = \bigwedge_{1 \leq i,j \leq n^k} \text{(the $(i,j)$ window is legal)}.$$ We define the $(i,j)$ window as the window anchored at $(i,j)$ in the following way with the symbol $a_1$ in cell $(i,j)$:

$a_1$	$a_2$	$a_3$
$a_4$	$a_5$	$a_6$

Then we can express the expression $\text{(the $(i,j)$ window is legal)}$ by the following formula $$\bigvee_{\text{$a_1,\dots,a_6$ is a legal window}} (x_{i,j,a_1} \wedge x_{i,j+1,a_2} \wedge x_{i,j+2,a_3} \wedge x_{i+1,j,a_4} \wedge x_{i+1,j+1,a_5} \wedge x_{i+1,j+2,a_6}).$$

To see that this reduction is polynomial, we consider the size of $\varphi$. We note that the size of $\varphi$ varies based only on the size of the tableau and not the elements of the machine, such as the number of states or alphabet symbols, since those are all fixed and do not depend on the size of the input. The tablueau has $n^k \times n^k$ cells. This gives us $O(n^{2k})$ variables. Each cell has a fixed subformula in $\varphi_{cell}$, giving us $O(n^{2k})$ clauses. The formula $\varphi_{start}$ contains a subformula for each cell of the first row and thus is of size $O(n^k)$. The formulas $\varphi_{accept}$ and $\varphi_{move}$ contain fixed subformulas for each cell of the tableau, which means they of size $O(n^{2k})$. In total, $\varphi$ has size $O(n^{2k})$ and therefore is polynomial in $n$. $$\tag*{$\Box$}$$

The complement of a language in NP

When we introduced the class $\mathbf{NP}$, it was mentioned that while a language in $\mathbf{NP}$ could be verified in polynomial time using a polynomial length certificate, the same could not be said if we wanted to show the inverse. For instance, it's very easy to check that a formula $\varphi$ is satisfiable via some assignment that I provide, but it is not clear how to do the same to show that $\varphi$ doesn't contain a satisfying assignment.

Just as we did with the class of efficiently verifiable decision problems, we can classify the class of problems which are the complements of efficiently verifiable decision problems.

We define the class $\mathbf{coNP}$ to be $$ \mathbf{coNP} = \{ L \mid \overline L \in \mathbf{NP} \}.$$ It's important to note that $\mathbf{coNP}$ is not $\overline{\mathbf{NP}}$. To make this more clear, we can define $\mathbf{coNP}$ in a different way.

We say that $L \in \mathbf{coNP}$ if there exists a polynomial time verifier $V$ that runs in time $n^k$ such that for every input string $w$, we have $w \in L$ if and only if $M$ accepts $\langle w,c \rangle$ for every string $c$ of length $n^k$.

It turns out that $\mathbf{NP}$ and $\mathbf{coNP}$ have a non-empty intersection: we have $\mathbf P \subseteq \mathbf{NP} \cap \mathbf{coNP}$. This isn't too difficult to see, since solving a problem in $\mathbf P$ gives us a way to verify the inverse problem.

However, like the $\mathbf P$ and $\mathbf{NP}$ question, we have an intuitive understanding of the difference between $\mathbf{NP}$ and $\mathbf{coNP}$ that breaks down when we try to determine the exact relation that these two classes have. It turns out we don't know whether $\mathbf{NP}$ and $\mathbf{coNP}$ are the same or different classes.