CS 360 — Lecture 23

The complement of a language in NP

When we introduced the class $\mathbf{NP}$, it was mentioned that while a language in $\mathbf{NP}$ could be verified in polynomial time using a polynomial length certificate, the same could not be said if we wanted to show the inverse. For instance, it's very easy to check that a formula $\varphi$ is satisfiable via some assignment that I provide, but it is not clear how to do the same to show that $\varphi$ doesn't contain a satisfying assignment.

Just as we did with the class of efficiently verifiable decision problems, we can classify the class of problems which are the complements of efficiently verifiable decision problems.

We define the class $\mathbf{coNP}$ to be $$ \mathbf{coNP} = \{ L \mid \overline L \in \mathbf{NP} \}.$$ It's important to note that $\mathbf{coNP}$ is not $\overline{\mathbf{NP}}$. To make this more clear, we can define $\mathbf{coNP}$ in a different way.

We say that $L \in \mathbf{coNP}$ if there exists a polynomial time verifier $V$ that runs in time $n^k$ such that for every input string $w$, we have $w \in L$ if and only if $M$ accepts $\langle w,c \rangle$ for every string $c$ of length $n^k$.

We can define $\mathbf{coNP}$-completeness in the same way as $\mathbf{NP}$-completeness:a language is $\mathbf{coNP}$-complete if it is in $\mathbf{coNP}$ and every other language in $\mathbf{coNP}$ is polynomial time reducible to it.

Some obvious problems that are in $\mathbf{coNP}$ can be found just by taking a problem in $\mathbf{NP}$ and taking its complement (for example, $\overline{\mathrm{SAT}}$, the problem of whether a given formula $\varphi$ is not satisfiable). However, $\mathbf{coNP}$ does contain natural problems. We consider the problem $\text{TAUTOLOGY}$: Given a boolean formula $\varphi$, is $\varphi$ satisfied by every assignment?

Theorem. $\text{TAUTOLOGY}$ is $\mathbf{coNP}$-complete.

Proof. By definition, it is fairly clear that $\text{TAUTOLOGY} \in \mathbf{coNP}$. To show that every $L \in \mathbf{coNP}$ reduces to $\text{TAUTOLOGY}$, modify the Cook-Levin reduction from $\overline L$ to $\mathrm{SAT}$, since $\overline L \in \mathbf{NP}$. For every input word $w$, the reduction constructs a formula $\varphi_w$ which is satisfiable if and only if $w \in \overline L$. Then $\overline{\varphi_w} \in \mathrm{TAUTOLOGY}$ iff $w \in L$. $$\tag*{$\Box$}$$

It turns out that $\mathbf{NP}$ and $\mathbf{coNP}$ have a non-empty intersection: we have $\mathbf P \subseteq \mathbf{NP} \cap \mathbf{coNP}$. This isn't too difficult to see, since solving a problem in $\mathbf P$ gives us a way to verify the inverse problem.

However, like the $\mathbf P$ and $\mathbf{NP}$ question, we have an intuitive understanding of the difference between $\mathbf{NP}$ and $\mathbf{coNP}$ that breaks down when we try to determine the exact relation that these two classes have. It turns out we don't know whether $\mathbf{NP}$ and $\mathbf{coNP}$ are the same or different classes.

The polynomial hierarchy

Can we go further with this idea? For example, we can see that $\mathbf{NP}$ is characterized by acceptance with the existence of a certificate, while $\mathbf{coNP}$ is characterized by acceptance with all certificates. What if we wanted to build more complicated problems by using more than one of these quantifiers?

Yes, we can do this. These are the classes $\Sigma_i^P$ and $\Pi_i^P$, depending on which quantifier you begin your statement with. In other words, if you have a verifier that can decide a problem in polynomial time if there exists a certificate $c_1$ such that for all certificates $c_2$ such that there exists a certificate $c_3$, and so and so forth, that's a problem in $\Sigma_i^P$. Just based on the definition, we clearly have $\Sigma_i^P \subseteq \Sigma_{i+1}^P$. But we also have $\Pi_i^P = \mathbf{co}\Sigma_i^P$, since those are the statements that begin with universal quantifiers instead.

We define the union of all of these classes to be the polynomial hierarchy $$\mathbf{PH} = \bigcup_{i \geq 0} \Sigma^P_i = \Pi_i^P.$$ From this, we have $\Sigma_0^P = \Pi_0^P = \mathbf P$, $\Sigma_1^P = \mathbf{NP}$, and $\Pi_i^P = \mathbf{coNP}$.

Of course, this means that just like with $\mathbf P$, $\mathbf{NP}$, and $\mathbf{coNP}$, we actually don't know if any of these inclusions are strict. It could be that the hierarchy is actually finite and at some point, we just have $\Sigma_k^P = \Pi_k^P$ for some $k$ (maybe even $k=1$???). But we don't think that's the case. And of course, as you may have come to expect, we don't even know whether $\mathbf P = \mathbf{PH}$.

Space complexity

We can define the space complexity of a Turing machine in a similar way as the time complexity.

Let $M$ be a deterministic Turing machine that halts on all inputs. The space complexity of $M$ is the function $f : \mathbb N \to \mathbb N$, where $f(n)$ is the maximum number of tape cells that $M$ scans on any input of length $n$. If the space complexity of $M$ is $f(n)$, we say that it runs in space $f(n)$

If $M$ is nondeterministic, we define its space complexity $f(n)$ to be the maximum number of tape cells that $M$ scans on ay branch of its computation for any input of length $n$.

Let $f : \mathbb N \to \mathbb R$. We define the space complexity classes $$\mathbf{SPACE}(f(n)) = \{ L \mid \text{$L$ is decided by an $O(f(n))$ space DTM} \},$$ $$\mathbf{NSPACE}(f(n)) = \{ L \mid \text{$L$ is decided by an $O(f(n))$ space NTM} \}.$$

As it turns out, all the same results from the old days of complexity theory that we went through for time also hold for space: the gap theorem, constructibility, the space hierarchy, infinite speedup.

SAT

We've shown that $\mathrm{SAT}$ can be considered a "hard" problem because we don't know of an efficient (polynomial-time) algorithm that solves it. Here, we show that we can solve $\mathrm{SAT}$ in linear space.

  1. For a boolean formula $\varphi$, for each assignment to variables $x_1,\dots,x_m$ of $\varphi$, evaluate $\varphi$ on the assignment.
  2. If $\varphi$ is every evaluated to True, accept; otherwise, reject.

Here, we see that we can just cycle through every assignment using the same portion of the tape. Since we have $m \leq n$ variables, this algorithm runs in $O(n)$ space.

This also shows us that space is possibly more powerful than time by demonstrating a property of space that seems fairly obvious: space can be reused, but time can't.

$ALL_{NFA}$

Now, we consider a nondeterministic algorithm for $ALL_{NFA}$. Recall that $ALL_{NFA}$ is the problem: Given an NFA $A$, does it accept every string? We give an algorithm for the complement $\overline{ALL_{NFA}}$. This language is not known to be in $\mathbf{NP}$ or $\mathbf{coNP}$.

We nondeterministically guess a string, run it on the NFA, and keep track of all the states it can be in.

  1. For an NFA $A$, mark the initial state.
  2. Repeat $2^n$ times, where $n$ is the number of states of $A$:
    Nondeterministically select an input symbol and change the positions of the markers on the states of $A$ to simulate reading the symbol
  3. Accept if some string is rejected by $A$; that is, if at some point, none of the markers are on an accepting state of $A$. Otherwise, reject.

We can restrict the search to words of length $2^n$, since if some longer string is rejected, the state markers would repeat and the section of the string that's repeated can be removed to get a shorter string.

Since we only need to keep track of the markers and the counter for the loop, the algorithm runs in nondeterministic space $O(n)$.

PSPACE

We define the class $\mathbf{PSPACE}$ to be the class of languages that are decidable in polynomial space on a determininstic Turing machine. That is, $$ \mathbf{PSPACE} = \bigcup_{k \geq 1} \mathbf{SPACE}(n^k).$$

We define the class $\mathbf{NPSPACE}$ to be the class of languages that are decidable in polynomial space on a nondetermininstic Turing machine. That is, $$ \mathbf{NPSPACE} = \bigcup_{k \geq 1} \mathbf{NSPACE}(n^k).$$

Clearly, $\mathbf{PSPACE} \subseteq \mathbf{NPSPACE}$. Interestingly, unlike $\mathbf P$ and $\mathbf{NP}$, we have $\mathbf{PSPACE} = \mathbf{NPSPACE}$, which is a consequence of Savitch's theorem:

Theorem. (Savitch) For any function $f : \mathbb N \to \mathbb R$, where $f(n) \geq n$, $$\mathbf{NSPACE}(f(n)) \subseteq \mathbf{SPACE}(f^2(n)).$$

What do we know about $\mathbf{PSPACE}$? Since we can't read more than one tape cell per time step, we have $\mathbf P \subseteq \mathbf{PSPACE}$ and $\mathbf{NP} \subseteq \mathbf{NPSPACE}$. Furthermore, our $\mathrm{SAT}$ example earlier shows us a way to solve problems in $\mathbf{NP}$ in $\mathbf{PSPACE}$: by cycling through every certificate on the same amount of space. This gives us $\mathbf{NP} \subseteq \mathbf{PSPACE}$ and $\mathbf{coNP} \subseteq \mathbf{PSPACE}$. A similar argument can be made for each class in $\mathbf{PH}$, so we also have that $\mathbf{PH} \subseteq \mathbf{PSPACE}$.

We can also get a time bound on a language in $\mathbf{PSPACE}$ by deriving a bound for a polynomial space Turing machine by arguing the number of configurations it can have. If a TM runs in $f(n)$ space, then it has at most $f(n) \cdot 2^{O(f(n))}$ configurations. Then such a machine must run in $f(n) \cdot 2^{O(f(n))}$ time, which gives us $\mathbf{PSPACE} \subseteq \mathbf{EXPTIME} = \bigcup_{k \geq 0} \mathbf{TIME}(2^{n^k})$.

This gives us the following picture: $$ \mathbf P \subseteq \mathbf{NP} \subseteq \mathbf{PH} \subseteq \mathbf{PSPACE} = \mathbf{NPSPACE} \subseteq \mathbf{EXPTIME}.$$ It is important to note that we do not know if any of these are equal, except for $\mathbf P \neq \mathbf{EXPTIME}$, as we've seen earlier.