CISC 462 — Lecture 1

What is computability?

Simply put, computability is about whether a particular problem can be solved or not. Depending on what you've assumed about computers so far, this statement may or may not be obvious to you, but it's another thing to have it proved. You might remember something like this coming up back in CISC 223, where it was shown that there's no computer program that will tell you whether some given program will halt or not. In this course, we're going to be looking at this from a more formal perspective.

What is complexity?

As anyone who's done any amount of programming probably knows, it's not enough to be able to solve a problem. We also want to solve it in the most efficient way. This is where the idea of complexity comes in.

Complexity is all about how difficult it is to solve a particular problem. What do we mean by this? If we want to write a program to solve a particular problem, we want to write something that's fast and doesn't use up a lot of resources. In other words, we want to use the least time and space in solving the problem.

You might be familiar with these notions already when designing algorithms. In algorithm design, we try to come up with the best possible algorithm to solve a problem. However, even if we come up with a good algorithm, we have no idea if it's really the best one. This is where complexity theory comes in. Complexity theory is about showing that we cannot do any better for a particular problem.

Some history

Computability theory is really the beginnings of the entire field of computer science and to see why, we need to go all the way back to the late 20's to 30's. An ambition of the mathematical community at the time was to be able to axiomize all of mathematics. The idea is that such a task would enable the construction of a machine that when given a statement in first-order logic could answer whether it was true or not.

David Hilbert, in his famous speech in 1900, proposed 23 problems that he believed would be important in the following century. The second of these dealt with the axiomization of mathematics. Later, in 1928, he proposed the problem of finding an algorithm that would describe the operation of the hypothetical machine described above.

Formally, he was asking for a proof system that was:

Complete: Every statement can either be proved or disproved.
Consistent: Following a set of valid proof steps would never result in a contradiction. In other words, it should not be possible to prove an assertion and the negation of the same assertion.
Decidable: There is a mechanical procedure that will always answer yes or no to the truth of any statement.

As it turns out, in 1931, Kurt Gödel showed that no such system existed. Specifically, he showed that every sufficiently powerful system, the system is either incomplete or inconsistent. How powerful is sufficiently powerful? A proof system that deals with addition and multiplication of natural numbers is powerful enough to qualify.

Then, in 1936, Turing showed that we don't even have a system that is decidable. To do this, he tried to formalize the notion of computation into what we now know as a Turing machine.

Turing machines

First, we'll try to understand what a Turing machine is informally. Recall from CISC 223 the idea of the finite automaton. This is considered the simplest model of computation and is just some finite set of states, or memory, and some transitions. We can do a lot of useful things with FAs but there are many languages that FAs aren't powerful enough to recognize, such as $a^n b^n$. However, we can enhance the power of the machine by adding a stack. This gives us the ability to recognize a larger class of languages.

We can think of a Turing machine in the same way. A Turing machine is just a finite automaton augmented with some additional capacity that we call a tape. We'll define what this is formally.

A Turing machine is a 7-tuple $M = (Q,\Sigma,\Gamma,\delta,q_i,q_A,q_R)$, where

$Q$ is a finite set of states,
$\Sigma$ is a finite input alphabet that does not contain the blank symbol $\Box$,
$\Gamma$ is a finite tape alphabet with a blank symbol $\Box \in \Gamma$ and $\Sigma \subseteq \Gamma$,
$\delta: Q \times \Gamma \to Q \times \Gamma \times \{L,R\}$ is the transition function,
$q_0 \in Q$ is the initial state,
$q_A \in Q$ is the accepting state,
$q_R \in Q$ is the rejecting state, with $q_A \neq q_R$.

In our model, we have a one-sided infinitely long tape, with the machine starting its computation with the input on the tape beginning at the left end of the tape. The input string is followed by infinitely many blank symbols $\Box$ at the right.

We should note that this view of the Turing machine is kind of ahistorical. The Turing machine was introduced by Turing in 1926, while the finite automaton model was introduced by Rabin and Scott in the 50's. Of course, the two are related and it's very helpful to relate the two models in this way, but Turing definitely didn't think of the machines he defined like this.

Instead, what Turing attempted to do was formalize how computers worked, by which we mean the 1930s conception of what a computer is. In other words, he tried to formalize how human computers might compute a problem. We can view the finite state system as a set of instructions we might give to a person, that tells them what to do when they read a particular symbol. We allow this person to read and write on some paper and we let them have as much paper as they want.

In the course of this computation, one of three outcomes will occur.

The machine enters the accepting state and accepts and halts.
The machine enters the rejecting state and rejects and halts.
The machine never enters the accepting or rejecting state and does not halt.

For a Turing machine $M$ and an input word $w$, if the computation of $M$ on $w$ enters the accepting state, then we say that $M$ accepts $w$. Likewise, if the computation of $M$ on $w$ enters the rejecting state, then we say $M$ rejects $w$.

We call the set of strings that are accepted by $M$ the language of $M$, denoted $L(M)$.

A Turing machine $M$ recognizes a language $L$ if every word in $L$ is accepted by $M$ and for every input not in $L$, $M$ either rejects the word or does not halt. A language $L$ is recognizable if some Turing machine recognizes it.

A Turing machine $M$ decides a language $L$ if every word in $L$ is accepted by $M$ and every input not in $L$ is rejected by $M$. A language $L$ is decidable if some Turing machine decides it.