Theorem (Rice). All non-trivial semantic properties of Turing machines are undecidable.
In other words, for any given Turing machine $M$, we can't decide any properties about $L(M)$ except for properties that are true for either exactly all or exactly none of the languages recognized by Turing machines.
Proof. To show this, we assume that $P$ is a decidable language that satisfies some property and we let $R_P$ be a TM that decides $P$. We will reduce TM membership to property testing.
Let $T_\emptyset$ be a TM that always rejects, so $L(T_\emptyset) = \emptyset$. Without loss of generality, we assume that $\langle T_\emptyset \rangle \not\in P$, since otherwise, we can just use $\overline P$ instead. Since $P$ is non-trivial, there exists a TM $T$ with $\langle T \rangle \in P$.
We construct a Turing machine that decides $A_{TM}$ by being able to distinguish between $T_\emptyset$ and $T$.
$M_w$ simulates $T$ if $M$ accepts $w$. Then $L(M_w) = L(T)$ if $M$ accepts $w$ and $L(M_w) = \emptyset$ otherwise. Thus, $\langle M_w \rangle \in P$ iff $M$ accepts $w$. $$\tag*{$\Box$}$$
Rice's theorem was proved by Rice in 1951 (the result appeared in a journal later in 1953). It is interesting to note that a similar result was discussed by Turing in his original 1936 paper about undecidability. Furthermore, there is a similar result characterizing undecidable properties of languages for subclasses of decidable languages (i.e. context-free languages, context-sensitive languages, etc.) called Greibach's theorem. As you might expect, this was shown by Greibach in 1963.
Suppose we have a collection of dominoes. A domino is a tile that has a string on the top and a string on the bottom: $\binom{a}{ab}$. A collection of dominoes looks like $$ \left\{ \binom{b}{ca}, \binom{a}{ab}, \binom{ca}{a}, \binom{abc}{c} \right\}.$$ We want to find a match from a sequence of dominoes drawn from our collection: $$ \binom{a}{ab} \binom{b}{ca} \binom{ca}{a} \binom{a}{ab} \binom{abc}{c}$$.
Formally, an instance of the Post Correspondence Problem is a sequence of pairs of strings $$(u_1,v_1),(u_2,v_2),\dots,(u_k,v_k)$$ where $u_i, v_i \in \Sigma^*, i = 1,\dots,k$. A solution of the above instance is the sequence $i_1,i_2,\dots,i_n \in \{1,\dots,k\}$ such that $$u_{i_1} u_{i_2} \cdots u_{i_n} = v_{i_1} v_{i_2} \cdots v_{i_n}.$$ The Post Correspondence problem asks whether or not a given instance of PCP has a solution. We define the language of PCP instances that have a solution to be $$L_{PCP} = \{ \langle P \rangle \mid \text{$P$ is a PCP instance that has a solution} \}.$$ Then we have the following result.
Theorem. PCP is undecidable.
First, let's talk about how we do this. The idea is to reduce from $A_{TM}$. We show that for any Turing machine $M$ and input string $w$, we can construct an instance of PCP such that a match is an accepting computation path for $M$ on $w$. Thus, if we have a match, then we can determine whether $M$ accepts $w$.
Proof. Suppose there is a TM $R$ that decides PCP. We construct a TM $S$ that decides $A_{TM}$. Let $M = (Q,\Sigma,\Gamma,\delta,q_0,q_A,q_R)$. We will have $S$ construct an instance of the PCP $P$ that has a match if and only if $M$ accepts $w$. Essentially, we will be defining pairs for the set $P'$.
First, we show how to do this with a restricted case of PCP where we require a particular starting pair. Later, we will show how to remove this restriction. The starting pair is defined $$\binom{\#}{\#q_0 w_1 w_2 \cdots w_n \#}.$$ Note that this is the starting configuration for $M$ on $w$.
Next, for every $a,b \in \Gamma$ and every $q,q' \in Q$ where $q \neq q_R$, if $\delta(q,a) = (q',b,R)$, add the pair $\binom{qa}{bq'}$. These pairs handle transitions for the tape head moving to the right.
For every $a,b,c \in \Gamma$ and every $q,q' \in Q$ where $q \neq q_R$, if $\delta(q,a) = (q',b,L)$, add $\binom{cqa}{q'cb}$. These pairs handle transitions for the tape head moving to the left.
For every $a \in \Gamma$, add $\binom a a$. These pairs correspond to tape contents that aren't near the tape head.
Add $\binom \# \#$ and $\binom{\#}{\Box\#}$. These pairs handle the delimiters between the configurations.
For every $a \in \Gamma$, add $\binom{a q_A}{q_A}$ and $\binom{q_A a}{q_A}$. This handles when the computation reaches the accept state. Note that the top lags behind the bottom, so we need to add some extra steps in order for us to get the match that we want.
Finally, we add the pair $\binom{q_A\#\#}{\#}$ to finish.
Now, we've constructed an instance $P'$ of PCP with a restriction, so it doesn't quite work the way we want it to. If we just use $P'$, there is a match regardless of whether $M$ accepts $w$ or not. So we have a bit more work to do.
First, let $w = w_1 w_2 \cdots w_n$ be some string of length $n$. Then we define the following strings:
Now, we convert our restricted PCP instance $P'$ that we constructed into a proper PCP instance $P$. If we have $$ P' = \left\{ \binom{u_1}{v_1}, \binom{u_2}{v_2}, \dots , \binom{u_k}{v_k} \right\},$$ where $\binom{u_1}{v_1} = \binom{\#}{\#q_0 w_1 w_2 \cdots w_n \#}$, then we define $P$ to be the set $$ \left\{ \binom{\star u_1}{\star v_1 \star}, \binom{\star u_1}{v_1 \star}, \binom{\star u_2}{v_2 \star}, \dots, \binom{\star u_k}{v_k \star}, \binom{\star \triangle}{\triangle} \right\}.$$
Then in the PCP instance $P$, the tile $\binom{\star u_1}{\star v_1 \star}$ is the only one that can be the first tile since it is the only one that starts with the same symbol $\star$. The addition of the $\star$s doesn't matter in the other tiles, since the $\star$s are simply interleaved in the original string. Then the final tile is always going to be $\binom{\star \triangle}{\triangle}$ to take care of the extra $\star$. $$\tag*{$\Box$}$$
Post's Correspondence Problem originates all the way back from 1946. While we've shown that this is undecidable via Turing machine simulation, Post's original proof of undecidability uses a string generation system formulated by Post in 1943 which we now call Post canonical systems. The languages generated by such systems are recognizable and so these systems are Turing-complete.
Of course, this isn't just a neat trick to show that there are undecidable problems that look deceptively easy. As with any undecidable problem, it can become another tool in our toolbox to show that a problem is undecidable. For instance, let's consider the problem of intersection emptiness for CFGs, $$ISE_{CFG} = \{ \langle G_1, G_2 \rangle \mid \text{$G_1, G_2$ are CFGs, $L(G_1) \cap L(G_2) = \emptyset$} \}.$$
Theorem. $ISE_{CFG}$ is undecidable.
Proof. We assume that we can decide intersection emptiness for context-free languages. Let $\Sigma = \{a,b\}$ and we define $I$ to be an arbitrary PCP instance over $\Sigma$, $$(\alpha_1, \beta_1), \dots, (\alpha_k,\beta_k)$$. We define the following languages over the alphabet $\Sigma \cup \{c\} = \{a,b,c\}$.
All of these languages are context-free. Then we observe that $L_0 \cap L_{mi} \neq \emptyset$ if and only if $I$ has a solution. Since $L_0$ and $L_{mi}$ are context-free, if we could decide emptiness of the intersection of context-free languages, we can also decide whether or not an arbitrary instance of PCP has a solution. $$\tag*{$\Box$}$$
This brings us to the end of computability theory, at least as far as most computer scientists are concerned. Up until now, we've been concerned with whether or not a problem is computationally solvable and we have lots of examples of useful problems that turned out to be computationally unsolvable. However, just because a problem is computationally solvable doesn't mean that we can necessarily solve it. The real challenge is whether a problem can be solved using a reasonable amount of resources. This additional constraint introduces the kinds of questions that complexity theory seeks to answer.
What kinds of resources are we talking about? The most common ones that we deal with are time and space. However, there are all sorts of other resources we can consider. We can consider the cost of querying a black box algorithm, the cost of acquiring random bits, or the cost of communication between multiple machines.
The goal of complexity theory is to be able to answer questions about how complex or difficult a problem is in terms of the resources that are required to compute it. We study this by examining the structure of various problems. Ultimately, we want to be able to talk about complexity and problems in a general sense, but like we did for computability theory, we need to fix a model of computation to start out and generalize later on.
We will be using the Turing machine as our standard model of computation. The Turing machine gives us a convenient way to talk about time and space. Namely, we can consider time in terms of the number of steps a Turing machine takes. This is useful because a TM step is a discrete unit and it always consists of the same actions: read, write, change state, and move the tape head. Similarly, the Turing machine has a natural concept of space, which we consider to be the number of tape cells that need to be used by the machine. Again, a tape cell is a discrete unit that is simple to count.
Let $M$ be a deterministic Turing machine that halts on all inputs. The running time or time complexity of $M$ is the function $T:\mathbb N \to \mathbb N$, where $T(n)$ is the maximum number of steps that $M$ uses on any input of length $n$. If $T(n)$ is the running time of $M$, we say that $M$ runs in time $T(n)$ and that $M$ is a $T(n)$ time Turing machine.