Let $A$ be an $n \times n$ matrix. A scalar $\lambda$ is an eigenvalue of $A$ if $A \mathbf x = \lambda \mathbf x$ for some vector $\mathbf x$. The vector $\mathbf x$ is the eigenvector that corresponds to $\lambda$.
This definition tells us that eigenvectors are those vectors that stay the same, but are scaled by a transformation—in other words, the direction of the vector doesn't change. In some sense, we can think of these vectors as remaining stable under the transformation. This suggests that they capture some information about the structure of the transformation.
Let $A = \begin{bmatrix} 7 & -9 \\ 2 & -2 \end{bmatrix}$. The vector $\begin{bmatrix} 3 \\ 1 \end{bmatrix}$ is an eigenvector of $A$. Its associated eigenvalue is 4. To see this, we have: \[\begin{bmatrix} 7 & -9 \\ 2 & -2 \end{bmatrix} \begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} 12 \\ 4 \end{bmatrix} = 4 \begin{bmatrix} 3 \\ 1 \end{bmatrix}.\]
Traditionally, the motivation for eigenvectors is in the evolution of linear systems: if we view a matrix $A$ as a process applied repeatedly to some input vector $\mathbf x$, then we can view $A^t \mathbf x$ as the evolution of the system through time step $t$. Obviously, taking powers of matrix products is expensive.
Eigenvectors give us a way around this. For an eigenvector $\mathbf x$ with eigenvalue $\lambda$, we have \[A^k \mathbf x = A^{k-1} \lambda \mathbf x = \lambda A^{k-1} \mathbf x = \lambda^2 A^{k-2} \mathbf x = \cdots = \lambda^k \mathbf x.\] This leads to the following ideas:
The set of eigenvalues of $A$ is called the eigenspectrum or spectrum of $A$. From this, we get spectral analysis—studying matrices (and other mathematical objects that can be represented as graphs) based on their eigenvalues and eigenvectors and the structures that they admit.
But we'll see that despite these being useful tools in their own right, what we would really like is for an analogous tool for any matrix, not just square ones. Eigenvalues and eigenvectors are only the first step towards this goal.
How do we find eigenvectors? We take our eigenvector equation and rewrite it. The equation $A \mathbf x = \lambda \mathbf x$ becomes \[(A - \lambda I) \mathbf x = \mathbf 0.\] Then this becomes almost the usual question that we've been dealing with all along: solving for $\mathbf x$. Of course, there's an extra step here: what is $\lambda$? To answer that, we have to ask what $A - \lambda I$ is. Obviously, we do not want $A - \lambda I = 0$.
However, if we take a look at this equation, we're back to solving something of the form $B \mathbf x = \mathbf 0$. One vector that satisfies this is $\mathbf x= \mathbf 0$, which we also do we want.
In order for both of these things to be true, it must be the case that $A - \lambda I$ is not invertible—in which case, the null space of $A - \lambda I$ contains more than just the zero vector.
What we need now is a systematic way of computing $\lambda$ based on this information. To do that, we need to discuss the determinant of a matrix.
Determinants are useful quantities that say something important about square matrices (rectangular matrices do not have determinants). Specifically, they quantify the change that a transformation makes to vectors in terms of its "volume". Unfortunately, like matrix inverses, they are a real pain to compute and their properties are not particularly relevant for us. However, we do need them for one thing: being able to compute eigenvalues. This makes sense because both of these values say something about the transformation. So we must discuss how to compute a determinant.
Classically, determinants are defined recursively. We begin with our base case, the $2 \times 2$ matrix.
Let $A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$. Then the determinant of $A$ is $\det A = ad - bc$.
You will sometimes see the determinant denoted by surrounding the matrix with bars instead of square brackets: \[\begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad-bc.\]
What does this say about linear transformations? Consider the effect of this matrix on the standard basis vectors $\begin{bmatrix} 1 \\ 0 \end{bmatrix}$ and $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$.
Notice that this essentially maps every vector onto something like a parallelogram, at least when viewed in $\mathbb R^2$. An interesting question we can ask is what the area of this parallelogram is, viewing the unit box as having an area of 1. This value is exactly the determinant and this is what the determinant signifies—it is the value that quantifies the amount of the transformation.
We can generalize this idea to 3 dimensions (thinking about volume instead of area) or more. The textbook contains details about how the volume arises from this computation in the 3-dimensional case. But we are more concerned with the formula: determinants for a $3 \times 3$ matrix are defined as follows.
Let $A = \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}$. Then $\det A = a(ei-fh) - b (di-fg) + c(dh-ef)$.
Be careful: notice that the sign for the terms alternates!
The idea here is that we go through each entry in the first row and consider the determinant of the $2 \times 2$ matrix obtained by removing the row and column that the entry is in. That is, \[\det A = a \begin{vmatrix} e & f \\ h & i \end{vmatrix} - b \begin{vmatrix} d & f \\ g & i \end{vmatrix} + c \begin{vmatrix} d & e \\ g & h \end{vmatrix}.\]
For general $n$, we can extend this idea and find that the determinant formula depends on computing the determinant for $n$ different $n-1 \times n-1$ matrices. This definition for the determinant is called the Laplace expansion, due to the 17th c. French mathematician Pierre–Simon Laplace. Notice that because we end up having to compute roughly $n!$ determinants, it is not actually computationally feasible to compute determinants for sufficiently large matrices in this way. For hand computation and small $n$, this will be fine.
Here are some useful properties of determinants.
These properties give us some interesting ideas and in particular lead to faster methods for computing determinants.
Recall that if $A$ is invertible, then it can be decomposed into $A = LU$, where both $L$ and $U$ are triangular matrices. We can make use of these properties: both $\det L$ and $\det U$ can be computed by multiplying their diagonals. Then we have $\det A = \det L \det U$. But what if $A$ isn't invertible? Then $\det A = 0$, which we'll discover when LU factorization fails!
What is the determinant of an orthogonal matrix $Q$? We know that $\det Q = \det Q^T$ and $Q^T Q = I$, so we must have that $\det Q = \pm 1$.
The last property is actually where the definition of singular matrix comes from—we saw this earlier simply as square matrices that are not invertible. If we consider the area/volume view of the determinant, this makes sense: a singular matrix has dependent columns, which means one of the dimensions of our parallelopiped collapses and the resultant space has 0 area/volume.
If you read the text, you'll find that determinants allow you to compute the inverse of a matrix without performing elimination. Personally, I think this is a scam which students are too quick to accept because they're tired of doing elimination. But I find computing the determinant and remembering the process to compute the inverse (called Cramer's rule) even more exhausting than just buckling down and doing the elimination, which we know how to do by now. However, if you're more into memorizing formulas, you may find this a more convenient way to compute the inverse.
Recall that $A - \lambda I$ is not invertible. Then $\det(A - \lambda I) = 0$. This is the key we need to solve for the eigenvalues $\lambda$.
Let $A = \begin{bmatrix} 7 & -9 \\ 2 & -2 \end{bmatrix}$. We have \[A - \lambda I = \begin{bmatrix} 7 - \lambda & -9 \\ 2 & -2-\lambda \end{bmatrix}.\] Then the determinant of this matrix is \[(7-\lambda)(-2-\lambda) -2 \cdot -9 = \lambda^2 - 5\lambda + 4.\] Recall that the determinant is 0, so this suggests that the eigenvalues $\lambda$ are roots of this polynomial. Indeed, we have that this polynomial factors to $(\lambda - 4)(\lambda - 1)$, so we have $\lambda = 4, 1$.
The polynomial $\det(A - \lambda I)$ is the characteristic polynomial of $A$. The eigenvalues of $A$ are the roots of the characteristic polynomial of $A$.
One of the implications from this definition is that an $n \times n$ matrix will have a characteristic polynomial of degree $n$. This comes from having $n$ $\lambda$'s along the diagonal of the matrix $A - \lambda I$.
Once we have our eigenvalues, to find the eigenvectors, we substitute each eigenvalue into $A - \lambda I$ and solve the equation $(A - \lambda I)\mathbf x = \mathbf 0$.
The entire process to find eigenvalues and eigenvectors is summarized:
There are a few things to watch out for at this point.