CAPP 30271 — Lecture 12

Basis and Dimension

We've been focused on solving these linear systems but if we step back a bit, we notice that there are some coincidences that keep popping up. For instance, we can see the rank of a matrix has a deep connection with the number of vectors we care about. But what exactly is that connection?

First, let's recall the following definition.

Let $v_1, \dots, v_k$ be a set of vectors in the vector space $\mathbf V$. We say that $v_1, \dots, v_k$ span the vector space $\mathbf V$ if every vector in $\mathbf V$ can be written as a linear combination of vectors in $S$.

Something to note here is that span is often used in two slightly different senses:

Here's one question to start with that we keep coming back to. Recall the column space of a matrix is spanned by its columns. The question we had was whether we needed all of those columns. Our factorization of $A$ into $CR$ suggests that the answer is sometimes no. We've also asked the same question of the null space: how do we know that the special solutions are really the vectors we need and want to describe the null space?

In both of the above cases, we have sets of vectors that span our respective spaces, but we desire something stronger to say about these kinds of sets. We also want to know that our spanning sets are linearly independent. This is an important concept, so we give it a name.

A basis for a vector space $V$ is a linearly independent set that spans $V$.

The standard basis for $\mathbb R^3$ is the unit vectors $\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}$. It's clear this set is linearly independent. Do they span $\mathbb R^3$? Well, take a vector of your choice $\begin{bmatrix} x \\ y \\ z \end{bmatrix}$. Then you can write it as \[\begin{bmatrix} x \\ y \\ z \end{bmatrix} = x \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + y \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + z \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}.\]

It's important to remember that a basis is both a linearly independent set and a spanning set. It's possible to have a linearly independent set that doesn't span a space—this says that we have too few vectors. It's also possible to have a spanning set that isn't linearly independent—this says we have too many vectors.

Consider the column space of a matrix $\begin{bmatrix} 3 & 1 \\ 9 & 3 \end{bmatrix}$. We've defined its column space to be the vectors that are the linear combinations of its columns. However, its columns aren't a basis for the column space. Why? The columns aren't linearly independent. So a basis for the column space would be any one of the columns, but not both.

Now is a good time to remember the other definition for linear independence. Recall that a set of vectors is linearly independent if no vector in the set can be written as a linear combination of any of the others. But we recall the following.

We say that the set of vectors $\mathbf v_1, \dots, \mathbf v_n$ is linearly independent if and only if the only solution to \[x_1 \mathbf v_1 + \cdots + x_n \mathbf v_n = \mathbf 0\] is $x_1 = \cdots = x_n = 0$.

We have been making plenty of use of this definition already. For instance, this is the crux of our arguments about the number of solutions for the linear equation $A \mathbf x = \mathbf 0$.

One reason why this definiton is preferred is that it makes it clear that linear independence is a property of the set of vectors collectively. When we talk about linear dependence, we often try to find a specific vector that is "dependent" on the others. But that's not how it works, because we can always rewrite the "dependence" another vector is the "dependent" one.

Here is another interesting consequence of this definition.

Let $\mathbf v_1, \dots, \mathbf v_n$ be a basis. There is exactly one way to write a vector $\mathbf u$ as a linear combination of the basis vectors.

Intuitively, this makes sense—if each vector in the basis is independent of the others, then there should be no way of "scaling" it and adjusting the other vectors to compensate.

But it's important not to confuse this with the idea of a "unique" basis—there is actually no such thing. In fact, a vector space may have infinitely many bases.

Here's another example of a basis for $\mathbb R^3$: $\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}$. To see this, observe that we have for any $x,y,z$, \[z \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} + (y - z) \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} + (x - y) \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} = \begin{bmatrix} z + (y-z) + (x-y) \\ z + (y - z) \\ z \end{bmatrix} = \begin{bmatrix} x \\ y \\ z \end{bmatrix}. \]


    >>> 5 * np.array([1, 1, 1]) + (1 - 5) * np.array([1, 1, 0]) + (3 - 1) * np.array([1, 0, 0])
    array([3, 1, 5])
    

So if there are so many bases out there for vector spaces, how do we find one? One way is to take advantage of tools we've already been working with.

Consider again our friend the matrix $R = \begin{bmatrix} 1 & 3 & 0 & 1 \\ 0 & 0 & 1 & 5 \\ 0 & 0 & 0 & 0 \end{bmatrix}$. We want to find bases for its column and row spaces. Here, the choice of the pivot columns is obvious for a basis for the column space—they're the standard basis. This tells us that the column space of $R$ is a two-dimensional subspace inside of $\mathbb R^3$. But considering what we said about bases not being unique, this is not our only choice. For instance, we can take any two columns except for the first two to be our basis.

Now consider the row space of $R$. The two non-zero rows are the obvious choice. This tells us that the row space of $R$ is also a two-dimensional space, but this one is a subspace of $\mathbb R^4$.

Here, we've used the word dimension, but what does "dimension" really mean? There's an intuitive geometric definition—the number of "directions" that we have. But this is not so easy to see after we leave $\mathbb R^3$ and go to higher-dimensional space.

Instead, we will think of the dimension of a space in terms of its basis. We think of the basis as having the "right" size—we have exactly as many vectors as we need and no more. So our notion of dimension will be based on this.

The dimension of a vector space $V$ is the number of vectors in a basis for $V$.

It may seem a bit strange for this number to be based on just the number of vectors, since we can have any number of bases. But the special property here really is the number. This definition comes from the following result.

If $\mathbf u_1, \dots, \mathbf u_m$ and $\mathbf v_1, \dots, \mathbf v_n$ are bases for the same vector space, then $m = n$.

This is really quite special: though we have infinitely many bases for a vector space, all the bases for a given vector space have the same size. So the definition says that the dimension of a vector space is not a "physical" quantity like we're used to imagining, but is really a descriptive quantity—how many vectors do I need to describe the vectors in a space?

Bases for subspaces of a matrix

Why does dimension matter? It's one of the things that ties together the subspaces of a matrix. We've seen a number of subspaces associated with any given $m \times n$ matrix $A$ of rank $r$. There are the obvious ones:

We saw a not-as-obvious one:

As we've done before, the key to identifying the subspaces is by taking $A$ and transforming it into its row-reduced echelon form $R$. We'll use an example to illustrate this.

Let $R = \begin{bmatrix} 1 & 3 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}$. We can see that $R$ is a $3 \times 5$ matrix with rank $r = 2$.


    >>> R = np.array([[1, 3, 1, 0, 9], [0, 0, 0, 1, 3], [0, 0, 0, 0, 0]])
    >>> np.linalg.matrix_rank(R)
    2
    

The row space of $R$ has dimension $r$, the rank of $R$. The nonzero rows of $R$ form a basis for the row space of $R$.

In this example, $r = 2$ and there are two nonzero rows. The nonzero rows form a basis for the row space—they are independent since each row has a 1 in the pivot columns and there is no way to combine rows to get those entries.

The column space of $R$ has dimension $r$, the rank of $R$. The pivot columns of $R$ form a basis for the column space of $R$.

In this example $r = 2$ and there are two pivot columns. These pivot columns form a basis for the column space of $R$ (it's pretty clear, since togther, they form $I$).

The null space of $R$ has dimension $n-r$. The special solutions for $R \mathbf x = \mathbf 0$ form a basis for the null space of $R$.

In this example, $R$ has $5-2 = 3$ free variables, so there are three vectors in the special solutions to $R \mathbf x = \mathbf 0$, one for each free variable. They are \[\begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} -9 \\ 0 \\ 0 \\ -3 \\ 1 \end{bmatrix}.\] We observe that the special solution vectors are independent—there's a 1 in each spot corresponding to the location of the free variable and 0 in the other vectors in the same spot. So the special solutions form a basis for the null space.

Indeed, this can be verified:


    >>> s1 = np.array([-3, 1, 0, 0, 0])
    >>> s2 = np.array([-1, 0, 1, 0, 0])
    >>> s3 = np.array([-9, 0, 0, -3, 1])
    >>> N = np.hstack([s.reshape(5,1) for s in (s1, s2, s3)])
    >>> np.linalg.matrix_rank(N)
    3
    

With this, we have covered subspaces for $R$. But the row-reduced echelon form matrix $R$ is shared by many matrices. So we need to take these results and connect them to the respective spaces of $A$.

The row space of $A$ is the row space of $R$.

To see this, we observe that $R$ is computed from $A$ via elimination. But elimination involves

So rows of $R$ are really linear combinations of rows of $A$. However, each of these linear combinations is invertible. This means that we can go backwards: rows of $A$ can be described as linear combinations of rows of $R$. This means that all of the linear combinations of rows of $A$ are also linear combinations of rows of $R$ and vice-versa. So the two row spaces are the same.

As a corollary, because the row space of $A$ and $R$ is the same, this means that the dimension of the row space of $A$ and the basis that was computed for the row space of $R$ can be used for the row space of $A$.

The dimension of the column space of $A$ is the dimension of the column space of $R$.

Unfortunately, it is not the case that the column space of $R$ is the same as the column space of $A$—only their dimensions are the same. Consider the following example.

Let $A = \begin{bmatrix} 1 & 3 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 2 & 6 \end{bmatrix}$. Its rref is $R$ (just subtract 2 of row 2 from row 3). However, $\begin{bmatrix} 9 \\ 3 \\ 6 \end{bmatrix}$ is in the column space of $A$, but not in the column space of $R$, since the last row of $R$ is all zeros.

The key difference is that $A$ has nonzero entries in the final row, but $R$ does not. This means the column spaces of the two matrices can't be the same.

So what can we say about the connection between the columns of $A$ and the columns of $R$? Intuitively, when we take combinations of rows of $A$, we are not changing anything about the positions of the columns of $A$ and they interact with each other in the same ways. Recall that we can represent the row elimnination operations as a matrix, say $E$. Then we have \begin{align*} A \mathbf x &= \mathbf 0 \\ EA \mathbf x &= E\mathbf 0 \\ R \mathbf x &= \mathbf 0 \end{align*} This tells us two things: the same combinations of columns in the same positions, specified by $\mathbf x$, gives us $\mathbf 0$. This means that whether a column was independent or not didn't change. This gives us our result: the dimension of the column space of $A$ is the same as that of $R$.

The second thing this tells us that the null space of $A$ and the null space of $R$ are the same.

The null space of $A$ is the null space of $R$.

Of course, this makes sense: when we try to solve $A \mathbf x = \mathbf 0$, we just take the solutions from the equation $R \mathbf x = \mathbf 0$ so if they weren't the same, we'd be in trouble.

Technically, there is one more step to take here, since all we've shown is that every vector in the null space of $A$ is in the null space of $R$. To go in reverse, we recall that elimination matrices are invertible and we get that \begin{align*} R \mathbf x &= \mathbf 0 \\ E^{-1}R \mathbf x &= E^{-1}\mathbf 0 \\ E^{-1}EA \mathbf x &= \mathbf 0 \\ A \mathbf x &= \mathbf 0 \end{align*} Since the null spaces of the two matrices are the same, we get our result that the dimension of the null space of $A$ is $n-r$ for free.

The dimension of the null space of $A$ is the dimension of the null space of $R$.

So in summary, we have the following. If $A$ is an $m \times n$ matrix of rank $r$, then

However, if you stare at this a bit longer, it seems a bit incomplete. By pattern matching, it almost seems like there should be a fourth subspace of $A$, with dimension $m-r$...