MPCS 50103 — Lecture 3

I mentioned that the naive way of computing the gcd of $m$ and $n$ was inefficient. This is because if the size of $n$ is $d = \log n$, then searching for divisors from 1 through $\sqrt n$ would require approximately $\sqrt n = 10^{\log \sqrt n} = 10^d$ steps, assuming your number was in decimal representation (swap with the appropriate base if desired). In other words, the number of steps we need to compute this grows exponentially as we increase the size of our number.

But does Euclid's algorithm do any better than this? How would we know? We'll take a brief excursion into algorithm analysis and learn about strong induction and some properties of the Fibonacci numbers along the way.

This sequence is attributed to Fibonacci, an Italian mathematician from the 13th Century in what was then the Republic of Pisa, though their origin can be traced further back to a number of Indian mathematicians as early as the 3rd Century BC. The Fibonacci numbers are one of those things from math that spookily shows up in all sorts of different places not only in mathematics, but in all sorts of natural phenomena. You may have been asked to compute the Fibonacci numbers as a programming exercise before.

Since this definition is recursive, the Fibonacci numbers seem to lend themselves to inductive proofs quite naturally. However, there is one snag with the definition of the Fibonacci numbers, and it's that they're defined in terms of the previous two terms. If we think back to our template, this means that we have to make a stronger assumption: if, for some statement $P$, we want to show $P(k+1)$, not only do we need to assume $P(k)$, but $P(k-1)$ as well.

This is where we need to introduce the notion of strong induction. Strong induction is a reformulation of our induction axiom that gives us a "stronger" hypothesis. Recall that we had \[P(0) \wedge \forall k (P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Now, we have "strengthened" our hypothesis by adding an additional term: \[P(0) \wedge \forall k (P(k-1) \wedge P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] But what if we needed more terms? That's no problem either. \[P(0) \wedge \forall k (P(0) \wedge P(1) \wedge \cdots \wedge P(k-1) \wedge P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Why does this work? One of the things you can try to convince yourself of is that this is actually no different from ordinary induction. It's not hard to see that every proof by induction is also a proof by strong induction, but it takes a bit more reasoning to convince ourselves of the converse.

Of course, if we think back to our computations on recursive definitons and functions, we remember that we saw a pattern: the computation for $f(n)$ always contained the computation for $f(n-1)$, which contained the computation for $f(n-2)$, which ...

So in fact, all strong induction does is make explicit a fact that was implicit: if $P(k-1)$ was true, it would necessarily have been the case that $P(k-2)$ was true, and so on. And since we proved that $P(0)$ was true right at the start, we can assume that everything from $P(0)$ to $P(k)$ is true if we assume that $P(k)$ is true.

This means we need to make a slight modification to our template for proof by induction in order to get a template for a proof by strong induction, namely: the inductive hypothesis.

Returning to Fibonacci numbers, sometimes we just want a nice neat formula to compute a term of a sequence like $f_n$, rather than having to compute every single term up to the $n$th one. We'll be showing how we can do this much later in the course, when we talk about recurrences. For now, we'll just say that one can get an exact formula for $f_n$ in terms of the solutions to the polynomial equation $x^2 - x - 1 = 0$. These are $\varphi = \frac{1 + \sqrt 5}{2}$, the Golden Ratio, and its conjugate root, $\widetilde \varphi = 1 - \varphi = \frac{1-\sqrt 5}{2}$. What we get is $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$ This is now known as Binet's formula, although it seems like it was known by a bunch of mathematicians like Euler, Bernoulli, and de Moivre over a century earlier.

For all $n \in \mathbb N$, $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$

We will show this by strong induction on $n$.

Base case. For $n = 0$, we have $f_0 = 0 = \frac{\varphi^0 - \widetilde \varphi^0}{\sqrt 5}$. For $n = 1$, we have \begin{align*} \frac{\varphi - \widetilde \varphi}{\sqrt 5} &= \frac{\frac{1+\sqrt 5}{2} - \frac{1-\sqrt 5}{2}}{\sqrt 5} \\ &= \frac{\frac{2\sqrt 5}{2}}{\sqrt 5} \\ &= \frac{\sqrt 5}{\sqrt 5} \\ &= 1 \end{align*}

Inductive case. Suppose that $f_j = \frac{\varphi^j - \widetilde \varphi^j}{\sqrt 5}$ for all $j$ with $j \leq k$. Consider $f_{k+1}$. \begin{align*} f_{k+1} &= f_k + f_{k-1} \\ &= \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5} + \frac{\varphi^{k-1} - \widetilde \varphi^{k-1}}{\sqrt 5} &\text{by ind. hyp.} \\ &= \frac{\varphi^k + \varphi^{k-1} - (\widetilde \varphi^k + \widetilde \varphi^{k-1})}{\sqrt 5} \\ &= \frac{(1 + \varphi) \varphi^{k-1} - (1 + \widetilde \varphi) \widetilde \varphi^{k-1}}{\sqrt 5} \\ &= \frac{\varphi^2 \varphi^{k-1} - \widetilde \varphi^2 \widetilde \varphi^{k-1}}{\sqrt 5} &\text{since $\varphi^2 = \varphi + 1$ and $\widetilde \varphi^2 = \widetilde \varphi+1$}\\ &= \frac{\varphi^{k+1} - \widetilde \varphi^{k+1}}{\sqrt 5} \end{align*}

On Euclid's algorithm

Using this property of the Fibonacci numbers, we can prove something interesting about the Euclidean algorithm. The following is a result attributed to Gabriel Lamé in 1844, which makes it one of the earliest examples of an analysis of an algorithm (though there is slightly earlier work on the same subject) as well as a nice application of the Fibonacci numbers.

Suppose $a_n \gt a_{n-1} \gt \cdots \gt a_1 \gt a_0 = 0$ is the sequence of numbers obtained from Euclid's algorithm. Then $a_i \geq f_i$ for all $i \leq n$.

We will prove this by (strong) induction on $n$.

Base case. From the statement, we have $a_0 = f_0$ and $0 \lt a_1$ and $f_1 = 1 \leq a_1$.

Inductive case. Let $k \in \mathbb N$. Assume that $a_j \geq f_j$ for all $j \lt k$. Let $n = k+1$. By the Division Theorem, there exists an integer $q$ such that $a_{k+1} = q \cdot a_k + a_{k-1}$ and that $a_{k-1} \lt a_k$. Then $q \geq 1$ and \begin{align*} a_{k+1} &= q \cdot a_k + a_{k-1} \\ & \geq a_{k} + a_{k-1} \\ & \geq f_{k} + f_{k-1} & \text{by ind. hyp.} \\ & = f_{k+1} \end{align*}

Recall that when analyzing algorithms, we are concerned with the number of elementary operations that are performed in proportion to the size of the input and we are concerned with the growth of this function. Here, we consider the number of divisions to be our elementary operation, since that's the most important and costly operation we execute.

Let $a \gt b \geq 0$ be natural numbers such that the representation of $a$ requires $d$ decimal digits and the calculation of $\gcd(a,b)$ via Euclid's algorithm requires $k$ division steps. Then $k \leq 5 \cdot d$.

Since the decimal representation of $a$ requires $d$ digits, we have $a \lt 10^d$. If the computation of $\gcd(a,b)$ by Euclid's algorithm requires $k$ steps, by Lamé's Theorem, we have $a \geq f_k$. Then we have \begin{align*} f_k &\lt 10^d &\text{since $f_k \leq a \lt 10^d$}\\ \frac{\varphi^k}{\sqrt 5} &\leq 10^d &\text{since $f_k = \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5} \lt \frac{\varphi^k}{\sqrt 5}$}\\ k \log \varphi - \frac{\log 5}{2} &\leq d \log 10 &\text{apply $\log$ to both sides}\\ k &\leq \frac{d \log 10}{\log \varphi} - \frac{\log 5}{2 \log \varphi} &\text{isolate $k$}\\ k &\leq 4.789d + 1.673 \end{align*} That this implies $k \leq 5d$ is clear for large $d$. For small $d$, we can check the values by hand.

This is a neat result that goes back to our brief discussion about the efficiency of trying to find a gcd. What Lamé's theorem tells us is that Euclid's algorithm executes operations roughly linearly to the number of digits of the largest number $a$. Intuitively, this result tells us that each division we perform usually knocks out one of the digits. The number of digits of a number turns out to be about $\log a$, so Euclid's algorithm scales roughly logarithmically with $a$.

If we compare this to the computing all the divisors method, we would be executing approximately as many operations as there are numbers from 1 to $n$. This means that our naive algorithm scales roughly linearly with $a$. Comparing the two algorithms, we have an approximately exponential difference between the two.

Prime numbers

An integer $p$ greater than 1 is called prime if the only positive divisors of $p$ are 1 and $p$. Otherwise, $p$ is called composite.

A large portion of number theory is devoted to the study of prime numbers and their properties. A big reason for this is that they are considered the building blocks for numbers. This idea is so important that the primary result concerning this is called the Fundamental Theorem of Arithmetic.

Note here the condition that $n \gt 1$. Sometimes there is some controversy among non-mathematicians over the fact that 1 is not considered a prime number. There are a number of reasons one can give to justify this idea. The main justification is due to this theorem: if 1 were a prime number, then we could no longer say that every number has a unique prime factorization.

This idea extends to other so-called unique factorization domains, which are a kind of $\mathbb Z$-like structure in abstract algebra where every non-zero and non-unit element has a unique factorization. Here, a unit is any element $x$ where one can find an element $y$ such that $xy = 1$. Obviously, in $\mathbb N$, the only one of these is 1, but in other structures, there are more elements that have these multiplicative inverses. One example you may be familiar with are the rational numbers, $\mathbb Q$.

In the end, no matter how you slice it, 1 is not a prime number. Do not fall into the trap.

In order to prove the Fundamental Theorem of Arithmetic, we need a few helpful properties of primes. The first of these is the following very useful theorem, a result from Euclid's Elements.

If $p$ is prime, and $a,b$ are integers such that $p \mid a \cdot b$, then $p \mid a$ or $p \mid b$.

Without loss of generality, suppose $p \not \mid a$. Then $\gcd(p,a) = 1$ and there exist integers $x$ and $y$ such that $xa + yp = 1$. Then multiplying both sides by $b$, we have $$b = xab + ypb.$$ Since $p \mid ab$, let $n$ be an integer such that $ab = pn$. Then we have $$b = xab + ypb = xpn + ypb = p \cdot (xn+yb).$$ Therefore, by definition of divisibility, $p \mid b$.

Notice that this proof makes use of Bézout's lemma. Since Bézout's lemma appears about 2000 years after Euclid, it is safe to say that this is not the proof that appears in the Elements.

Now, we will state, but not prove the following lemma, which we make use of later on. It makes a nice exercise (try proving it with induction).

If $p, p_1, \dots, p_k$ are prime and $p \mid p_1 \cdots p_k$, then $p = p_i$ for some $i$.

Fundamental Theorem of Arithmetic

Now, we're almost ready to prove the Fundamental Theorem of Arithmetic. Since the theorem claims existence and uniqueness of prime factorization, we'll prove this in two parts: first we prove the fact that we can factor numbers into primes and then show that this factorization is unique.

We will prove that for $n \gt 1$, $n$ can be written as a product of primes via strong induction on $n$.

Base case. $n = 2$. Clearly, 2 is a product of primes because 2 itself is a prime number.

Inductive case. Let $k \in \mathbb N$. Assume that for all natural numbers $m$ with $2 \leq m \leq k$, that $m$ can be written as a product of primes. We want to show that $n = k+1$ can be written as a product of primes. There are two cases to consider.

If $k+1$ is prime, then $k+1$ can clearly be written as a product of primes since $k+1$ is itself a prime number.
If $k+1$ is composite, then we can write $k+1 = a \cdot b$ for some $a, b \in \mathbb N$. Observe that $2 \leq a, b \lt k+1$. Therefore, by the inductive hypothesis, $a$ and $b$ can be written as a product of primes. Let $a = p_1 p_2 \cdots p_s$ and $b = q_1 q_2 \cdots q_t$, where $p_1, p_2, \dots, p_s, q_1, q_2, \dots, q_t$ are primes. Then $$k+1 = a \cdot b = p_1 p_2 \cdots p_s q_1 q_2 \cdots q_t$$ and therefore $k+1$ can be written as a product of primes.

Now, we will finish the proof of Theorem 3.6 by showing that prime factorizations are unique.

By Theorem 3.9, every natural number $n \gt 1$ has a prime factorization. Suppose that prime factorizations are not unique. Let $n$ be the smallest number that doesn't have a unique prime factorization, so $$n = p_1 \cdots p_k = q_1 \cdots q_\ell,$$ where $p_i$ is prime for $1 \leq i \leq k$ and $q_j$ is prime for $1 \leq j \leq \ell$. Clearly, $p_1 \mid n$. Then by Lemma 3.8, $p_1 = q_j$ for some $j$. Now, we divide each factorization by $p_1$. This gives us $$\frac n {p_1} = p_2 p_3 \cdots p_k = q_1 q_1 \cdots q_{j-1} q_{j+1} \cdots q_\ell.$$ But this is a smaller number with more than one prime factorization, contradicting the minimality of $n$.

The following result is another of Euclid's from the Elements. You can tell that this one is very important because out of all the results of Euclid, this one is called Euclid's Theorem.

Assume for a contradiction that there are only finitely many primes $p_1, p_2, \dots, p_k$. Consider $n = p_1 \cdot p_2 \cdots p_k + 1$. Since $n \gt p_i$ for all $i$, it cannot be a prime. Observe that for each prime $p_i$, we can write $n = q \cdot p_i + 1$ for some integer $q$.

By the Division Theorem, this means that we have a remainder of 1 when dividing $n$ by any of the primes $p_i$. Since $1 \neq 0$ and $1 \lt p_i$, it must be the case that $p_i$ does not divide $n$ for all $i$. In other words, $n$ cannot be factored into primes. But this is a contradiction of the Fundamental Theorem of Arithmetic. Therefore, our assumption that there were only finitely many primes was incorrect and there must exist infinitely many primes.

Modular arithmetic

Now, armed with our accumulated knowledge on division, divisors, remainders, and primes, we'll define a system of arithmetic on integers based around remainders, which seems like a strange thing to do. But often times, we want to do calculations based around multiples of certain numbers, like time. Modular arithmetic formalizes these notions. One of the things we'll see is that in certain cases, working in these structures gives us a notion of "division" that is well-defined. The system of modular arithmetic was first developed by Gauss.

Let $m$ be an integer. For integers $a$ and $b$, we say that $a$ is congruent to $b$ modulo $m$, written $a = b \pmod m$ or $a \equiv_m b$, if $m \mid (a-b)$.

This definition looks a bit strange, but it really means that $a$ and $b$ have the same remainders when divided by $m$. You can convince yourself of this as an exercise (make use of the Division Theorem).

Ultimately, we want to be able to talk about integers that are equivalent to each other with respect to a particular modulus. An easy example of this is when we think about integers modulo 10, since our entire number system is built around 10s. We can formally define what it means to be "equivalent".

A relation $\sim$ is an equivalence relation if $\sim$ satisfies the following:

Equivalence relations are called as such because they capture relationships that are similar to equality. For instance, if I have two formulas $\varphi$ and $\neg \neg \varphi$, we can't say they're equal because they aren't: they contain different symbols and one is longer than the other. However, we can say that they're logically equivalent because they mean the same thing. If we really wanted to, we can define the notion of logical equivalence more formally and then show that it satisfies the conditions for equivalence relations.

To see that $\equiv_m$ is reflexive, observe that $m \mid (a-a)$ for all integers $a$.
To see that $\equiv_m$ is symmetric, if $a \equiv_m b$, then $m \mid (a - b)$. This means there is an integer $n$ such that $a - b = mn$. Then we get $b - a = m\cdot (-n)$ and we have $m \mid (b-a)$.
To see that $\equiv_m$ is transitive, consider integers $a,b,c$ such that $a \equiv_m b$ and $b \equiv_m c$. We have $m \mid (a-b)$ and $m \mid (b-c)$, which gives us $m \mid (a-b) + (b-c)$ and therefore, $m \mid (a-c)$ and $a \equiv_m c$.

Using the notion of an equivalence relation, we can divide $\mathbb Z$ into sets that contain equivalent members. For instance, if we choose $m = 2$, then all even numbers are equivalent to each other (i.e., $0 \pmod 2$) and all odd numbers are equivalent to each other (i.e., $1 \pmod 2$). These sets are called equivalence classes.

For all $m \gt 0$ and $a \in \mathbb Z$, we define the equivalence class modulo $m$ of $a$ to be the set of integers $$[a]_m = \{b \in \mathbb Z \mid b \equiv_m a\}.$$

Typically, we refer to equivalence classes by their "obvious" name, which is the member of the class that is between 0 and $m-1$. This is called the canonical representative of the class. Of course, we should keep in mind that $[0] = [m] = [2m] = [-m]$ and such. But in addition to this, sometimes the $[]_m$ gets dropped for convenience's sake and we have to determine from context whether "2" means $2 \in \mathbb Z$ or $[2]_m$. Usually, this becomes clear with the usage of $\pmod m$ and we will try to make that explicit, but outside of this course, that's not always guaranteed.

All of this seems a bit obvious, but just like we did when we were defining addition on the natural numbers, we should think about what we're really doing here. We've defined operations $+$ and $\cdot$ that look like our usual operations on the integers. However, observe that we're not adding and multiplying integers; we've defined a notion of adding and multiplying sets of integers.

Based solely on this, there is no reason that what we've defined is guaranteed to work. For instance, how do we know that when adding two sets in this way that we even get a set that makes sense at all? So of course, we have to prove this and it will turn out that our definitions of equivalence classes and addition and multiplication on those classes is such that everything works out intuitively almost without a second thought.

We will only show that for $a_1 \equiv a_2 \pmod m$ and $b_1 \equiv b_2 \pmod m$, we have $a_1 + b_1 \equiv a_2 + b_2 \pmod m$. Multiplication is left as an exercise.

By definition, we have that $m \mid (a_1 - a_2)$ and $m \mid (b_1 - b_2)$. Then we have $m \mid ((a_1 - a_2) + (b_1 - b_2))$. We can easily rearrange this to get $m \mid ((a_1 + b_1) - (a_2 + b_2))$ and therefore, $a_1 + b_1 \equiv a_2 + b_2 \pmod m$.

Let $\mathbb Z_m = \{[0]_m, [1]_m, \dots, [m-1]_m\}$. The integers mod $m$ is the set $\mathbb Z_m$, together with the binary operations $+$ and $\cdot$. The integers mod $m$ are denoted by $\mathbb Z/\equiv_m$ or simply as $\mathbb Z_m$.

Up to now, we have been working implicitly in the structure $\mathbb Z$, the integers. As I've briefly alluded to before, we're not only talking about the domain $\mathbb Z$ but also how we interpret operations like $+$ and $\cdot$. The integers mod $m$, $\mathbb Z_m$, is another structure, whose basic elements are the equivalence classes with respect to $\equiv_m$.

These kinds of structures—a set together with binary operations $+$ and $\cdot$ and identities for both operations—are called rings.

The notation $\mathbb Z/\equiv_m$ gives us a hint at what's happening. We took $\mathbb Z$ and partitioned it into equivalence classes by the relation $\equiv_m$. This idea of taking an algebraic structure and constructing another structure based on an equivalence relation is something that comes up a lot in algebra and is called a quotient structure, where quotient in the algebraic context just means equivalence class.

Multiplicative inverses in $\bmod m$

I mentioned earlier that one of the things that this structure allows us to do is, under certain conditions, "divide" things in the sense that there is an operation that we can perform on elements of our structure that reverse mulitplication. I say "divide" because the operation that we perform is not really division. It's more accurate to say that we'll be showing that multiplicative inverses exist.

Two integers $a$ and $b$ are relatively prime (or coprime) if $\gcd(a,b) = 1$.

Primes are obviously relatively prime to any number less than them, since they don't have any divisors except 1 and themselves. However, non-prime numbers (called composite numbers) can be relatively prime, even to each other. The numbers 10 and 21 are not prime, since $10 = 2 \cdot 5$ and $21 = 3 \cdot 7$. However, they are relatively prime, since $\operatorname{Div}(10) = \{\pm 1, \pm 2, \pm 5, \pm 10\}$ and $\operatorname{Div}(21) = \{\pm 1, \pm 3, \pm 7, \pm 21\}$.

If integers $m \gt 0$ and $a$ are relatively prime, then $a$ has a multiplicative inverse mod $m$. That is, there exists an integer $b$ such that $a \cdot b = 1 \pmod m$.

By Bézout's lemma, there exist integers $n$ and $b$ such that $n \cdot m + b \cdot a = 1$. Then, \begin{align*} [1]_m &= [n \cdot m + b \cdot a]_m \\ &= [n]_m \cdot [m]_m + [b]_m \cdot [a]_m \\ &= [n]_m \cdot [0]_m + [b]_m \cdot [a]_m \\ &= [b]_m \cdot [a]_m \end{align*}

Consider $\mathbb Z_4$. There are four equivalence classes: $0,1,2,3$. Since 1 and 3 are coprime, they have inverses: $1^{-1} = 1$ (this is obvious) and $3^{-1} = 3$, which we get by observing that $3 \cdot 3 = 9 = 1 \pmod 4$. However, 2 has no inverse: \begin{align*} 2 \cdot 0 &= 0 &\pmod 4 \\ 2 \cdot 1 &= 2 &\pmod 4 \\ 2 \cdot 2 &= 4 = 0 &\pmod 4 \\ 2 \cdot 3 &= 6 = 2 &\pmod 4 \end{align*}

One might ask whether there is any integer $m$ for which $\mathbb Z_m$ has a multiplicative inverse for all non-zero elements. Our discussion about prime numbers gives us a hint.

If $a$ and $m$ are relatively prime and $a \gt 1$, then the multiplicative inverse of $a$ modulo $m$ is unique.

By Theorem 3.19, since $a$ and $m$ are relatively prime, $a$ has a multiplicative inverse modulo $m$. Suppose that $b$ and $c$ are multiplicative inverses of $a$. Then, \begin{align*} b &= b \cdot 1 &\pmod m \\ &= b \cdot (c \cdot a) &\pmod m \\ &= b \cdot (a \cdot c) &\pmod m \\ &= (b \cdot a) \cdot c &\pmod m \\ &= 1 \cdot c &\pmod m \\ &= c &\pmod m \\ \end{align*}

If $p$ is prime and $a \neq 0 \pmod p$, then $a$ has a multiplicative inverse mod $p$.

This is easy to see since every integer $2, \dots, m-1$ aren't divisors of $p$ and therefore share no common divisors with $p$.

Up until now, we've been working in $\mathbb Z$, where "division" "doesn't work". However, we've proved sufficient conditions to create structures where "division" does work in a sense, in that multiplicative inverses are guaranteed to exist for every element. This means that we can solve linear equations like $$6x = 144 \pmod{17}$$ using our usual algebraic techniques. In fact, assuming we were working with the same modulus, it's not hard to solve a system of equations in the usual way.

So let's solve this congruence. Just like with ordinary equations, there are a few approaches we can take. One thing we might try is to compute $144 \pmod{17}$. Some quick math tells us that $144 = 8 \pmod{17}$. This means we have $6x = 8 \pmod{17}$. Now, we want to isolate the $x$, so we need to "divide" by 6. This amounts to finding an element $y$ such that $6y = 1 \pmod{17}$. Doing some quick math, we get $6 \times 3 = 18 = 1 \pmod{17}$. This gives us $x = 8 \times 3 = 24 = 7 \pmod{17}$.