CMSC 27100 — Lecture 6

Computing the gcd

Bézout's lemma gives us a nice alternate way to represent the gcd of $m$ and $n$. But how does one compute the gcd? The obvious way to do this is to compute all divisors of $m$ and $n$, find the common divisors, and find the largest of those. The big problem with this approach is that factoring integers is a notoriously hard problem to solve efficiently. There are some heuristics that we can apply and some tricks that we can pull, but in general we do not have a fast algorithm for factoring numbers. This fact happens to be the backbone of many public-key cryptographic systems we use today.

However, if all we care about is the greatest common divisor and not any of the other divisors, then we will see a theorem will get us there efficiently.

First, a bit of notation.

Let $n,d,q,r$ be integers that satisfy $n = d \cdot q + r$ and $0 \leq r \lt d$, as in the Division Theorem. We define the function $\mathop{\mathbf{mod}}$ by $n \mathop{\mathbf{mod}} d = r$.

You may be more familiar with $\mathop{\mathbf{mod}}$ as the modulus operator % from Python. It's a handy way to refer to the remainder without having to say "$r$ as in the Division Theorem" every time. We can similarly define $n \mathop{\mathbf{div}} d = q$, which is analogous to the Python operator // (hence the name of the function, $\operatorname{divmod}$).

If $d = \gcd(a,b)$, $b \neq 0$, and $r = a \mathop{\mathbf{mod}} b$, then $d = \gcd(b,r)$.

Let $a$ and $b \gt 0$ be arbitrary. Let $q$ be such that $a = qb + r$. Since $d = \gcd(a,b)$, we have $d \mid a$ and $d \mid b$. Then this means that $d \mid qb$. Putting this together, we have $d \mid (a - qb)$ and therefore $d \mid r$. So $d$ is also a common divisor of $b$ and $r$.

Now consider an arbitrary common divisor $c$ of $b$ and $r$. If $c \mid b$ and $c \mid r$, we have $c \mid (qb+r) = a$, so $c$ is also a common divisor for $a$. Since $c$ is a common divisor of $a$ and $b$, this means we must have $c \leq d = \gcd(a,b)$. And since $c$ was arbitrary, we must have that every common divisor $c$ of $b$ and $r$ is at most $d$. Therefore, $\gcd(b,r) = d = \gcd(a,b)$.

This result gives us a way to compute the gcd: \[\gcd(a,b) = \begin{cases} a & \text{if $b = 0$,} \\ \gcd(b, a \mathop{\mathbf{mod}} b) & \text{otherwise,} \end{cases}\]

which becomes the following function fairly straightforwardly:

def euclid(a,b):
    if b == 0:
        return a
    else:
        return euclid(b, a%b)

This algorithm is called Euclid's algorithm, named after Euclid who describes it in the Elements. Euclid is the Greek mathematician from around 300 BC who, in the Elements, describes a lot of the elementary geometry, algebra, and number theory that we learn in school. Euclid's algorithm is considered to be one of the oldest known algorithms.

Suppose we want to compute the gcd of 232 and 72. We apply Euclid's algorithm as follows: \begin{align*} \gcd(232,72) &= \gcd(72,16) \\ &= \gcd(16,8) \\ &= \gcd(8,0) \\ &= 8 \end{align*}

How fast is Euclid's algorithm?

We briefly mentioned that the naive way of computing the gcd of $m$ and $n$ was inefficient. How do we measure the efficiency of algorithms? Informally, we can think of this as the number of steps that the algorithm needs to take, and we compare this to the size of the input of the algorithm. We would like the number of steps of our algorithm to grow slowly as we give it larger and larger inputs.

But what is the size of a number? Here, the size of the number is the size of its representation. This would be the number of digits it takes to write the number down. This means that the size of a number $n$ is $d = \log n$. Here, the exact base doesn't matter too much. But what this means then is that searching for divisors from 1 through $\sqrt n$ would require approximately $\sqrt n = 10^{\log \sqrt n} = 10^d$ steps, assuming your number was in decimal representation (swap with the appropriate base if desired). In other words, the number of steps we need to compute this grows exponentially as we increase the size of our number. Unfortunately, exponentially growing functions do not grow slowly.

So does Euclid's algorithm do any better than this? How would we know? We'll take a brief excursion into algorithm analysis and learn about strong induction and some properties of the Fibonacci numbers along the way.

We will define this famous sequence of numbers in this way.

The Fibonacci numbers are defined by

This sequence is attributed to Fibonacci, an Italian mathematician from the 13th Century in what was then the Republic of Pisa, though their origin can be traced further back to a number of Indian mathematicians as early as the 3rd Century BC. The Fibonacci numbers are one of those things from math that spookily shows up in all sorts of different places not only in mathematics, but in all sorts of natural phenomena. You may have been asked to compute the Fibonacci numbers as a programming exercise before.

def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

Since this definition is recursive, the Fibonacci numbers seem to lend themselves to inductive proofs quite naturally. For instance, something we like to do with recursively defined sequences like this is to find a closed form for them. This would allow us to compute something like $f_{200}$ without having to compute $f_{2}$ through $f_{199}$ first.

It turns out that one can get an exact formula for $f_n$ in terms of the solutions to the polynomial equation $x^2 - x - 1 = 0$. There are a few ways to arrive here, none of which we'll be going through. But the roots of this equation are $\varphi = \frac{1 + \sqrt 5}{2}$, the Golden Ratio, and its conjugate root, $\widetilde \varphi = 1 - \varphi = \frac{1-\sqrt 5}{2}$. What we get is $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$ This is now known as Binet's formula, although it seems like it was known by a bunch of mathematicians like Euler, Bernoulli, and de Moivre over a century earlier.

For all $n \in \mathbb N$, $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$

We will show this by induction on $n$.

Base case. For $n = 0$, we have $f_0 = 0 = \frac{\varphi^0 - \widetilde \varphi^0}{\sqrt 5}$. For $n = 1$, we have \[ \frac{\varphi - \widetilde \varphi}{\sqrt 5} = \frac{\frac{1+\sqrt 5}{2} - \frac{1-\sqrt 5}{2}}{\sqrt 5} = \frac{\frac{2\sqrt 5}{2}}{\sqrt 5} = \frac{\sqrt 5}{\sqrt 5} = 1. \]

Inductive case. Let $k$ be an arbitrary natural number. Suppose that $f_k = \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5}$. Consider $f_{k+1}$. \begin{align*} f_{k+1} &= f_k + f_{k-1} & \text{Fibonacci numbers} \\ &= \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5} + f_{k-1} &\text{inductive hypothesis} \\ &\cdots \end{align*}

Unfortunately, it seems like we are stuck here. We need to say something about $f_{k-1}$ but we our inductive hypothesis only covers $f_k$. What can we do?

We will proceed by making our inductive hypothesis stronger. That is, we will assume more in our inductive hypothesis. Recall that the induction axiom was \[P(0) \wedge \forall k (P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Instead, we will consider the following strong induction axiom: \[P(0) \wedge \forall k (\forall j(j \leq k \rightarrow P(j)) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] This is called strong induction since the assumption we make in the inductive case is stronger. Let's take a closer look at the hypothesis for the inductive case: \[\forall j(j \leq k \rightarrow P(j)) \rightarrow P(k+1)).\] So what are we assuming? For every $j$, if $j \leq k$ then $P(j)$ is true. A more "natural" way of phrasing this might be "$P(j)$ for every $j \leq k$". So we'd write something like "Assume that $P(j)$ for all $j \leq k$.