CMSC 27100 — Lecture 6

Here's the proof of uniqueness of $q$ and $r$ in the Division Theorem.

Let $n$ and $d \gt 0$ be arbitrary integers. By Lemma 4.4, there exist integers $q$ and $r$ such that $n = d \cdot q + r$ and $0 \leq r \lt d$. We want to show that $q$ and $r$ are unique.

Suppose that there exist integers $q_1, r_1, q_2, r_2$ such that $n = d \cdot q_1 + r_1$ and $n = d \cdot q_2 + r_2$ with $0 \leq r_1, r_2 \lt d$. Without loss of generality, let $r_1 \geq r_2$ and consider the following system of equations \begin{align*} n &= q_1 \cdot d + r_1 \\ n &= q_2 \cdot d + r_2. \end{align*} Subtracting gives us $0 = (q_1 - q_2) \cdot d + (r_1 - r_2)$. Rearranging this equation gives us $(q_2 - q_1) \cdot d = r_1 - r_2$. Since $(q_2 - q_1)$ is an integer, we have $d \mid (r_1 - r_2)$ by definition of divisibility. Since $0 \leq r_1 - r_2 \lt d$, we have $r_1 - r_2 = 0$ and therefore $r_1 = r_2$. Then from above, this gives us $(q_2 - q_1) \cdot d = 0$. Since $d \gt 0$, we have $q_2 - q_1 = 0$ and therefore $q_1 = q_2$. Thus, we have shown that $q$ and $r$ are unique.

"Without loss of generality" and proof by cases

What does the statement "without loss of generality" mean? Consider the scenario in which we used it: when we considered $r_1$ and $r_2$, we said "without loss of generality" and set $r_1 \geq r_2$. We were really pre-emptively dealing with the question: We said $r_1 \geq r_2$, but what if $r_2 \geq r_1$ instead?

It's true: we had one of two situations on our hands: $r_1 \geq r_2$ or $r_2 \geq r_1$. That is, the particular fact we were working with was a statement of the form $p \vee q$.

How does one make use of a statement of the form $p \vee q$? Unlike a statement of the form $p \wedge q$, we can't say that $p$ is definitely true or $q$ is definitely true. We know that at least one of them is true but we don't know which. So we have to assume each of $p$ and $q$ and show that from either assumption, we can construct an appropriate proof. This is sometimes known as proof by cases. Here, $p$ and $q$ are our cases. There's an inference rule for this: \[\frac{\begin{matrix} \\ \\ p \vee q \end{matrix} \quad \boxed{\begin{matrix}p \\ \vdots \\ r \end{matrix}} \quad \boxed{\begin{matrix}q \\ \vdots \\ r \end{matrix}}}{r} \vee\!e\] What is this saying? Suppose we want to prove some proposition $r$ by using the proposition $p \vee q$. What we need to do is consider the $p$ case and show that we can arrive at $r$ and then consider the $q$ case and show that we can also arrive at $r$.

In other words, we do the following:

  1. Consider case $p$—assume that $p$ holds.
  2. Carry out proof of $r$ assuming $p$.
  3. Conclude that $r$ holds.
  4. Consider case $q$—assume that $q$ holds.
  5. Carry out proof of $r$ assuming $q$.
  6. Conclude that $r$ holds.
  7. Therefore, in either case $p$ or $q$, we can conclude $r$.

So what we needed to do to prove our statement was to consider two cases: $r_1 \geq r_2$ and $r_2 \geq r_1$.

But what does "without loss of generality" mean here? We make the observation that if we carried out a proof for the other case, the proof would look exactly the same! The only things that would change would be the positions of $r_1$ and $r_2$. It's not very interesting to see the exact same proof twice, so mathematicians invented some language, "without loss of generality", to say to the reader that, look, the other case will basically give you the same proof but with the names swapped, so we'll only do it once.

Divisors

You might recall that all of the numbers we discussed in the Division Theorem actually have names (which is why we use certain letters to name them):

We are particularly interested in divisors. Why? Because the divisors of a number are exactly those that divide it. One thing that is of particular interest is when certain numbers share divisors. This is one way that we can relate numbers with each other.

For instance, the numbers 8 and 16 feel more "similar" to each other than 8 and 20 do, but these feel more similar than even 8 and 21. If we examine the divisors of each of these numbers, we see that 8 and 16 share many (1, 2, 4, 8), while 8 and 20 share fewer (1, 2, 4), and 8 and 21 only share 1.

Let $a,b,c$ be integers. If $c \mid a$ and $c \mid b$, then $c$ is a common divisor of $a$ and $b$. The integer $d$ is the greatest common divisor of $a$ and $b$ if $d$ is a common divisor of $a$ and $b$ and every common divisor of $a$ and $b$ divides $d$. Equivalently, $|c| \leq |d|$ for all common divisors $c$ of $a$ and $b$. The greatest common divisor of $a$ and $b$ is denoted $\gcd(a,b)$.

Computing the gcd

So we have this notion of the greatest common divisor, but how does one find or compute it? The obvious way to do this is to compute all divisors of $m$ and $n$, find the common divisors, and find the largest of those. The big problem with this approach is that factoring integers is a notoriously hard problem to solve efficiently. There are some heuristics that we can apply and some tricks that we can pull, but in general we do not have a fast algorithm for factoring numbers. This fact happens to be the backbone of many public-key cryptographic systems we use today.

However, if all we care about is the greatest common divisor and not any of the other divisors, then we will see a theorem will get us there efficiently.

First, a bit of notation.

Let $n,d,q,r$ be integers that satisfy $n = d \cdot q + r$ and $0 \leq r \lt d$, as in the Division Theorem. We define the function $\mathop{\mathbf{mod}}$ by $n \mathop{\mathbf{mod}} d = r$.

You may be more familiar with $\mathop{\mathbf{mod}}$ as the modulus operator % from Python. It's a handy way to refer to the remainder without having to say "$r$ as in the Division Theorem" every time. We can similarly define $n \mathop{\mathbf{div}} d = q$, which is analogous to the Python operator // (hence the name of the function, $\operatorname{divmod}$).

If $d = \gcd(a,b)$, $b \neq 0$, and $r = a \mathop{\mathbf{mod}} b$, then $d = \gcd(b,r)$.

Let $a$ and $b \gt 0$ be arbitrary. Let $q$ be such that $a = qb + r$. Since $d = \gcd(a,b)$, we have $d \mid a$ and $d \mid b$. Then this means that $d \mid qb$. Putting this together, we have $d \mid (a - qb)$ and therefore $d \mid r$. So $d$ is also a common divisor of $b$ and $r$.

Now consider an arbitrary common divisor $c$ of $b$ and $r$. If $c \mid b$ and $c \mid r$, we have $c \mid (qb+r) = a$, so $c$ is also a common divisor for $a$. Since $c$ is a common divisor of $a$ and $b$, this means we must have $c \leq d = \gcd(a,b)$. And since $c$ was arbitrary, we must have that every common divisor $c$ of $b$ and $r$ is at most $d$. Therefore, $\gcd(b,r) = d = \gcd(a,b)$.

This result gives us a way to compute the gcd: \[\gcd(a,b) = \begin{cases} a & \text{if $b = 0$,} \\ \gcd(b, a \mathop{\mathbf{mod}} b) & \text{otherwise,} \end{cases}\]

which becomes the following function fairly straightforwardly:

def euclid(a: int, b: int) -> int:
    if b == 0:
        return a
    else:
        return euclid(b, a % b)

This algorithm is called Euclid's algorithm, named after Euclid who describes it in the Elements. Euclid is the Greek mathematician from around 300 BC who, in the Elements, describes a lot of the elementary geometry, algebra, and number theory that we learn in school. Euclid's algorithm is considered to be one of the oldest known algorithms.

Suppose we want to compute the gcd of 232 and 72. We apply Euclid's algorithm as follows: \begin{align*} \gcd(232,72) &= \gcd(72,16) \\ &= \gcd(16,8) \\ &= \gcd(8,0) \\ &= 8 \end{align*}

Before talking about the complexity of Euclid's algorithm, we'll take some time to prove one more useful property of the gcd. From the Division Theorem, we get the following result.

For all integers $a$ and $b$, there exist integers $x$ and $y$ such that $\gcd(a,b) = a \cdot x + b \cdot y$.

This result is commonly referred to as Bézout's lemma or Bézout's identity, attributed to Étienne Bézout, a French mathematician in the mid-1700s. We'll continue to do so for lack of a better snappy name, but like many other results you'll encounter, the result was known before.

Working backwards, we quickly find that the result is found due to Bachet de Méziriac, another French mathematician in around 1600. But in fact, we can go even further back: a constructive proof of this result dates back to about 500 AD via the Indian mathematician Āryabhaṭa and the first mention of the result can be found due to Euclid in about 300 BC (who we will be seeing a lot of).

Like the Division Theorem, there are constructive and non-constructive proofs of Bézout's lemma. The constructive proof is via an algorithm called the Extended Euclidean Algorithm, which can be obtained by extending the Euclidean algorithm from above. However, we'll take the opportunity to introduce the notion of non-constructive proof and proof by contradiction.