We saw that the strong induction axiom can be written as \[P(0) \wedge \forall k (\forall j(j \leq k \rightarrow P(j)) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] Sometimes, you may see strong induction be a bit less strong than this and instead specify the assumption a bit more modestly, like \[P(0) \wedge \forall k (P(k-1) \wedge P(k) \rightarrow P(k+1)) \rightarrow \forall n P(n).\] This is also okay, but unnecessary, since strong induction covers this anyway.
But why does this work? One of the things you can try to convince yourself of is that this is actually no different from ordinary induction. It's not hard to see that every proof by induction is also a proof by strong induction, but it takes a bit more reasoning to convince ourselves of the converse.
Of course, if we think back to our computations on recursive definitons and functions, we remember that we saw a pattern: the computation for $f(n)$ always contained the computation for $f(n-1)$, which contained the computation for $f(n-2)$, and so on...
So in fact, all strong induction does is make explicit a fact that was implicit: if $P(k-1)$ was true, it would necessarily have been the case that $P(k-2)$ was true, and so on. And since we proved that $P(0)$ was true right at the start, we can assume that everything from $P(0)$ to $P(k)$ is true if we assume that $P(k)$ is true.
This means we need to make a slight modification to our template for proof by induction in order to get a template for a proof by strong induction, namely: the inductive hypothesis.
Let's go back to the proof of Binet's formula.
We will show this by strong induction on $n$.
Base case. For $n = 0$, we have $f_0 = 0 = \frac{\varphi^0 - \widetilde \varphi^0}{\sqrt 5}$. For $n = 1$, we have \[ \frac{\varphi - \widetilde \varphi}{\sqrt 5} = \frac{\frac{1+\sqrt 5}{2} - \frac{1-\sqrt 5}{2}}{\sqrt 5} = \frac{\frac{2\sqrt 5}{2}}{\sqrt 5} = \frac{\sqrt 5}{\sqrt 5} = 1. \]
Inductive case. Suppose that $f_j = \frac{\varphi^j - \widetilde \varphi^j}{\sqrt 5}$ for all $j$ with $j \leq k$. Consider $f_{k+1}$. \begin{align*} f_{k+1} &= f_k + f_{k-1} & \text{Fibonacci numbers} \\ &= \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5} + \frac{\varphi^{k-1} - \widetilde \varphi^{k-1}}{\sqrt 5} &\text{inductive hypothesis} \\ &= \frac{\varphi^k + \varphi^{k-1} - (\widetilde \varphi^k + \widetilde \varphi^{k-1})}{\sqrt 5} & \text{grouping terms} \\ &= \frac{(1 + \varphi) \varphi^{k-1} - (1 + \widetilde \varphi) \widetilde \varphi^{k-1}}{\sqrt 5} & \text{factoring out $\varphi^{k-1}$ and $\widetilde \varphi^{k-1}$}\\ &= \frac{\varphi^2 \varphi^{k-1} - \widetilde \varphi^2 \widetilde \varphi^{k-1}}{\sqrt 5} &\text{since $\varphi^2 = \varphi + 1$ and $\widetilde \varphi^2 = \widetilde \varphi+1$}\\ &= \frac{\varphi^{k+1} - \widetilde \varphi^{k+1}}{\sqrt 5} \end{align*}
Using this property of the Fibonacci numbers, we can prove something interesting about the Euclidean algorithm. The following is a result attributed to Gabriel Lamé in 1844, which makes it one of the earliest examples of an analysis of an algorithm (though there is slightly earlier work on the same subject) as well as a nice application of the Fibonacci numbers.
Typically, we represent the running time of an algorithm on inputs of size $n$ as a function $T(n)$. For iterative algorithms, we might use $T(n)$ to count the number of operations. For recursive algorithms, this typically means that $T(n)$ becomes recursively defined. Because of this, we often use $T(n)$ to count a particular operation instead of all of them.
For example, when thinking about binary search, the operation we care most about is integer comparison. In other words, how many integer comparisons do we need to make in order to determine whether we locate our item in the list? In an ordinary search, we may have to compare our item against every single item, while in binary search, we perform far fewer comparisons.
For Euclid's algorithm, the operation of interest is division. On each recursive call, we must perform a division operation to acquire the remainder. The following theorem links the number of divisions performed by Euclid's algorithm together with the Fibonacci numbers.
Suppose $a_n \gt a_{n-1} \gt \cdots \gt a_1 \gt a_0 = 0$ is the sequence of numbers obtained from Euclid's algorithm. Then $a_i \geq f_i$ for all $i \leq n$.
What this means is that we have \begin{align*} \gcd(a_n, a_{n-1}) &= \gcd(a_{n-1}, a_{n-2}) \\ &= \gcd(a_{n-2}, a_{n-3}) \\ & \vdots \\ &= \gcd(a_{2}, a_{1}) \\ &= \gcd(a_{1}, a_{0}) \end{align*}
What this means is that for two numbers, say $a$ and $b$ and without loss of generality, if $a \geq b$ so that $a_n = a$ and $a_{n-1} = b$, then computing the $\gcd(a,b)$ will take $n$ recursive calls.
We will prove this by (strong) induction on $n$.
Base case. From the statement, we have $a_0 = f_0$ and $0 \lt a_1$ and $f_1 = 1 \leq a_1$.
Inductive case. Let $k \in \mathbb N$. Assume that $a_j \geq f_j$ for all $j \lt k$. Let $n = k+1$. Recall that the $a_i$'s are the sequence of integers that we get from running Euclid's algorithm. This means that the algorithm will try to compute $\gcd(a_{k+1}, a_k)$, which results in a recursive call to try to compute $\gcd(a_k, a_{k-1})$. Recall that this amounts to trying to "divide" $a_{k+1}$ by $a_k$, resulting in a remainder of $a_{k-1}$.
So by the Division Theorem, there exists an integer $q$ such that $a_{k+1} = q \cdot a_k + a_{k-1}$ and that $a_{k-1} \leq a_k$. Then $q \geq 1$ and \begin{align*} a_{k+1} &= q \cdot a_k + a_{k-1} & \text{Division Theorem} \\ & \geq a_{k} + a_{k-1} & \text{since $q \cdot a_k \geq a_k$} \\ & \geq f_{k} + f_{k-1} & \text{inductive hypothesis} \\ & = f_{k+1} & \text{Fibonacci numbers} \end{align*}
Ultimately, the question we want to answer is how big $n$ is compared to $\log a$.
Let $a \gt b \geq 0$ be natural numbers such that the representation of $a$ requires $d$ decimal digits and the calculation of $\gcd(a,b)$ via Euclid's algorithm requires $k$ division steps. Then $k \leq 5 \cdot d$.
Since the decimal representation of $a$ requires $d$ digits, we have $a \lt 10^d$. If the computation of $\gcd(a,b)$ by Euclid's algorithm requires $k$ steps, by Lamé's Theorem, we have $a \geq f_k$. Then we have \begin{align*} f_k &\lt 10^d &\text{since $f_k \leq a \lt 10^d$}\\ \frac{\varphi^k}{\sqrt 5} &\leq 10^d &\text{since $\frac{\widetilde \varphi^k}{\sqrt 5} \to 0$ as $k \to \infty$}\\ k \log \varphi - \frac{\log 5}{2} &\leq d \log 10 &\text{apply $\log$ to both sides}\\ k &\leq \frac{d \log 10}{\log \varphi} - \frac{\log 5}{2 \log \varphi} &\text{isolate $k$}\\ k &\leq 4.789d + 1.673 \end{align*} That this implies $k \leq 5d$ is clear for large $d$.
This is a neat result that goes back to our brief discussion about the efficiency of trying to find a gcd. What Lamé's theorem tells us is that Euclid's algorithm executes operations roughly linearly to the number of digits of the largest number $a$. Intuitively, this result tells us that each division we perform usually knocks out one of the digits. The number of digits of a number turns out to be about $\log a$, so Euclid's algorithm scales roughly logarithmically with $a$.
If we compare this to the computing all the divisors method, we would be executing approximately as many operations as there are numbers from 1 to $n$. This means that our naive algorithm scales roughly linearly with $a$. Comparing the two algorithms, we have an approximately exponential difference between the two.
Here is an important definition.
An integer $p$ greater than 1 is called prime if the only positive divisors of $p$ are 1 and $p$. Otherwise, $p$ is called composite.
A large portion of number theory is devoted to the study of prime numbers and their properties. A big reason for this is that they are considered the building blocks for numbers. This idea is so important that the primary result concerning this is called the Fundamental Theorem of Arithmetic.
Every natural number $n \gt 1$ can be written uniquely as a product of primes.
Note here the condition that $n \gt 1$. Sometimes there is some controversy among non-mathematicians over the fact that 1 is not considered a prime number. There are a number of reasons one can give to justify this idea. The main justification is due to this theorem: if 1 were a prime number, then we could no longer say that every number has a unique prime factorization.
This idea extends to other so-called unique factorization domains, which are a kind of $\mathbb Z$-like structure in abstract algebra where every non-zero and non-unit element has a unique factorization. Here, a unit is any element $x$ where one can find an element $y$ such that $xy = 1$. Obviously, in $\mathbb N$, the only one of these is 1, but in other structures, there are more elements that have these multiplicative inverses. One example you may be familiar with are the rational numbers, $\mathbb Q$.
In the end, no matter how you slice it, 1 is not a prime number. Do not fall into the trap.
In order to prove the Fundamental Theorem of Arithmetic, we need a few helpful properties of primes. The first of these is the following very useful theorem, a result from Euclid's Elements.
If $p$ is prime, and $a,b$ are integers such that $p \mid a \cdot b$, then $p \mid a$ or $p \mid b$.
Without loss of generality, suppose $p \not \mid a$. Then $\gcd(p,a) = 1$. Since $p$ is prime, its only divisors are $1$ and $p$. Since $p \nmid a$, the only common divisor of $p$ and $a$ must be 1. So then there exist integers $x$ and $y$ such that $xa + yp = 1$. Then multiplying both sides by $b$, we have $$b = xab + ypb.$$ Since $p \mid ab$, there exists an integer $n$ such that $ab = pn$. Then we have $$b = xab + ypb = xpn + ypb = p \cdot (xn+yb).$$ Since $xn + yb$ is an integer, by definition of divisibility, $p \mid b$.
Notice that this proof makes use of Bézout's lemma. Although Bézout's lemma is attributed about 2000 years after Euclid, recall that result does actually appear in the Elements.
This gives us the following extension, which one simply proves using induction. We will make use this of later on.
If $p, p_1, \dots, p_n$ are prime and $p \mid p_1 \cdots p_n$, then $p = p_i$ for some $i$.