Suppose we insert our items from our running example into the tree in order from least to greatest: 15, 21, 28, 31, 48, 54, 56, 92, 93, 94. We get the following tree.
15
├──EMPTY
└──21
├──EMPTY
└──28
├──EMPTY
└──31
├──EMPTY
└──48
├──EMPTY
└──54
├──EMPTY
└──56
├──EMPTY
└──92
├──EMPTY
└──93
├──EMPTY
└──94
├──EMPTY
└──EMPTY
Everything is on one side! Even worse, this is basically a list: there's exactly one path to travel down for the entire tree. So this doesn't really save us any time at all.
But if the worst-case scenario is when the tree is just a big long list, what's the ideal situation? In theory, we want the tree to be as short as possible. This suggests that an additional property we might want for our tree is that it should be balanced. By balanced, we mean that each side of the tree has roughly the same number of elements.
How tall is our tree when the tree is balanced? Let's think about this based on the number of nodes in our tree. If we want our tree to be balanced, we need both subtrees to have roughly half of the elements. Then each subtree needs roughly half of that side's elements. And so on and so forth.
In other words, we keep on dividing by 2 repeatedly until we run out of elements to divide at some point. What is this number?
Let's take another number. Suppose we divide it by 10. How many times can we do this before we get to zero?
This is the meaning of the logarithm. It is a rough measure of the size of the representation (i.e. the number of digits) of a number for a given base. How fast does it grow? Well, we know that it is the inverse of the exponent function. Since the exponent grows extremely fast, the logarithm must grow extremely slowly.
How can we accomplish this? One strategy we can take is to randomize the items before insertion. However, this only works when we first populate our tree. Unless we can guarantee that items will continue to be evenly distributed, there's no way to guarantee that our tree remains balanced. And with our current definition, there is no way to guarantee this.
To guarantee balance, we would need additional restrictions on how values are stored in the tree. There are a class of data structures that are self-balancing binary search trees, like AVL trees and Red-black trees. Such trees maintain additional properties in the tree that can place provable guarantees on the number of elements on each side (or how different they can be).
So far, we've been focusing on structural recursion. Functions use structural recursion when they follow the structural definition of the data they are applied to. This trick is what gives us those nice templates. Because the functions follow the definitions so closely, half the work is done for you by the time you have to put everything together.
However, there are some disadvantages to this. What if our problem doesn't follow the strucutre of the data? Or what if following the structure so closely makes our algorithm less efficient than we can make it?
We can ask this question for some problems we've seen already. If we're looking for an element in a list, do we really need to search the entire list in order for it? Or if we're sorting, do we really need to compare every element to every other element in the list?
As we just saw, structural recursion has a tight link with iteration with for loops—for loops are structural too! They very closely follow the structure of the data that we're iterating over. But we'll also encounter problems that require us to break out of the structure that for loops provide.
For looping, we know that we can use while loops. But what about recursion?
Recursion that doesn't follow the recursive definition of its input is sometimes called generative recursion because it's recursion that "generates" subproblems. Like while loops, using generative recursion is trickier and less straightforward than using structural recursion. Since we no longer have the recursive structure of the input to guide us, we have to rely solely on the recursive structure of the problem. There's a very general template for this.
Let's see some examples of this.
The analogy of the while loop is instructive here, because our motivation for non-structural recursion arises from a similar motivation. But there's something else about Euclid's algorithm that you may not have noticed: it's recursive. Consider again the definition we saw:
\[ \gcd(a,b) = \begin{cases} a & \text{if $b = 0$,} \\
\gcd(b, a \bmod b) & \text{otherwise.}
\end{cases}\]
Wait a second—the definition of $\gcd$ is referring to $\gcd$! So this definition is recursive. However, if we inspect this further, we notice something else: the base and recursive cases don't match the base and recursive cases of the definition of the natural numbers! So this is an example of recursion that is not structural.
So if we think of $\gcd$ as a function, then we can write something like the following:
def gcd(a, b):
"""
Produce the greatest common divisor of a and b.
Input:
a (int): nonnegative integer
b (int): nonnegative integer, a >= b
Output (int): gcd of a, b
"""
if b == 0:
return a
else:
return gcd(b, a % b)
Our first sorting algorithm, insertion sort, relied on structural recursion. As a result, we were bound to the structure of the list: we have to look at every element of the list, which could lead to comparing each element to every other element in the list. This is as many as $\binom n 2$ comparisons, if $n$ is the size of the list. This is a well known quantity: it's $\frac{n(n-1)}{2}$, which is quadratic, or $O(n^2)$.
What this tells us is that if we want to do something faster, we have to break out of the recursive structure of the list. How might we go about doing this?
One of the assumptions we had in the recursive case of insertion sort was that we had a sorted list after a recursive call and that we could easily find a spot for one element in this list.
But what if we generalize this: instead of having one sorted list, we have two? Then if there was an easy way to combine the two sorted lists, we could be in business.
The business of merging is pretty standard: it's just going through the list, comparing the first elements, and putting them in the right order. Here, I show the recursive version, but you can just as easily accomplish this with loops, since we're just going down the list.
def merge(lst1, lst2):
"""
Combines two sorted lists into one sorted list.
Input:
lst1 (list[int]): first sorted list
lst2 (list[int]): second sorted list
Output (list[int]): sorted list of all elements from lst1 and lst2
"""
if lst1 == []:
return lst2
elif lst2 == []:
return lst1
else:
first1, *rest1 = lst1
first2, *rest2 = lst2
if first1 < first2:
return [first1] + merge(rest1, lst2)
else:
return [first2] + merge(lst1, rest2)
This idea leads us to a sorting algorithm first proposed by John von Neumann in 1945. If we have two sorted lists, we can merge them easily. But how do we turn our list into two sorted lists? By breaking them in half and recursively sorting them.
def mergesort(lst):
"""
Sorts a list of integers.
Input:
lst (list[int]): list of integers
Output (list[int]): sorted list of elements from lst
"""
if 0 <= len(lst) <= 1:
return lst
else:
mid = len(lst) // 2
left = mergesort(lst[:mid])
right = mergesort(lst[mid:])
return merge(left, right)
Notice that unlike the standard list recursion, we have two base cases: the empty list and the list containing one element. It's worth thinking about why this is necessary (hint: it has to do with our recursive case).
How is this any better than insertion sort? With insertion sort, we understand that we could possibly compare every pair of items in the list together. So we need a way to guarantee that we can avoid this.
Notice that we break our list in half every time before doing a recursive call. Intuitively, this breaking up of the search space in half each time makes it so that many of the comparisons between elements are eliminated. There's a more formal analysis that you'll see if you take Theory of Algorithms, but this leads us to $O(n \log n)$ time.
In other words, insertion sort requires us to decompose the problem along the structure of the list, while mergesort breaks us out of that and skips some comparisons.