CMSC 14100 — Lecture 19

Last time, we were left with the following function definition for a function that computes the sum of the strings in a given list.


def strings_lengths(strings):
    """
    Given a list of strings, produces the sum of lengths of strings in 
    the list.
    
    Input:
        strings (list[str]): List of strings

    Output (int): sum of lengths of strings from strings
    """
    if strings == []:
        return 0
    else:
        first, *rest = strings
        return len(first) + strings_lengths(rest)

What does this actually look like in action? Well, suppose we try running this on the list ["shoe", "banana", "penguin"].

First, we notice we have a non-empty list. So we split it into the first item, "shoe" and the rest of the list, ["banana", "penguin"]. The length of "shoe" is 4 so the result of our function is 4 + strings_lengths(["banana", "penguin"]).
But what is strings_lengths(["banana", "penguin"])? Well, we have a non-empty list again, so we split it into the first item, "banana" and the rest of the list, ["penguin"]. The length of "banana" is 6, so this function call evaluates to 6 + strings_lengths(["penguin"]).
But what is strings_lengths(["penguin"])? Again, we have a non-empty list so we split it into the first item, "penguin" and the rest of the list, []. The length of "penguin" is 7, so the result here is 7 + strings_lengths([]).
But what is strings_lengths([])? This time, we have an empty list, so the result of this call is 0.
Now, we start putting things back together. We'll have something that conceptually looks like this:
```
        4 + (6 + (7 + 0))
        
```
which evalutes to 17.

Another way of seeing this is the following: \begin{align*} &\ \mathtt{strings\_lengths(["shoe", "banana", "penguin"])} \\ =&\ 4 + \mathtt{strings\_lengths(["banana", "penguin"])} \\ =&\ 4 + (6 +\mathtt{strings\_lengths(["penguin"])}) \\ =&\ 4 + (6 + (7 + \mathtt{strings\_lengths([])})) \\ =&\ 4 + (6 + (7 + 0)) \\ =&\ 17 \end{align*}

In other words, we can get the result by doing exactly what we did before, by simply evaluating the function and performing the appropriate substitutions at each step.

Accumulation and tail recursion

Now, there are a few things about recursion on lists you might be wondering. First: what's the point—we have for loops. And secondly, from what we just saw, it doesn't seem great that we have to wait until we reach the base case before beginning to evaluate the expression.

In fact, the issue of delayed evaluation is a practical issue. The fact that we have to wait until the base case before evaluation is problematic because every recursive call of our function creates a new stack frame. On real computers, this is a real cost and it's possible to run out of memory in this way.

It turns out that our two observations are related: we can try to solve the problem of delayed evaluation and in doing so we will see how we can eventually land on the idea of loops.

The main issue with evaluation right now is the order in which we do evaluation. Our expression above requires knowing what our recursive calls evaluate to before being able to evaluate the sum. To avoid this, we would want to compute our sum before the recursive call. But this means we can only compute a partial sum. Furthermore, we would have to provide this partial sum to our recursive call. What might this look like?


def strings_lengths(strings, sum_so_far):
    """
    Given a list of strings, produces the sum of lengths of strings in 
    the list.
    
    Input:
        strings (list[str]): List of strings
        sum_so_far (int): sum of the lengths of the strings seen so far

    Output (int): sum of lengths of strings from strings
    """
    if strings == []:
        return sum_so_far
    else:
        first, *rest = strings
        return strings_lengths(rest, len(first) + sum_so_far)

So to account for the fact that we do some computation before our recursive call, we have to create an explicit parameter to pass that partial result. If we run this, we see that we get the same answer. However, if we write out the steps, we get the following: \begin{align*} &\ \mathtt{strings\_lengths(["shoe", "banana", "penguin"], 0)} \\ =&\ \mathtt{strings\_lengths(["banana", "penguin], 4 + 0)} \\ =&\ \mathtt{strings\_lengths(["penguin"], (6 + (4 + 0))} \\ =&\ \mathtt{strings\_lengths([], (7 + (6 + (4 + 0)))} \\ =&\ (7 + (6 + (4 + 0))) \\ \end{align*}

Here, I've left the expression unevaluated so we can see how the terms are gathered, but the actual function would evaluate on each step: \begin{align*} &\ \mathtt{strings\_lengths(["shoe", "banana", "penguin"], 0)} \\ =&\ \mathtt{strings\_lengths(["banana", "penguin], 4 + 0)} \\ =&\ \mathtt{strings\_lengths(["penguin"], 6 + 4)} \\ =&\ \mathtt{strings\_lengths([], 7 + 10)} \\ =&\ 17 \\ \end{align*}

Here, we see that sum_so_far plays the role as the accumulator variable sum in our loop in the iterative version. And this is really the secret: for loops are really this version of recursion:

We have an accumulator variable that collects partial results
We always take off the "next" element and use it with the partial result and store the partial result
We remember the accumulator for the next "iteration" (i.e. the next step of recursion)
We even have to initialize the accumulator. In other words, we need to start it off with the appropriate base case. In the above, we explicitly provided this, though we could set an appropriate default value instead, like
```
def strings_lengths(strings, sum_so_far=0):
```
Then, we can simply call strings_lengths(lst) on some list.

Sorting

While the previous problem was something we could easily solve with a loop, there are some problems where thinking recursively can be a good idea. Let's consider a classical problem on lists: sorting a list of numbers.

Again, we try to view this problem through the lens of the our recursive definition.

If our list is empty, it's sorted. Done!
If our list isn't empty, then we have a value and the rest of the list. Recall that we apply our recursive call to the rest of the list, so suppose we do and now we have to think about where the value we have has to go.

So we have something that looks like


    def our_sort(lst):
        """
        Given a list of integers, produce a new list with the numbers 
        sorted from least to greatest.

        Input:
            lst (list[int]): a list of integers

        Output (list[int]): the list of numbers sorted in increasing order 
        """
        if lst == []:
            return []
        else:
            first, *rest = lst
            sorted_rest = our_sort(rest)
            ...
            sorted_lst = ???
            return sorted_lst

At this point, we recognize that we have the following pieces:

first, which is an integer, and
rest, which is the rest of the list. Since we apply our_sort to it, what we really have is our_sort(rest), which is the rest of our list, but sorted.

This makes our lives much easier: all we need to do to finish the job is to slide first into the right spot in our_sort(rest), which is sorted.

How do we do this? Let's take the opportunity to solve this problem recursively as well. Suppose we have to put an item item into a sorted list. What are our options?

If that list is empty, then it's easy, the result is just a list [item] that contains our one item that we had.
Otherwise, we can take our sorted list and split it up into a value first and a list rest again. Now we have two items. Let's compare them.
- If item < first, then I know item goes before first. So a sorted list would be [item] + lst.
- Otherwise, item has to go after first. This means we have to find a place for it in rest. But rest is already sorted! So we can repeat this process again.

This gives us the following.


def insert(item, lst):
    """
    Inserts a given item into a sorted list.

    Input:
        item (int): the item to be inserted
        lst (list[int]): a sorted list
    
    Output (list[int]): sorted list containing item and items from lst 
    """
    if lst == []:
        return [item]
    else:
        first, *rest = lst
        if item < first:
            return [item] + lst
        else:
            return [first] + insert(item, rest)

Suppose we tried running insert(5, [1,2,3,8,13,21]). What does this look like?

First, we have 5 and [1,2,3,8,13,21]. This list isn't empty, so we split it into a first element 1 and the rest of the list [2,3,8,13,21]. Is 5 less than 1? Yes! So we want to return [1] + insert(5, [2,3,8,13,21]).
But what is insert(5, [2,3,8,13,21])? Well, we notice we have a list that isn't empty, so we split it into 2 and [3,8,13,21]. We ask: is 5 less than 2? It is! So the result of this is [2] + insert(5, [3,8,13,21]).
But what is insert(5, [3,8,13,21])? Well, we notice we have a list that isn't empty, so we split it into 3 and [8,13,21]. Again, we ask: is 5 less than 3? Yep! So the result of this is [3] + insert(5, [8,13,21]).
But what is insert(5, [8,13,21])? Again, we notice we have a list that isn't empty, so we split it into 8 and [13,21]. So we ask: is 5 less than 8? Absolutely not! So the result of this is [5] + [8,13,21].
Now, we start getting results back and we end up putting something together that looks like
```
        [1] + ([2] + ([3] + ([5] + [8,13,21])))
        
```
which evalutes to [1,2,3,5,8,13,21].

With less prose, we have \begin{align*} &\ \mathtt{insert(5, [1,2,3,8,13,21])} \\ =&\ \mathtt{[1] + insert(5, [2,3,8,13,21])} \\ =&\ \mathtt{[1] + ([2] + insert(5, [3,8,13,21]))} \\ =&\ \mathtt{[1] + ([2] + ([3] + insert(5, [8,13,21])))} \\ =&\ \mathtt{[1] + ([2] + ([3] + ([5] + [8,13,21])))} \\ =&\ \mathtt{[1,2,3,5,8,13,21])))} \\ \end{align*}

Now, we can complete our strategy for our_sort. If we have a non-empty list, we recursively sort the rest of our list. Then we insert the first element of the list into the rest of the list that's been sorted!


def insertion_sort(lst):
    """
    Given a list of integers, produces a list of the same integers in 
    sorted order.

    Input:
        lst (list[int]): list of integers

    Output: sorted list of integers (list[int])
    """
    if lst == []:
        return []
    else:
        first, *rest = lst
        return insert(first, insertion_sort(rest))

Because the way this sorting algorithm works is by repeatedly inserting an element into a sorted list, it is called insertion sort.

Let's see what this looks like with a short list, like [3, 1, 5].

We split our non-empty list into 3 and [1,5]. The result of this call is insert(3, insertion_sort([1,5]))
- To evaluate insert(3, insertion_sort([1,5])), we need to know what insertion_sort([1,5]) is.
  - To evaluate insertion_sort([1,5]), we split our list into 1 and [5]. The result of this is insert(1, insertion_sort([5]).
    - To evaluate insert(1, insertion_sort([5])), we need to know what insertion_sort([5]) is.
      - To evaluate insertion_sort([5]), we split our list into 5 and []. The result of this is insert(5, insertion_sort([]).
        To evaluate insert(5, insertion_sort([])), we need to know what insertion_sort([]) is.
        To evaluate insertion_sort([]), we see that our list is empty. Then the result is just [].
        Then insert(5, []) is just [5].
        Then insertion_sort([5]) is just [5].
      Then insert(1, [5]) is [1,5].
    Then insertion_sort([1,5]) is [1,5].
  Then insert(3,[1,5]) is [1,3,5].
Then insertion_sort([3,1,5]) is [1,3,5].

More succinctly, we have \begin{align*} &\ \mathtt{insertion\_sort([3,1,5])} \\ =&\ \mathtt{insert(3, insertion\_sort([1,5]))} \\ =&\ \mathtt{insert(3, insert(1, insertion\_sort([5])))} \\ =&\ \mathtt{insert(3, insert(1, insert(5, insertion\_sort([]))))} \\ =&\ \mathtt{insert(3, insert(1, insert(5, [])))} \\ =&\ \mathtt{insert(3, insert(1, [5]))} \\ =&\ \mathtt{insert(3, [1,5])} \\ =&\ \mathtt{[1,3,5]} \end{align*}

Most treatments of insertion sort involve swapping list items around in a loop, but this version shows that it very naturally exploits the recursive structure of lists.