CMSC 14100 — Lecture 18

Recursion

Let's compute a factorial, a very standard programming exercise. What is a factorial? Informally, the factorial of a natural number $n$, denoted $n!$, is the product of all numbers from 1 to $n$. For example, $4! = 1 \times 2 \times 3 \times 4$.

Okay, but how do we turn that into a program? First, we should try to define it more clearly. So we might say, ah, that's easy, we can just write it out like this: \[n! = 1 \times 2 \times \cdots \times n = \prod_{i=1}^n i.\] Great! Now we can write the function.


def factorial(n):
    """
    Computes the factorial of a given natural number n.

    Input:
        n (int): a natural number (i.e. n >= 0)

    Output (int): n!
    """
    prod = 1
    for i in range(1,n+1):
        prod = prod * i
    return prod
    

But there's one problem with this: what's $0!$?

Let's consider the following more formal definition.

Let $n$ be a natural number. The factorial $n!$ is defined as follows:

Here, we have a definition that doesn't refer to loops, repetition, or counting. Instead, factorials are defined in terms of themselves. This is a recursive definition.

Let's examine this in more detail. First, notice the definition is broken down into two cases: 0 and all the other natural numbers. But what's more strange is that the second case seems to use the factorial itself in the definition. This seems a bit like circular reasoning.

Let's try unpacking this. What's $3!$? According to the definition, $3! = 3 \times 2!$. Okay, so what's $2!$? According to the definition, $2! = 2 \times 1!$. But what's $1!$? According to the definition, $1! = 1 \times 0!$. But what's $0!$? According to the definition, $0! = 1$. Putting it all together, we have the following. \begin{align*} 3! &= 3 \times 2! \\ &= 3 \times (2 \times 1!) \\ &= 3 \times (2 \times (1 \times 0!)) \\ &= 3 \times (2 \times (1 \times 1)) \end{align*}

So this definition is sound. But notice that it's important that we have both a case for zero and a case for everything greater than zero. If we miss the zero case, there are two possibilities.

For any recursive function $f$, we always have the following in the definition of $f$:

But why are we allowed to do this? Why does this work?

Recursion is structural

Let's take a step back. We very quickly took for granted that we only want to compute factorials for natural numbers and that natural numbers are just nonnegative integers. But in fact, natural numbers have a particular structure.

This is important because recursion is about structure. In this case, we'll see that the reason factorials work is because they exploit the structure of the natural numbers.

This definition of the natural numbers is due to Giuseppe Peano in 1889. Notice that recursion is both not that new (more than a hundred years old!) but also pretty new if we're talking about mathematical ideas (just over a hundred years old!).

A natural number is either

What's $S(n)$? Based just on the definition, we just know it's also a natural number. And this is enough for a structural definition. If we want to give meaning to these numbers, we might start thinking about what "makes sense". The notion of a successor suggests the idea of a number that "comes after" a number. So what's the number that "comes after" $n$? We would think of that as $n+1$. Luckily it turns out these two notions, the structural definition and the meaning that we give to numbers, align if we do a bit more work and reason about the structural definition.

Notice that just like recursive functions, we have a base case (zero) and a recursive case (successors). This recursive definition tells us how this data is built up. Now, look back at the definition for the factorial. We see some interesting parallels:

Now, we can try to implement this definition as a function in Python.


def factorial(n):
    """
    Computes the factorial of a given natural number n.

    Input:
        n (int): a natural number (i.e. n >= 0)

    Output (int): n!
    """
    assert n >= 0
    if n == 0:      # this is the base case, n = 0
        return 1
    else:           # this is the recursive case, n > 0
        return n * factorial(n-1)
    

Notice how the function looks almost identical to the definition. In some sense, we can consider the definition itself an algorithm. In this algorithm:

You may notice a parallel between recursive definitions and mathematical induction. This is because mathematical induction is a proof technique that relies on and exploits the recursive structure of the natural numbers.

Recursion on natural numbers

The fact that the natural numbers are a recursive structure tells us something neat about the natural numbers. Namely, recursion tells us how to count.

For example, we're familiar with powers $m^n$ as $m$ multiplied by itself $n$ times. But like the factorial, we can formally define this in the same way.

For two natural numbers $m$ and $n$, we define the $n$th power of $m$, denoted $m^n$ by

Notice again that the definition doesn't rely on us knowing how to count. All we need to know how to do is substitute.


def power(m, n):
    """
    Computes the nth power of m.

    Input:
        m (int): base, natural number
        n (int): exponent, natural number
    
    Output (int): m multiplied by itself n times
    """
    assert n >= 0
    if n == 0:      # this is the case case, n = 0
        return 1
    else:           # this is the recursive case, n > 0
        return m * power(m, n-1)
    

This is pretty straightforward, but notice again how the recursive case of the function corresponds to the recursive case of the definition. We can almost make a template out of this.


def recursive_function(n):
    assert n >= 0
    if n == 0:
        # Base case (zero)
    else:
        # Recursive case (successor of a natural number)
        # Do something with recursive_function(n-1)
    

Lists are also recursive

What other things are recursive? It turns out many basic structures in computer science are recursive! A big one is lists.

A list is either

If we think about lists in Python, then we have the following corresponding lists:

Note that you can get away with thinking about every list in Python like this, even if you have a list like ["item"]. Why? It's clear what the "first" item in this list is, but what's the rest of the list? It should be [], but what happens if we try lst[1:]? (The answer is slicing rules tell us that slice will result in [])

Diversion: sequence unpacking

Because this idea of splitting up a list into the first item and the rest of the list is so common, there's a way to unpack a list into these two things in a similar way to tuple unpacking. This makes use of the sequence unpacking operator, denoted *.

Recall that with tuple unpacking, we needed to know the size of the tuple and make sure our assignments matched the number of values in a tuple. With lists, we have no idea how long the "rest" of the list might be.

How do we use *? Suppose we have a list lst. Then we perform the assignment


    first, *rest = lst
    

This means we want to take the first element of lst and assign it to first and take the rest of the list and assign it to rest. For example,


    >>> first, *rest = [1, 2, 3, 5, 8, 13]
    >>> first
    1
    >>> rest
    [2, 3, 5, 8, 13]
    

Note that this works even for a list with one item, because, according to our definition, [] is a list!


    >>> first, *rest = [72]
    >>> first
    72
    >>> rest
    []
    

This then gives us a nice strategy for outlining what a recursive function on lists might look like.


def recursive_function(lst):
    if lst == []:
        # Base case (empty list)
    else:
        # Recursive case (value and list)
        first, *rest = lst
        # Do something with first and recursive_function(rest)
    

Writing a simple recursive function for a list

Let's try writing a recursive function on a relatively simple problem that we've solved using loops.

Given a list of strings, compute the sum of the lengths of strings in the list.

This is not so bad.


def strings_lengths(strings):
    """
    Given a list of strings, produces the sum of lengths of strings in 
    the list.

    Input:
        strings (list[str]): List of strings

    Output (int): sum of lengths of strings from strings
    """
    total = 0
    for string in strings:
        total = total + len(string)
    return total
    

Now, let's see how we might think about this recursively. Recall that the definition of a list is either empty or not. So we break down our problem into these two cases.

Putting this together, we get the following.


def strings_lengths(strings):
    """
    Given a list of strings, produces the sum of lengths of strings in 
    the list.
    
    Input:
        strings (list[str]): List of strings

    Output (int): sum of lengths of strings from strings
    """
    if strings == []:
        return 0
    else:
        first, *rest = strings
        return len(first) + strings_lengths(rest)
    

Next time, we'll see what this actually looks like in action.