CMSC 14100 — Lecture 19

Recursion

Let's compute a factorial, a very standard programming exercise. What is a factorial? Informally, the factorial of a natural number $n$, denoted $n!$, is the product of all numbers from 1 to $n$. For example, $4! = 1 \times 2 \times 3 \times 4$.

Okay, but how do we turn that into a program? First, we should try to define it more clearly. So we might say, ah, that's easy, we can just write it out like this: \[n! = 1 \times 2 \times \cdots \times n = \prod_{i=1}^n i.\] Great! Now we can write the function.


def factorial(n):
    """
    Computes the factorial of a given natural number n.

    Input:
        n (int): a natural number (i.e. n >= 0)

    Output: n! (int)
    """
    prod = 1
    for i in range(1,n+1):
        prod = prod * i
    return prod
    

But there's one problem with this: what's $0!$?

Let's consider the following more formal definition.

Let $n$ be a natural number. The factorial $n!$ is defined as follows:

Here, we have a definition that doesn't refer to loops, repetition, or counting. Instead, factorials are defined in terms of themselves. This is a recursive definition.

Let's examine this in more detail. First, notice the definition is broken down into two cases: 0 and all the other natural numbers. But what's more strange is that the second case seems to use the factorial itself in the definition. This seems a bit like circular reasoning.

Let's try unpacking this. What's $3!$? According to the definition, $3! = 3 \times 2!$. Okay, so what's $2!$? According to the definition, $2! = 2 \times 1!$. But what's $1!$? According to the definition, $1! = 1 \times 0!$. But what's $0!$? According to the definition, $0! = 1$. Putting it all together, we have the following. \begin{align*} 3! &= 3 \times 2! \\ &= 3 \times (2 \times 1!) \\ &= 3 \times (2 \times (1 \times 0!)) \\ &= 3 \times (2 \times (1 \times 1)) \end{align*}

So this definition is sound. But notice that it's important that we have both a case for zero and a case for everything greater than zero. If we miss the zero case, there are two possibilities.

For any recursive function $f$, we always have the following in the definition of $f$:

But why are we allowed to do this? Why does this work?

Recursion is structural

Let's take a step back. We very quickly took for granted that we only want to compute factorials for natural numbers and that natural numbers are just nonnegative integers. But in fact, natural numbers have a particular structure.

This is important because recursion is about structure. In this case, we'll see that the reason factorials work is because they exploit the structure of the natural numbers.

This definition of the natural numbers is due to Giuseppe Peano in 1889. Notice that recursion is both not that new (more than a hundred years old!) but also pretty new if we're talking about mathematical ideas (just over a hundred years old!).

A natural number is either

Here, you may realize that $S(n) = n+1$.

Notice that just like recursive functions, we have a base case (zero) and a recursive case (successors). This recursive definition tells us how this data is built up. Now, look back at the definition for the factorial. We see some interesting parallels:

Now, we can try to implement this definition as a function in Python.


def factorial(n):
    """
    Computes the factorial of a given natural number n.

    Input:
        n (int): a natural number (i.e. n >= 0)

    Output: n! (int)
    """
    assert n >= 0
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)
    

Notice how the function looks almost identical to the definition. In some sense, we can consider the definition itself an algorithm. In this algorithm:

You may notice a parallel between recursive definitions and mathematical induction. This is because mathematical induction is a proof technique that relies on and exploits the recursive structure of the natural numbers.

Recursion on natural numbers

The fact that the natural numbers are a recursive structure tells us something neat about the natural numbers. Namely, recursion tells us how to count.

For example, we're familiar with powers $m^n$ as $m$ multiplied by itself $n$ times. But like the factorial, we can formally define this in the same way.

For two natural numbers $m$ and $n$, we define the $n$th power of $m$, denoted $m^n$ by

Notice again that the definition doesn't rely on us knowing how to count. All we need to know how to do is substitute.


def power(m, n):
    """
    Computes the nth power of m.

    Input:
        m (int): base, natural number
        n (int): exponent, natural number
    
    Output: m multiplied by itself n times (int)
    """
    assert n >= 0
    if n == 0:
        return 1
    else:
        return m * power(m, n-1)
    

This is pretty straightforward, but notice again how the recursive case of the function corresponds to the recursive case of the definition. We can almost make a template out of this.


def recursive_function(n):
    assert n >= 0
    if n == 0:
        # Base case (zero)
    else:
        # Recursive case (successor of a natural number)
        # Do something with recursive_function(n-1)
    

Lists are also recursive

What other things are recursive? It turns out many basic structures in computer science are recursive! A big one is lists.

A list is either

If we think about lists in Python, then we have the following corresponding lists:

Note that you can get away with thinking about every list in Python like this, even if you have a list like ["item"]. Why? It's clear what the "first" item in this list is, but what's the rest of the list? It should be [], but what happens if we try lst[1:]? (The answer is slicing rules tell us that slice will result in [].

Diversion: sequence unpacking

Because this idea of splitting up a list into the first item and the rest of the list is so common, there's a way to unpack a list into these two things in a similar way to tuple unpacking. This makes use of the sequence unpacking operator, denoted *.

Recall that with tuple unpacking, we needed to know the size of the tuple and make sure our assignments matched the number of values in a tuple. With lists, we have no idea how long the "rest" of the list might be.

How do we use *? Suppose we have a list lst. Then we perform the assignment


    head, *tail = lst
    

This means we want to take the first element of lst and assign it to head and take the rest of the list and assign it to tail. For example,


    head, *tail = [1,2,3,5,8,13]
    

will assign 1 to head and [2,3,5,8,13] to tail. Note that this works even for a list with one item, because [] is a list!

This then gives us a nice strategy for outlining what a recursive function on lists might look like.


def recursive_function(lst):
    if lst == []:
        # Base case (empty list)
    else:
        # Recursive case (value and list)
        first, *rest = lst
        # Do something with first and recursive_function(rest)