CMSC 15100 — Lecture 2

A simple problem

Let's say that we want to do something very simple. Let's compute a factorial. What is a factorial? Informally, the factorial of a number $n$, denoted $n!$, is the product of all numbers from 1 to $n$. Okay, but how do we turn that into a program? First, we should try to define it more clearly. So we might say, ah, that's easy, we can just write it out like this: \[n! = 1 \times 2 \times \cdots \times n.\] Great! But there are some issues with this. For instance,

Is it really okay to just go $\cdots$? How do we know what's really going on there?
How would we describe what goes on inside the $\cdots$?
What about $n = 0$? Or $n = \frac 3 7$? Or how about $n = -23$?

One thing we might wonder is whether the definition above is really sufficient for our purposes. After all, the ellipses there are doing a lot of work. We might want a less handwavy definition. Consider the following more formal definition.

Let $n$ be a natural number. The factorial $n!$ is defined as follows:

If $n = 0$, then $n! = 1$.
If $n \gt 0$, then $n! = n \times (n-1)!$.

There is a lot going on here, but let's try to break it down. First, we assert that $n$ is not just any number but a specific kind of number: a natural number. You may have encountered natural numbers before in mathematics. An informal definition of the natural numbers is just the set of nonnegative integers (in this class, we believe that $0$ is a natural number). So here, we have already taken the step of restricting the possible input.

More intriguing is the definition itself. For one thing, the definition is broken down into two cases: 0 and all the other natural numbers. What's even more strange is that the second case seems to use the factorial itself in the definition. This seems a bit like circular reasoning. However, if try and unpack the definition, we'll discover that we get exactly what we wanted.

Let's try this. What's $3!$? According to the definition, $3! = 3 \times 2!$. Okay, so what's $2!$? According to the definition, $2! = 2 \times 1!$. But what's $1!$? According to the definition, $1! = 1 \times 0!$. But what's $0!$? According to the definition, $0! = 1$. Putting it all together, we have the following. \begin{align*} 3! &= 3 \times 2! \\ &= 3 \times (2 \times 1!) \\ &= 3 \times (2 \times (1 \times 0!)) \\ &= 3 \times (2 \times (1 \times 1)) \end{align*}

Note that here, we've already made use of a very sophisticated mathematical idea: substitution. But why are we allowed to do this? How do we make use of this? What exactly are we doing here?

Natural numbers

Let's take a step back. We very quickly took for granted that we only want to compute factorials for natural numbers and that natural numbers are just nonnegative integers. But what is an integer? What is a number? We've been fine with these concepts because we intuitively understand them to represent amounts of stuff, but this is not really a formal definition.

But why do we need formal definitions? In mathematics, these are important for really nailing down the properties of things that we want to study. Of course, that's a very ivory tower kind of ideal. More recently, though, we started making use of some machines that really need these things nailed down: computers. And although we're trying, we're not at the point yet where we can rely on a computer's intuition for certain tasks.

So what does a formal definition of the natural numbers look like? Such a definition can't rely on our intuition from childhood, so it will look quite different. The most famous of these come from the Italian mathematician Giuseppe Peano in the late 19th Century.

The set of natural numbers, denoted $\mathbb N$, is defined as follows:

$z$ is a natural number, and
if $m$ is a natural number, then $\operatorname{succ}(m)$ is a natural number.

There are a few things we need to examine about this definition. First, we have an element that we've just called $z$. This happens to be the thing that we call zero or 0. Here I've made a deliberate choice not to use 0 to make it clear there's nothing particularly special about this element. $\operatorname{succ}$ is a function called the successor function.

It so happens that all of the numbers we already know and love can be described using this definition. One or 1 is $\operatorname{succ}(z)$. Two, or 2, is $\operatorname{succ}(\operatorname{succ}(z))$. But notice that we never actually defined these other natural numbers: the only natural number we explicitly defined was our zero. Every other number is expressed as the successor of another number.

Again, this seems like a strange definition at first, but notice how similar it is to our definition of the factorial from before. Very spooky, but this is not a coincidence! We can play the same game that we did with the factorial.

Is $3$ a natural number? Well, here, we're using $3$ as a shorthand for \[\operatorname{succ}(\operatorname{succ}(\operatorname{succ}(z))).\] So how do we know if $\operatorname{succ}(\operatorname{succ}(\operatorname{succ}(z)))$ is a natural number? Well, we have to figure out if it's the successor of a natural number, or if the argument to $\operatorname{succ}$ is also a natural number. So we look at $\operatorname{succ}(\operatorname{succ}(z))$ and see if that's a natural number. Again, this depends on the argument to $\operatorname{succ}$, so we check to see if $\operatorname{succ}(z)$ is a natural number. We go through the same argument and check if $z$ is a natural number. It is! So we can conclude that $3$ was a natural number after all.

This definition for the natural numbers (and the definition for factorial from above) is an example of a recursive or inductive definition. These are definitions that are defined in terms of itself. But how can this work out? The key lies in the two parts of the definition.

The first part of the definition is called the base case. The base case defines the most basic element of the definition. Often, one can think of this as kind of the "smallest" or "least" case. The second part of the definition is called the recursive or inductive case. This is the part of the definition that refers to itself.

We've implicitly made use of a number of ideas that we'll be fleshing out in the coming weeks.

Values. The "things" we're working with.
Types. The kinds of "things" we're working with.
Structures. Sometimes the "things" we're working with have a particular form.
Functions. How we perform computation. We apply functions, which evaluate to values, and substitute them into expressions that we're evaluating as appropriate.
Recursion. We can cleverly define a definition that refers to itself. This is related to the structure of our data.

Values and Types

Let's start with values. A value is just a thing. This is not a very formal definition. If we take the analogy of language, then values are the nouns. Values are the stuff that we want to do things with. There are all sorts of things we might be interested in. Intuitively, we understand that these things are not all the "same" kind of thing. For example, numbers are very different from, say, letters.

This naturally leads us to the idea of types. Types tell us what kind of thing a particular "thing" is. Every value has a type associated with it. Types allow us to discriminate what kinds of things we can do with certain things. For example, the notion of addition has an obvious meaning for things that are numbers.

But what if we try to add two words together? The way that we add two numbers together wouldn't make sense. However, one can envision a way for this to make sense. For example, one can think of adding two words by putting them one right after the other. But we would still need a way to say that if we have two numbers, we add them together like this, and if we have two words, we add them together like that. Or sometimes we can decide that addition doesn't really make any sense at all in the context of certain types!

Racket has a number of basic types that are predefined. We will see that we can define more types later. For now, we have the following types that we should become familiar with.

Natural. This is the type for natural numbers $\mathbb N$, which are non-negative integers. Note this description carefully: in this course (and in most other CS courses), 0 is a natural number. Even though we went through this whole thing about the structure of the natural numbers just now, in Racket, naturals are really just non-negative integers.
Integer. This is the type for integers, $\mathbb Z$.
Exact-Rational. This is the type for rational numbers $\mathbb Q$. This is an interesting type. You may expect that trying to represent something like $\frac 1 3$ will produce a number like 0.333333333, but Racket will actually give you the exact representation of the rational number $\frac 1 3$.
Real. This is the type for real numbers $\mathbb R$. One of the things that you will discover is that computers are incapable of representing real numbers. Because computers have only a finite amount of memory, at best, we can ask them to represent approximations of real numbers. Racket calls these inexact numbers (as opposed to exact numbers like Exact-Rational). So you should treat Reals with care, especially if you require a certain degree of accuracy.
Number. This is the type for all numbers, including complex numbers, $\mathbb C$. One thing to note is that some of these types are subtypes of other types. For instance, it's not hard to see how every Natural is an Integer.

Racket also has types that are not numbers.

String. This is the type for textual data. In the olden days, this kind of thing was usually restricted to the kind of symbols you'd find on your keyboard. However, most modern programming languages nowadays are able to use Unicode in strings, meaning that you can put almost any kind of letter or symbol from any human language in a string.
Boolean. This is the type for the Boolean logic values, true (true or #t) and false (false or #f).
Symbol. Symbols are a type that allows you to create a "symbol" for anything you'd like.
Image. Images are graphical data. Images are not really a built-in type in Racket, but are an extension for How to Design Programs.

Expressions

We've dealt with the nouns of our language, so now let's think about how to put them to use. We would like to do and say more complicated things, so we have to figure out what we're able to say. We can think of this as learning the grammar of our new language.

Just as we take sentences to be a complete thought, the basic thing that we can "say" is an expression. Everything that we "say" is going to be an expression of some kind, whether it is a single expression or a combination of expressions. And along with expressions, we will need some verbs to go along with our nouns. These actions that we want to perform with our things are functions.

This is not unlike math, where we have expressions like $3+4$ or $2^5$ or $x^2 + x + 1$. Let's dissect this. In our expressions, we have values (things) and we are doing something to those things, either addition or exponentation. We call these operations because they seem very basic and fundamental, but they're really functions.

These two things, values and functions, are the basic building blocks for expressions in Racket as well. Expressions will involve applying functions to one or more values. The result of such an application will be another value, just as the application of the functions in the expressions above will produce values.

However, one problem with mathematical expressions is that they were defined by and intended for humans to communicate with. This is usually not a huge problem, but once in a while, you run into things on Facebook that ask you what the correct answer to $6 + 8 \div 2$ is.

Many people will interpret this from left to right, adding $6$ and $8$ first to get $14$ and then dividing by $2$ to get $7$. These people will get accosted by others in comments who say "You fool! Division comes before addition in the order of operations!" and arrive at an answer of $10$. However, a pedantic mathematician will sit back and say "This expression is ambiguous." Then the remedy is clear: we just have to make the expression unambiguous and to do that, all one needs to do is to apply parentheses liberally.

This seems like a very laborious way to get to saying that every expression must be in parentheses, but this is actually still not enough for us to really define the structure of an expression. You see, in mathematics, we have all sorts of different constructions for expressions:

sometimes an operation is between its operands, $3 + 4$
sometimes an operation is after its operand, $x^2$
sometimes an operation is before its operand, $\log x$

In Racket, we dispense with all of this ambiguity. Every expression begins with a left parenthesis ( and ends with a right parenthesis ). The first thing after the left parenthesis will always be the name of the function to be applied, followed by expressions to which the function should be applied.

This means that the expression $3 + 4$ becomes (+ 3 4), $x^2$ becomes (expt x 2), and $\log x$ becomes (log x). Generally, we can say that all expressions will look like the following


    (function-name arg1 arg2 ... argn)

For functions that we already write in this order, like $f(x)$, this is not such a huge change, since it involves moving the name of the function inside the parentheses, like (f x). But it sure looks weird for those functions that we typically write infix, like $24 - 13$. Unfortunately, this is something that you'll just have to get used to.

The decision to unify all expressions by writing the function first seems arbitrary, but there are good reasons to do so. This idea is not new: it is called prefix notation, since the function prefixes the rest of the expression (this is in contrast with infix notation, where the function is somewhere within the expression, and postfix notation, where the function is at the end of the expression).

Historically, this is also called Polish notation, called such for the Polish logician Jan Łukasiewicz who defined it in the 1920s. One of the neat properties of prefix notation is that it doesn't actually require parentheses: every expression written in prefix notation is unambiguous (you can try this for yourself). This happens to make it very convenient for computers to read, since they are forced to read things from left to right.