Because lists can be of arbitrary length, we often need to iterate over the elements of a list to work with it. We saw how to do this with for loops.
for item in lst:
print(item)
A common temptation is to iterate over the indices of the list. For example, suppose you wanted the index for something other than accessing a particular element. A classical way to do this that is inherited from other programming languages is loop on the integers from 0 to the length of the list.
for i in range(len(lst)):
print(i, lst[i])
This is generally discouraged in Python, especially if all you're doing is accessing each element. However, if you need to work specifically with the index, the preferred method is to use the function enumerate. This function enumerates the items of a sequence and provides both the item and the index.
for i, item in enumerate(lst):
print(i, item)
It's worth discussing what exactly enumerate does. Simply put, this function "numbers off" each item in the list. However, in Python, we will see that there are more types like lists that contain multiple values. We can enumerate, or number items off, with those too.
Since lists are ordered, enumerate is guaranteed to enumerate the list in order, and so the numbering happens to align with the index of the items in the list.
However, looking at this, you might wonder why the for loop with enumerate has two loop variables instead of one. That's a bit strange. The answer is that enumerate produces tuples
Tuples are another data structure for representing a collection or sequence of values. Tuples are denoted by parentheses instead of square brackets.
t = (2, 5, "bread", 24.34)
Like lists, tuples can contain any number of items of any type. The main difference between tuples and lists is that tuples are immutable. That is, tuples cannot be changed in the same way lists can and we can treat them essentially like values. One consequence of this is that tuples can be thought of as having their sizes fixed once they are created.
We can make tuples of any size/dimension we like, including 0 or 1.
empty = ()
one_item = (3,)
Notice the quirk in the syntax for defining a tuple of one item. We require the trailing comma because (3) would be interpreted as an expression (recall that parentheses are used to group expressions).
There is a mathematical intepretation for tuples: while lists are an unbounded structure and can be of arbitrary length, tuples can be thought of as an element of a cartesian product of sets. Such a product has a fixed size and cannot be unbounded.
For example, a tuple of two floats representing a point (x,y) is the same as an element $(x,y) \in \mathbb R \times \mathbb R$.
(Here's a fun question to ponder: If tuples are elements of a cartesian product, what is the mathematical definition of a list? Does one even exist? (hint: yes))
The fact that tuples are of fixed size has some advantages. One of the most used is the ability to pack and unpack tuples.
t = 72, "south" # packing
count, direction = t # unpacking
x, y = -3.1, 9.3 # both at once
In the first case above, one can "pack" multiple items into a single assignment. In practice, this is basically the same whether you put the parentheses around or not.
The second case is more interesting: we take a tuple and perform the corresponding assignments to names on the left hand side. This requires knowing that our tuple is of the right size and holds the right values. But this is convenient because we don't have to do any fishing via indices.
The third case above is a special case of doing both packing and unpacking at once. You can think of it as matching. One example of unpacking we saw already was with enumerate, when it seemed like we had two loop variables. Here's another example.
triples = [("a", 1, 5.5), ("b", 2, 6.7), ("d", 4, 7.8)]
for key, pos, val in triples:
print(key, pos, val)
But what if we only want to iterate over the first two components of the tuple? Or maybe the outer two? The limitation that we must have the same number of components is getting in the way. Luckily, we can use _ as a placeholder that will ignore the corresponding component.
for key, _, val in triples:
print(key, val)
A final note: A common temptation is to use lists for all compound data. However, it is worth thinking through whether this really makes sense. For example, if we want to represent vectors in $\mathbb R^3$, we know that all of our data will always have three components. We don't need many of the features of Python lists.
while loopsSpeaking of iteration, let's return to control structures for a bit and discuss the other looping construct: while loops. These are looping statements that repeat based on a condition, expressed as a boolean expression. Here's a template:
while <boolean expression>
<statements>
One can read such a loop as "while the condition is true, do the following...".
Suppose we wanted to do something simple, like count backwards.
i = 0
N = 10
while i < N:
print(i)
i = i + 1
While for loops manage advancing each iteration for you, the condition for a while loop must be managed more carefully. As a result, while loops are much more complicated to reason about. For these reasons, it is easy for an incorrect while loop to run indefinitely, entering an infinite loop.
i = 0
N = 10
while i < N:
print(i)
Notice that the correct loop above is equivalent to the following.
N = 10
for i in range(N):
print(i)
i = i + 1
It's not too hard to see that while loops are more general—it is always possible to rewrite a for loop as a while, but the converse is not true.
There is a clear advantage to using for when you want to perform actions over a sequence of values. On the other hand, if repetition is required in a situation that doesn't involve iterating over a sequence, then we will need to use a while loop.
For example, suppose we want to simulate a series of coin flips, until we get $n$ heads in a row. This is not something that we can use a for loop for, because we don't quite know how many flips we need ahead of time. Instead, this is a loop that depends on a condition.
import random
flips = 0
target = 5
heads = 0
while heads < target:
flip = random.randint(0,1)
if flip == 0:
heads = heads + 1
else:
heads = 0
flips = flips + 1
print("flips:", flips, "heads:", heads)
Typically, you will see and write more for loops simply because a lot of work is structural: you want to proceed orderly along a structure and for loops are very convenient for doing that. However, when you need to break out of this structure (for example, if you want items from a list but not necessarily in the prescribed order), while loops become necessary.