We often think of lists as a sequence of data, but the ability to put lists inside of lists gives us a way to represent arbitrarily complex structures. For instance, we can represent a matrix like, \[\begin{bmatrix} 0 & 4 & 3 & 4 \\ 3 & 1 & 5 & 0 \\ 1 & 0 & 4 & 6 \end{bmatrix},\] by using a list of lists:
>>> m = [[0, 4, 3, 4], [3, 1, 5, 0], [1, 0, 4, 6]]
>>> for i, x in enumerate(m):
... print(f"m[{i}]:", x)
m[0]: [0, 4, 3, 4]
m[1]: [3, 1, 5, 0]
m[2]: [1, 0, 4, 6]
Notice that the result of m[i] is a list. This means that in order to access a particular element inside one of the lists, we have to refer to an index for that list.
>>> m[1][2]
5
>>> m[2][0]
1
Think of it in the following way. The element m[1] is a list. Evaluating this produces a reference to this list. Then we can refer to an index in this list.
>>> type(m[1])
list
>>> l = m[1]
>>> l[2]
5
>>> (m[1])[2]
5
How should we navigate such a structure? If we use a loop on the list, then we will obtain one of the inner lists on each iteration. To process one of the "inside" lists, we need another loop.
def find_zeros(mat):
"""
Produce a list of locations of entries in the given matrix that
are 0.
Input:
mat (list[list]): matrix as list of list of numbers
Output (list[tuple[int,int]]): list of locations that have 0
"""
zeros = []
for i, row in enumerate(mat):
for j, val in enumerate(row):
if val == 0:
zeros.append((i,j))
return zeros
The way to think about this process is:
>>> zs = find_zeros(m)
>>> print(zs)
[(0, 0), (1, 3), (2, 1)]
>>> for i, j in zs:
... print(f"m at {i}, {j}:", m[i][j])
m at 0, 0: 0
m at 1, 3: 0
m at 2, 1: 0
while loopsSpeaking of iteration and complex structures, let's return to control structures for a bit and discuss the other looping construct: while loops. A while loop is a looping statement that repeats execution of a block of statements based on a condition, expressed as a boolean expression. Here's a template:
while <boolean expression>:
<statements>
One can read such a loop as "while the condition is true, do the following...".
Suppose we wanted to do something simple, like count backwards.
>>> N = 10
>>> i = N
>>> while i > 0:
... print(i)
... i = i - 1
10
9
8
7
6
5
4
3
2
1
While for loops manage advancing each iteration for you, the condition for a while loop must be managed more carefully. As a result, while loops are much more complicated to reason about. For these reasons, it is easy for an incorrect while loop to run indefinitely, entering an infinite loop.
>>> N = 10
>>> i = N
>>> while i > 0:
... print(i)
... # Did you forget something?
10
10
10
10
10
10
10
^C
Here, we see a very common error: forgetting to update variables that the loop condition checks for. Technically "forgetting" is not always the cause of this. Another cause is formulating a solution that may not update the variable in the first place.
>>> N = 10
>>> i = N
>>> while i > 0:
... print(i)
... i = i + 1
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
^C
Here's another common error: updating the loop condition in not quite the right way. In this example, we may reflexively increase i when we should have decreased it.
>>> N = 10
>>> i = N
>>> while i <= N:
... print(i)
... i = i - 1
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
^C
One last one: expressing the loop condition incorrectly. In this case, since we think of i as counting down from N, we may frame the loop condition in these terms. But we did not really specify in our problem statement that we wanted to stop at 0.
Notice that the correct loop above is equivalent to the following.
N = 10
for i in range(N):
print(N - i)
As we can see, whenever we want to iterate through a structure linearly, we are better served by for loops. It is also much clearer to express such iteration using a for loop.
Because of this, it's not too hard to see that while loops are more general—it is always possible to rewrite a for loop as a while, but the converse is not true.
There is a clear advantage to using for when you want to perform actions over a sequence of values. On the other hand, if repetition is required in a situation that doesn't involve iterating over a sequence, then we will need to use a while loop.
Recall the function that we wrote to determine whether a given number was a prime number or not. Another special kind of number having to do with divisibility is the greatest common divisor of two numbers.
The greatest common divisor of two numbers $a$ and $b$ is the greatest positive integer that divides both $a$ and $b$. Notice since every number is divisible by 1, the gcd of any two numbers is always at least 1.
One easy way to find the gcd is to use a similar strategy for determining whether a number is prime: check every number from 1 to $\min(a,b)$ and remember the biggest one that evenly divides both $a$ and $b$.
def gcd(a, b):
"""
Produce the greatest common divisor of a and b.
Input:
a (int): nonnegative integer
b (int): nonnegative integer
Output (int): gcd of a, b
"""
m = min(a, b)
g = 1
for d in range(2, m + 1):
if a % d == 0 and b % d == 0:
g = d
return g
An unfortunate feature of this algorithm is that we are forced to check all numbers up to $\min(a,b)$. Unlike the primality test, there is no opportunity to short circuit.
But this is not the best we can do, if we break out of the linear structure of the integers. One can show (for example, in CMSC 27100) that using division and remainders, we can actually skip checking a lot of the numbers. Assuming that $a \geq b$, we have that \[\gcd(a,b) = \begin{cases} a & \text{if $b = 0$,} \\ \gcd(b, a \bmod b) & \text{otherwise.} \end{cases}\]
Here, $a \mod b$ means the remainder when $a$ is divided by $b$, or a % b.
This property is known as Euclid's algorithm, due to Euclid, the Greek mathematician, c.300 BC (efficient algorithms have existed for a very long time). Even though it doesn't look like an algorithm in programming sense, it gives us a concrete series of steps we can take to compute the gcd.
Here, we have an operation we want to repeat: division and getting the remainder. We also have a condition we want to meet: until $b = 0$. So we want to use a loop since we need to repeat execution of these steps, but we can't use a for loop, because the sequence of remainders that we get is not a predictable structure. So we need to use a while loop.
def euclid(a, b):
"""
Produce the greatest common divisor of a and b.
Input:
a (int): nonnegative integer
b (int): nonnegative integer, a >= b
Output (int): gcd of a, b
"""
m = a
n = b
while n != 0:
r = m % n
m = n
n = r
return m
Here, we make a few choices to make the work that our loop does a bit clearer. Notice that we create variables m and n explicitly for the purposes of working in the loop rather than just reusing a and b. This is helpful for isolating the work in the loop to reason about it.
Once we do this, it's helpful to trace how our loop variables change during execution.
m | n
|
|---|---|
| 72 | 16 |
| 16 | 8 |
| 8 | 0 |
Typically, you will see and write more for loops simply because a lot of work is structural: you want to proceed orderly along a structure and for loops are very convenient for doing that. However, when you need to break out of this structure (for example, if you want items from a list but not necessarily in the prescribed order), while loops become necessary.