Iterations in Python

The while and for can handle most repetitive tasks, however, Python provides additional tools to make it even simpler.

All iteration tools that scan objects from left to right work on any iterable objects in Python (for loops, the list comprehensions, in membership tests, the map() built-in function etc).

File Iterators

To understand the iteration protocol you have to see how it works. I am going to use the built-in type file. In this chapter, we’ll be using the following input file to demonstrate:

>>> print(open('test.py').read())
S1 = 'developer'

for (offset, i) in enumerate(S1):
	print(i, 'at offset no --> ', offset)

>>> open('test.py').read()
"S1 = 'developer'\n\nfor (offset, i) in enumerate(S1):\n\tprint(i, 'at offset no --> ', offset)\n"

Using readline to read one line of text from a file at a time. At the end of the file, an empty string is returned.

>>> f = open('test.py')
>>> f.readline()
"S1 = 'developer'\n"
>>> f.readline()
'\n'
>>> f.readline()
'for (offset, i) in enumerate(S1):\n'
>>> f.readline()
"\tprint(i, 'at offset no --> ', offset)\n"
>>> f.readline()
''

Using the __next__() method to read one line of text from a file at a time. At the end of the file it raises a built-in StopIteration exception.

>>> f = open('test.py')
>>> f.__next__()
"S1 = 'developer'\n"
>>> f.__next__()
'\n'
>>> f.__next__()
'for (offset, i) in enumerate(S1):\n'
>>> f.__next__()
"\tprint(i, 'at offset no --> ', offset)\n"
>>> f.__next__()
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

This interface is what is called the iteration protocol in Python.

However, the best way to read a text file line by line is to use the for loop to automatically call __next__() to advance to the next line on each iteration.

>>> for l in open('test.py'):
...     print(l, end = '')
... 
S1 = 'developer'

for (offset, i) in enumerate(S1):
	print(i, 'at offset no --> ', offset)

Manual iteration with iter() and next()

Python 3.X provides a built-in function, next(), that automatically calls an object’s __next__ method. So, the call next(X) is the same as X.__next__(), but is simpler and more version-neutral. With files, for instance, either form may be used.

>>> f = open('test.py')
>>> f.__next__()
"S1 = 'developer'\n"
>>> f.__next__()
'\n'
>>> f.__next__()
'for (offset, i) in enumerate(S1):\n'
>>> f.__next__()
"\tprint(i, 'at offset no --> ', offset)\n"
>>> f.__next__()
Traceback (most recent call last):
  File "", line 1, in 
StopIteration
>>> 
>>> 
>>> f = open('test.py')
>>> next(f)
"S1 = 'developer'\n"
>>> next(f)
'\n'
>>> next(f)
'for (offset, i) in enumerate(S1):\n'
>>> next(f)
"\tprint(i, 'at offset no --> ', offset)\n"
>>> next(f)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

The full iteration protocol, used by every iteration tool in Python, is based on two objects, used in two distinct steps by iteration tools:

  • The iterable object you request iteration for, whose __iter__ is run by iter
  • The iterator object returned by the iterable that actually produces values during the iteration, whose __next__ is run by next and raises StopIteration when finished producing results

As a Python programmer it helps to understand these two objects’ roles, although they do what they have to do automatically.

The first step of the protocol is obvious if we look at how for loops internally process built-in sequence types such as lists.

>>> L = [1, 2, 3]
>>> l = iter(L)
>>> l.__next__()
1
>>> l.__next__()
2
>>> l.__next__()
3
>>> l.__next__()
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

A file object is its own iterator (the initial step is not required) and files have their own __next__() method.

>>> f = open('test.py')
>>> iter(f) is f
True
>>> iter(f) is f.__iter__()
True
>>> f.__next__()
"S1 = 'developer'\n"
>>> f.__next__()
'\n'
>>> f.__next__()
'for (offset, i) in enumerate(S1):\n'
>>> f.__next__()
"\tprint(i, 'at offset no --> ', offset)\n"
>>> f.__next__()
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

Many built-in objects are not their own iterators because they do support multiple open iterations (there may be multiple iterations in nested loops all at different positions) so, for such situations, we must call iter to start iterating.

>>> L = [1, 2, 3]
>>> iter(L) is L
False
>>> L.__next__()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'list' object has no attribute '__next__'
>>> l = iter(L)
>>> l.__next__()
1
>>> l.__next__()
2
>>> l.__next__()
3
>>> l.__next__()
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

We can use the iteration tools to apply the iteration protocol manually.

>>> L = [1, 2, 3]
>>> for a in L:
...     print(a + 3, end = ' ')
... 
4 5 6 >>> 
>>> 
>>> 
>>> 
>>> I = iter(L)
>>> while True:
...     try:
...             a = next(I)
...     except StopIteration:
...             break
...     print(a + 3, end = ' ')
... 
4 5 6 >>> 

Note:
The try statements runs an action and catches exceptions that occur while the action runs.

Other built-in type iterables

Besides files and physical sequences like lists, other types have useful iterators as well.
In recent versions of Python, though, dictionaries are iterables with an iterator that automatically returns one key at a time in an iteration context.

# the classic way
>>> D1 = {'moto1': 'honda', 'moto2': 'yamaha', 'moto3': 'suzuki'}
>>> for k in D1.keys():
...     print(k, D1[k])
... 
moto3 suzuki
moto2 yamaha
moto1 honda
>>>
>>>
# using an iterator
>>> I = iter(D1)
>>> next(I)
'moto3'
>>> next(I)
'moto2'
>>> next(I)
'moto1'
>>> next(I)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

The net effect is that we don’t need to call the keys() method to step through dictionary keys; the for loop will use the iteration protocol to grab one key each time through.

>>> D1 = {'moto1': 'honda', 'moto2': 'yamaha', 'moto3': 'suzuki'}
>>> for k in D1:
...     print(k, D1[k])
... 
moto2 yamaha
moto1 honda
moto3 suzuki

Leave a Reply