Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Generators

A generator is a function or expression that returns a special type of iterator called generator iterator. generator-iterators are lazy: they do not store their values in memory, but generate their values when needed.

A generator function looks like any other function, but contains one or more yield expressions. Each yield will suspend code execution, saving the current execution rate (including all local variables and try statements). When the generator resumes, it picks up state from the suspension - unlike regular functions which reset with every call.

Constructing a generator

Generators are constructed much like other looping or recursive functions, but require a yield expression, which we will explore in depth a bit later.

An example is a function that returns the squares from a given list of numbers. As currently written, all input must be processed before any values can be returned:

def squares(list_of_numbers):
    squares = []
    for number in list_of_numbers:
        squares.append(number ** 2)
    return squares

You can convert that function into a generator like this:

def squares_generator(list_of_numbers):
    for number in list_of_numbers:
        yield number ** 2

The rationale behind this is that you use a generator when you do not need all the values at once.

This saves memory and processing power, since only the value you are currently working on is calculated.

Using a generator

Generators may be used in place of most iterables in Python. This includes functions or objects that require an iterable/iterator as an argument.

To use the squares_generator() generator:

squared_numbers = squares_generator([1, 2, 3, 4])

for square in squared_numbers:
    print(square)

1
4
9
16

Values within a generator can also be produced/accessed via the next() function. next() calls the __next__() method of a generator object, "advancing" or evaluating the generator code up to its yield expression, which then "yields" or returns the value.


>>> squared_numbers = squares_generator([1, 2])

>>> next(squared_numbers)
1
>>> next(squared_numbers)
4

When a generator is fully consumed and has no more values to return, it throws a StopIteration error.

>>> next(squared_numbers)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Difference between iterables and generators

Generators are a special sub-set of iterators. Iterators are the mechanism/protocol that enables looping over iterables. Generators and the iterators returned by common Python iterables act very similarly, but there are some important differences to note:

  • Generators are one-way; there is no "backing up" to a previous value.

  • Iterating over generators consume the returned values; no resetting.

  • Generators (being lazily evaluated) are not sortable and can not be reversed.

  • Generators do not have indexes, so you can't reference a previous or future value using addition or subtraction.

  • Generators cannot be used with the len() function.

  • Generators can be finite or infinite, be careful when collecting all values from an infinite generator.

The yield expression

The yield expression is very similar to the return expression.

Unlike the return expression, yield gives up values to the caller at a specific point, suspending evaluation/return of any additional values until they are requested.

When yield is evaluated, it pauses the execution of the enclosing function and returns any values of the function at that point in time.

The function then stays in scope, and when __next__() is called, execution resumes until yield is encountered again.

Note

Using yield expressions is prohibited outside of functions.

>>> def infinite_sequence():
...     current_number = 0
...     while True:
...         yield current_number
...         current_number += 1

>>> lets_try = infinite_sequence()
>>> lets_try.__next__()
0
>>> lets_try.__next__()
1

Why generators?

Generators are useful in a lot of applications.

When working with a large collection, you might not want to put all of its values into memory. A generator can be used to work on larger data piece-by-piece, saving memory and improving performance.

Generators are also very helpful when a process or calculation is complex, expensive, or infinite:


>>> def infinite_sequence():
...     current_number = 0
...     while True:
...         yield current_number
...         current_number += 1

Now whenever __next__() is called on the infinite_sequence object, it will return the previous number + 1.