Python 3: Using "yield from" in Generators - Part 1

A new feature in Python 3.3 is the ability to use yield from when defining a generator. My current experience with Python 3 has been with Python 3.1 so I've upgraded my installation in Ubuntu to Python 3.3 to explore "yield from". In this tutorial, in two parts, I'm going to outline some examples and potential use cases.

If you're already familiar with generators then you can skip the first section of this article and continue with the specifics of "yield from" below it.

Brief introduction to generators

In Python a generator can be used to let a function return a list of values without having to store them all at once in memory. This also allows you to utilize the values immediately without having to wait until all values have been computed.

Let's look at the following Python 2 function:

def not_a_generator():
    result = []
    for i in xrange(2000):
        result.append(perform_expensive_computation(i))
    return result

When we call not_a_generator() we have to wait until perform_expensive_computation has been performed on all 2000 integers.

This is inconvenient because we may not actually end up using all the computed results. For example, we may wish to use the above function as follows:

for element in not_a_generator():
    if not certain_condition(element):
        break
    # ...

Depending on the behaviour of certain_condition, it could be that we only use the first 700 values returned from not_a_generator() and we would have wasted time on computing the remaining values.

We can turn not_a_generator() into a generator by using the yield keyword:

def my_generator():
    for i in xrange(2000):
        yield perform_expensive_computation(i)

Difference between a function and a generator

Calling this generator is no different from our previous function:

for element in my_generator():
    if not certain_condition(element):
        break
    # ...

The difference is that whenever the generator "yields" a value the execution of the generator is paused and the code continues where the generator was called. That means the value of the variable element is known and the execution of the code can continue.

When the if-statement involving not certain_condition(element) is true then we no longer need the generator to compute the remaining values. On the other hand, if the for loop finishes for the current value of element then the generator is executed again until it yields another value.

The above will continue until all values of the generator have been yielded or until we no longer need the generator.

Note that we can easily get the behaviour of not_a_generator() back by storing the results of my_generator in a list: list(my_generator()). This forces the generator to be executed fully and to yield all its values.

Also note that a generator can't contain return value statements: using one or more yield statements turns a function into a generator and you can only use yield to return values to where the generator is called.

Why yield from?

When a new feature is introduced in a programming language we should ask ourselves if and why this was necessary. The short explanation is that it enables you to easily refactor a generator by splitting it up into multiple generators.

For basic purposes we can use plain generators to compute values and to pass those values around. The benefits of yield from should become clear when we know what it does and in which situations it can be used.

Consider a generator that looks like this:

def generator():
    for i in range(10):
        yield i
    for j in range(10, 20):
        yield j

As expected this generator yields the numbers 0 to 19. Let's say we wish to split this generator into two generators so we reuse them elsewhere.

We could rewrite the above into:

def generator2():
    for i in range(10):
        yield i

def generator3():
    for j in range(10, 20):
        yield j

def generator():
    for i in generator2():
        yield i
    for j in generator3():
        yield j

This version of generator() also yields the numbers 0 to 19. However, it feels unnecessary to specify that we wish to iterate over both generator2 and generator3 and yield their values. This is where yield from comes in. Using this new keyword we can rewrite generator into:

def generator():
    yield from generator2()
    yield from generator3()

This gives the same result and it is much cleaner to write and maintain. It is also quite similar to the way functions are refactored and split up into multiple functions. For example, a large function can be split into several smaller functions, f1(), f2() and f3(), and the original function simply calls f1(), f2() and f3() in sequence.

Useful situations for 'yield from'

Those of you familiar with the itertools module may note that the above example is rather simple and does not truly justify introducing a new keyword in the language.

Using the chain function from the itertools module we also could have written:

from itertools import chain

def generator():
    for v in chain(generator2(), generator3()):
        yield v

It can be argued that the yield from syntax and semantics are slightly cleaner than importing an additional function from a module but, leaving that aside, we have not yet seen an example where yield from has enabled us to do something new. As we will see later in this tutorial, the main benefit of yield from is to allow easy refactoring of generators.

It should be noted that it is not necessary for new programming language syntax to also introduce new semantics (i.e, to express something that was not possible before).

Many languages introduce syntax, often called syntactic sugar, to make it easier to write something that would otherwise be cumbersome to write. For example, Haskell allows you to easily write a string as "example" which is shorthand syntax for ['e', 'x', 'a', 'm', 'p', 'l', 'e'] (a list of characters). Cleaner and more maintainable code can suffice to introduce new syntax into a programming language.

Binary tree example

The proposal that introduces the yield from syntax provides a few examples to demonstrate the new behaviour. One of them is a basic binary tree and we can traverse the nodes of the tree using normal for loops as well as the new yield from syntax.

In the next part of this tutorial we will build upon this example to explain the benefits of yield from.

Contents © 2014 Simeon Visser