A new feature in Python 3.3 is the ability to use
yield from when defining
a generator. My current experience with Python 3 has been with Python 3.1 so
I've upgraded my installation in Ubuntu to Python 3.3 to explore "yield from".
In this tutorial, in two parts, I'm going to outline some examples and
potential use cases.
If you're already familiar with generators then you can skip the first section of this article and continue with the specifics of "yield from" below it.
Brief introduction to generators
In Python a generator can be used to let a function return a list of values without having to store them all at once in memory. This also allows you to utilize the values immediately without having to wait until all values have been computed.
Let's look at the following Python 2 function:
def not_a_generator(): result =  for i in xrange(2000): result.append(perform_expensive_computation(i)) return result
When we call
not_a_generator() we have to wait until
has been performed on all 2000 integers.
This is inconvenient because we may not actually end up using all the computed results. For example, we may wish to use the above function as follows:
for element in not_a_generator(): if not certain_condition(element): break # ...
Depending on the behaviour of
certain_condition, it could be that we only
use the first 700 values returned from
not_a_generator() and we would have
wasted time on computing the remaining values.
We can turn
not_a_generator() into a generator by using the
def my_generator(): for i in xrange(2000): yield perform_expensive_computation(i)
Difference between a function and a generator
Calling this generator is no different from our previous function:
for element in my_generator(): if not certain_condition(element): break # ...
The difference is that whenever the generator "yields" a value the execution
of the generator is paused and the code continues where the generator was
called. That means the value of the variable
element is known and the execution
of the code can continue.
When the if-statement involving
not certain_condition(element) is true then
we no longer need the generator to compute the remaining values. On the other
hand, if the for loop finishes for the current value of
element then the
generator is executed again until it yields another value.
The above will continue until all values of the generator have been yielded or until we no longer need the generator.
Note that we can easily get the behaviour of
not_a_generator() back by
storing the results of
my_generator in a list:
forces the generator to be executed fully and to yield all its values.
Also note that a generator can't contain
return value statements: using one or
yield statements turns a function into a generator and you can only
yield to return values to where the generator is called.
Why yield from?
When a new feature is introduced in a programming language we should ask ourselves if and why this was necessary. The short explanation is that it enables you to easily refactor a generator by splitting it up into multiple generators.
For basic purposes we can use plain generators to compute values and to
pass those values around. The benefits of
yield from should become clear
when we know what it does and in which situations it can be used.
Consider a generator that looks like this:
def generator(): for i in range(10): yield i for j in range(10, 20): yield j
As expected this generator yields the numbers 0 to 19. Let's say we wish to split this generator into two generators so we reuse them elsewhere.
We could rewrite the above into:
def generator2(): for i in range(10): yield i def generator3(): for j in range(10, 20): yield j def generator(): for i in generator2(): yield i for j in generator3(): yield j
This version of
generator() also yields the numbers 0 to 19. However, it
feels unnecessary to specify that we wish to iterate over both
generator3 and yield their values. This is where
yield from comes in.
Using this new keyword we can rewrite
def generator(): yield from generator2() yield from generator3()
This gives the same result and it is much cleaner to write and maintain. It is
also quite similar to the way functions are refactored and split up into
multiple functions. For example, a large function can be split into several
f3(), and the original function simply
f3() in sequence.
Useful situations for 'yield from'
Those of you familiar with the
itertools module may note that the above example
is rather simple and does not truly justify introducing a new keyword in the language.
chain function from the
itertools module we also could have written:
from itertools import chain def generator(): for v in chain(generator2(), generator3()): yield v
It can be argued that the
yield from syntax and semantics are slightly cleaner
than importing an additional function from a module but, leaving that aside,
we have not yet seen an example where
yield from has enabled us to do something
new. As we will see later in this tutorial, the main benefit of
yield from is
to allow easy refactoring of generators.
It should be noted that it is not necessary for new programming language syntax to also introduce new semantics (i.e, to express something that was not possible before).
Many languages introduce syntax, often called
syntactic sugar, to make it easier to write something that would otherwise be
cumbersome to write. For example, Haskell allows you to easily write a string as
"example" which is shorthand syntax for
['e', 'x', 'a', 'm', 'p', 'l', 'e']
(a list of characters). Cleaner and more maintainable code can suffice to
introduce new syntax into a programming language.
Binary tree example
The proposal that introduces
yield from syntax provides a few examples to
demonstrate the new behaviour. One of them is a basic binary tree and we
can traverse the nodes of the tree using normal for loops as well as the
yield from syntax.
In the next part of this tutorial we will build upon this example to explain
the benefits of