Python 3: Using "yield from" in Generators - Part 2
In the previous part of this tutorial,
I have discussed the basics of generators, some differences between functions
and generators. I have also hinted at some benefits of the new yield from
syntax.
I recommend reading that part of the tutorial first before continuing unless
you're well familiar with generators in Python.
All examples in this part of the tutorial will be in Python 3.3 unless stated otherwise.
Basic binary tree
In this second part I'll build upon a basic binary tree example that
demonstrates the uses of the yield from
syntax. For the sake of this example,
I'll let each node in the tree have a list of children rather than, as is often
done, let each node point to its parent node.
Here is the implementation of this data structure without the new syntax:
class Node: def __init__(self, value): self.left = [] self.value = value self.right = [] def iterate(self): for node in self.left: yield node.value yield self.value for node in self.right: yield node.value def main(): root = Node(0) root.left = [Node(i) for i in [1, 2, 3]] root.right = [Node(i) for i in [4, 5, 6]] for value in root.iterate(): print(value) if __name__ == "__main__": main()
As expected, this prints the values 1
, 2
, 3
(of the left children),
the value 0
(of the root node) and the values 4
, 5
, and 6
(of the
right children).
However, this example only iterates over the root node
and its children; it won't recursively iterate over the children of the
child nodes (if there happened to be any). Let's modify the iterate()
method to make that happen:
def iterate(self): for node in self.left: for value in node.iterate(): yield value yield self.value for node in self.right: for value in node.iterate(): yield value
This version also calls iterate()
on each of the child nodes so this
yields each of the nodes in the tree. This code is rather cumbersome: for
each of the left and right children we yield all the values by using
explicit iteration. This is where yield from
simplifies the code:
def iterate(self): for node in self.left: yield from node.iterate() yield self.value for node in self.right: yield from node.iterate()
Other aspects of generators
Now, the story would be over pretty quickly if that was all there was to it.
To appreciate yield from
I'll need to explain an alternative way of using
a generator.
A generator can be controlled using methods such as send()
and next()
. These
and related methods allow you to start, stop and continue a generator rather
than having Python handle most of the generator's execution.
For example, instead of the above basic loop:
for value in root.iterate(): print(value)
we can also write:
it = root.iterate() while True: try: print(it.send(None)) except StopIteration: break
The send()
method allows you to "send" a value into the generator, which means
the yield
expression receives that value. That value can be used by assigning
it to a variable (i.e., v = yield self.value
).
In this case we repeatedly send
the value None
into the generator and our generator doesn't utilize the
value that it sent into it. Effectively this leads to the same result as
the previous loop that prints the generator's yielded values.
Benefits of yield from
The primary benefits of yield from
come when you've written a generator that
uses these techniques and when it needs to be refactored. This means you'll have
to subdivide the generator into multiple subgenerators and send / receive values
to and from those subgenerators. Rather than rewriting
the generator to send values to the subgenerator, you can simply use
yield from
and the semantics will remain the same.
There are some caveats and some situations which aren't handled but that's beyond the scope of this tutorial (I'll refer you to the official proposal for details).
Another example generator
Let's create a small generator that demonstrates the above:
def node_iterate(self): for node in self.left: input_value = yield node.value # ... input_value = yield self.value # ... for node in self.right: input_value = yield node.value # ...
This generator only iterates over the node and its immediate children. Any
value sent into the generator is stored in the variable input_value
which is then available for computations (the # ...
sections).
For example, the code that uses this generator may perform various computations and it passes intermediate values back into the generator so that they can be used there. The following shows how the yielded values are summed and the subtotals are passed back into the generator:
total = 0 it = root.node_iterate() it.send(None) while True: try: value = it.send(total) total += value except StopIteration: break
Refactoring iteration over children
It seems repetitive to have the same code for both the left and right children so we can refactor that into:
def child_iterate(self, nodes): for node in nodes: input_value = yield node.value # ... def node_iterate(self): yield from self.child_iterate(self.left) input_value = yield self.value # ... yield from self.child_iterate(self.right)
Note that it is not recommended practice to call a class method
(self.child_iterate
) by passing an instance variable (i.e., the
self.left
and self.right
arguments) but
that is a topic for a different post. The important part is that we can only
refactor in this way as a result of the new yield from
semantics.
When we send values into node_iterate()
they are automatically
passed into the subgenerator child_iterate()
where input_value
receives a value.
Prior to Python 3.3, if we had written:
for value in self.child_iterate(self.left): yield value
then this would yield values from child_iterate()
but input_value
would
remain None
. To make it work as expected, we would need to explicitly send()
the values from node_iterate()
into child_iterate()
.
More refactoring
Lastly, we can perform one more refactoring step leaving us with only one
section where the input_value
is received and used:
def process(self): input_value = yield self.value # ... def child_iterate(self, nodes): for node in nodes: yield from node.process() def node_iterate(self): yield from self.child_iterate(self.left) self.process() yield from self.child_iterate(self.right)
As you become more familiar with using send()
and related functions, you'll
notice that yield from
makes life easier. I suggest reading the exact
semantics in the specification
to know when and how you'll be able to use it.