Generators
Generators are relatively new, but now fairly widespread. Languages that have them include: Python (2001), C# (2005), JavaScript (2006), Ruby (2007?).
In Python, people usually use generators like this:
def counter(m):
i = 0
while i < m:
yield i
i += 1
>>> for i in counter(5): print(i)
0
1
2
3
4
The generator can also be used by directly calling
next(g)
(or g.__next__()
):
>>> g = counter(3)
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> next(g)
StopIteration
You can also pass a value to the generator using send
,
and this value becomes the value of the yield
expression.
def counter():
i = 0
while True:
val = (yield i)
if val is not None:
i = val
else:
i += 1
>>> g = counter(100)
>>> next(g)
0
>>> next(g)
1
>>> g.send(10)
10
>>> next(g)
11
It used to be an error for a generator function to return a value,
but since Python 3.3, it can, and the return value is accessible to the
caller in the value
attribute of the
StopIteration
exception that gets raised.
def counter(m):
i = 0
while i < m:
yield i
i += 1
return "secret message"
>>> g = counter(3)
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> try:
... next(g)
... except StopIteration as e:
... print(e.value)
secret message
Syntax and semantics
We add generators to the lambda calculus following James and Sabry [1]. Their generators differ from Python generators in two major ways and some minor ways:
Python generators are mutable. There isn’t an easy way, for example, to clone or rewind a generator. But in the λ-calculus, we keep generators and mutability as two separate features, so generators are immutable. Thus they are used in a slightly different way, but the plus is that immutable generators can be copied trivially, and they can be “rewound” simply by keeping an old copy.
Python generators are always functions, but in the λ-calculus, we want to keep generators and functions as two separate things, so generators are just terms, not necessarily functions.
Generators yielding values
Let’s start with a simpler version, in which a generator is essentially a stream (p. 270). We’re omitting types for conciseness, but if you wanted to write down the type of a generator of Nats, it would be:
NatGen ≡ μG. Unit + Nat × Unit→G
So inl unit
is a generator with no more elements to
yield, and inr {n, g}
is a generator whose next value is
n
, and after it yields n
, it “becomes”
g
.
We add two new keywords to our language, gen
and
yield
. The gen
wraps a term possibly
containing yield
, and evaluates to a generator:
gen (yield 123; yield 456; yield 789)
⟶* inr {123, λz. gen (unit; yield 456; yield 789)}
gen (yield 456; yield 789)
⟶* inr {456, λz. gen (unit; yield 789)}
gen (yield 789)
⟶* inr {789, λz. gen unit}
gen unit
⟶* inl unit
As an example of using a generator, we can write a function
nth
such that nth n e g
gets the
n
th value of generator g
, or e
if
there is none:
nth ≡ fix λf. λn. λe. λg. case g of
inl _ ⇒ e
inr s ⇒ if iszero n then s.1 else f (pred n) e (s.2 unit)
nth 0 0 (gen (yield 123; yield 456; yield 789)) ⟶* 123
nth 1 0 (gen (yield 123; yield 456; yield 789)) ⟶* 456
nth 2 0 (gen (yield 123; yield 456; yield 789)) ⟶* 789
Our generators aren’t mutable, so:
let g = gen (yield 123; yield 456; yield 789) in
[nth 2 0 g, nth 1 0 g, nth 0 0 g] ⟶* [789, 456, 123]
Another example:
fib ≡ gen ((fix λf. λx. λy. (yield x; f y (+ x y))) 1 1)
nth 0 0 fib ⟶* 1
nth 1 0 fib ⟶* 1
nth 2 0 fib ⟶* 2
nth 3 0 fib ⟶* 3
nth 4 0 fib ⟶* 5
nth 5 0 fib ⟶* 8
The syntax and semantics are:
t ::= ...
gen t
yield t
F ::= ...
yield F
E ::= ...
gen E
yield E
gen v ⟶ inl unit (GenV)
gen F[yield v] ⟶ inr {v, λz. gen F[unit]} z ∉ FV(F) (GenConYield)
The generator is either of:
inr {o, k}
, whereo
is a yielded value andk
is a thunk. Thenk unit
resumes the generator from where it left off (withunit
in place of theyield
expression).inl unit
.
In all languages that have generators, the caller gets to choose
whether or not to call k
. But under the semantics above,
the caller can also choose to call it more than once, which is not
possible in many languages.
Sending values to generators
We can get an effect similar to Python’s send
by
changing the thunk (λz
above) into a function that actually
uses its argument. So the type of generators is
NatGen ≡ μG. Unit + Nat × Nat→G
(assuming that the sent values are also Nat
s), and the
evaluation rule GenConYield is
gen F[yield v] ⟶ inr {v, λx. gen F[x]} x ∉ FV(F) (GenConYield)
This function is called a continuation. When called, it
returns a generator for the rest of the sequence, and its argument
becomes the value of the yield
expression.
The return value of a generator
We can also change GenV by replacing the unit
with a
return value. Thus the type of generators is
NatGen ≡ μG. Nat + Nat × Nat→G
(assuming that the return value is also a Nat
), and the
evaluation rule GenV is
gen v ⟶ inl v (GenV)
In pure lambda calculus
At this point, you should be anticipating a translation of λ-calculus with generators into pure λ-calculus. And you’d be right, but the appearance of continuations in our operational semantics for generators means that the topic of translating generators into pure λ-calculus is a topic that is highly important in its own right. To be continued!