Generators

Generators are relatively new, but now fairly widespread. Languages that have them include: Python (2001), C# (2005), JavaScript (2006), Ruby (2007?).

In Python, people usually use generators like this:

def counter(m):
    i = 0
    while i < m:
        yield i
        i += 1

>>> for i in counter(5): print(i)
0
1
2
3
4

The generator can also be used by directly calling next(g) (or g.__next__()):

>>> g = counter(3)
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> next(g)
StopIteration

You can also pass a value to the generator using send, and this value becomes the value of the yield expression.

def counter():
    i = 0
    while True:
        val = (yield i)
        if val is not None:
            i = val
        else:
            i += 1

>>> g = counter(100)
>>> next(g)
0
>>> next(g)
1
>>> g.send(10)
10
>>> next(g)
11

It used to be an error for a generator function to return a value, but since Python 3.3, it can, and the return value is accessible to the caller in the value attribute of the StopIteration exception that gets raised.

def counter(m):
    i = 0
    while i < m:
        yield i
        i += 1
    return "secret message"

>>> g = counter(3)
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> try:
...     next(g)
... except StopIteration as e:
...     print(e.value)
secret message

Syntax and semantics

We add generators to the lambda calculus following James and Sabry [1]. Their generators differ from Python generators in two major ways and some minor ways:

  • Python generators are mutable. There isn’t an easy way, for example, to clone or rewind a generator. But in the λ-calculus, we keep generators and mutability as two separate features, so generators are immutable. Thus they are used in a slightly different way, but the plus is that immutable generators can be copied trivially, and they can be “rewound” simply by keeping an old copy.

  • Python generators are always functions, but in the λ-calculus, we want to keep generators and functions as two separate things, so generators are just terms, not necessarily functions.

Generators yielding values

Let’s start with a simpler version, in which a generator is essentially a stream (p. 270). We’re omitting types for conciseness, but if you wanted to write down the type of a generator of Nats, it would be:

NatGen ≡ μG. Unit + Nat × UnitG

So inl unit is a generator with no more elements to yield, and inr {n, g} is a generator whose next value is n, and after it yields n, it “becomes” g.

We add two new keywords to our language, gen and yield. The gen wraps a term possibly containing yield, and evaluates to a generator:

gen (yield 123; yield 456; yield 789)
* inr {123, λz. gen (unit; yield 456; yield 789)}

gen (yield 456; yield 789)
* inr {456, λz. gen (unit; yield 789)}

gen (yield 789)
* inr {789, λz. gen unit}

gen unit
* inl unit

As an example of using a generator, we can write a function nth such that nth n e g gets the nth value of generator g, or e if there is none:

nth ≡ fix λf. λn. λe. λg. case g of
        inl _  e
        inr s  if iszero n then s.1 else f (pred n) e (s.2 unit)

nth 0 0 (gen (yield 123; yield 456; yield 789)) * 123
nth 1 0 (gen (yield 123; yield 456; yield 789)) * 456
nth 2 0 (gen (yield 123; yield 456; yield 789)) * 789

Our generators aren’t mutable, so:

let g = gen (yield 123; yield 456; yield 789) in
  [nth 2 0 g, nth 1 0 g, nth 0 0 g] * [789, 456, 123]

Another example:

fib ≡ gen ((fix λf. λx. λy. (yield x; f y (+ x y))) 1 1)
nth 0 0 fib * 1
nth 1 0 fib * 1
nth 2 0 fib * 2
nth 3 0 fib * 3
nth 4 0 fib * 5
nth 5 0 fib * 8

The syntax and semantics are:

t ::= ...
      gen t
      yield t

F ::= ...
      yield F

E ::= ...
      gen E
      yield E

         gen v  inl unit                               (GenV)
gen F[yield v]  inr {v, λz. gen F[unit]}   z ∉ FV(F)   (GenConYield)

The generator is either of:

  • inr {o, k}, where o is a yielded value and k is a thunk. Then k unit resumes the generator from where it left off (with unit in place of the yield expression).

  • inl unit.

In all languages that have generators, the caller gets to choose whether or not to call k. But under the semantics above, the caller can also choose to call it more than once, which is not possible in many languages.

Sending values to generators

We can get an effect similar to Python’s send by changing the thunk (λz above) into a function that actually uses its argument. So the type of generators is

NatGen ≡ μG. Unit + Nat × NatG

(assuming that the sent values are also Nats), and the evaluation rule GenConYield is

gen F[yield v]  inr {v, λx. gen F[x]}      x ∉ FV(F)   (GenConYield)

This function is called a continuation. When called, it returns a generator for the rest of the sequence, and its argument becomes the value of the yield expression.

The return value of a generator

We can also change GenV by replacing the unit with a return value. Thus the type of generators is

NatGen ≡ μG. Nat + Nat × NatG

(assuming that the return value is also a Nat), and the evaluation rule GenV is

gen v  inl v                                           (GenV)

In pure lambda calculus

At this point, you should be anticipating a translation of λ-calculus with generators into pure λ-calculus. And you’d be right, but the appearance of continuations in our operational semantics for generators means that the topic of translating generators into pure λ-calculus is a topic that is highly important in its own right. To be continued!

References

[1]
Roshan P. James and Amr Sabry. 2011. Yield: Mainstream delimited continuations. In Proc. Theory and Practice of Delimited Continuations. Retrieved from https://www.cs.indiana.edu/~sabry/papers/yield.pdf