Continuations

Contexts and continuations

An evaluation context describes the “rest of the computation” in the sense that it expresses all the computations that will be performed after the expression in the hole is evaluated. For example, consider the expression + (* 3 3) (* 4 4). We indicate evaluation contexts using square brackets:

    [+ (* 3 3) (* 4 4)]
  = + [* 3 3] (* 4 4)
⟶ + [9] (* 4 4)
  = [+ 9 (* 4 4)]
  = + 9 [* 4 4]
⟶ + 9 [16]
  = [+ 9 16]
⟶ [25]

The subexpression * 4 4 has the evaluation context + 9 [⋅], indicating that what remains is to add 9. Note that the evaluation context is not + (* 3 3) [⋅], because the first multiplication has already taken place.

For those who prefer to think in terms of implementations, the evaluation context corresponds to the call stack. In the above example, when the evaluation context narrows from [⋅] to + 9 [⋅], it corresponds to a function call (push); when it widens back to [⋅], it corresponds to a return (pop).

Here, we look at how to transform programs to make evaluation contexts accessible within the language itself, as functions called continuations. This is interesting for (at least) three reasons:

We can manipulate the evaluation context to simulate effects like exceptions, nondeterminism, and generators.
When a program is transformed to keep track of the evaluation context, the operational semantics no longer needs to and can therefore be simplified (and the corresponding implementation doesn’t need a call stack).
The type of a continuation fits into the Curry-Howard isomorphism in a surprising way.

Continuation-passing style

As a warm-up, consider our example expression from above, again with the evaluation context marked with square brackets:

+ [* 3 3] (* 4 4)

We could rewrite this as

(λx. + x (* 4 4)) (* 3 3)

where (λx. + x (* 4 4) is the continuation of * 3 3. Furthermore, we can flip this so that instead of the continuation taking t as its argument, the subterm takes the continuation as its argument:

(λk. k (* 3 3)) (λx. + x (* 4 4))

In the first subterm, λk. k (* 3 3), the continuation now has a name, k. The “normal” thing to do is to apply k to something, but if we wanted to, we could do something else with it (e.g., discard it, apply it twice, or save it for later).

In CPS, we do this everywhere! The above example term is going to become:

λk. *' 3 3 (λv₂. *' 4 4 (λv₃. +' v₂ v₃ k))

where *' takes three arguments, two numbers and a continuation; it multiplies the two numbers and then passes the result to the continuation. Similarly for +'.

NB

Let’s define the transformation just for NB first.

In the following, v stands for either a value or a variable.

To simplify things, we define a helper transformation 𝒞_v that operates only on values or variables. For now, it’s trivial:

                   𝒞_v⟦0⟧ = 0
              𝒞_v⟦succ v⟧ = succ 𝒞_v⟦v⟧
                𝒞_v⟦true⟧ = true
               𝒞_v⟦false⟧ = false

The transformation 𝒞 does the real work of turning terms into functions that take continuations:

                    𝒞⟦v⟧ = λk. k 𝒞_v⟦v⟧
             𝒞⟦iszero v⟧ = λk. iszero' 𝒞_v⟦v⟧ k = iszero' 𝒞_v⟦v⟧
               𝒞⟦pred v⟧ = λk. pred' 𝒞_v⟦v⟧ k = pred' 𝒞_v⟦v⟧
𝒞⟦if v₁ then t₂ else t₃⟧ = λk. (if 𝒞_v⟦v₁⟧ then 𝒞⟦t₂⟧ else 𝒞⟦t₃⟧) k
                         = if 𝒞_v⟦v₁⟧ then 𝒞⟦t₂⟧ else 𝒞⟦t₃⟧
                 𝒞⟦E[t]⟧ = λk. 𝒞⟦t⟧ (λx. 𝒞⟦E[x]⟧ k)

The rules for the function-like operators iszero and pred assume that there is a CPS version of each operator that takes a continuation as a second argument. These could be defined as:

              pred' 0 k ⟶ k 0
       pred' (succ v) k ⟶ k v
            iszero' 0 k ⟶ k true
     iszero' (succ v) k ⟶ k false

Examples:

               𝒞⟦succ 0⟧ = λk. k 𝒞_v⟦succ 0⟧
                         = λk. k (succ 𝒞_v⟦0⟧)
                         = λk. k (succ 0)

             𝒞⟦iszero 0⟧ = λk. iszero' 𝒞_v⟦0⟧ k
                         = λk. iszero' 0 k

A more complex example illustrating the rule for E[t]. Assume that we have CPS versions of + and *, called +' and *'. We do some reductions to simplify away some λks (sometimes called “administrative lambdas”) that clutter things up.

    𝒞⟦+ (* 3 3) (* 4 4)⟧ = λk. 𝒞⟦* 3 3⟧ (λv₂. 𝒞⟦+ v₂ (* 4 4)⟧ k)
                         = λk. *' 3 3 (λv₂. 𝒞⟦+ v₂ (* 4 4)⟧ k)
                         = λk. *' 3 3 (λv₂. (λk. 𝒞⟦* 4 4⟧ (λv₃. 𝒞⟦+ v₂ v₃⟧ k)) k)
                         = λk. *' 3 3 (λv₂. 𝒞⟦* 4 4⟧ (λv₃. 𝒞⟦+ v₂ v₃⟧ k))
                         = λk. *' 3 3 (λv₂. *' 4 4 (λv₃. 𝒞⟦+ v₂ v₃⟧ k))
                         = λk. *' 3 3 (λv₂. *' 4 4 (λv₃. +' v₂ v₃ k))

λ

Now we extend the transform to λ:

                   𝒞_v⟦x⟧ = x
               𝒞_v⟦λx. t⟧ = λx. 𝒞⟦t⟧
                𝒞⟦v₁ v₂⟧ = λk. 𝒞_v⟦v₁⟧ 𝒞_v⟦v₂⟧ k = 𝒞_v⟦v₁⟧ 𝒞_v⟦v₂⟧

The rule for abstractions says that the body of the abstraction should also be turned into a function that takes a continuation. In other words, a function λx. ... is turned into a function of two arguments, λx. λk. ....

The rule for applications says that, since 𝒞_v⟦v₁⟧ is now a function of two arguments, the continuation of the application should be passed as the second argument.

Examples:

                𝒞⟦λx. 0⟧ = λk. k 𝒞_v⟦λx. 0⟧
                         = λk. k (λx. 𝒞⟦0⟧)
                         = λk. k (λx. λk₁. k₁ 𝒞_v⟦0⟧)
                         = λk. k (λx. λk₁. k₁ 0)

         𝒞⟦(λx. 0) true⟧ = 𝒞_v⟦λx. 0⟧ 𝒞_v⟦true⟧
                         = (λx. λk₁. k₁ 0) 𝒞_v⟦true⟧
                         = (λx. λk₁. k₁ 0) true

Some theoretical and practical issues

The final continuation

The CPS transform turns every term into a function on continuations – even the whole term. So if we want to actually evaluate a term in CPS, we need to apply it to a continuation. What should it be?

One simple answer is λx. x, but to emphasize the “finality” of this continuation, let’s invent a pseudofunction exit that receives the final answer and does not return.

    (λk. *' 3 3 (λv₂. *' 4 4 (λv₃. +' v₂ v₃ k))) exit
⟶ *' 3 3 (λv₂. *' 4 4 (λv₃. +' v₂ v₃ exit))
⟶ (λv₂. *' 4 4 (λv₃. +' v₂ v₃ exit)) 9
⟶ *' 4 4 (λv₃. +' 9 v₃ exit)
⟶ (λv₃. +' 9 v₃ exit) 16
⟶ +' 9 16 exit
⟶ exit 25

Every call is a tail call

A tail call is when one function calls another function as the very last thing it does, so that they both have the same return value. Tail calls are important because they can be translated into gotos; in particular, writing recursive functions so that recursive function calls are tail calls is efficient because the recursion can be translated into a loop.

In the λ-calculus, a term t₁ is in the tail position of term t₂ if one of the following is true:

t₁ is in the tail position of t and t is in the tail position of t₂
t₁ ≡ t₂
t₂ ≡ λx. t₁
t₂ ≡ if t₀ then t₁ else t₃
t₂ ≡ if t₀ then t₃ else t₁

The last three cases are exactly the places our semantics says we don’t evaluate inside, i.e., we don’t have rules E ::= λx. E, E ::= if v then E else t, or E ::= if v then t else E.

In the right-hand sides of the equations defining CPS, every application t₁ t₂ falls into one of these categories:

t₁ is a continuation (in the rule for 𝒞⟦v⟧)
t₁ takes two arguments and t₂ is the first one (iszero', etc., and 𝒞_v⟦v₁⟧ in the rule for 𝒞⟦v₁ v₂⟧
t₁ takes a continuation and t₂ is a continuation

In CPS, cases (a) and (c) are always tail calls, that is, they always occur in the tail position of the entire term. Case (b) is not, but if we regard case (c) as merely partial application, then we can say: in CPS every full application is in the tail position of the entire term.

This restriction has several benefits.

The operational semantics could be simplified. We don’t need evaluation contexts anymore, because the evaluation context is always just [·]. We do need an additional evaluation rule for β-reduction because of case (b) above:
```
(λx. t₁) v₂ v₃ ⟶ [x↦v₂]t₁ v₃
```
But the congruence rules and/or evaluation contexts can go away.
In the corresponding implementation, we don’t need a stack. In your implementation of the big-step semantics, you probably used function calls to achieve the same effect as a stack; under the assumption that the term to be evaluated is in CPS, you would be able to write an interpreter or compiler that doesn’t use target-language function calls.

The type of continuations

What does the CPS transform do to types? Consider a term t with type T; for simplicity, assume that T is a base type. (Function types are more complicated because 𝒞 is recursively applied to the return type.) The CPS transform turns this into a function looking for the final continuation, which we’ve decided is exit.

Since exit does not return, its return type should (arguably) be the null type, or the type that contains no values, usually written ⊥ and pronounced “bottom”. So

exit : T→⊥
𝒞⟦t⟧ : (T→⊥)→⊥

Since every call is a tail call, every continuation must have the same return type as the final continuation, so for every term of base type T, the CPS transform turns it into a term of type (T→⊥)→⊥, and its continuation has type T→⊥.

Under the Curry-Howard isomorphism, ⊥ corresponds to false. And recall that X→Y in classical logic is defined to be ¬X ∨ Y. So the type of a continuation corresponds to T→⊥ = ¬T ∨ false = ¬T. In other words, continuations correspond to negation.