# Continuations

## Contexts and continuations

An evaluation context describes the “rest of the computation” in the sense that it expresses all the computations that will be performed after the expression in the hole is evaluated. For example, consider the expression `+ (* 3 3) (* 4 4)`

. We indicate evaluation contexts using square brackets:

```
[+ (* 3 3) (* 4 4)]
= + [* 3 3] (* 4 4)
⟶ + [9] (* 4 4)
= [+ 9 (* 4 4)]
= + 9 [* 4 4]
⟶ + 9 [16]
= [+ 9 16]
⟶ [25]
```

The subexpression `* 4 4`

has the evaluation context `+ 9 [⋅]`

, indicating that what remains is to add 9. Note that the evaluation context is not `+ (* 3 3) [⋅]`

, because the first multiplication has already taken place.

For those who prefer to think in terms of implementations, the evaluation context corresponds to the call stack. In the above example, when the evaluation context narrows from `[⋅]`

to `+ 9 [⋅]`

, it corresponds to a function call (push); when it widens back to `[⋅]`

, it corresponds to a return (pop).

Here, we look at how to transform programs to make evaluation contexts accessible within the language itself, as functions called *continuations*. This is interesting for (at least) three reasons:

We can manipulate the evaluation context to simulate effects like exceptions, nondeterminism, and generators.

When a program is transformed to keep track of the evaluation context, the operational semantics no longer needs to and can therefore be simplified (and the corresponding implementation doesn’t need a call stack).

The

*type*of a continuation fits into the Curry-Howard isomorphism in a surprising way.

## Continuation-passing style

As a warm-up, consider our example expression from above, again with the evaluation context marked with square brackets:

`+ [* 3 3] (* 4 4)`

We could rewrite this as

`(λx. + x (* 4 4)) (* 3 3)`

where `(λx. + x (* 4 4)`

is the continuation of `* 3 3`

. Furthermore, we can flip this so that instead of the continuation taking `t`

as its argument, the subterm takes the continuation as its argument:

`(λk. k (* 3 3)) (λx. + x (* 4 4))`

In the first subterm, `λk. k (* 3 3)`

, the continuation now has a name, `k`

. The “normal” thing to do is to apply `k`

to something, but if we wanted to, we could do something else with it (e.g., discard it, apply it twice, or save it for later).

In CPS, we do this everywhere! The above example term is going to become:

`λk. *' 3 3 (λv`_{2}. *' 4 4 (λv_{3}. +' v_{2} v_{3} k))

where `*'`

takes *three* arguments, two numbers and a continuation; it multiplies the two numbers and then passes the result to the continuation. Similarly for `+'`

.

### NB

Let’s define the transformation just for NB first.

In the following, `v`

stands for either a value or a variable.

To simplify things, we define a helper transformation `𝒞`

that operates only on values or variables. For now, it’s trivial:_{v}

` 𝒞`_{v}⟦0⟧ = 0
𝒞_{v}⟦succ v⟧ = succ 𝒞_{v}⟦v⟧
𝒞_{v}⟦true⟧ = true
𝒞_{v}⟦false⟧ = false

The transformation `𝒞`

does the real work of turning terms into functions that take continuations:

` 𝒞⟦v⟧ = λk. k 𝒞`_{v}⟦v⟧
𝒞⟦iszero v⟧ = λk. iszero' 𝒞_{v}⟦v⟧ k = iszero' 𝒞_{v}⟦v⟧
𝒞⟦pred v⟧ = λk. pred' 𝒞_{v}⟦v⟧ k = pred' 𝒞_{v}⟦v⟧
𝒞⟦if v_{1} then t_{2} else t_{3}⟧ = λk. (if 𝒞_{v}⟦v_{1}⟧ then 𝒞⟦t_{2}⟧ else 𝒞⟦t_{3}⟧) k
= if 𝒞_{v}⟦v_{1}⟧ then 𝒞⟦t_{2}⟧ else 𝒞⟦t_{3}⟧
𝒞⟦E[t]⟧ = λk. 𝒞⟦t⟧ (λx. 𝒞⟦E[x]⟧ k)

The rules for the function-like operators `iszero`

and `pred`

assume that there is a CPS version of each operator that takes a continuation as a second argument. These could be defined as:

```
pred' 0 k ⟶ k 0
pred' (succ v) k ⟶ k v
iszero' 0 k ⟶ k true
iszero' (succ v) k ⟶ k false
```

Examples:

` 𝒞⟦succ 0⟧ = λk. k 𝒞`_{v}⟦succ 0⟧
= λk. k (succ 𝒞_{v}⟦0⟧)
= λk. k (succ 0)
𝒞⟦iszero 0⟧ = λk. iszero' 𝒞_{v}⟦0⟧ k
= λk. iszero' 0 k

A more complex example illustrating the rule for `E[t]`

. Assume that we have CPS versions of `+`

and `*`

, called `+'`

and `*'`

. We do some reductions to simplify away some `λk`

s (sometimes called “administrative lambdas”) that clutter things up.

` 𝒞⟦+ (* 3 3) (* 4 4)⟧ = λk. 𝒞⟦* 3 3⟧ (λv`_{2}. 𝒞⟦+ v_{2} (* 4 4)⟧ k)
= λk. *' 3 3 (λv_{2}. 𝒞⟦+ v_{2} (* 4 4)⟧ k)
= λk. *' 3 3 (λv_{2}. (λk. 𝒞⟦* 4 4⟧ (λv_{3}. 𝒞⟦+ v_{2} v_{3}⟧ k)) k)
= λk. *' 3 3 (λv_{2}. 𝒞⟦* 4 4⟧ (λv_{3}. 𝒞⟦+ v_{2} v_{3}⟧ k))
= λk. *' 3 3 (λv_{2}. *' 4 4 (λv_{3}. 𝒞⟦+ v_{2} v_{3}⟧ k))
= λk. *' 3 3 (λv_{2}. *' 4 4 (λv_{3}. +' v_{2} v_{3} k))

### λ

Now we extend the transform to λ:

` 𝒞`_{v}⟦x⟧ = x
𝒞_{v}⟦λx. t⟧ = λx. 𝒞⟦t⟧
𝒞⟦v_{1} v_{2}⟧ = λk. 𝒞_{v}⟦v_{1}⟧ 𝒞_{v}⟦v_{2}⟧ k = 𝒞_{v}⟦v_{1}⟧ 𝒞_{v}⟦v_{2}⟧

The rule for abstractions says that the body of the abstraction should also be turned into a function that takes a continuation. In other words, a function `λx. ...`

is turned into a function of two arguments, `λx. λk. ...`

.

The rule for applications says that, since `𝒞`

is now a function of two arguments, the continuation of the application should be passed as the second argument._{v}⟦v_{1}⟧

Examples:

` 𝒞⟦λx. 0⟧ = λk. k 𝒞`_{v}⟦λx. 0⟧
= λk. k (λx. 𝒞⟦0⟧)
= λk. k (λx. λk_{1}. k_{1} 𝒞_{v}⟦0⟧)
= λk. k (λx. λk_{1}. k_{1} 0)
𝒞⟦(λx. 0) true⟧ = 𝒞_{v}⟦λx. 0⟧ 𝒞_{v}⟦true⟧
= (λx. λk_{1}. k_{1} 0) 𝒞_{v}⟦true⟧
= (λx. λk_{1}. k_{1} 0) true

## Some theoretical and practical issues

### The final continuation

The CPS transform turns every term into a function on continuations – even the whole term. So if we want to actually evaluate a term in CPS, we need to apply it to a continuation. What should it be?

One simple answer is `λx. x`

, but to emphasize the “finality” of this continuation, let’s invent a pseudofunction `exit`

that receives the final answer and does not return.

` (λk. *' 3 3 (λv`_{2}. *' 4 4 (λv_{3}. +' v_{2} v_{3} k))) exit
⟶ *' 3 3 (λv_{2}. *' 4 4 (λv_{3}. +' v_{2} v_{3} exit))
⟶ (λv_{2}. *' 4 4 (λv_{3}. +' v_{2} v_{3} exit)) 9
⟶ *' 4 4 (λv_{3}. +' 9 v_{3} exit)
⟶ (λv_{3}. +' 9 v_{3} exit) 16
⟶ +' 9 16 exit
⟶ exit 25

### Every call is a tail call

A *tail call* is when one function calls another function as the very last thing it does, so that they both have the same return value. Tail calls are important because they can be translated into gotos; in particular, writing recursive functions so that recursive function calls are tail calls is efficient because the recursion can be translated into a loop.

In the λ-calculus, a term `t`

is in the _{1}*tail position* of term `t`

if one of the following is true:_{2}

`t`

is in the tail position of_{1}`t`

and`t`

is in the tail position of`t`

_{2}`t`

_{1}≡ t_{2}`t`

_{2}≡ λx. t_{1}`t`

_{2}≡ if t_{0}then t_{1}else t_{3}`t`

_{2}≡ if t_{0}then t_{3}else t_{1}

The last three cases are exactly the places our semantics says we *don’t* evaluate inside, i.e., we *don’t* have rules `E ::= λx. E`

, `E ::= if v then E else t`

, or `E ::= if v then t else E`

.

In the right-hand sides of the equations defining CPS, every application `t`

falls into one of these categories:_{1} t_{2}

`t`

is a continuation (in the rule for_{1}`𝒞⟦v⟧`

)`t`

takes two arguments and_{1}`t`

is the first one (_{2}`iszero'`

, etc., and`𝒞`

in the rule for_{v}⟦v_{1}⟧`𝒞⟦v`

_{1}v_{2}⟧`t`

takes a continuation and_{1}`t`

is a continuation_{2}

In CPS, cases (a) and (c) are always tail calls, that is, they always occur in the tail position of the entire term. Case (b) is not, but if we regard case (c) as merely partial application, then we can say: in CPS every *full* application is in the tail position of the entire term.

This restriction has several benefits.

The operational semantics could be simplified. We don’t need evaluation contexts anymore, because the evaluation context is always just

`[·]`

. We do need an additional evaluation rule for β-reduction because of case (b) above:`(λx. t`

_{1}) v_{2}v_{3}⟶ [x↦v_{2}]t_{1}v_{3}But the congruence rules and/or evaluation contexts can go away.

In the corresponding implementation, we don’t need a stack. In your implementation of the big-step semantics, you probably used function calls to achieve the same effect as a stack; under the assumption that the term to be evaluated is in CPS, you would be able to write an interpreter or compiler that doesn’t use target-language function calls.

### The type of continuations

What does the CPS transform do to types? Consider a term `t`

with type `T`

; for simplicity, assume that `T`

is a base type. (Function types are more complicated because `𝒞`

is recursively applied to the return type.) The CPS transform turns this into a function looking for the final continuation, which we’ve decided is `exit`

.

Since `exit`

does not return, its return type should (arguably) be the null type, or the type that contains no values, usually written `⊥`

and pronounced “bottom”. So

```
exit : T→⊥
𝒞⟦t⟧ : (T→⊥)→⊥
```

Since every call is a tail call, every continuation must have the same return type as the final continuation, so for *every* term of base type `T`

, the CPS transform turns it into a term of type `(T→⊥)→⊥`

, and its continuation has type `T→⊥`

.

Under the Curry-Howard isomorphism, `⊥`

corresponds to `false`

. And recall that `X→Y`

in classical logic is defined to be `¬X ∨ Y`

. So the type of a continuation corresponds to `T→⊥ = ¬T ∨ false = ¬T`

. In other words, continuations correspond to negation.