Although the operational semantics presented in class for the λ-calculus can be used as a basis for implementing an interpreter, it’s not necessarily the best fit. First, it may be easier to implement the big-step semantics (Exercise 5.3.8) instead of the small-step semantics. Second, substitution might not be very practical because it modifies code. Especially if functions are compiled, it may be difficult or impossible to perform substitutions on them.
Environments as pending substitutions
So, instead of performing substitutions directly, we store them up in a data structure called an environment
ρ. An environment is just a list of bindings of variables to values:
ρ ::= ε ρ ::= ρ, x ↦ v
ρ(x) for the value that
The order of the bindings matters: if there are two bindings for
x, let’s say that the one on the right wins. For example, if
ρ = a ↦ 0, b ↦ succ 0, a ↦ succ (succ 0)
ρ(a) = succ (succ 0) ρ(b) = succ 0
The judgements in the big-step semantics are now of the form
ρ ⊢ t ⇓ v, which means, “Evaluating term
t while performing the substitutions in
ρ yields value
We now need a rule for variables which looks the variable up in the environment:
———————————— B-Var ρ ⊢ x ⇓ ρ(x)
The rule for abstractions
λx. t doesn’t perform the substitutions in
ρ; that’s what we were trying to avoid. Instead, the abstraction is bundled with
ρ in a new kind of object called a closure (let’s write it as
[ρ](λx. t)). It’s called that because the environment “closes over” all the free variables in
λx. t to turn it into a closed term.
—————————————————————— B-Abs ρ ⊢ λx. t ⇓ [ρ](λx. t)
Finally, the rule for applications is:
ρ ⊢ t1 ⇓ [ρ'](λx. t12) ρ ⊢ t2 ⇓ v2 ρ', x↦v2 ⊢ t12 ⇓ v ———————————————————————————————————————————————————————— B-App ρ ⊢ t1 t2 ⇓ v
This does the following:
t1in the current environment
ρto get a closure,
[ρ'](λx. t12). Note that
ρ', the environment where the function is defined, might be different from
ρ, the environment where the function is applied.
t2in the current environment
Next we need to evaluate
t12. The substitutions in
ρ1have been saved up for
v2, is the environment we use. (If we used
ρinstead, we would get dynamic scope.)
The rest of the rules are the same as before, with the addition of an environment.
Here’s an example proof tree using the above big-step semantics to show that
(λx. λy. x) 0 true evaluates to
(1) ⊢ (λx. λy. x) ⇓ (λx. λy. x) B-Abs (2) ⊢ 0 ⇓ 0 B-Zero (3) x↦0 ⊢ λy. x ⇓ [x↦0](λy. k) B-Abs (4) ⊢ (λx. λy. x) 0 ⇓ [x↦0](λy. x) B-App (1) (2) (3) (5) ⊢ true ⇓ true B-True (6) x↦0, y↦true ⊢ x ⇓ 0 B-Var (7) ⊢ (λx. λy. x) 0 true ⇓ 0 B-App (4) (5) (6)
Data structures for environments
An environment is a list of bindings. Each time a closure is applied, its environment is extended with a new binding for its argument. This happens in the third premise of B-App, where
ρ' is extended to
ρ', x↦v2. But it would be a waste of memory to copy
ρ' every time. Moreover, if variables are mutable, we don’t want a variable to have multiple copies, because assigning to the variable would require updating all the copies.
To avoid copying, we can store environments as linked lists. Each list node (known as an activation record or AR) stores a binding and a pointer (called an access link) to the rest of the environment. Then, extending
ρ' just means creating a new AR for
x↦v2 and pointing its access link to
For example, recall that
pair ≡ (λx. λy. f x y) triple ≡ (λx. λy. λz. λf. f x y z)
triple 0 true false looks like:
The advantage of using linked lists is that in
let t = triple 0 true false in pair t t, several environments share the bindings
x↦0, y↦true, z↦false instead of copying them:
Because of this sharing, some kind of garbage collection (at least reference counting) is generally needed.
Some languages place restrictions to simplify implementation of closures and environments. For example, in C, nested functions are not allowed. Since every function is defined in the same environment (of global variables), closures don’t need an explicit pointer to it – that is, closures are just function pointers. For the same reason, the only environment that gets extended (
ρ' in the third premise of B-App) is the global environment, so ARs do not need explicit access links. Moreover, since neither closures nor ARs have pointers to other ARs, no garbage collection is needed.