Environments
Although the operational semantics presented in class for the λ-calculus can be used as a basis for implementing an interpreter, it’s not necessarily the best fit. First, it may be easier to implement the big-step semantics (Exercise 5.3.8) instead of the small-step semantics. Second, substitution might not be very practical because it modifies code. Especially if functions are compiled, it may be difficult or impossible to perform substitutions on them.
Environments as pending substitutions
So, instead of performing substitutions directly, we store them up in a data structure called an environment ρ
. An environment is just a list of bindings of variables to values:
ρ ::= ε
ρ ::= ρ, x ↦ v
We write ρ(x)
for the value that ρ
binds x
to.
The order of the bindings matters: if there are two bindings for x
, let’s say that the one on the right wins. For example, if
ρ = a ↦ 0, b ↦ succ 0, a ↦ succ (succ 0)
then
ρ(a) = succ (succ 0)
ρ(b) = succ 0
The judgements in the big-step semantics are now of the form ρ ⊢ t ⇓ v
, which means, “Evaluating term t
while performing the substitutions in ρ
yields value v
.”
We now need a rule for variables which looks the variable up in the environment:
———————————— B-Var
ρ ⊢ x ⇓ ρ(x)
The rule for abstractions λx. t
doesn’t perform the substitutions in ρ
; that’s what we were trying to avoid. Instead, the abstraction is bundled with ρ
in a new kind of object called a closure (let’s write it as [ρ](λx. t)
). It’s called that because the environment “closes over” all the free variables in λx. t
to turn it into a closed term.
—————————————————————— B-Abs
ρ ⊢ λx. t ⇓ [ρ](λx. t)
Finally, the rule for applications is:
ρ ⊢ t1 ⇓ [ρ'](λx. t12) ρ ⊢ t2 ⇓ v2 ρ', x↦v2 ⊢ t12 ⇓ v
———————————————————————————————————————————————————————— B-App
ρ ⊢ t1 t2 ⇓ v
This does the following:
Evaluate
t1
in the current environmentρ
to get a closure,[ρ'](λx. t12)
. Note thatρ'
, the environment where the function is defined, might be different fromρ
, the environment where the function is applied.Evaluate
t2
in the current environmentρ
to getv2
.Next we need to evaluate
t12
. The substitutions inρ1
have been saved up fort12
, soρ'
, withx
bound tov2
, is the environment we use. (If we usedρ
instead, we would get dynamic scope.)
The rest of the rules are the same as before, with the addition of an environment.
Here’s an example proof tree using the above big-step semantics to show that (λx. λy. x) 0 true
evaluates to 0
:
(1) ⊢ (λx. λy. x) ⇓ [](λx. λy. x) B-Abs
(2) ⊢ 0 ⇓ 0 B-Zero
(3) x↦0 ⊢ λy. x ⇓ [x↦0](λy. k) B-Abs
(4) ⊢ [](λx. λy. x) 0 ⇓ [x↦0](λy. x) B-App (1) (2) (3)
(5) ⊢ true ⇓ true B-True
(6) x↦0, y↦true ⊢ x ⇓ 0 B-Var
(7) ⊢ [](λx. λy. x) 0 true ⇓ 0 B-App (4) (5) (6)
Data structures for environments
An environment is a list of bindings. Each time a closure is applied, its environment is extended with a new binding for its argument. This happens in the third premise of B-App, where ρ'
is extended to ρ', x↦v2
. But it would be a waste of memory to copy ρ'
every time. Moreover, if variables are mutable, we don’t want a variable to have multiple copies, because assigning to the variable would require updating all the copies.
To avoid copying, we can store environments as linked lists. Each list node (known as an activation record or AR) stores a binding and a pointer (called an access link) to the rest of the environment. Then, extending ρ'
just means creating a new AR for x↦v2
and pointing its access link to ρ'
.
For example, recall that
pair ≡ (λx. λy. f x y)
triple ≡ (λx. λy. λz. λf. f x y z)
Here’s what triple 0 true false
looks like:
The advantage of using linked lists is that in let t = triple 0 true false in pair t t
, several environments share the bindings x↦0, y↦true, z↦false
instead of copying them:
Because of this sharing, some kind of garbage collection (at least reference counting) is generally needed.
Some languages place restrictions to simplify implementation of closures and environments. For example, in C, nested functions are not allowed. Since every function is defined in the same environment (of global variables), closures don’t need an explicit pointer to it – that is, closures are just function pointers. For the same reason, the only environment that gets extended (ρ'
in the third premise of B-App) is the global environment, so ARs do not need explicit access links. Moreover, since neither closures nor ARs have pointers to other ARs, no garbage collection is needed.
But the general trend appears to be that modern languages are either designed with closures (JavaScript, Go, Swift) or got them later (C++11, Python 2.2, Java 8).