Environments

Although the operational semantics presented in class for the λ-calculus can be used as a basis for implementing an interpreter, it’s not necessarily the best fit. First, it may be easier to implement the big-step semantics (Exercise 5.3.8) instead of the small-step semantics. Second, substitution might not be very practical because it modifies code. Especially if functions are compiled, it may be difficult or impossible to perform substitutions on them.

Environments as pending substitutions

So, instead of performing substitutions directly, we store them up in a data structure called an environment ρ. An environment is just a list of bindings of variables to values:

ρ ::= ε
ρ ::= ρ, x ↦ v

We write ρ(x) for the value that ρ binds x to.

The order of the bindings matters: if there are two bindings for x, let’s say that the one on the right wins. For example, if

ρ = a ↦ 0, b ↦ succ 0, a ↦ succ (succ 0)

then

ρ(a) = succ (succ 0)
ρ(b) = succ 0

The judgements in the big-step semantics are now of the form ρ ⊢ t ⇓ v, which means, “Evaluating term t while performing the substitutions in ρ yields value v.”

We now need a rule for variables which looks the variable up in the environment:

————————————  B-Var
ρ ⊢ x ⇓ ρ(x)

The rule for abstractions λx. t doesn’t perform the substitutions in ρ; that’s what we were trying to avoid. Instead, the abstraction is bundled with ρ in a new kind of object called a closure (let’s write it as [ρ](λx. t)). It’s called that because the environment “closes over” all the free variables in λx. t to turn it into a closed term.

——————————————————————  B-Abs
ρ ⊢ λx. t ⇓ [ρ](λx. t)

Finally, the rule for applications is:

ρ ⊢ t₁ ⇓ [ρ'](λx. t₁₂)  ρ ⊢ t₂ ⇓ v₂  ρ', x↦v₂ ⊢ t₁₂ ⇓ v
————————————————————————————————————————————————————————   B-App
                      ρ ⊢ t₁ t₂ ⇓ v

This does the following:

Evaluate t₁ in the current environment ρ to get a closure, [ρ'](λx. t₁₂). Note that ρ', the environment where the function is defined, might be different from ρ, the environment where the function is applied.
Evaluate t₂ in the current environment ρ to get v₂.
Next we need to evaluate t₁₂. The substitutions in ρ₁ have been saved up for t₁₂, so ρ', with x bound to v₂, is the environment we use. (If we used ρ instead, we would get dynamic scope.)

The rest of the rules are the same as before, with the addition of an environment.

Here’s an example proof tree using the above big-step semantics to show that (λx. λy. x) 0 true evaluates to 0:

(1) ⊢ (λx. λy. x) ⇓ [](λx. λy. x)     B-Abs
(2) ⊢ 0 ⇓ 0                           B-Zero
(3) x↦0 ⊢ λy. x ⇓ [x↦0](λy. k)      B-Abs
(4) ⊢ [](λx. λy. x) 0 ⇓ [x↦0](λy. x) B-App (1) (2) (3)
(5) ⊢ true ⇓ true                     B-True
(6) x↦0, y↦true ⊢ x ⇓ 0             B-Var
(7) ⊢ [](λx. λy. x) 0 true ⇓ 0        B-App (4) (5) (6)

Data structures for environments

An environment is a list of bindings. Each time a closure is applied, its environment is extended with a new binding for its argument. This happens in the third premise of B-App, where ρ' is extended to ρ', x↦v₂. But it would be a waste of memory to copy ρ' every time. Moreover, if variables are mutable, we don’t want a variable to have multiple copies, because assigning to the variable would require updating all the copies.

To avoid copying, we can store environments as linked lists. Each list node (known as an activation record or AR) stores a binding and a pointer (called an access link) to the rest of the environment. Then, extending ρ' just means creating a new AR for x↦v₂ and pointing its access link to ρ'.

For example, recall that

  pair ≡ (λx. λy. f x y)
triple ≡ (λx. λy. λz. λf. f x y z)

Here’s what triple 0 true false looks like:

Box diagram for triple 0 true false — Box diagram for `triple 0 true false`

The advantage of using linked lists is that in let t = triple 0 true false in pair t t, several environments share the bindings x↦0, y↦true, z↦false instead of copying them:

Box diagram for let t = triple 0 true false in pair t t — Box diagram for `let t = triple 0 true false in pair t t`

Because of this sharing, some kind of garbage collection (at least reference counting) is generally needed.

Some languages place restrictions to simplify implementation of closures and environments. For example, in C, nested functions are not allowed. Since every function is defined in the same environment (of global variables), closures don’t need an explicit pointer to it – that is, closures are just function pointers. For the same reason, the only environment that gets extended (ρ' in the third premise of B-App) is the global environment, so ARs do not need explicit access links. Moreover, since neither closures nor ARs have pointers to other ARs, no garbage collection is needed.

But the general trend appears to be that modern languages are either designed with closures (JavaScript, Go, Swift) or got them later (C++11, Python 2.2, Java 8).