The untyped lambda calculus

Reading: TAPL Chapter 5.1.

Syntax

The syntax of the λ-calculus is simple:

t ::= x       variable
      λx. t   abstraction
      t t     application

Two rules to remember about the concrete syntax:

The body of an abstraction extends as far right as possible (λx. t₁ t₂ ≡ λx. (t₁ t₂)).
Application is left-associative (t₁ t₂ t₃ ≡ (t₁ t₂) t₃).

If a variable occurrence x sits outside any abstractions λx. t, it is free; otherwise, it is bound. A bound variable occurrence x is bound by the smallest enclosing λx, which is called its binder. (In computer programming, we would say that the inner λx shadows the outer λx.) For example, in (λx. x (λx. x)) (u r) (p. 56), the two occurrences of x have different binders; we can draw lines to make it clear which variable occurrences are bound by which binders (known as a Stoy diagram):

(λx. x (λx. x)) (u r)
  └──┘   └──┘

Beta reduction

When an abstraction λx. t₁₂ is applied to a term t₂, we substitute t₂ into every occurrence of x that is bound by the λx. In the example above, the operand (u r) should only be substituted into the occurrences of x that are bound by the first λx:

(λx. x (λx. x)) (u r)
  └──┘   └──┘
⟶ u r (λx. x)
          └──┘

Another example that has a pitfall:

      ┌──────┐
    (λx. λy. x y) (y z)
          └────┘ 
    ⟶ λy. y z y         bad!
         └──────┘

This last term is bad because the line and the variable names do not agree: there’s an occurrence of y that is not connected by a line to the nearest enclosing λy.

This problem is called variable capture. Avoiding it is treated in Section 5.3, but I think it’s worth saying a bit about it now. The short story is that you should first rename the λy and all the occurrences of y that it binds (this is called α-conversion). This doesn’t change the term at all as far as we are concerned (Convention 5.3.4).

Thus, the right way to do the substitution is:

(λx. λy. x y) (y z)
= [x↦y z](λy. x y)   problem: y occurs free in (y z)
= [x↦y z](λw. x w)   first α-convert
= λw. y z w           then apply substitution to body

Evaluation strategies

Because a term can have more than one redex, we need to choose an evaluation strategy (p. 56) that says what order to perform β-reductions in. This is somewhat like the nondeterministic machines we studied in Theory of Computing, in that more than one choice is possible, but different in that we have to follow one choice, not all of them in parallel. See More about evaluation for why this isn’t as arbitrary as it sounds.

The book mentions four evaluation strategies, but only uses call-by-value. It’s safe to ignore the other three evaluation strategies, although we may talk about call-by-name later.

Call-by-value is just what you’re used to from most (if not all) programming languages you’ve used. It has three rules.

No reductions inside abstractions.
A redex is reduced only when its right-hand side has already been reduced to a value, that is, a variable or abstraction.
Subject to the above rules, the leftmost redex is reduced first. (The book doesn’t make this rule entirely clear.)

It may be easier to think about this way:

To evaluate an abstraction λx. t, do nothing.
To evaluate an application t₁ t₂, first evaluate t₁, then evaluate t₂, then β-reduce the application.

As a perhaps more familiar example, imagine the following C program:

int triple(int x) { return x+x+x; }
int answer(void) { return 42; }
void bad(void) { while (1) { } }
void main(void) { triple(answer()); }

The “no reductions inside abstractions” rule in C means that code inside a function doesn’t execute until the function is called. Thus, the program won’t spontaneously call bad.

The “redex is reduced only when its right-hand side is a value” rule means that the argument of a function call is executed before calling the function. Thus, answer is called once before passing its return value to triple. Under call-by-name evaluation, the program would call answer three times.

Equivalence

We consider two terms to be identical if they differ only up to consistent renaming of variables. This is called α-equivalence and we write it as ≡. For example, λx. x and λy. y differ only in their variable names, so λx. x ≡ λy. y.

But sometimes we will want to know whether two terms, even if they aren’t identical, are “behaviorally equivalent” (p. 64). For example, λx. (λy. y) x and λx. (λy. x) z aren’t identical, but behave the same in the sense that they evaluate to identical values when applied to an argument. For a proper definition of behavioral equivalence, see More about equality, but for our purposes, the following should suffice.

Define βᵥ-reduction (⟶_v) to be β-reduction in which the argument must be a value (as in call-by-value evaluation), but reductions inside abstractions are allowed (as in full β-reduction). Define β_v-equivalence to be the reflexive, symmetric, transitive closure of βᵥ-reduction. For example, both λx. (λy. y) x and λx. (λy. x) z βᵥ-reduce to the same term, so they are βᵥ-equivalent:

λx. (λy. y) x ⟶_v λx. x
λx. (λy. x) z ⟶_v λx. x

Then, if two terms are βᵥ-equivalent, then they are behaviorally equivalent [1].

References

[1] G. D. Plotkin. 1975. Call-by-name, call-by-value and the λ-calculus. Theoretical Computer Science 1, (1975), 125–159. DOI:https://doi.org/10.1016/0304-3975(75)90017-1