# The untyped lambda calculus

## Syntax

The syntax of the λ-calculus is simple:

``````t ::= x       variable
λx. t   abstraction
t t     application``````

Two rules to remember about the concrete syntax:

• The body of an abstraction extends as far right as possible (`λx. t1 t2 ≡ λx. (t1 t2)`).

• Application is left-associative (`t1 t2 t3 ≡ (t1 t2) t3`).

If a variable occurrence `x` sits outside any abstractions `λx. t`, it is free; otherwise, it is bound. A bound variable occurrence `x` is bound by the smallest enclosing `λx`, which is called its binder. (In computer programming, we would say that the inner `λx` shadows the outer `λx`.) For example, in `(λx. x (λx. x)) (u r)` (p. 56), the two occurrences of `x` have different binders; we can draw lines to make it clear which variable occurrences are bound by which binders (known as a Stoy diagram):

``````(λx. x (λx. x)) (u r)
└──┘   └──┘``````

## Beta reduction

When an abstraction `λx. t12` is applied to a term `t2`, we substitute `t2` into every occurrence of `x` that is bound by the `λx`. In the example above, the operand `(u r)` should only be substituted into the occurrences of `x` that are bound by the first `λx`:

``````(λx. x (λx. x)) (u r)
└──┘   └──┘
⟶ u r (λx. x)
└──┘``````

Another example that has a pitfall:

``````      ┌──────┐
(λx. λy. x y) (y z)
└────┘
⟶ λy. y z y         bad!
└──────┘``````

This last term is bad because the line and the variable names do not agree: there’s an occurrence of `y` that is not connected by a line to the nearest enclosing `λy`.

This problem is called variable capture. Avoiding it is treated in Section 5.3, but I think it’s worth saying a bit about it now. The short story is that you should first rename the `λy` and all the occurrences of `y` that it binds (this is called α-conversion). This doesn’t change the term at all as far as we are concerned (Convention 5.3.4).

Thus, the right way to do the substitution is:

``````(λx. λy. x y) (y z)
= [x↦y z](λy. x y)   problem: y occurs free in (y z)
= [x↦y z](λw. x w)   first α-convert
= λw. y z w           then apply substitution to body``````

## Evaluation strategies

Because a term can have more than one redex, we need to choose an evaluation strategy (p. 56) that says what order to perform β-reductions in. This is somewhat like the nondeterministic machines we studied in Theory of Computing, in that more than one choice is possible, but different in that we have to follow one choice, not all of them in parallel. See More about evaluation for why this isn’t as arbitrary as it sounds.

The book mentions four evaluation strategies, but only uses call-by-value. It’s safe to ignore the other three evaluation strategies, although we may talk about call-by-name later.

Call-by-value is just what you’re used to from most (if not all) programming languages you’ve used. It has three rules.

• No reductions inside abstractions.

• A redex is reduced only when its right-hand side has already been reduced to a value, that is, a variable or abstraction.

• Subject to the above rules, the leftmost redex is reduced first. (The book doesn’t make this rule entirely clear.)

• To evaluate an abstraction `λx. t`, do nothing.

• To evaluate an application `t1 t2`, first evaluate `t1`, then evaluate `t2`, then β-reduce the application.

As a perhaps more familiar example, imagine the following C program:

``````int triple(int x) { return x+x+x; }
int answer(void) { return 42; }
void bad(void) { while (1) { } }

The “no reductions inside abstractions” rule in C means that code inside a function doesn’t execute until the function is called. Thus, the program won’t spontaneously call `bad`.

The “redex is reduced only when its right-hand side is a value” rule means that the argument of a function call is executed before calling the function. Thus, `answer` is called once before passing its return value to `triple`. Under call-by-name evaluation, the program would call `answer` three times.

## Equivalence

We consider two terms to be identical if they differ only up to consistent renaming of variables. This is called α-equivalence and we write it as `≡`. For example, `λx. x` and `λy. y` differ only in their variable names, so `λx. x ≡ λy. y`.

But sometimes we will want to know whether two terms, even if they aren’t identical, are “behaviorally equivalent” (p. 64). For example, `λx. (λy. y) x` and `λx. (λy. x) z` aren’t identical, but behave the same in the sense that they evaluate to identical values when applied to an argument. For a proper definition of behavioral equivalence, see More about equality, but for our purposes, the following should suffice.

Define βᵥ-reduction (`⟶v`) to be β-reduction in which the argument must be a value (as in call-by-value evaluation), but reductions inside abstractions are allowed (as in full β-reduction). Define `βv-equivalence` to be the reflexive, symmetric, transitive closure of βᵥ-reduction. For example, both `λx. (λy. y) x` and `λx. (λy. x) z` βᵥ-reduce to the same term, so they are βᵥ-equivalent:

``````λx. (λy. y) x ⟶v λx. x
λx. (λy. x) z ⟶v λx. x``````

Then, if two terms are βᵥ-equivalent, then they are behaviorally equivalent [1].

## References

[1] G. D. Plotkin. 1975. Call-by-name, call-by-value and the λ-calculus. Theoretical Computer Science 1, (1975), 125–159. DOI:https://doi.org/10.1016/0304-3975(75)90017-1