More about equality (graduate only)
We’ve defined the λ-calculus in terms of reduction
(⟶
), which is natural for thinking about λ-calculus as
a mechanism for computation. But we can also think about it in terms of
equivalence between terms. There is more than one way to do
this.
We’ve already talked about α-equivalence. This is the finest-grained notion of equivalence we have, and not very interesting.
β-equivalence
Another notion of equivalence is the reflexive, symmetric, transitive closure of β-reduction, called β-equivalence. That is, two terms are β-equivalent iff there is a sequence of zero or more reductions and expansions between them. Because of confluence, this can be restated as: two terms are β-equivalent iff they reduce in zero or more steps to the same term.
Under full β-reduction, that would be a reasonable notion of
equivalence, but under call-by-value, it’s too fine-grained. For
example, since we don’t reduce inside lambdas, this definition doesn’t
equate (say) λs. λz. s z
and
λs. λz. s ((λs. λz. z) s z)
. But intuitively these two
terms do the same thing.
Observational equivalence
The coarsest-grained notion of equivalence is called
observational (or behavioral or contextual or
operational) equivalence. Define a context to be a
closed term with a single hole [·]
in it; this is like an
evaluation context, except the hole can occur anywhere. Then two terms
t
and t'
are called observationally
equivalent iff for all contexts C
such that
C[t]
and C[t']
are closed, C[t]
and C[t']
have the same “observable” result.
What “observable” means seems to depend on the situation. For the
pure untyped λ-calculus, having the same “observable” result means that
they either both converge (not necessarily to the same thing) or both
diverge. This definition may be surprising, but the idea is that if
t
and t'
can be distinguished, then it should
be possible to make a context that makes one of them converge and the
other diverge.
Equational theories
Proving that two terms are observationally equivalent can be
difficult. But we can write down an equational theory with
which we can sometimes (not necessarily always) prove that two terms are
observationally equivalent. Above, we proposed to use the reflexive,
symmetric, transitive closure of ⟶
, which we can write
out as:
t = t (Eq-Reflexive)
t = t'
――――――― (Eq-Symmetric)
t' = t
t = t' t' = t"
―――――――――――――――― (Eq-Transitive)
t = t"
t = t'
―――――――――――― (Eq-Context)
E[t] = E[t']
(λx. t12) v2 = [x↦v2]t12 (Eq-AppAbs)
But it would seem that, just as in algebra, if two terms are equal,
we ought to be able to substitute them into any context C
(not just an evaluation context), and the resulting terms should be
equal:
t = t'
―――――――――――― (Eq-Context')
C[t] = C[t']
(This rule replaces Eq-Context.) In other words, the new definition
is the reflexive, symmetric, transitive closure of full βᵥ-reduction
(⟶v
), which is what we’ve been using.
The important fact is that if t = t'
according to the
above rules, then they are observationally equivalent [1]. (The proof is not difficult using
the results mentioned in More about
evaluation.)
I think that should be enough for our purposes, but there are other axioms that could be added to the theory – if you’re curious, they are [2,3]:
λx. v x = v x ∉ FV(v) (ηv)
(λx. E[x]) t2 = E[t2] x ∉ FV(E) (βΩ)
With the addition of these axioms, the equational theory is complete in the sense that if two terms are observationally equivalent (for call-by-value evaluation), then they can be proven equivalent using these axioms.