Recursion

Reading: TAPL Section 5.2, pages 65-67.

TAPL doesn’t attempt to explain the fixed-point combinator and says (p. 65, fn. 8) that such explanations are “fairly intricate.” Here’s an attempt at one.

Conditionals

The book defines a function called test which always evaluates both of its arms (under call-by-value evaluation). This is often not what we want, especially when recursion is involved.

Exercise 5.2.9 shows how to rewrite the factorial function using test; here’s a definition of if that works the same way. Note that it’s not a function, but a “macro”.

if l then m else n ≡ (test l (λz. m) (λz. n)) c₀

“Manual” recursion

We’d like to define a factorial function recursively:

fac ≡ λn. if iszro n then c₁ else times n (fac (prd n))

But this definition is circular. So, let’s make fac available to itself as a local variable:

fac' ≡ λf. λn. if iszro n then c₁ else times n (f (prd n))

Then we promise that whenever fac' is invoked, it will receive itself as its first argument. It may seem like we could just define fac ≡ fac' fac', but we need to keep this promise even for the recursive call as well. So we modify fac' into:

fac" ≡ λh. λn. if iszro n then c₁ else times n (h h (prd n))

Now we can define

fac ≡ fac" fac"

To see how this works, note that

fac ≡ fac" fac"
    ≡ (λh. λn. if iszro n then c₁ else times n (h h (prd n))) fac"
  ⟶ λn. if iszro n then c₁ else times n (fac" fac" (prd n))

So whenever you self-apply fac", it unrolls one step of recursion – only one, because the new occurrence of fac" fac" is buried deep in the term and self-applies just before the actual recursive call is made.

Check that fac c₃ = c₆:

fac c₃ ≡ fac" fac" c₃
       ≡ (λh. λn. if iszro n then c₁ else times n (h h (prd n))) fac" c₃
     ⟶ (λn. if iszro n then c₁ else times n (fac" fac" (prd n))) c₃
     ⟶ if iszro c₃ then c₁ else times c₃ (fac" fac" (prd c₃))
    ⟶* if fls then c₁ else times c₃ (fac" fac" (prd c₃))
    ⟶* times c₃ (fac" fac" (prd c₃))
   ⟶_v* times c₃ (fac" fac" c₂)

(The ⟶_v is because the next redex is times c₃, but we don’t want to expand the definition of times.) The recursion happens similarly two more times:

   ⟶_v* times c₃ (times c₂ (fac" fac" c₁))
   ⟶_v* times c₃ (times c₂ (times c₁ (fac" fac" c₀)))

And now the base case:

       ≡ times c₃ (times c₂ (times c₁ ((λh. λn. if iszro n then c₁ else times n (h h (prd n))) fac" c₀)))
    ⟶_v times c₃ (times c₂ (times c₁ ((λn. if iszro n then c₁ else times n (fac" fac" (prd n))) c₀)))
    ⟶_v times c₃ (times c₂ (times c₁ (if iszro c₀ then c₁ else times c₀ (fac" fac" (prd c₀)))))
   ⟶_v* times c₃ (times c₂ (times c₁ (if tru then c₁ else times c₀ (fac" fac" (prd c₀)))))
   ⟶_v* times c₃ (times c₂ (times c₁ c₁))
   ⟶_v* times c₃ (times c₂ c₁)
   ⟶_v* times c₃ c₂
   ⟶_v* c₆

The fixed-point combinator

This works. But, it’s still not very convenient for the programmer to define recursive functions this way. We previously discarded fac' as incorrect, but it’s simple, so it would be convenient to have a function fix such that we can define fac ≡ fix fac'.

Let’s look at fac' and fac" again and think about how we might express fac" (or something equivalent) in terms of fac':

fac' ≡ λf. λn. if iszro n then c₁ else times n (f (prd n))
fac" ≡ λh. λn. if iszro n then c₁ else times n (h h (prd n))

How about:

     fac" ≡ λh. fac' (h h)

Unfortunately, that leads to infinite recursion before we even apply fac" fac" to a number:

fac" fac" ≡ (λh. fac' (h h)) fac"
        ⟶ fac' (fac" fac")
        ⟶ fac' (fac' (fac" fac"))
        ⟶ fac' (fac' (fac' (fac" fac")))
          ⋮

To delay the recursion, wrap the self-application h h inside a λv. … v (known as an η-expansion):

     fac" ≡ λh. fac' (λv. h h v)
fac" fac" ≡ (λh. fac' (λv. h h v)) fac"
        ⟶ fac' (λv. fac" fac" v)
          ≡ (λf. λn. if iszro n then c₁ else times n (f (prd n))) (λv. fac" fac" v)
        ⟶ λn. if iszro n then c₁ else times n ((λv. fac" fac" v) (prd n))

Good – now the self-application fac" fac" happens just before the recursive call, as before.

Now we want a function fix such that fix fac' = fac" fac":

  fix ≡ λg. (λh. g (λv. h h v)) (λh. g (λv. h h v))

This function is known as the call-by-value fixed-point combinator, and is also written as Z. If we define fac ≡ fix fac', it reduces to fac" fac" as desired:

fac ≡ fix fac'
    ≡ (λg. (λh. g (λv. h h v)) (λh. g (λv. h h v))) fac'
  ⟶ (λh. fac' (λv. h h v)) (λh. fac' (λv. h h v))
    ≡ fac" fac"

We can check that fac c₃ = c₆:

fac c₃ ⟶ fac" fac" c₃
      ⟶² (λn. if iszro n then c₁ else times n ((λv. fac" fac" v) (prd n))) c₃
       ⟶ if iszro c₃ then c₁ else times c₃ ((λv. fac" fac" v) (prd c₃))
      ⟶* if fls then c₁ else times c₃ ((λv. fac" fac" v) (prd c₃))
      ⟶* times c₃ ((λv. fac" fac" v) (prd c₃))
     ⟶_v* times c₃ ((λv. fac" fac" v) c₂)
      ⟶_v times c₃ (fac" fac" c₂)

The recursion happens two more times:

     ⟶_v* times c₃ (times c₂ (fac" fac" c₁))
     ⟶_v* times c₃ (times c₂ (times c₁ (fac" fac" c₀)))

And now the base case:

     ⟶_v² times c₃ (times c₂ (times c₁ (λn. if iszro n then c₁ else times n ((λv. fac" fac" v) (prd n))) c₀))
      ⟶_v times c₃ (times c₂ (times c₁ (if iszro c₀ then c₁ else times c₀ ((λv. fac" fac" v) (prd c₀)))))
     ⟶_v* times c₃ (times c₂ (times c₁ (if tru then c₁ else times c₀ ((λv. fac" fac" v) (prd c₀)))))
     ⟶_v* times c₃ (times c₂ (times c₁ c₁))
     ⟶_v* times c₃ (times c₂ c₁)
     ⟶_v* times c₃ c₂
     ⟶_v* c₆

Question. What would happen if we had used the better-known Y combinator (namesake of the startup incubator), shown below, instead of Z? What is the Y combinator good for?

  Y ≡ λg. (λh. g (h h)) (λh. g (h h))

Turing-completeness

Here’s a sketch of how a Turing machine might be simulated in the lambda calculus.

We’ll represent the tape as two lists (Exercise 5.2.8). We can represent both tape symbols and states using Church numerals, with a designated tape symbol blank and designated start and accept states q0 and qf. We need a definition of equal that works on both tape symbols and states (Exercise 5.2.7).

Then, given a Turing machine, construct a function f:

f ≡ λf. λq. λl. λa. λr.
      if equal q qf then tru
      
      # for each transition (Q, A, Q', A', N):
      else if and (equal q Q) (equal a A) then
        f Q' l A' r
      
      # for each transition (Q, A, Q', A', L):
      else if and (equal q Q) (equal a A) then
        if (isnil l) then
          f Q' (tail l) (head l) (cons A' r)
        else
          f Q' l A' r

      # for each transition (Q, A, Q', A', R):
      else if and (equal q Q) (equal a A) then
        if isnil r then
          f Q' (cons A' l) blank nil
        else
          f Q' (cons A' l) (head r) (tail r)
        
      ...
      
      else fls

Then λw. fix f q0 nil (head w) (tail w) simulates the Turing machine.