Recursion

Reading: TAPL Section 5.2, pages 65-67.

TAPL doesn’t attempt to explain the fixed-point combinator and says (p. 65, fn. 8) that such explanations are “fairly intricate.” Here’s an attempt at one.

Conditionals

The book defines a function called test which always evaluates both of its arms (under call-by-value evaluation). This is often not what we want, especially when recursion is involved.

Exercise 5.2.9 shows how to rewrite the factorial function using test; here’s a definition of if that works the same way. Note that it’s not a function, but a “macro”.

if l then m else n ≡ (test l (λz. m) (λz. n)) c0

“Manual” recursion

We’d like to define a factorial function recursively:

fac ≡ λn. if iszro n then c1 else times n (fac (prd n))

But this definition is circular. So, let’s make fac available to itself as a local variable:

fac' ≡ λf. λn. if iszro n then c1 else times n (f (prd n))

Then we promise that whenever fac' is invoked, it will receive itself as its first argument. It may seem like we could just define fac ≡ fac' fac', but we need to keep this promise even for the recursive call as well. So we modify fac' into:

fac" ≡ λh. λn. if iszro n then c1 else times n (h h (prd n))

Now we can define

fac ≡ fac" fac"

To see how this works, note that

fac ≡ fac" fac"
    ≡ (λh. λn. if iszro n then c1 else times n (h h (prd n))) fac"
   λn. if iszro n then c1 else times n (fac" fac" (prd n))

So whenever you self-apply fac", it unrolls one step of recursion – only one, because the new occurrence of fac" fac" is buried deep in the term and self-applies just before the actual recursive call is made.

Check that fac c3 = c6:

fac c3 ≡ fac" fac" c3
       ≡ (λh. λn. if iszro n then c1 else times n (h h (prd n))) fac" c3
      (λn. if iszro n then c1 else times n (fac" fac" (prd n))) c3
      if iszro c3 then c1 else times c3 (fac" fac" (prd c3))
    * if fls then c1 else times c3 (fac" fac" (prd c3))
    * times c3 (fac" fac" (prd c3))
   v* times c3 (fac" fac" c2)

(The v is because the next redex is times c3, but we don’t want to expand the definition of times.) The recursion happens similarly two more times:

   v* times c3 (times c2 (fac" fac" c1))
   v* times c3 (times c2 (times c1 (fac" fac" c0)))

And now the base case:

       ≡ times c3 (times c2 (times c1 ((λh. λn. if iszro n then c1 else times n (h h (prd n))) fac" c0)))
    v times c3 (times c2 (times c1 ((λn. if iszro n then c1 else times n (fac" fac" (prd n))) c0)))
    v times c3 (times c2 (times c1 (if iszro c0 then c1 else times c0 (fac" fac" (prd c0)))))
   v* times c3 (times c2 (times c1 (if tru then c1 else times c0 (fac" fac" (prd c0)))))
   v* times c3 (times c2 (times c1 c1))
   v* times c3 (times c2 c1)
   v* times c3 c2
   v* c6

The fixed-point combinator

This works. But, it’s still not very convenient for the programmer to define recursive functions this way. We previously discarded fac' as incorrect, but it’s simple, so it would be convenient to have a function fix such that we can define fac ≡ fix fac'.

Let’s look at fac' and fac" again and think about how we might express fac" (or something equivalent) in terms of fac':

fac' ≡ λf. λn. if iszro n then c1 else times n (f (prd n))
fac" ≡ λh. λn. if iszro n then c1 else times n (h h (prd n))

How about:

     fac" ≡ λh. fac' (h h)

Unfortunately, that leads to infinite recursion before we even apply fac" fac" to a number:

fac" fac" ≡ (λh. fac' (h h)) fac"
         fac' (fac" fac")
         fac' (fac' (fac" fac"))
         fac' (fac' (fac' (fac" fac")))
          ⋮

To delay the recursion, wrap the self-application h h inside a λv. … v (known as an η-expansion):

     fac" ≡ λh. fac' (λv. h h v)
fac" fac" ≡ (λh. fac' (λv. h h v)) fac"
         fac' (λv. fac" fac" v)
          ≡ (λf. λn. if iszro n then c1 else times n (f (prd n))) (λv. fac" fac" v)
         λn. if iszro n then c1 else times n ((λv. fac" fac" v) (prd n))

Good – now the self-application fac" fac" happens just before the recursive call, as before.

Now we want a function fix such that fix fac' = fac" fac":

  fix ≡ λg. (λh. g (λv. h h v)) (λh. g (λv. h h v))

This function is known as the call-by-value fixed-point combinator, and is also written as Z. If we define fac ≡ fix fac', it reduces to fac" fac" as desired:

fac ≡ fix fac'
    ≡ (λg. (λh. g (λv. h h v)) (λh. g (λv. h h v))) fac'
   (λh. fac' (λv. h h v)) (λh. fac' (λv. h h v))
    ≡ fac" fac"

We can check that fac c3 = c6:

fac c3  fac" fac" c3
      ² (λn. if iszro n then c1 else times n ((λv. fac" fac" v) (prd n))) c3
        if iszro c3 then c1 else times c3 ((λv. fac" fac" v) (prd c3))
      * if fls then c1 else times c3 ((λv. fac" fac" v) (prd c3))
      * times c3 ((λv. fac" fac" v) (prd c3))
     v* times c3 ((λv. fac" fac" v) c2)
      v times c3 (fac" fac" c2)

The recursion happens two more times:

     v* times c3 (times c2 (fac" fac" c1))
     v* times c3 (times c2 (times c1 (fac" fac" c0)))

And now the base case:

     v² times c3 (times c2 (times c1 (λn. if iszro n then c1 else times n ((λv. fac" fac" v) (prd n))) c0))
      v times c3 (times c2 (times c1 (if iszro c0 then c1 else times c0 ((λv. fac" fac" v) (prd c0)))))
     v* times c3 (times c2 (times c1 (if tru then c1 else times c0 ((λv. fac" fac" v) (prd c0)))))
     v* times c3 (times c2 (times c1 c1))
     v* times c3 (times c2 c1)
     v* times c3 c2
     v* c6

Question. What would happen if we had used the better-known Y combinator (namesake of the startup incubator), shown below, instead of Z? What is the Y combinator good for?

  Y ≡ λg. (λh. g (h h)) (λh. g (h h))

Turing-completeness

Here’s a sketch of how a Turing machine might be simulated in the lambda calculus.

We’ll represent the tape as two lists (Exercise 5.2.8). We can represent both tape symbols and states using Church numerals, with a designated tape symbol blank and designated start and accept states q0 and qf. We need a definition of equal that works on both tape symbols and states (Exercise 5.2.7).

Then, given a Turing machine, construct a function f:

f ≡ λf. λq. λl. λa. λr.
      if equal q qf then tru
      
      # for each transition (Q, A, Q', A', N):
      else if and (equal q Q) (equal a A) then
        f Q' l A' r
      
      # for each transition (Q, A, Q', A', L):
      else if and (equal q Q) (equal a A) then
        if (isnil l) then
          f Q' (tail l) (head l) (cons A' r)
        else
          f Q' l A' r

      # for each transition (Q, A, Q', A', R):
      else if and (equal q Q) (equal a A) then
        if isnil r then
          f Q' (cons A' l) blank nil
        else
          f Q' (cons A' l) (head r) (tail r)
        
      ...
      
      else fls

Then λw. fix f q0 nil (head w) (tail w) simulates the Turing machine.