Type reconstruction

Read TAPL 22.1-6. We’re jumping way ahead in the book, but there’s nothing in these sections that depends on any intervening material (as far as I know).

The goal of this chapter is to develop an algorithm for type reconstruction (or type inference): Given an untyped lambda-term, find the most general (or principal) type.

We could start simply by removing type annotations from the typing rules for the simply-typed lambda calculus:

x:T ∈ Γ
———————                                         (T-Var)
Γ ⊢ x:T

 Γ, x:T₁ ⊢ t₂ : T₂
——————————————————                              (T-Abs')
Γ ⊢ λx.t₂ : T₁→T₂

Γ ⊢ t₁ : T₁₁→T₁₂    Γ ⊢ t₂ : T₁₁
—————————————————————————————————               (T-App)
        Γ ⊢ t₁ t₂ : T₁₂

Γ ⊢ t : T→T
—————————————                                   (T-Fix)
Γ ⊢ fix t : T

The problem with these rules is that they are not syntax-directed: Given an untyped term, we used to be able to traverse the syntax tree once and either end up with a type or find that there is no type. Now, the lack of type annotations means that in T-Abs’, we have to guess the type T₁. That’s bad news because the set of possible types is infinite.

Constraint-based typing

The solution pursued in the book is to develop a syntax-directed algorithm that assigns every term a type possibly involving type variables, while generating a set of constraints (a system of equations involving the type variables). Then, in a second phase, it tries to solve the system of equations.

The notation in Figure 22-1 is pretty hard to read. I second the suggestion (p. 321) to ignore the subscripts. I’m also going to go ahead and incorporate the implicit type annotation rule from section 22.6. That makes the typing rules look like this:

——————————————  x:T ∈ Γ                          (CT-Var)
Γ ⊢ x : T | {}

 Γ, x:X ⊢ t₂ : T₂ | C
—————————————————————  X fresh                   (CT-AbsInf)
Γ ⊢ λx.t₂ : X→T₂ | C

 Γ ⊢ t₁ : T₁ | C₁    Γ ⊢ t₂ : T₂ | C₂
——————————————————————————————————————  X fresh  (CT-App)
Γ ⊢ t₁ t₂ : X | C₁ ∪ C₂ ∪ {T₁ = T₂→X}

       Γ ⊢ t₁ : T₁ | C
———————————————————————————————  X fresh         (CT-Fix)
Γ ⊢ fix t : X | C ∪ {T₁ = X→X}

The side condition X fresh means that X is different from the type variables introduced anywhere else in the derivation. It’s a bit of informality that is actually perfectly safe when you implement the algorithm (Exercise 22.3.9, and below).

Here’s a example typing derivation for λf. f 0:

―――――――――――――――――― (CT-Var)  ――――――――――――――――――― (CT-Zero)
f:X₁ ⊢ f : X₁ | {}           f:X₁ ⊢ 0 : Nat | {}
―――――――――――――――――――――――――――――――――――――――――――――――― (CT-App)
       f:X₁ ⊢ f 0 : X₂ | {X₁ = Nat→X₂}
     ――――――――――――――――――――――――――――――――――― (CT-AbsInf)
     ⊢ λf. f 0 : X₁→X₂ | {X₁ = Nat→X₂}

Implementing the algorithm

Here’s a pseudocode implementation of the typing algorithm.

function typecheck_helper(t, Γ)
    if t = x
        return Γ(x), {}
    else if t = λx.t₂
        X = fresh()
        T₂, C = typecheck_helper(t₂, Γ ∪ {x:X})
        return X→T₂, C
    else if t = t1 t2
        T₁, C₁ = typecheck_helper(t₁, Γ)
        T₂, C₂ = typecheck_helper(t₂, Γ)
        X = fresh()
        return X, C₁ ∪ C₂ ∪ {T₁ = T₂→X}
    else if t = fix t₁
        T₁, C₁ = typecheck_helper(t₁, Γ)
        X = fresh()
        return X, C₁ ∪ {T₁ = X→X}

The function fresh() returns a new type variable that has never been used before.

The unification algorithm in Figure 22-2 might also be easier to read as imperative pseudocode:

function unify(T₀, C)
    while C is not empty
        pop a constraint S=T from C
        if S = T
            do nothing
        else if S = X and X ∉ FV(T)
            T₀ = [X↦T]T₀
            C = [X↦T]C
        else if T = X and X ∉ FV(S)
            T₀ = [X↦S]T₀
            C = [X↦S]C
        else if S = S₁→S₂ and T = T₁→T₂
            C = C ∪ {S₁=T₁, S₂=T₂}
        else
            fail
    return T₀

Then the top-level reconstruction function is:

function typecheck(t)
    T, C = typecheck_helper(t, {})
    return unify(T, C)

The operations T₀ = [X↦T]T₀ and the like are expensive and can be made more efficient if we don’t actually perform these substitutions, but maintain a table that says what each type variable stands for, e.g.,

Type variable	Stands for
`X₁`	`Nat→X₂`
`X₂`	`X₃`
`X₃`	`X₄`

This can be thought of as a union-find data structure, where each set has one or more type variables and at most one type that isn’t a type variable. This data structure can be implemented so that querying and updating are almost linear time.