Fragments of System F
Read TAPL 22.7. Pierce covers this topic before System F, because it turns out to be a fairly mild extension of constraint-based typing. But it can also be thought of as a restriction of System F.
Prenex polymorphism
We can restrict System F to only use types where a ∀X
can only occur at the outermost level (known as prenex
polymorphism). Thus, ∀α. ∀β. α → β → α
is
allowed but (∀α. α → α) → (∀α. α → α)
is not.
So the syntax for types can be written as:
T ::= Nat
Bool
X
T → T
S ::= T
∀X. S
where S
stands for type scheme. We call types
S
polymorphic and we call types T
monomorphic.
The syntax for terms is:
t ::= x
λx: T. t
t t
λX. t
t [T]
The typing rules would be the same as for System F.
The restriction to prenex types has some further consequences on syntax:
In abstractions, the variable can’t have polymorphic type (it’s a
T
, not anS
) because then the abstraction would have a non-prenex type.In type applications
t [T]
as well, we restrict the type to be monomorphic. This restriction is called predicativity. Given that we’ve already restricted to prenex types, predicativity does not shrink the set of typable terms.
Question. If we did allow t [S]
where
S
is polymorphic, what would t
have to look
like in order for the result to have a prenex type?)
But these restrictions are too tight; we can’t even type something like
let f = λX. λx: X. x in {f [Nat] 0, f [Bool] true}
≡ (λf: ∀X. X → X. {f [Nat] 0, f [Bool] true}) (λX. λx: X. x)
because the λf
term has type
(∀X. X → X) → {Nat, Bool}
, which is not in prenex
form. In fact, we can’t type any more terms than the simply-typed lambda
calculus can, in the sense that for any well-typed term t
of System F under the restriction to prenex types, there’s a well-typed
simply-typed term t'
such that t
and
t'
are the same after erasing all type information [2, Theorem 6.4].
Question. What are the places where a type abstraction can occur?
Exercise. Show, by induction on typing derivations:
If t
is a well-typed term of System F under the restriction
to prenex types, then for any substitution [X1↦T1,...]
of the free type variables of t
, there is a well-typed term
t'
of λ→ that has the same type-erasure as
[X1↦T1,...] t
.
Let polymorphism
The extension made by the Hindley-Milner type system, used by ML, is
called let-polymorphism. Previously, we defined
(let x = t1 in t2) ≡ (λx. t2) t1
, but now we make
let
a primitive with its own typing rule.
t ::= ...
let x = t in t
Γ ⊢ t1 : S1 Γ, x:S1 ⊢ t2 : S2
——————————————————————————————— (T-Let)
Γ ⊢ let x = t1 in t2 : S2
The non-prenex type that would have appeared fleetingly as the type
of (λx. t2)
no longer appears, and so we do allow
x
to have polymorphic type.
Exercise. Check that the above example is typable.
(There’s one last problem. In the expression
let x = t1 in t2
, if t1
has side-effects, bad
things can happen (pp 335-6). So, to be on the safe side,
t1
has to be a value in order for x
to get a
polymorphic type. This is called the value restriction.)
Type reconstruction
In the simply-typed lambda calculus, we made the typing rules
syntax-directed, even without type annotations, by switching to a
constraint-based formulation. If we did that now, the typing rules would
still not be syntax-directed, because of T-Gen and T-Inst – in the
absence of explicit types, we don’t know where to use them, and in
T-Inst, we don’t know what T2
should be.
Let’s forget about constraint-based typing for a moment and concentrate on the problem of T-Gen and T-Inst. The fix [1] is to realize that T-Gen can only be used just above T-Let. For example:
Γ, X ⊢ t1 : T1
—————————————— (T-Gen)
Γ ⊢ t1 : ∀X.T1 Γ, x:∀X.T1 ⊢ t2 : S2
————————————————————————————————————————————— (T-Let)
Γ ⊢ let x = t1 in t2 : S2
And T-Inst can only be used just below T-Var. For example:
—————————————— x:∀X.T12 ∈ Γ (T-Var)
Γ ⊢ x : ∀X.T12
—————————————————— (T-Inst)
Γ ⊢ x : [X↦T2]T12
So T-Gen can be baked into T-Let, and T-Inst can be baked into T-Var. It’s possible for T-Let to introduce multiple type variables, and for T-Var to instantiate multiple type variables, so the new rules should look like:
Γ, X1, ... ⊢ t1 : T1 Γ, x:∀X1....T1 ⊢ t2 : S2
———————————————————————————————————————————————— (T-GenLet)
Γ ⊢ let x = t1 in t2 : S2
—————————————————————— x:∀X1....T ∈ Γ (T-VarInst)
Γ ⊢ x : [X1→T1, ...]T
Example: let f = λx. x in {f 0, f false}
.
First we derive f:∀X.X→X ⊢ f 0 : Nat
:
(T-VarInst) ―――――――――――――――――――――――― ――――――――――――――――――― (T-Zero)
f:∀X.X→X ⊢ f : Nat→Nat f:∀X.X→X ⊢ 0 : Nat
―――――――――――――――――――――――――――――――――――――――――――――― (T-App)
f:∀X.X→X ⊢ f 0 : Nat
And similarly for f:∀X.X→X ⊢ f false : Bool
.
Then the rest of the derivation is:
⋮ ⋮
―――――――――――― ――――――――――――――――――――― ――――――――――――――――――――――――――
X, x:X ⊢ x:X f:∀X.X→X ⊢ f 0 : Nat f:∀X.X→X ⊢ f false : Bool
―――――――――――――― ―――――――――――――――――――――――――――――――――――――――――――――――――― (T-Pair)
X ⊢ λx.x : X→X f:∀X.X→X ⊢ {f 0, f false} : {Nat, Bool}
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― (T-GenLet)
⊢ let f = λx.x in {f 0, f false} : {Nat, Bool}
Note that above, when we used T-VarInst, we had to guess what type to
substitute for X
: in one place it was Nat
and
in another place it was Bool
. To remove the remaining
guesswork, we need constraint-based typing.
Recall our constraint-based typing rules for simple types:
—————————————— x:T ∈ Γ (CT-Var)
Γ ⊢ x : T | {}
Γ, x:X ⊢ t2 : T2 | C
————————————————————— X fresh (CT-AbsInf)
Γ ⊢ λx.t2 : X→T2 | C
Γ ⊢ t1 : T1 | C1 Γ ⊢ t2 : T2 | C2
—————————————————————————————————————— X fresh (CT-App)
Γ ⊢ t1 t2 : X | C1 ∪ C2 ∪ {T1 = T2→X}
A non-polymorphic rule for let
(but we’re about to fuse
it with T-Gen, so this is just for illustration):
Γ ⊢ t1 : T1 | C1 Γ, x:T1 ⊢ t2 : T2 | C2
――――――――――――――――――――――――――――――――――――――――― (CT-Let)
Γ ⊢ let x = t1 in t2 : T2 | C1 ∪ C2
To add polymorphism, we replace CT-Var with CT-VarInst. For each universally-quantified type variable (if any), we have to instantiate it to some type, but instead of guessing what to instantiate it to, we simply guess a fresh, free type variable:
———————————————————— x:T ∈ Γ (CT-VarInst)
Γ ⊢ x : Inst(T) | {}
where Inst
removes all quantifiers and replaces each
quantified type variable with a fresh type variable.
T-GenLet has to guess the type variables X1, ...
that
will be used for typing t1
. We have no way of knowing what
those are, but in practice it does not matter. Instead (p. 333-4), we
just go ahead and type t1
, and look in its type
T1
for any new type variables – that is, type variables
that are free in T1
but not in Γ
.
However, in order to know what T1
is, we need to solve the
constraints immediately instead of saving it all for the end. So
CT-GenLet looks like this:
Γ ⊢ t1 : T1 | C1 Γ, x:Gen(T1', Γ) ⊢ t2 : T2 | C2
——————————————————————————————————————————————————— T1' = unify(T1, C1) (CT-GenLet)
Γ ⊢ let x = t1 in t2 : T2 | C2
where Gen(T1', Γ)
looks for all type variables that are
free in T1'
but not Γ
and universally
quantifies them.
At this point, I think it’s much easier to switch to pseudocode instead:
function typecheck(t, Γ)
if t = x
T = Γ(x)
while T is of the form ∀X.T2
T = [X↦fresh()]T2
return T, {}
else if t = λx.t2
T2, C = typecheck(t2, Γ ∪ {x:fresh()})
return T1→T2, C
else if t = t1 t2
T1, C1 = typecheck(t1, Γ)
T2, C2 = typecheck(t2, Γ)
X = fresh()
return X, C1 ∪ C2 ∪ {T1 = T2→X}
else if t = fix t1
T1, C1 = typecheck(t1, Γ)
X = fresh()
return X, C1 ∪ {T1 = X→X}
else if t = (let x = t1 in t2)
T1, C1 = typecheck(t1, Γ)
T1 = unify(T1, C1)
for type variables X free in T1 but not in Γ
T1 = ∀X. T1
return typecheck(t2, Γ ∪ {x:T1})
function typecheck(t)
T, C = typecheck(t, {})
return unify(T, C)