# Fragments of System F

Read TAPL 22.7. Pierce covers this topic before System F, because it turns out to be a fairly mild extension of constraint-based typing. But it can also be thought of as a restriction of System F.

## Prenex polymorphism

We can restrict System F to only use types where a `∀X`

can only occur at the outermost level (known as *prenex polymorphism*). Thus, `∀α. ∀β. α → β → α`

is allowed but `(∀α. α → α) → (∀α. α → α)`

is not. So the syntax for types can be written as:

```
T ::= Nat
Bool
X
T → T
S ::= T
∀X. S
```

where `S`

stands for type *scheme*. We call types `S`

*polymorphic* and we call types `T`

*monomorphic*.

The syntax for terms is:

```
t ::= x
λx: T. t
t t
λX. t
t [T]
```

The typing rules would be the same as for System F.

The restriction to prenex types has some further consequences on syntax:

In abstractions, the variable can’t have polymorphic type (it’s a

`T`

, not an`S`

) because then the abstraction would have a non-prenex type.In type applications

`t [T]`

as well, we restrict the type to be monomorphic. This restriction is called*predicativity*. Given that we’ve already restricted to prenex types, predicativity does not shrink the set of typable terms.

**Question**. If we did allow `t [S]`

where `S`

is polymorphic, what would `t`

have to look like in order for the result to have a prenex type?)

But these restrictions are too tight; we can’t even type something like

```
let f = λX. λx: X. x in {f [Nat] 0, f [Bool] true}
≡ (λf: ∀X. X → X. {f [Nat] 0, f [Bool] true}) (λX. λx: X. x)
```

because the `λf`

term has type `(∀X. X → X) → {Nat, Bool}`

, which is not in prenex form. In fact, we can’t type any more terms than the simply-typed lambda calculus can, in the sense that for any well-typed term `t`

of System F under the restriction to prenex types, there’s a well-typed simply-typed term `t'`

such that `t`

and `t'`

are the same after erasing all type information [2].

**Question**. What are the places where a type abstraction can occur?

**Exercise**. Show, by induction on typing derivations: If `t`

is a well-typed term of System F under the restriction to prenex types, then for any substitution `[X`

of the free type variables of _{1}↦T_{1},...]`t`

, there is a well-typed term `t'`

of λ→ that has the same type-erasure as `[X`

._{1}↦T_{1},...] t

## Let polymorphism

The extension made by the Hindley-Milner type system, used by ML, is called *let-polymorphism*. Previously, we defined `(let x = t`

, but now we make _{1} in t_{2}) ≡ (λx. t_{2}) t_{1}`let`

a primitive with its own typing rule.

```
t ::= ...
let x = t in t
Γ ⊢ t
```_{1} : S_{1} Γ, x:S_{1} ⊢ t_{2} : S_{2}
——————————————————————————————— (T-Let)
Γ ⊢ let x = t_{1} in t_{2} : S_{2}

The non-prenex type that would have appeared fleetingly as the type of `(λx. t`

no longer appears, and so we _{2})*do* allow `x`

to have polymorphic type.

**Exercise**. Check that the above example is typable.

(There’s one last problem. In the expression `let x = t`

, if _{1} in t_{2}`t`

has side-effects, bad things can happen (pp 335-6). So, to be on the safe side, _{1}`t`

has to be a value in order for _{1}`x`

to get a polymorphic type. This is called the *value restriction*.)

## Type reconstruction

In the simply-typed lambda calculus, we made the typing rules syntax-directed, even without type annotations, by switching to a constraint-based formulation. If we did that now, the typing rules would still not be syntax-directed, because of T-Gen and T-Inst – in the absence of explicit types, we don’t know where to use them, and in T-Inst, we don’t know what `T`

should be._{2}

Let’s forget about constraint-based typing for a moment and concentrate on the problem of T-Gen and T-Inst. The fix [1] is to realize that T-Gen can only be used just above T-Let. For example:

`Γ, X ⊢ t`_{1} : T_{1}
—————————————— (T-Gen)
Γ ⊢ t_{1} : ∀X.T_{1} Γ, x:∀X.T_{1} ⊢ t_{2} : T_{2}
————————————————————————————————————————————— (T-Let)
Γ ⊢ let x = t_{1} in t_{2} : T_{2}

And T-Inst can only be used just below T-Var. For example:

` —————————————— x:∀X.T`_{1}_{2} ∈ Γ (T-Var)
Γ ⊢ x : ∀X.T_{1}_{2}
—————————————————— (T-Inst)
Γ ⊢ x : [X↦T_{2}]T_{1}_{2}

So T-Gen can be baked into T-Let, and T-Inst can be baked into T-Var. It’s possible for T-Let to introduce multiple type variables, and for T-Var to instantiate multiple type variables, so the new rules should look like:

`Γ, X`_{1}, ... ⊢ t_{1} : T_{1} Γ, x:∀X_{1}....T_{1} ⊢ t_{2} : T_{2}
———————————————————————————————————————————————— (T-GenLet)
Γ ⊢ let x = t_{1} in t_{2} : T_{2}
—————————————————————— x:∀X_{1}....T ∈ Γ (T-VarInst)
Γ ⊢ x : [X_{1}→T_{1}, ...]T

Example: `let f = λx. x in {f 0, f false}`

.

First we derive `f:∀X.X→X ⊢ f 0 : Nat`

:

```
(T-VarInst) ―――――――――――――――――――――――― ――――――――――――――――――― (T-Zero)
f:∀X.X→X ⊢ f : Nat→Nat f:∀X.X→X ⊢ 0 : Nat
―――――――――――――――――――――――――――――――――――――――――――――― (T-App)
f:∀X.X→X ⊢ f 0 : Nat
```

And similarly for `f:∀X.X→X ⊢ f false : Bool`

.

Then the rest of the derivation is:

```
⋮ ⋮
―――――――――――― ――――――――――――――――――――― ――――――――――――――――――――――――――
X, x:X ⊢ x:X f:∀X.X→X ⊢ f 0 : Nat f:∀X.X→X ⊢ f false : Bool
―――――――――――――― ―――――――――――――――――――――――――――――――――――――――――――――――――― (T-Pair)
X ⊢ λx.x : X→X f:∀X.X→X ⊢ {f 0, f false} : {Nat, Bool}
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― (T-GenLet)
⊢ let f = λx.x in {f 0, f false} : {Nat, Bool}
```

Note that above, when we used T-VarInst, we had to guess what type to substitute for `X`

: in one place it was `Nat`

and in another place it was `Bool`

. To remove the remaining guesswork, we need constraint-based typing.

Recall our constraint-based typing rules for simple types:

```
—————————————— x:T ∈ Γ (CT-Var)
Γ ⊢ x : T | {}
Γ, x:X ⊢ t
```_{2} : T_{2} | C
————————————————————— X fresh (CT-AbsInf)
Γ ⊢ λx.t_{2} : X→T_{2} | C
Γ ⊢ t_{1} : T_{1} | C_{1} Γ ⊢ t_{2} : T_{2} | C_{2}
—————————————————————————————————————— X fresh (CT-App)
Γ ⊢ t_{1} t_{2} : X | C_{1} ∪ C_{2} ∪ {T_{1} = T_{2}→X}

A non-polymorphic rule for `let`

(but we’re about to fuse it with T-Gen, so this is just for illustration):

`Γ ⊢ t`_{1} : T_{1} | C_{1} Γ, x:T_{1} ⊢ t_{2} : T_{2} | C_{2}
――――――――――――――――――――――――――――――――――――――――― (CT-Let)
Γ ⊢ let x = t_{1} in t_{2} : T_{2} | C_{1} ∪ C_{2}

To add polymorphism, we replace CT-Var with CT-VarInst. For each universally-quantified type variable (if any), we have to instantiate it to some type, but instead of guessing what to instantiate it to, we simply guess a fresh, free type variable:

```
———————————————————— x:T ∈ Γ (CT-VarInst)
Γ ⊢ x : Inst(T) | {}
```

where `Inst`

removes all quantifiers and replaces each quantified type variable with a fresh type variable.

T-GenLet has to guess the type variables `X`

that will be used for typing _{1}, ...`t`

. We have no way of knowing what those are, but in practice it does not matter. Instead (p. 333-4), we just go ahead and type _{1}`t`

, and look in its type _{1}`T`

for any new type variables – that is, type variables that are free in _{1}`T`

but _{1}*not* in `Γ`

. However, in order to know what `T`

is, we need to solve the constraints immediately instead of saving it all for the end. So CT-GenLet looks like this:_{1}

`Γ ⊢ t`_{1} : T_{1} | C_{1} Γ, x:Gen(T_{1}', Γ) ⊢ t_{2} : T_{2} | C_{2}
——————————————————————————————————————————————————— T_{1}' = unify(T_{1}, C_{1}) (CT-GenLet)
Γ ⊢ let x = t_{1} in t_{2} : T_{2} | C_{2}

where `Gen(T`

looks for all type variables that are free in _{1}', Γ)`T`

but not _{1}'`Γ`

and universally quantifies them.

At this point, I think it’s much easier to switch to pseudocode instead:

```
function typecheck(t, Γ)
if t = x
T = Γ(x)
while T is of the form ∀X.T
```_{2}
T = [X↦fresh()]T_{2}
return T, {}
else if t = λx.t_{2}
T_{2}, C = typecheck(t_{2}, Γ ∪ {x:fresh()})
return T_{1}→T_{2}, C
else if t = t_{1} t_{2}
T_{1}, C_{1} = typecheck(t_{1}, Γ)
T_{2}, C_{2} = typecheck(t_{2}, Γ)
X = fresh()
return X, C_{1} ∪ C_{2} ∪ {T_{1} = T_{2}→X}
else if t = fix t_{1}
T_{1}, C_{1} = typecheck(t_{1}, Γ)
X = fresh()
return X, C_{1} ∪ {T_{1} = X→X}
else if t = (let x = t_{1} in t_{2})
T_{1}, C_{1} = typecheck(t_{1}, Γ)
T_{1} = unify(T_{1}, C_{1})
for type variables X free in T_{1} but not in Γ
T_{1} = ∀X. T_{1}
return typecheck(t_{2}, Γ ∪ {x:T_{1}})
function typecheck(t)
T, C = typecheck(t, {})
return unify(T, C)

(This algorithm delays calling `unify`

until it’s absolutely necessary. It’s also common to instead call `unify`

as soon as each new constraint is added, that is, in the handling of the `t = t`

case.)_{1} t_{2}

## References

[1] Dominique Clement, Joelle Despeyroux, Thierry Despeyroux, and Gilles Kahn. 1986. *A simple applicative language: Mini-ML*. Retrieved from https://hal.inria.fr/inria-00076025/document

[2] Daniel Leivant. 1983. Polymorphic type inference. In *Proc. POPL*. DOI:https://doi.org/10.1145/567067.567077