Sum and product types

Product types

Read TAPL 11.6-8.

Pierce shows how to add pairs, tuples (which are an easy generalization of pairs), and records (which are basically tuples with named members).

Scheme cons cells, Python tuple, and C struct are real-world examples of product types.

The i ∈ 1..n notation that the book uses is a bit hard to read; here’s a less rigorous notation that also uses evaluation contexts.

t ::= ...
      {t, t, ...}
      t.i             (i≥1)

v ::= ...
      {v, v, ...}

T ::= ...
      {T, T, ...}

E ::= ...
      E.i
      {v, v, ..., E, t, t, ...}

{v₁, v₂, ...}.j ⟶ v_j             E-ProjTuple

  Γ ⊢ t₁ : T₁  Γ ⊢ t₂ : T₂  ...
―――――――――――――――――――――――――――――――――  T-Tuple
Γ ⊢ {t₁, t₂, ...} : {T₁, T₂, ...}

Γ ⊢ t : {T₁, T₂, ...}
―――――――――――――――――――――              T-Proj
     Γ ⊢ t.j : T_j

Sum types

Read TAPL 11.9-10.

Pierce presents two kinds of sum types: sums and variants, which are an easy generalization of sums.

C union is not quite the same as a sum type, because a C union variable doesn’t “know” which of the two unioned types it belongs to. Sum types are often called tagged unions to distinguish from C-style untagged unions.

Question: Why don’t untyped languages (Scheme, Python) typically have sum types?

In order to preserve uniqueness of types, TAPL adds as clauses in many places. For variants, these are actually not necessary if we require that every label uniquely determines a variant type. This is, in fact, the case in many languages, like ML and Haskell (p. 141). We will adopt this convention.

Again, the notation gets a bit hard to read; here’s a less rigorous version.

t ::= ...
      <l=t>
      case t of <l=x> ⇒ t, <l=x> ⇒ t, ...

T ::= ...
      <l:T, l:T, ...>

E ::= ...
      case E of <l=x> ⇒ t, <l=x> ⇒ t, ...
      <l=E>

case <l_j=v_j> of <l₁=x₁> ⇒ t₁, <l₂=x₂> ⇒ t₂, ... ⟶ [x_j↦v_j]t_j      E-CaseVariant

           Γ ⊢ t_j:T_j
―――――――――――――――――――――――――――――――                                       T-Variant
Γ ⊢ <l_j=t_j>:<l₁:T₁, l₂:T₂, ...>

Γ ⊢ t:<l₁:T₁, l₂:T₂, ...>   Γ, x₁:T₁ ⊢ t₁:T   Γ, x₂:T₂ ⊢ t₂:T   ...
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――   T-Case
        Γ ⊢ case t of <l₁=x₁> ⇒ t₁, <l₂=x₂> ⇒ t₂ : T

Assuming that we have a Unit type, we can define variants like

Color ≡ <red:Unit, green:Unit, blue:Unit>

which are analogous to C/C++/Java enum. In particular, we no longer need Bool to be built in; we can define it ourselves:

Bool ≡ <false:Unit, true:Unit>

Exercise. Show how to simulate if if Bool is defined as above.

Propositions as types (graduate only)

Now would be a good time to bring up the Curry-Howard isomorphism (pages 108-109). Forget about type systems for a moment and think about propositional logic.

We define a syntax of formulas:

φ ::= x
      true
      false
      φ → φ
      φ ∧ φ
      φ ∨ φ

And inference rules for Γ ⊢ φ (“under the assumptions in Γ, φ is provable”):

――――――――
Γ ⊢ true

――――――――
Γ, φ ⊢ φ

 Γ, φ₁ ⊢ φ₂
――――――――――――                              (→ introduction)
Γ ⊢ φ₁ → φ₂                              
                                          
Γ ⊢ φ₁ → φ₂  Γ ⊢ φ₁                      
――――――――――――――――――――                      (→ elimination, or modus ponens)
       Γ ⊢ φ₂                             
                                          
Γ ⊢ φ₁  Γ ⊢ φ₂                            
――――――――――――――                            (∧ introduction)
 Γ ⊢ φ₁ ∧ φ₂                              
                                          
Γ ⊢ φ₁ ∧ φ₂                               
―――――――――――   i ∈ {1,2}                   (∧ elimination)
  Γ ⊢ φ_i                                  
                                          
   Γ ⊢ φ_i                                 
―――――――――――   i ∈ {1,2}                   (∨ introduction)
Γ ⊢ φ₁ ∨ φ₂

Γ ⊢ φ₁ ∨ φ₂   Γ ⊢ φ₁ → φ   Γ ⊢ φ₁ → φ
―――――――――――――――――――――――――――――――――――――――   (∨ elimination)
                Γ ⊢ φ

This is a restriction of propositional logic called intuitionistic propositional logic that doesn’t have negation.

The Curry-Howard isomorphism is a correspondence between proof systems in logic and type systems in programming language theory. Notice the similarity between these rules and the typing rules we’ve built up so far.

proof systems	type systems
formulas `φ`	types `T`
assumptions `Γ`	contexts `Γ`
`Γ ⊢ φ`	`Γ ⊢ t:T` for some `t`
`→`	`→`
`∧`	`×`
`∨`	`+`
`true`	`unit`

This naturally leads to a number of questions, like:

What does false correspond to? The empty type, usually written ⊥.
What do ∀ and ∃ correspond to?
What does negation correspond to?