Typechecking Assignment

Objectives

  • To implement binding of names to symbols in a nested language.
  • To reason carefully about type-safety in a strictly typed language.
  • To implement brilliantly helpful error messages in a typed language.
  • To gain experience in programming via recursive decomposition.
  • To gain further experience in incremental software engineering.
  • Overview

    The third step in building a compiler is to implement name resolution and typechecking. Both stages involve traversing the abstract syntax tree recursively. Name resolution involves matching every unbound reference to a name (x) to the relevant variable definition to which it refers (x:integer=3;). Once that is accomplished, typechecking computes the type of each expression and ensure that it is compatible with its destination.

    Requirements

    Your program must be written in plain C (not C++) using GCC (not G++) and use bison to generate the parser and flex to generate the scanner. You must have a Makefile such that when you type make, all the pieces are compiled and result in a binary program called cminor. make clean should also delete all temporary files, so that the program can be made again from scratch. Your code must work on the ND Linux student machines. You must use the CMinor Starter Code as the basis for your work.

    If your program is invoked as follows:

    cminor -print sourcefile.cminor
    
    .. then it should construct and print out the AST, as in the previous assignment.

    If your program is invoked as follows:

    cminor -resolve sourcefile.cminor
    

    ...then you will construct the AST and resolve variable names to symbols using decl_resolve, stmt_resolve, as discussed in class. As each variable declaration is encountered, enter it into the symbol table, bound to the given name. As each name reference is encountered, match it to the corresponding symbol.

    For each name resolved, you should print a message like:

    x resolves to local 3
    y resolves to global y
    z resolves to param 5
    

    For each name is used without a corresponding declaration, emit a message like this:

    resolve error: foo is not defined
    resolve error: x is not defined
    

    If your program is invoked like this:

    ./cminor -typecheck sourcefile.cminor
    
    ...then you will construct the AST, resolve variable names, and perform typechecking, using decl_typecheck, stmt_typecheck, etc, as discussed in class. In every place where type equivalence is required, you must look for compatibility of types. If an operation is not type-safe (for example, adding a string to an integer), then you should emit a very detailed error message that indicates the relevant expression(s), type(s), and what was expected in that context. For example:
    type error: cannot add a string ("abc") to an integer (3+5)
    type error: cannot return a boolean (x<5) in a function (fibonnacci) that returns integer
    type error: declaration of array (a) must have a fixed size.
    
    After encountering the first type or resolution error, keep going so that you can display all possible type errors at once, then exit with status 1 to indicate failure. If no errors were encountered, exit with status zero.

    There are many different places where typechecking must be performed. Section 7.3 in the textbook summarizes the expected behavior of operators, but you will have to examine the grammar carefully to find all places where checks must be made. Think carefully about function calls and definitions, array definitions, etc. C-Minor does not perform any kind of automatic type conversion, so any kind of incompatibility is an error.

    You must test your program extensively by designing and testing a large number of test cases. Try these example tests as a starting point. However, the example tests are not comprehensive, so you must also design and submit your own test files. Ten should be named good[1-10].cminor and should be valid C-minor programs. Ten should be named bad[1-10].cminor and should contain at least one resolution error or type checking error.

    Hints

    For name resolution, you will need to build a scope module which keeps track of the binding between names and symbols in each level of nesting, with an interface like this:
    void            scope_enter();
    void            scope_exit();
    void            scope_bind( const char *name, struct symbol *s );
    struct symbol * scope_lookup( const char *name );
    
    This module will keep track of a linked list of hash tables, each one representing nested scopes in the program. scope_enter will push a new (empty) hash table on to the stack, while scope_delete will pop one from the stack. scope_bind will insert into the current scope an entry binding a name to a symbol object. scope_lookup will search the stack of hash tables, looking for the closest instance of a matching definition.

    For typechecking, begin by building some helper functions related to types:

    struct type * type_copy( struct type *t );
    int type_compare( struct type *a, struct type *b );
    void type_delete( struct type *t );
    
    Then, implement typechecking operations on each of the key structures:
    struct type * expr_typecheck( struct expr *e );
    void stmt_typecheck( struct stmt *s );
    void decl_typecheck( struct decl *d );
    
    expr_typecheck should compute the type of an expression recursively, and return a new type object to represent it. (It should also check for errors within the expression.) Then, write stmt_typecheck and decl_typecheck to use the result of expr_typecheck and compare it against expectations.

    Grading

    For this assignment, your grade will be based upon the following:
  • Continued correctness of the -print option. (10 percent)
  • General correctness of the -resolve option. (20 percent)
  • General correctness of the -typecheck option. (20 percent)
  • Correctness of your test cases. (20 percent)
  • Correctness on our test cases. (20 percent)
  • Good programming style. (10 percent)
  • To turn in the assignment, copy your source files, Makefile, and testing files into your dropbox directory, which is:

    /afs/nd.edu/coursefa.17/cse/cse40243.01/dropbox/YOURNAME/typecheck
    
    This assignment is due Monday, November 13th at 5PM. Late assignments are not accepted.

    Frequently Asked Questions

  • Q: Can you clarify how each operator can be used?
    A: See section 7.3 in the textbook.

  • Q: Does C-minor allow arrays of functions, functions that return functions, variables of type function, and things of that sort?
    A: No, those should be flagged as type errors, since we won't be implementing them in the code generation.

  • Q: What sort of expression can be used to initialize the length of an array?
    A: When an array is declared as a global or local variable, the length must be given as a constant integer. Any more complex expression should result in a type error. When an array is declared as a function parameter, it should have no length given.

  • Q: What type should be assumed for a variable or function that cannot be resolved?
    A: There is no good assumption that you can make. To avoid this problem, you may stop after the name resolution phase, if any name resolution errors are discovered.