Saturday 1 November 2014

Compiler 2 Part 7: AST

The scanner changes were pretty simple. The AST changes aren’t all that momentous either, per se; rather, it's more a matter of volume.

The additions in the language specification state that we need to create objects for: function calls, function declarations, identifiers and variable declarations. What might not be clear are that we also need objects to handle: assignment, if expressions, expression lists and scoping.

Assignment



If you recall from the first series, I used a nested object Expression in BinaryExpr. I explained that it would be of more use later.

It now gets used rather prolifically throughout the new objects and AssignExpr is no exception.

We track the position of the equals sign and the variable to which we’re assigning a value to. The last field contains the sub-tree of the value we’re assigning to the variable.

The value can be anything with a side effect. Function declarations (do not confuse with function CALLS) are an exception to this rule and can’t be assigned to a variable.

Function Calls


Calling a function is fairly simple. There’s an identifier which signifies the name of the function being called and a list of arguments for the function. That about sums it up!

Function Declarations


There are quite a few fields in DeclExpr. We need to keep track of the position of the ‘decl’ keyword. Then there’s the function’s ‘Name’ and it’s return type. The parameters are a list of variables and their corresponding types. The ‘Body’ of the function is an expression of some kind.

Finally, there’s a field to hold the Scope of the function. If you recall, each function has it’s own scope and this is where it’s stored!

List of Expressions


It’s just a list of expressions. Nothing to see here, move along please.

Identifiers


The identifier can be many things. It may be a type name, like int. It may be a variable name or a function name too.

It is also associated with an Object. The object is stored in the active scope. The object holds the current value and type of a variable or function. When an identifier is something like a type name, then Object is nil.

More details on the Object shortly.

Conditionals


Like the others, we first record where we found the ‘if’ keyword.

The Cond field holds an expression which evaluates to zero, for false, or any other number for true. Ideally, I would prefer the expression always evaluate to a boolean value but since we don’t have a boolean type, an integer will have to do!

The if expression is typed because it's functional.

The Then and Else clauses can by any kind of expression at this point. The type of, or returned by, the expression is not evaluated at this point. This will be verified further down the line when code generation is being handled.

There’s a Scope field for if expressions because they have scope, too.

Objects


The object consists of a Name, to which it is associated, and it’s position. We also want to know what kind of object it is. In this version of Calc it can only be either a variable or function declaration.

We also have a field used to track a variable’s offset when compiling but this will be addressed in a later post.

The Type of an object is important but may not be known during parsing. If the type is going to be inferred, then the type will initially be nil.

The Value is an expression representing the actual value of the variable. It could be anything, including nil.

Package


A package consists of a package level scope and a list of File objects.

Scope


The scope object points to a parent scope, or nil, and contains a table of identifier names mapped to an Object.

Unary Expression


Should be self explanatory.

Variables


We keep track of the position of the ‘var’ keyword. We keep track of the identifier used for the name and an Object representing the variable to save a call to Lookup for it later.

More Scope


Four functions have been introduced to work with scoping but only three are worthy of note. NewScope, creates a new scope. Insert, inserts a new object into the scope and returns either nil or the previous object assigned to the same value. Lookup returns nil or the object associated with ident.

Walk and Print


Walk was adjusted to deal with the new types.

Print is in sorry shape and probably should be removed. It could have been improved to display an AST in it’s entirety. I didn't care to update it for sake of time.