Thursday, 5 November 2015

Compiler 3 Part 2 - C Runtime and Code Generation

Before diving into the real meat and potatoes of the changes that led to the runtime being (temporarily) removed, I thought I should discuss the motivations.
The runtime irked me greatly because I felt that I did not leverage the power of C nor the C compilers. There were many, many improvements needing to be made and as I began working on them I came to a realization.

The runtime, at least in its current form, is not very useful.

Unlike in a language like Go where the runtime serves a very real purpose (managing goroutines, the garbage collector, etc) the Calc C runtime merely tried to mimic what it’s like to program in assembly.

Of course, there are reasons to re-add a runtime further down the road. Runtime checks, like bounds checking, are one such reason. Stack tracing is another wonderful motivation to implement a runtime. Built-in functions are also common and useful.

I still maintain this was a worthy exercise, long-term it just wasn't viable. The more I looked at the runtime the more I knew I could do better.

The largest hurdle was removing the stack. How was I to handle the call stack? As it turns out, it’s not too hard to use C-style function calls but it did require some changes that the series 2 code base just wasn't equipped to handle elegantly.

I also needed to push and pop data on and off the stack. So how could I rid myself of it?

I have been researching on how to use C as an intermediate representation for many months. Generating C function calls and declarations were proved difficult with the Calc2 compiler and that was one of the motivations for using an assembly-like approach.

Ultimately, it became more and more clear that I was missing a step that would make my life a lot easier.

That secret, if you will, was intermediate representation.

After much research I ran into some potential solutions: three address code and static single assignment. I have been aware of these languages and others like them (Gimple and RTL) for a while.

Using them seemed complicated and I didn't think they were worth the effort of learning about. I wasn't writing an optimizing compiler so what advantages did they provide that I would need.

While not using any of these representations directly I do take advantage of what they provide. More on that next post.