// Lessons from writing a compiler / codeisgo.com

The standard academic literature is most useful for the extreme frontend (parsing) and the extreme backend (SSA, instruction selection and code generation), but the middle-end is ignored. This is fine if you want to learn how to build, e.g., the next LLVM: a fat backend with a very thin frontend. By Fernando Borretti.

Imagine a compiler as a 2D grid. The rows are the stages of compilation: from parsing, through the middleend, to the backend. The columns are the different facets of the programming language being implemented. For a smaller language like C, the different facets are: global variable declarations, enum declarations, type declarations, function declarations, and nothing else. More featureful languages add extra facets: interface and class definitions in C++, trait and impl definition in Rust, etc.

This article contains some of the lessons I learned writing the compiler for Austral, a new systems programming language with linear types that I’ve been working on for a while. The first few sections are high-level, the rest most specific to using OCaml to write a compiler.

The article then dives into:

Implementation strategies
Just use OCaml
Just use Menhir
What tests are useful
Compilation model
Code organization
Bootstrapping issues
The environment
Identifying declarations
Errors
Remarks

There are, broadly, two ways to implement a compiler. In the waterfall strategy, each stage is implemented one after the other: the complete parser, then the complete semantic analysis pass, then the complete type checking pass, and on to the backend stages. Excellent read!

[Read More]

Tags miscellaneous programming oop app-development performance data-science