Some other good projects: \* Lambda calculus evaluator \* Hindley Milner typeche...

WalterBright · on April 12, 2020

> in my experience the best way to get better at compilers (I can't claim to be a wizard) is to just build a goddamn compiler.

Yup. In fact, the best way to start is to build a desk calculator, where you type in strings like:

    4 + 6

and it gives you back 10. There was one in the old K+R book. A compiler just scales that up.

sitkack · on April 13, 2020

For context, parent wrote a C++ compiler and invented the D programming language.

azhenley · on April 12, 2020

Exactly. Just start from the beginning and work through each problem as you get to it. Keep the scope small and iterate. You'll probably find yourself wanting to start from scratch a lot once you learn how to do it better, that's ok.

I'm writing up a tutorial for my students right now. It is just a stripped down dialect of BASIC. I think seeing the entire compiler process from start to finish on a tiny language is what helped me the most.

WalterBright · on April 12, 2020

I got my start from the Tiny Pascal compiler listing from BYTE Magazine in the 1970's. It was a complete compiler and interpreter.

kick · on April 12, 2020

https://archive.org/details/byte-magazine-1978-09

https://archive.org/details/byte-magazine-1978-10-rescan

https://archive.org/details/byte-magazine-1978-11-rescan

For anyone curious.

zoomablemind · on April 12, 2020

Thanks! First time reading Byte mag. Kinda almost half century late :)

The article is written with amazing (these days) clarity. Kinda coincided with my current interest in LLVM. Nice to see this article also be coming from UofI Urbana-Champaign grads.

Reportedly the article series misses implementation of one component - run-time, which is needed in order to fully replicate it now. Well, anyway, reading on!

Btw: HN thread on Tiny Pascal from 2018 https://news.ycombinator.com/item?id=17220507

WalterBright · on April 12, 2020

Thank you. I was too lazy to look it up.

kick · on April 12, 2020

Always happy to lend context to these things; BYTE was so fantastic.

j88439h84 · on April 12, 2020

How do I learn to write a type checker? Got any suggested readings?

PeCaN · on April 12, 2020

Type checkers are actually surprisingly easy to write. I recommend learning about unification (in the logic sense), maybe just by playing around with Prolog. After you understand that I think type checking is rather easy to figure out (it's literally just unification).

chubot · on April 12, 2020

I think it depends on whether you have type inference or not. If you have type inference, then it's like unification. If you have explicit type annotations, it's even simpler than that.

https://eli.thegreenplace.net/2018/unification/

https://eli.thegreenplace.net/2018/type-inference/

seanmcdirmid · on April 13, 2020

And if you want nominal sub typing, F-bounded generics, and some inference, things get really fun.

rurban · on April 13, 2020

Nominal subtyping is trivial and unsafe. The fun starts with structural subtyping.

seanmcdirmid · on April 13, 2020

Nominal sub typing is hard from trivial and calling it unsafe is non-sensical. Structural subtyping yields poor error messages and does not support encapsulation very well, especially if you also want separate compilation.

barrkel · on April 12, 2020

Imperative languages without type inference (for variables and functions, that is) work from the bottom up. Symbols (variables, functions) have declared types, constants have known types. Operators have overloads, and depending on the arguments a particular overload is chosen which best matches, and its return type is the type of the operator expression. Type conversions may need to occur along the way to make the arguments to operators or functions fit.

The hairiest thing is usually overload resolution (both operator and function, if implemented). Determining the set of applicable overloads may depend in the types of the arguments, scoping rules, or generic instantiation if it's in the language, and determining the best overload depends on weighing up coercions, subtyping and polymorphism. Simple rules may lead to rejection of overload selection as ambiguous in situations where programmers don't want to define more overloads.

krat0sprakhar · on April 12, 2020

I have a simple implementation of the Hindley Milner Type Inference algorithm in OCaml that should be easy to grok: https://github.com/prakhar1989/type-inference

ckok · on April 13, 2020

I fully agree. I wrote my elements compilers this way. Slowly from a tokenizer, parser up to .net backend, Java backend. Bit by bit. The only way to learn is by doing.

I'd even go as far as saying the literature on the subject is overrated or out of date.

And not everything is set in stone. Not all compilers have a single symbol table. Sometimes it makes sense to have a type/global table and local table, and check the current class scope on the fly. Sometimes not. It doesn't really.

Abstraction doesn't make sense until you actually have multiple implementations. And you might have rewritten it four times by then.

My advice like above. Just start writing

gergoerdi · on April 13, 2020

As an alternative to Hindley-Milner, also consider a compositional type system (https://unsafePerform.IO/projects/talks/2016-06-compty/CompT... it should be a very good match for the kind of simple HM type systems that you're going to encounter while learning about compilers.

Here's code if you prefer that to slides: https://github.com/gergoerdi/hm-compo

zanderwohl · on April 12, 2020

I second your recommendation of Crafting Interpreters. Not only is it free, but it's an excellent way to learn tokenization, parsing, and tree walking, it's a lot of fun to read.