Haskell excels at DSLs and the sort of data manipulation needed in compilers. OCaml, Lisp, and really any language with support for ADTs and such things do the trick as well. You can even try hard with modern C++ and variant types and such, but it won't be as pretty.
Of course, if you actually want to run games on the emulator, C or C++ is where the game is. I suppose Rust would work too, but I can't speak much for its low-level memory manipulation.
Haskell and OCaml are excellent for compilers, because - as you suggest - you end up building, walking, and transforming tree data structures where sum types are really useful. Lisp is an odd suggestion there, as it doesn’t really have any built-in support for this sort of thing.
At any rate, that’s not really the case when building an emulator or bytecode interpreter. And Haskell ends up being mostly a liability here, because most work is just going to be imperatively modifying your virtual machine’s state.
> And Haskell ends up being mostly a liability here, because most work is just going to be imperatively modifying your virtual machine’s state.
That sounds odd to me. Haskell is great for managing state, since it makes it possible to do so in a much more controlled manner than non-pure languages.
Yeah, I don't understand what the "liability" here is. I never claimed it was going to be optimal, and I already pointed out C/C++ as the only reasonable choice if you actually want to run games on the thing and get as much performance as possible. But manipulating the machine state in Haskell is otherwise perfect. Code will look like equations, everything becomes trivially testable and REPLable, and you'd even get a free time machine from the immutability of the data, which makes debugging easy.
If you're effectively always in a stateful monad, Haskell's purity offers nothing. Code doesn't look like equations, things aren't trivially testable and REPLable, you don't get a free time machine, and there's syntactic overhead from things like lifting or writes to deeply nested structures and arrays, since the language doesn't have built-in syntactic support for them.
On the other hand, it does have support for things like side-effectful traversals, folds, side effects conditional on value existing, etc. In most other languages you have to write lower-level code to accomplish the same thing.
Even if you use a stateful monad (not necessarily the State monad), you can take snapshots of the state of the machine and literally produce a log. You haven't lost immutability or the time machine, and you can 'deriving Show' the hell out of everything and get human-readable output for free. Fuck, you could even lift functions in such a way that they produce a trace of assertions that each function of (state -> state) must satisfy. A state-debugger-log monad.
Not that you'd need a monad for something like this anyway.
Also when people say Lisp in 2025, usually we can assume Common Lisp, which is far beyond the Lisp 1.5 reference manual in capabilities.
In fact, back when I was in the university, Caml Light was still recent, Miranda was still part of programming language lectures, the languages forbidden on compiler development assignments were Lisp and Prolog, as they would make it supper easy assignment.
I’d also point out, that even in the compiler space, there are basically no production compilers written in Haskell and OCaml.
I believe those two languages themselves self-host. So not saying it’s impossible. And I have no clue about the technical merits.
But if you look around programming forums, there’s this ideas that”Ocaml is one of the leading languages for compiler writers”, which seems to be a completely made up statistic.
I don't know that many production compilers are in them, but how much of that is compilers tending towards self hosting once they get far enough along these days? My understanding is early Rust compilers were written in Ocaml, but they transitioned to Rust to self-host.
What do you define as a production compiler?
Two related languages have compilers built in Haskell: PureScript and Elm.
Also, Haskell has parsers for all major languages. You can find them on Hackage with the `language-` prefix: language-python, language-rust, language,javascript, etc.
Obviously C is the ultimate compiler of compilers.
But I would call Rust, Haxe and Hack production compilers. (As mentioned by sibling, Rust bootstraps itself since its early days. But that doesn't diminish that OCaml was the choice before bootstrapping.)
Most C and C++ developers take umbrage with combining them. Since C++11, and especially C++17, the languages have diverged significantly. C is still largely compatible (outside of things like uncasted malloc) since the rules are still largely valid in C++; but both have gained fairly substantial incompatibilities to each other. Writing a pure C++ application today will look nothing like a modern C app.
RAII, iterators, templates, object encapsulation, smart pointers, data ownership, etc are entrenched in C++; while C is still raw pointers, no generics (no _Generic doesn’t count), procedural, void* casting, manual malloc/free, etc.
I code in both, and enjoy each (generally for different use cases), but certainly they are significantly differing experiences.
Sure, and we also still have people coding in K&R-style C. Some people are hard to change in their ways, but that doesn't mean the community/ecosystem hasn't moved on.
> Another one is C++ "libraries" that are plain C with extern "C" blocks.
Sure, and you also see "C Libraries" that are the exact same. I don't usually judge the communities on their exceptions or extremists.
What are you on? Rust was written in ocaml, and Haxe is still after 25 years going strong with a ocaml based compiler, and is very much production grade.
I wrote a GBC emulator in Haskell (https://github.com/CLowcay/hgbc/tree/master). It's nice for modelling the instruction set, decoding and dispatching instructions. Optimization is tough though. To achieve playable performance, everything has to go into the IO monad. Haskell is famous for lazy evaluation. I found that occasionally useful, but mostly a source of performance problems.
Ultimately, the hard thing in emulation was not decoding instructions. It was synchronization, timing, and faithfully reproducing all the hardware glitches (because many games will not work without certain hardware bugs). Haskell doesn't help much for those things. If I was doing another emulation project I'd choose rust.
Of course, if you actually want to run games on the emulator, C or C++ is where the game is. I suppose Rust would work too, but I can't speak much for its low-level memory manipulation.