How to speed up the Rust compiler

vlovich123 · on Aug 25, 2023

> Anyone waiting for 10x improvements is likely to be disappointed. I don’t see that happening, except for the occasional unusual program (such as the chess engine case mentioned above).

> The compiler is a large, mature project. It’s well over 10 years old, and hundreds of thousands of lines. Projects like that tend to not get quantum leaps in performance. Many small improvements is more typical.

And

> And it’s hard to radically change the data structures used in a mature compiler.

As the author describes, the compiler is likely fairly well optimized for its existing way of doing things. 10x improvements would come from architectural changes and it sounds like the compiler is so big now that this is difficult. The other piece that makes it difficult is that LLVM is a separate also large project.

An example of an architectural change I think hasn’t been investigated enough is in the space of PGO. While compilers use PGO to choose better heuristics, they don’t use that to control optimization levels applied to various code. Most programs have performance critical compute paths, but the rest of the program is no where near that. I feel like dropping expensive optimization passes for “cold” code would provide significant savings and Rust already lets you annotate cold/hot functions. Would be nice if it combined it with call flow analysis to build something like audio ABR but for optimization.

Anyway, such things are difficult research projects to attempt on existing large codebases. I believe the author that further meaningful performance wins are unlikely without deeper architectural changes. This is very similar to comments that Lemire makes in [1] and is well understood to be the same (often misunderstood) critique as Knuth made many decades ago about premature optimization. Architecture trumps code tuning.

[1] https://lemire.me/blog/2023/04/27/hotspot-performance-engine...

GeekyBear · on Aug 25, 2023

> The other piece that makes it difficult is that LLVM is a separate also large project.

There is the Cranelift project which is intended to replace LLVM on the back end while you're coding, testing and debugging.

> Cranelift focuses on just-in-time compilation with short compile time being an explicit goal of the project.

https://en.wikipedia.org/wiki/Cranelift

glandium · on Aug 25, 2023

> they don’t use that to control optimization levels applied to various code.

GCC does. I think LLVM does to some extent too.

vlovich123 · on Aug 25, 2023

Where can I read up on this? AFAIK PGO is used in lieu of various heuristics within a pass, but is not used to remove optimization passes on “dead code”.

KMag · on Aug 25, 2023

Dead code is a very specific term, meaning unreachable code paths that can be safely eliminated.

You mean cold code paths, not dead code.

glandium · on Aug 25, 2023

Good question, I unfortunately don't have a link to give you, or fine details, but GCC does hot/cold partitioning and applies different optimization level to hot and cold code blocks. IIRC, it uses the level given on the command line for hot code, and -Os for cold code.

vlovich123 · on Aug 25, 2023

I see nothing in the documentation to suggest that. The only thing I’m aware compilers do for hot/cold is partitioning to improve locality.

mamcx · on Aug 25, 2023

Exist many potential ways to speed up a compiler, than are proven (example: pascal), but the most nasty truth about that is that change the language.

That is a reason for why will be impossible to fix Python (you can't assert that anything will keep their types) or in this case Rust (several features are "made" with the sole purpose to make the compiler pipeline SAD!).

You need certain discipline to develop both the language/syntax/semantic so the compiler can exploit them for speed.

---

However, I suspect it must be possible to speed up Rust and other langs massively, because you can chew gigabytes in a RDBMs in seconds. I now wonder if a "compiler database engine" could help, so you can store the lexemes, sources, inferences, etc and query them.

Also, A LOT of the work of compiler is just repeat the same steps over and over, and exist some research about save pre-computed snippets to avoid that.

estebank · on Aug 25, 2023

> I now wonder if a "compiler database engine" could help, so you can store the lexemes, sources, inferences, etc and query them.

rustc already does this https://rustc-dev-guide.rust-lang.org/queries/query-evaluati...

Rusky · on Aug 25, 2023

This doesn't feel like the same thing. The performance benefits of an RDBMS come from the way its storage is organized and (at least in typical RDBMS scenarios) the way queries are dynamically optimized to traverse that storage, based on the data it contains.

The thing rustc calls "queries" is more about incremental compilation- reusing work within and across compiler invocations, but without the sort of impact on memory layout and access patterns that you get from an RDBMS.

The closest thing I've seen in a compiler is rather Zig's AST memory layout work: https://github.com/ziglang/zig/pull/7920. But this is a very "hand-crafted" version of what an RDBMS does.

nnethercote · on Aug 26, 2023

I've tried AST memory layout work in the Zig style, look for "shrinkage" in these two posts:

- https://nnethercote.github.io/2022/10/27/how-to-speed-up-the...

- https://nnethercote.github.io/2023/03/24/how-to-speed-up-the...

It's hard work. Small AST changes often require hundreds of changes to the code. The required changes usually make the AST less ergonomic to work with. And the perf benefits I obtained were very small. Even after shrinking `ast::Expr` (by far the most common AST node kind) from over 100 bytes to 64 bytes on 64-bit.

The linked Zig PR has very impressive reductions in walltime/cycles/etc. but if you read closely it's restricted just to parsing, which is usually a very small slice of compile time, at least for rustc. My experience with these kinds of changes was disappointing. I concluded "I’d love to be proven wrong, but it doesn’t feel like this work is taking place in a good part of the effort/benefit curve."

Rusky · on Aug 26, 2023

Zig did go on to apply the same style to their two later IRs, ZIR (https://github.com/ziglang/zig/pull/8266) and AIR (https://github.com/ziglang/zig/pull/9353). These don't look as thoroughly benchmarked (at least not on the PRs) but it looks like they got significant wins there as well.

Of course Zig is a very different language and its compiler handles a rather different workload. It's totally possible that their approach makes more sense in younger codebases, or with a different source language design, or whatever else. But I also don't think node size tells the whole story- there's a synergy between memory usage, memory layout, and memory access patterns. For example Cranelift gets a lot of mileage from tweaking their algorithms in combination with their data structures, e.g. the "half-move" design mentioned in https://cfallin.org/blog/2022/06/09/cranelift-regalloc2/#per...

AndyKelley · on Aug 26, 2023

We also migrated all compile-time types and values to be stored in a similar fashion: https://github.com/ziglang/zig/pull/15569

Perf was a bit of a wash on this one but it means we can serialize most of the compiler's state with a single pwritev syscall. For a 300,000 line codebase, this data takes up 30 MiB and uses the same format on disk as in memory. On my laptop, 30 MiB can be written from memory to disk in 25ms. This is one puzzle piece for incremental compilation. More puzzle pieces are listed in the PR description here: https://github.com/ziglang/zig/pull/16917

AndyKelley · on Aug 26, 2023

This is interesting work - thank you for sharing!

junon · on Aug 25, 2023

I guess I'm in the thought camp (a diminishingly small camp, if HN is any measure) that compile times don't matter. If you want short compile times, use interpreted languages. I appreciate the checks, validation, and optimization a compiler gives, and Rust's cuts down on total development time because of the extensive checks it performs.

I remember a time where C++ projects took hours to build that today only take a few minutes. Perhaps that's an assist when 'dealing' with compilation times.

I also don't think tests should be blazing fast. I'd rather correct tests than fast tests, just like I'd rather a thorough compilation over a fast one.

Anecdotally, even with a project that has an upwards of 50 dependencies, many of them *-sys crates, I've never thought that compilation times were that bad.

Speaking of interpreted languages, Javascript projects have historically taken exponentially longer to compile (looking at you, Meteor) and are rarely incremental.

That's not to say "don't optimize things" of course but there are seemingly many people very upset or critical of Rust's compile times that I've never fully understood.

dllthomas · on Aug 25, 2023

I use rust because of the value it brings for the reasons you say and for others, but it would bring me even more value if it built faster. Compile times absolutely matter; how they trade off against other things will depend on the particular project.

stephc_int13 · on Aug 25, 2023

A lot of us realized over the years that compilation time was a productivity killer.

The workflow can be adapted to slow compilation, obviously, with some discipline and a less iterative process.

But the ideal "flow" for many coders is attained with zero interruptions and being able to write-compile-test in a seamless fashion.

I think that the success of interpreters in many fields is related to this.

Modern compilers should maybe be split in two, a fast compile/slowish code/loose tests version and a "checks+release" one.

We have incredibly fast computers, instant compilation is feasible.

amenghra · on Aug 25, 2023

We could do day-to-day Rust/C++/any compiled language related development with an interpreter/JIT and then compile the code once a PR is ready and the code hits CI/CD or gets deployed. The "instant" runtime doesn't even need to have the exact same behavior -- it can be 99% accurate and still be very useful.

rewmie · on Aug 25, 2023

Why would anyone subject himself to that?

All mainstream build systems already support incremental builds, and in some cases it's even possible to leverage compiler cache to further shave down the time it takes to rebuild a project.

Breaking down a monolithic project into submodules is all it takes to shave off a big chunk of time from a build.

amenghra · on Aug 25, 2023

Imagine being able to set a breakpoint in a function, change something and ask the debugger to “travel” back in time and pretend the code was always what it’s been changed to. It’s something some languages give you but I’m not sure an incremental compiler is sufficient?

These kinds of few seconds/minutes shaved add up and can impact one’s productivity.

tomjakubowski · on Aug 25, 2023

for that kind of thing I would imagine more late binding in the language is helpful

at the level Rust works, late binding is hard to do but I don't think it's impossible. Name mangling and calling conventions are a couple big obstacles.

silverwind · on Aug 26, 2023

> Modern compilers should maybe be split in two, a fast compile/slowish code/loose tests version and a "checks+release" one.

That's exactly what Vite is doing for JS. Nearly instantaneous dev build, and slow but thorough prod build which does all the expensive optimizations.

It seems odd that compiled languages have not picked this up yet.

paulddraper · on Aug 26, 2023

TypeScript is a perfect example.

It can compile with typechecking, or, due to the design of the language, without any consideration of types at all.

This later mode is extremely fast, particularly with per file incrementality.

(Unfortunately for reasons I cannot fathom, this is not part of tsc CLI. You have to use something like Babel that uses the TS compiler API.)

oreilles · on Aug 25, 2023

I don't understand this take. Correct and fast are absolutely not mutually exclusive, wether it apply to tests or compilation. The rust compiler is not slow because of all the checks it performs, it's slow because of the way those checks are implemented.

> If you want short compile times, use interpreted languages

Interpreted languages are not compiled, so this sentence doesn't make sense.

diarrhea · on Aug 25, 2023

> Interpreted languages are not compiled, so this sentence doesn't make sense.

Usually it's a moot point, but in the context of the post it isn't. Python, Lua and JavaScript are compiled to bytecode / intermediate representations. That's compilation, but usually not mentioned, as it's so fast. However, in this context, that speed is the exact point. Use interpreted languages if compilation times (or lack thereof) are a concern!

The lines are blurry though. C# and Java are also interpreted languages (compiled to intermediate representation, then executed on the language runtime aka VM; possibly JITed to native machine code), but no one calls them that.

oreilles · on Aug 25, 2023

There is a very clear difference between compilation and interpretation: in scripting languages, a statement does not depend on the following ones to be executed, meaning that the code can just be read through. Compilation is different because the binary produced need to include the whole program, therefore the whole code has to be read and analyzed before the bytecode can be produced. Compile litterally means "group together".

So compilation time obviously only applies to compiled languages.

Saying that the compilation time is not a concern to you and shouldn't be to anyone is also very short sighted. I'm pretty confident that you would consider it to be a problem if a "Hello world" took a week to compile. Maybe you meant "Compilation is fast enough for me", which is fine, but there is no reason why your speed expectation should be more legitimate that anyone else's.

rat87 · on Aug 25, 2023

There is not a clear differentiation. For one thing just because the primary implementation is a compiler doesn't mean you can't also have an interpreter.

There is no requirement that a compiler has to produce a binary. Compilers can also produce byte code or JavaScript or C code.

Also very few languages are purely interpreted.

Python, Ruby, and Pearl aren't (purely) interpreted. They're compiled

to bytecode, then that bytecode is interpreted. At least for primary version there's also jvm/.net versions of python and ruby and pypy(jit). Compilation happens at import/run time. Emscripten uses llvm to compile c/c++/rust into JavaScript.

Java and C# are recompiled into byte code which is then interpreted or jit compiled or both.

naikrovek · on Aug 25, 2023

took the words right out of my mouth.

compilation speed matters. it matters to some more than others, obviously, but it absolutely matters, and to some it matters a great deal.

mikebenfield · on Aug 25, 2023

> in scripting languages, a statement does not depend on the following ones to be executed, meaning that the code can just be read through

I'm pretty sure this is nonsense and is not a consistent way to distinguish languages that are considered scripting languages and languages considered compiled languages.

> Compile litterally means "group together".

Maybe so, but as a technical term in computer science, it means translate from one language to another.

That said, I definitely agree that compile times matter.

dragonwriter · on Aug 25, 2023

> There is a very clear difference between compilation and interpretation: in scripting languages, a statement does not depend on the following ones to be executed, meaning that the code can just be read through.

But most scripting languages are usually compiled (even if not to always AOT and not always to native code, but instead to bytecode of some kind), so how can a difference between compilation and interpretation, even if it was a valid one, be relevant here?

Guvante · on Aug 25, 2023

Bytecode encoding an interpreted language is only superficially similar to compilation.

After all you can't perform meaningful optimizations when the environment your code is running completely defines what the code does.

Sure you don't know the details of how syscalls work but you do know the exact process necessary to call a function in compiled languages. Interpreted languages tend to just emit the original names in the bytecode because you don't know want the function `sin` when that is environment dependent.

dragonwriter · on Aug 25, 2023

> Bytecode encoding an interpreted language is only superficially similar to compilation.

No, its literally compilation, not something superficially similar to it.

> After all you can’t perform meaningful optimizations when the environment your code is running completely defines what the code does.

The environment your code is running on completely defines what the code does in all cases, and yet, meaningful optimization is possible, and bytecode for abstract/virtual machines are common for both things both for things that are commonly called scripting languages and things that generally are not. (Sometimes, the implementation of an abstract/virtual machine may involve another, JIT, compiler to some other – maybe native, maybe not – target.)

It’s true a lot of the languages called “scripting” languages have semantics that are structurally very difficult to optimize, but that has nothing to do with “interpreted vs. compiled”.

Guvante · on Aug 26, 2023

You have come up with your own definition that conflicts with what everyone is saying and are wondering why there is uncertainty...

horsburgh · on Aug 26, 2023

I beg you, please don't speak with such certainty about definitions which you're not certain of, particularly when you've already been corrected. Take the opportunity to learn something new. Dragonwriter is using exactly the definition that people who work in this area use.

My copy of the red dragon book defines a compiler as a program for translating from one language to another. A translator from source to bytecode clearly meets this definition. If you have a problem with that definition then you're on pretty shaky ground since you're essentially making the argument that you know more about this sort of thing than Aho, Lam, Sethi and Ullman.

The distinction between a compiler and an interpreter is most clearly illustrated by old school (pre-JIT) java, where javac is clearly a compiler which translates Java source to Java bytecode (language -> language) and the java executable itself which interprets bytecode.

That distinction still exists in most interpreted language implementations, though not all. Python being a good example. It has a compiler which translates source to bytecode and an interpreter which executes the bytecode. They just both exist in the same program but crucially it certainly includes a compiler.

Guvante · on Aug 26, 2023

My point wasn't well made you are correct.

It was that there needs to be a distinct target language.

Encoding the source language in a more compact form without meaningfully changing it beyond superficially isn't compilation.

Like a web server that compresses JavaScript isn't a compiler.

My point was anything that outputs something that has to be interpreted is on sketchy grounds: if you are interpreting it how is that a distinct language from the original.

CPython can interpret py files or compile them into pyc.

CPython does not interpret the resulting bytecode. It executed on its virtual machine that bytecode.

Similarly Java and .NET languages are compiled into a bytecode which is executed on a virtual machine.

You don't interpret a compiled output.

The confusion comes from languages which support both like Python. We call Python interpreted as a shorthand for "can be interpreted" but modern Python often is compiled for the enormous performance benefits.

To reiterate you don't interpret bytecode you translate or execute it on a virtual machine.

dragonwriter · on Aug 26, 2023

> You have come up with your own definition

No, I haven’t. Compilation is well-defined (unlike, really, "scripting language", though there are clearly languages that are frequently called that despite the absence of any coherent definition bounding the class), and its unquestionable that many implementations of languages usually called "scripting languages" use it.

> and are wondering why there is uncertainty…

No, I’m not “wondering” anything, and “uncertainty” isn’t even part of the discussion.

Guvante · on Aug 26, 2023

I never said scripting languages can't have a compiler.

I meant a compilation step means you aren't interpreting anymore.

Apologies for my inexact verbiage.

yawaramin · on Aug 26, 2023

> in scripting languages, a statement does not depend on the following ones to be executed

I agree with most of your comment, but how do you explain languages with end-of-block markers? E.g. in Ruby, if you stop execution before the end you will surely get an error:

    def test
      puts "test"
    end

lordnacho · on Aug 25, 2023

I wouldn't say compile time doesn't matter, just that it's highly non-linear. The breaking points are:

- You can restart the program as you're typing it. I've been looking at Elixir lately, and if I find a bug I can fix it and immediately look at a page that's got my change.

- You can compile it and it will be done before you async to something else, so you aren't waiting long. Typically you also have a good idea as to whether it will compile.

- You hit compile, and you know you have to find something else to do. Sometimes it will break when you're doing this other thing and the time lost is even greater. I think this used to be normal in the punch-card era.

diarrhea · on Aug 25, 2023

> Rust's cuts down on total development time because of the extensive checks it performs

Plus, there's rust-analyzer/LSP. It works in real-time, and catches basically everything one might care about, no compiler needed. It's kind of like a static interpreter.

My use cases are modest, but so far, once r-a was happy, the compiler was as well. Mileage might vary when using crate features, possibly? Inactive features are (until Rust 1.72 at least) invisible to the compiler so issues might arise only at compile-time there.

> I'd rather correct tests than fast tests, just like I'd rather a thorough compilation over a fast one.

The first one is a false dichotomy though. It's not like people write fast but incorrect tests whenever given a keyboard.

random_ · on Aug 25, 2023

You can enable optional features on Rust analyser. For example, to enable ssr Leptos feature on VSCode, you'd need to create a .vscode/settings file:

{ "rust-analyzer.cargo.features": [ "ssr" ] }

Currently, though, you cannot enable the incompatible "hydrate" feature at the same time, sadly.

rewmie · on Aug 25, 2023

> I also don't think tests should be blazing fast. I'd rather correct tests than fast tests, just like I'd rather a thorough compilation over a fast one.

That's a false dilemma, and a poorly thought out strawman.

The goal is not to pick between fast tests and correct tests. The goal is to pick between correct tests that are fast, and correct tests that are slow. There is absolutely zero reasons to prefer slow tests.

yawaramin · on Aug 26, 2023

> If you want short compile times, use interpreted languages.

Or, you know, languages with fast compilers. Like Pascal, Go, OCaml.

> I'd rather correct tests than fast tests

I'd rather both at the same time.

izacus · on Aug 25, 2023

What's the biggest, oldest project you've ever worked on?

junon · on Aug 25, 2023

A tie between the Linux kernel and LLVM (specifically Clang). Also making game engines gets pretty hefty quite quickly. That's just open source work though. I've worked on larger systems at some Big Tech that were unwieldy, too.

izacus · on Aug 25, 2023

And at no point were you bothered by the timewaste C/C++ compiling (and especially, unavoidable linking!) brings to your work process, CI process and general ability to quickly debug and iterate on a hard problem?

pertymcpert · on Aug 27, 2023

It depends on the problem. For some problems where you’re spending time mostly thinking, and relatively little time actually typing code, the compile times don’t really matter that much.

For LLVM for example, with clangd configured to do code completion and syntax checking in the editor, you don’t really have to do as many build/test cycles.

That said, I think fast compile times are important for the other use cases.

chaxor · on Aug 25, 2023

I am a bit surprised about this as well, since my perspective on a Rust comparison to C++, one of the more noticable issues is the extra verbosity rather than compile time.

You're right about switching to interpreted languages for this reason as well. Python is so incredibly simple to write, and doesn't have any compile time at all. I think a better trade off is a language that has a long compile time, is very fast and secure, but has simpler syntax.

Effectively a rust based language with simpler syntax.

I recognize some of these exist, but they need to be popular enough to use as well.

papruapap · on Aug 25, 2023

Go also have fast compilation times btw (:

mfrw · on Aug 25, 2023

I don't think the compile times of Go can be even compared with Rust or any major compiled language.

It does compile extremely quickly; Rust on the other end lies completely opposite on that spectrum IMO.

Although, both languages are "modern", both have their pros & cons.

Although this may sound like a pink elephant, but I would really want something that has the guard-rails, expresiveness & performance of Rust with the simplicity & compile-time speed of Go.

3836293648 · on Aug 25, 2023

Go is not modern and has never been modern. It's an old language that just happened to be designed and implemented recently.

yawaramin · on Aug 26, 2023

You are looking at only the language, you should also evaluate the toolchain and runtime. Go has a toolchain that most other languages (including Rust) can only dream of.

yawaramin · on Aug 26, 2023

> Although this may sound like a pink elephant, but I would really want something that has the guard-rails, expresiveness & performance of Rust with the simplicity & compile-time speed of Go.

This is basically OCaml.

mfrw · on Aug 26, 2023

Sure, thanks for the tip.

I will take a look :)

belmont_sup · on Aug 28, 2023

Ocaml fan here but don’t expect crazy productiveness as you’d find yourself in Go unless you know fp and aren’t doing a lot of web dev.

Ygg2 · on Aug 25, 2023

And a GC.

Ygg2 · on Aug 25, 2023

I hear ya. But people are vocal about those. And thus Rust optimized those.

Others (a game dev acquaintance) complains about debug performance being abysmal.

At the end of the day, what people are vocal and more importantly willing to sink their time/money into will decide what happens.

Olreich · on Aug 25, 2023

> Compilers involve giant tree structures, with lots of allocations, pointer chasing, and data-dependent multi-way branches.

Tree structures and data-dependent multi-way branches do seem inevitable to me, but pointer chasing and allocations don't seem inevitable at all. There are lots of ways to avoid allocations or make allocations practically free. Similarly there are lots of ways to make pointer chasing less prevalent, especially getting down to single-indirection.

rustc may not be able to fix their allocation or pointer chasing problems, but this is not endemic to the problem of compiling.

di4na · on Aug 25, 2023

No but it is endemic to how we think about compilers up to now, and especially when the rust compiler arch was massively developed, more than 10 years ago.

I recommend a watch of this for a recent take on trying to change how we think of these. It is an open research field

https://m.youtube.com/watch?v=ZI198eFghJk

nindalf · on Aug 25, 2023

What I find most impressive is improving performance while other developers continue to land new features in the same code base.

pornel · on Aug 25, 2023

I’m impressed that this is the original code base dating back to the first lines of self-compiling Rust. They’ve managed to rearchitect and refactor it multiple times instead of rewriting it.

meisel · on Aug 25, 2023

Why would they use SipHash at all? That’s a significantly slower hash algo than, say, wyhash, and any additional DOS protection SipHash might have (assuming it does) seems irrelevant for this use case

loeg · on Aug 25, 2023

Right? Jump straight to one of the modern, fast, non-cryptographic hashes.

fulafel · on Aug 26, 2023

Would be interesting to hear more behind this:

> Compilers involve giant tree structures, with lots of allocations, pointer chasing, and data-dependent multi-way branches. All the things that modern microarchitectures handle poorly. There is research into different designs that avoid these problems, but they involve data structures that can be highly non-ergonomic.

A lot of graph algorithms are being expressed in matrix form. Like the current ML stuff.

jmmv · on Aug 25, 2023

How much is slow compilation caused by the compiler, and how much is it caused by the crates ecosystem and the many, many, many dependencies a project gains?

I'm genuinely curious because a couple of projects I have written are not that large per se, but I don't like their compilation times too much. What I have observed is that I have ended with 100-200 (indirect) dependencies, and compilation slows down the closer it gets to the "leaf" of the process. In other words: by the time compilation reaches my own code, my code seems to take much longer to compile relative to the intermediate (larger) crates. And it's not link time (I've switched to mold).

iknowstuff · on Aug 25, 2023

Most of those are trivially paralllelized so it's not a problem. It gets slower towards the end because e.g. your crate needs all of its dependencies in place before it can be compiled.

darksaints · on Aug 25, 2023

This brings up a potentially huge optimization waiting to happen. When you have a Rust dependency, you compile the whole crate before you can even start compiling your code. But because of the depth of the ecosystem, there is a lot of code packaged into crates that may not need to be compiled...the crate might only used for a single function. There are probably dozens of ways to do this, but a quick parse and analysis of the target code could eliminate a large amount of dependency compilation.

epage · on Aug 25, 2023

I do know there is vague / high-level interest in this kind of approach though I expect this would require a lot of work from the compiler, especially when it comes to figuring out what traits need to be compiled.

Already, rust splits crate compilation into two passes where dependents are only blocked on the first pass and linking on the second. Maybe an intermediate is to do the first pass everywhere and then use that to decide what can be dropped from the second pass.

pas · on Aug 25, 2023

traits or not, isn't it all compiled down to impls? and if some are not called, they can be dropped (dead code elimination basically)? of course this would require tracking the full call graph, right? (which the compiler might or might not do already; or might not be able to do due to memory constraints.)

scns · on Aug 25, 2023

> This is a fun one. @Dragon-Hatcher created a very unusual Rust program: a chess engine that computes things at compile time only using Rust’s Turing-complete trait system. (They also have a TypeScript version of the same program.) This causes vast numbers of trait obligations to be generated for the compiler’s trait solver. In this PR I made some small changes to a de-duplication step that runs on collections of these obligations, which reduced both the compile time and peak memory usage of this crate by almost 7x! It also gave icount wins of up to 4% across a number of secondary benchmarks.

Type System Acrobatics have their use.

stephc_int13 · on Aug 25, 2023

The interesting point about this work is that this compiler is already close to its asymptotic speed. It unlikely will get much faster in the future.

dralley · on Aug 25, 2023

Nothing in this blog post says this, or even implies it. Just because expecting a 10x improvement is unrealistic doesn't mean that it's impossible for the Rust compiler to get "much" faster for a more reasonable definition of "much" (say, 2x). He does say that there likely won't be some step change, but rather the result of lots of small improvements over a long period of time.

stephc_int13 · on Aug 25, 2023

For any software, the optimization process tend to follow a curve, with diminishing returns over time, thus, the asymptote.

What I am assuming is that the author is competent and is already doing a good job.

ilSignorCarlo · on Aug 25, 2023

August is almost over. Is there a similar guide for September?

SkiFire13 · on Aug 25, 2023

Yeah, just give the author a time machine so that they can post the same blog at the end of September.

Jokes aside, this is not a guide on how to tweak your compiler settings or stuff like that, but rather it's about changes in the compiler source code that just make it faster for everyone. You just need to wait 6-12 weeks for them to come to stable though.

thraway3837 · on Aug 25, 2023

The author is a member of the Rust team. https://nnethercote.github.io/about-me.html

Their blog posts are also posted multiple times on Hacker News by different identities.

https://news.ycombinator.com/from?site=nnethercote.github.io

Seems odd that this wouldn’t go into an official Rust blog when you’re a member of the team and instead published on your personal domain and blog.

nnethercote · on Aug 26, 2023

I'm a member of Futurewei's Rust team, and I'm in the Compiler Contributors group.

I started these posts back in 2016 when I wasn't a member of either. It's been a long-running series, and there's never been much reason for them to go on the official Rust blog, because they get enough attention on my personal blog.

In the past I have posted links to Hacker News but usually they don't make the front page and get few if any comments, so I stopped bothering, but others sometimes do. I always post to /r/rust where the level of discussion tends to be higher-quality than HN, because there is a higher level of Rust knowledge.

kzrdude · on Aug 25, 2023

Why is that odd? It's too specialized (and too many updates) to have his updates on the rust blog.