Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In Assembly, even if you manage to beat the compiler, it might be a pyrrhic victory, because it might be lost when trying the same benchmark in another CPU or after getting a microcode update.

During the 80's and early 90's it was a different matter, because CPUs were dumb, hardware was relatively static specially on 8 and 16 bit consumer systems and high level optimizers were pretty dumb given the resource constraints of those platforms.



I’m not debating whether or not its a worthy endeavour though, I’m only saying that you can’t expect good performance out of assembly code unless you practice writing high performance assembly code. Most of us have a lot of experience with high level languages, so that we can write well performing high level code makes a lot of sense, but we shouldn’t expect that we can just “drop down to assembly” and get a performance boost, but that also doesn’t mean that its never possible, for the people who do actually do this a lot (eg the x264 people writing hand crafted SSE/AVX code)


It is an herculean effort trying to master modern Assembly.

Back in the day, you could easily know all opcodes for a given CPU, and their clock cycle timings.

This is the SIMD guide for the Intel CPUs,

https://software.intel.com/sites/landingpage/IntrinsicsGuide...

Which is only a tiny subset of all the opcodes that a modern Intel CPU is able to understand, let alone what AMD also offers.

You need tools like VTune from each CPU vendor to actually understand the CPU clock timings of each opcode in micro-ops (microcode execution unit).

While you can master a specific subset, like knowing AVX instructions, mastering Assembly back to back like in the old days, only when writing Assembly for stuff like small PIC microcontrollers.

Trying to master a language like C++ is easier, which says a lot about how modern CPUs look like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: