Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tragically LOOP is alive and well in x86-64.


EDIT: Blergh, confused LOOP with REP, but keeping the below comment so the rest of the thread still makes sense.

FWIW LOOP isn't the worse thing in the world once you have dedicated silicon for it anyway generating micro ops in the instruction decode pathway. It's just a pretty cute run length encoding scheme for the instruction stream.


It's slow as sin, though. Just straight emulating it using more common instructions is like 4x better in most modern Intel CPUs. For some insane reason, it emits 8 uops on Skylake.


There is a reason why loop is (was made) slow: It was (in the 90s) explicitly made slow because it was used for timing loops. Making it faster would have broken existing software.

Source: https://stackoverflow.com/a/35743699

See also https://stackoverflow.com/questions/35742570/why-is-the-loop...


The things you link to don't seem to say that at all, they seem to say it got slow because it was hard to implement and no-one cared about it.


"IIRC LOOP was used in some software for timing loops; there was (important) software that did not work on CPUs where LOOP was too fast (this was in the early 90s or so). So CPU makers learned to make LOOP slow."

"(My opinion: Intel is probably still making it slow on purpose, and hasn't bothered to rewrite their microcode for it for a long time. Modern CPUs are probably too fast for anything using loop in a naive way to work correctly.)"


It's also very fast on AMD (not any slower than the equivalent dec/jnz), so use it if you want your software to run faster on AMD and slower on Intel...


Sure, it doesn't matter anymore because anyone who cares is going through the vector unit to do bulk transfers. But there was issues with doing unaligned base and length memory transfers for the longest time, well through x86_64's original design.


are you confusing LOOP with the REP prefix?


I absolutely am, thanks!


My gripe with LOOP and other crappy instructions is that they use up valuable space in the instruction encoding.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: