gwittel's favorites | Hacker News

Especially when targeting realistic machine models, there are a lot of things that are suboptimal about the classical sorting algorithms like quicksort or mergesort. For example, a quicksort with perfect choice of pivot will incur a branch miss with probability 50% for every element. That's not something classical complexity analysis measures, but on actual CPUs, branch misses have quite an impact (something like 10–15 cycles). Shameless plug for something I once built for a lecture to demonstrate this: https://github.com/lorenzhs/quicksort-pivot-imbalance

For sorting algorithms that take the behaviour of modern CPUs into account, check out ips4o (https://arxiv.org/abs/1705.02257, code: https://github.com/SaschaWitt/ips4o) or for a simpler algorithm that's still much faster than quicksort in most cases, blockquicksort (https://arxiv.org/abs/1604.06697, code: https://github.com/weissan/BlockQuicksort). Note that both papers were published in the last two years :)

Of course these algorithms are much more complex and error-prone to implement and use some additional memory, which may explain why they're not used in standard library implementations of popular languages.