I can believe it’s less of an issue for primitive smaller than long/double where...

marginalia_nu · 2025-11-02T20:03:51 1762113831

That's a good start that explains some of the memory overhead (along with the sizable Java object header), but we also need to take into account memory locality to explain why this is so much slower.

Main memory access is at worst case order of 100x slower than a cached read. With boxed primitives you very often looking and main memory access, whereas naked primitives can (when the planets align) amount to cached memory access.

cyberax · 2025-11-02T20:51:05 1762116665

Tagged pointers don't buy as much performance as you'd expect in Java. That's because the JVM is highly multithreaded, and the Java memory model guarantees memory safety (unlike in Go, for example). So every pointer load from RAM will need a check for the tag bits. And you end up with your code full of branches.

From practical experience, JVMs have had an option to use compressed pointers for inner fields for two decades ( https://wiki.openjdk.org/display/HotSpot/CompressedOops ). It saves a bit of RAM, but often results in slower code.

More recently, the new ZGC collector started using colored pointers, there's a good presentation about it: https://inside.java/2025/10/06/jvmls-zgc-colored-pointers/ It's also been a mixed bag, performance-wise.