Th64: Tiny Hash Function in C

aappleby · on July 4, 2024

Hi, MurmurHash and SMHasher author here.

With the code-golfing obfuscation unwound, Th64 is structured very similarly to MurmurHash. The source data is pre-mixed via "rol(x * r + i, 31)", the hash is mixed via "hash = rol(hash * r ^ x, 31)", the tail bytes are mixed via "hash = hash * r ^ x", and the finalization mix is three "hash = (hash ^ (hash >> 31)) * r" rounds.

I see nothing wrong with it if it passes SMHasher3, but it's also not radically different than other hashes - it's just smooshed down into four lines.

fasteo · on July 4, 2024

>>> Hi, MurmurHash and SMHasher author here.

Only in hacker news

aappleby · on July 4, 2024

What did you expect?

bhickey · on July 5, 2024

On Monday I've got a fantastic hashing bug to show you.

aappleby · on July 7, 2024

Ooooo.....

tidwall · on July 4, 2024

Hi HN. I’m the author of this hash. It’s just something I whipped up last night. It’s really not much different than the other hashes out there. I wanted to see if I could make a simple quality hash that easy to cut & paste, under 80 chars lines, and self contained.

Otherwise just an experiment.

ronsor · on July 4, 2024

This is cool, but note:

* ~~It requires two passes over the data to hash (notice the two while loops).~~ Actually it doesn't.

* The first loop casts the data pointer to a 64-bit int, so you will get different results depending on endianness. I don't remember if this is undefined behavior or not.

bobmcnamara · on July 4, 2024

It's not the endianness that's undefined, it's the dereferencing. 75% chance of this causing an alignment trap on my current development target.

jsheard · on July 4, 2024

And even on platforms that allow unaligned access, isn't there usually a performance penalty when the access straddles two cache lines? Still best avoided if possible.

bobmcnamara · on July 4, 2024

I don't often deal with CPU-coupled caches, but I could see a system where that happens.

Cortex-M4, which doesn't have integrated caches, breaks each 32-bit load into 1-3 aligned loads of between 8 and 32 bits according to the address %4. Cortex-M7 performs each 32-bit load as either 1x or 2x 32-bit aligned loads depending on address % 4.

I agree - especially in cases like this algorithm, where there is only one memory stream, it's often worth unrolling up to 3 or 7 accesses before the big aligned loop.

benhoyt · on July 4, 2024

I could be wrong, but I think it just goes over the data once. The first loop increments i by 8, so is processing 64 bits at a time. It looks like the second loop only does the remaining few bytes, if any.

rurban · on July 4, 2024

Right. To make it work you need to copy the first unaligned bytes, and then loop over the aligned parts.

ronsor · on July 4, 2024

You're right. I missed that in my original readthrough.

jacknews · on July 4, 2024

Is the obfuscated-C style really necessary? Last I checked, compacting C source did not directly result in tighter object code.

o11c · on July 4, 2024

It's not just unnecessary, it's wrong since it casts a potentially-unaligned pointer, instead of just using `memcpy`.

It will happen to work in some common cases if the input is exactly the output of `malloc`, or the same after certain kinds of fixed headers, but will fail for many other buffers.

mnurzia · on July 4, 2024

This is super cool. Any insight into how this may have been designed? High-quality, compact hash functions are really useful in C.

Yoofie · on July 5, 2024

Slightly off topic: Does anyone know of a good/fast hash function (similar to this) for use in 32bit embedded systems (such as Cortex-M, TriCore, C2000, etc)?

IChooseY0u · on July 4, 2024

why not just use fnv64

EPWN3D · on July 4, 2024

I get that expressing the function in 6 lines is cool and all, but it'd be nice if it was readable.

tidwall · on July 5, 2024

Here you go. https://gist.github.com/tidwall/332fc3791a23b038c6b3e73aba02...

tpoacher · on July 5, 2024

Great! Now do it in brainfuck!

_V_ · on July 4, 2024

Neat, but I'm personally missing some key info here - most importantly "why?". What problem does this solve or why is this particular function good or better than other "standard" hashing functions?

And I have to admin that this kind of obfuscation/minification is especially dangerous in languages such as C as there are quite a lot of footguns already as it is. I don't think that minifying C produces faster programs :)