With the code-golfing obfuscation unwound, Th64 is structured very similarly to MurmurHash. The source data is pre-mixed via "rol(x * r + i, 31)", the hash is mixed via "hash = rol(hash * r ^ x, 31)", the tail bytes are mixed via "hash = hash * r ^ x", and the finalization mix is three "hash = (hash ^ (hash >> 31)) * r" rounds.
I see nothing wrong with it if it passes SMHasher3, but it's also not radically different than other hashes - it's just smooshed down into four lines.
Hi HN. I’m the author of this hash. It’s just something I whipped up last night. It’s really not much different than the other hashes out there. I wanted to see if I could make a simple quality hash that easy to cut & paste, under 80 chars lines, and self contained.
* ~~It requires two passes over the data to hash (notice the two while loops).~~ Actually it doesn't.
* The first loop casts the data pointer to a 64-bit int, so you will get different results depending on endianness. I don't remember if this is undefined behavior or not.
And even on platforms that allow unaligned access, isn't there usually a performance penalty when the access straddles two cache lines? Still best avoided if possible.
I don't often deal with CPU-coupled caches, but I could see a system where that happens.
Cortex-M4, which doesn't have integrated caches, breaks each 32-bit load into 1-3 aligned loads of between 8 and 32 bits according to the address %4. Cortex-M7 performs each 32-bit load as either 1x or 2x 32-bit aligned loads depending on address % 4.
I agree - especially in cases like this algorithm, where there is only one memory stream, it's often worth unrolling up to 3 or 7 accesses before the big aligned loop.
I could be wrong, but I think it just goes over the data once. The first loop increments i by 8, so is processing 64 bits at a time. It looks like the second loop only does the remaining few bytes, if any.
It's not just unnecessary, it's wrong since it casts a potentially-unaligned pointer, instead of just using `memcpy`.
It will happen to work in some common cases if the input is exactly the output of `malloc`, or the same after certain kinds of fixed headers, but will fail for many other buffers.
Slightly off topic: Does anyone know of a good/fast hash function (similar to this) for use in 32bit embedded systems (such as Cortex-M, TriCore, C2000, etc)?
Neat, but I'm personally missing some key info here - most importantly "why?". What problem does this solve or why is this particular function good or better than other "standard" hashing functions?
And I have to admin that this kind of obfuscation/minification is especially dangerous in languages such as C as there are quite a lot of footguns already as it is. I don't think that minifying C produces faster programs :)
With the code-golfing obfuscation unwound, Th64 is structured very similarly to MurmurHash. The source data is pre-mixed via "rol(x * r + i, 31)", the hash is mixed via "hash = rol(hash * r ^ x, 31)", the tail bytes are mixed via "hash = hash * r ^ x", and the finalization mix is three "hash = (hash ^ (hash >> 31)) * r" rounds.
I see nothing wrong with it if it passes SMHasher3, but it's also not radically different than other hashes - it's just smooshed down into four lines.