There are only 2^32 IPv4 addresses, if you know the nonce you just try them all....

hyperbovine · on June 6, 2021

But of course, the real reason is that those ips are worth analytics $$$.

cout · on June 6, 2021

It's also useful forensic data if your site is ever hacked.

spacemeetstab · on June 6, 2021

An example of using IPs to combat abuse is Wordfence. It's a WordPress plugin which blocks traffic from known abusive IPs. A quick glimpse at the "live traffic" for one of my websites reveals several IPs within the last hour that have attempted to access the site which were blocked.

A site I was repairing after a hack fortunately had server logs which included IP data. That IP allowed me to identify the specific exploit used.

So, there are definitely uses for IP data in security terms.

xvector · on June 5, 2021

If you use a difficult hash function that takes ~1 seconds to calculate then it would take over 120 years to iterate through the IPv4 address space. At the very least, this could cut down on dragnet surveillance

gizmo686 · on June 5, 2021

This requires that you add ~1 second of latency to every request that requires you to hash the IP. Even if we assume relatively aggressive caching, this is still incredibly unacceptable from a user experience perspective.

Assuming you do that, you are looking at about 1193046 hours to hash the entire address space. More specifically, you are looking at 1193046 CPU hours.

You can rent a 96 vCPU c5.24xlarge instance from AWS for a rate of $4.08/hour; or $0.0425/CPU-Hour. Assuming this offers the same per-cpu hashrate as the general purpose web-server, you are looking at a cost of $50,704 to construct a rainbow table. That is no where near a prohibitive sum of money.

You can probably reduce the cost by shopping around for compute or using bare metal. You could see significant cost reductions by using hashing optimized ASICs.

Combine this with the fact that no website is going to spend 1000ms just computing the hash for every request (even if you allow for caching). And the fact that they can probably narrow down the address space they are interested in considerably if they wanted to save money.

2^32 is just too small of an asymmetry between legitimate use and an attack to be a viable defense.

xvector · on June 6, 2021

From a user experience perspective, you can perform the computation asynchronously. There are also hash algorithms resistant to ASIC.

But yeah, everything else you said makes sense.

gizmo686 · on June 6, 2021

And now you have a ~1000ms latency between when some events happen, and when you can log them. Even assuming all such events get logged, you will be left with a jumbled mess of out-of-order events.

tremon · on June 6, 2021

Why does your logging system rely on the order of entry insertion and not on the entry timestamp?

542458 · on June 5, 2021

Yes, but then I’m burning a second of compute time every time I want to log something.

Also, by removing unlikely candidates (IPs owned by irrelevant entities or that are not US based) you could get the search range much much smaller, and with the FBIs budget you could probably compute it all in a few days even with a 1-second hash time.

nullc · on June 5, 2021

But then a single user clicking on links quickly would bring your webserver to its knees. So much for using those addresses to combat abuse... :)

Plus the FBI could probably narrow their search to a few hundred thousand addresses (relevant ISPs, no unroutable/multicast/etc), then only use the list to confirm.

Finally, if it takes 120 years on one core, it'll take 1.4 months on 1000 cores. I'm willing to be the FBI has access to more computing power than I do. ~100 CPU years isn't a particularly daunting amount of computing work, even for fairly low stakes research.

That search would also decode all addresses in the logs, not just one targeted one...

b9a2cab5 · on June 6, 2021

1 second on a CPU can easily be 100x faster on a GPU, then distributed over 1000's of GPUs. For reference argon2 was supposed to be an ASIC-resistant, GPU-resistant memory-hard hashing algorithm, but a K20X from 2013 is 5x faster than a CPU [1] and GPUs have only gotten faster since then compared to CPUs.

[1]: https://github.com/WebDollar/argon2-gpu

xwolfi · on June 5, 2021

The best model would be to display publicly commenters IPs, never store readers', store error logs (like people bruteforcing a password).

You d have a triple virtuous effect: people would stop being such insuferable asses once they understand basically their name is on the comment, readers would be completely safe because why not and abusers would be logged still.

It's even probably what most websites do: it news to me to keep the IP of every visitor, I'd have pruned them.

taneq · on June 6, 2021

And then my modem reconnects and I get a new IP that used to belong to some insufferable asshole, and suddenly I’m blocked / blackholed / shadowbanned everywhere and some vigilante is flood pinging me.

nexuist · on June 6, 2021

Bingo. IRC tried the strategy of banning users by IP and half the time you'd end up k-lining entire countries because their ISPs were too cheap to buy more endpoints.

Scoundreller · on June 6, 2021

Maybe in the 56k days, but my DOCSIS ISP rarely re-assigns IPs.

techbio · on June 6, 2021

Any examples? I like the transparency and self-filtering. What is/isn't this approach suitable for? Anonymous is a very common pen-name.