(disclaimer: i wrote that post) It is not. We rely on more than User Agents beca...

PittleyDunkin · on Oct 31, 2024

> There are other signals we see that confirm whether the request came from a "legitimate" AI scraper, or a different scraper with the same user agent.

Great! What are these signals? That seems to be the meat of the post but it's conspicuously absent. How are we supposed to validate the post?

signatoremo · on Oct 31, 2024

> how are we supposed to validate the post?

Imagine you were a vendor who were trying to trick the author into divulging his methods. Can a stranger on the Internet be trusted?

dgfitz · on Oct 31, 2024

I imagine if that information is disclosed, you won’t be able to verify it, as it will be bypassed… because it was disclosed.

PittleyDunkin · on Oct 31, 2024

What a wonderful world we live in where serious people are expected to believe press releases based purely on brand prestige.

dgfitz · on Nov 2, 2024

There’s a lot of assumptions in that comment.

superkuh · on Nov 1, 2024

So user-agent and whois to see if it's coming from a plausible netblock and accept: strings, http header stuff?