Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(disclaimer: i wrote that post)

It is not. We rely on more than User Agents because they are too often faked, so it is not just marketing. There are other signals we see that confirm whether the request came from a "legitimate" AI scraper, or a different scraper with the same user agent.



> There are other signals we see that confirm whether the request came from a "legitimate" AI scraper, or a different scraper with the same user agent.

Great! What are these signals? That seems to be the meat of the post but it's conspicuously absent. How are we supposed to validate the post?


> how are we supposed to validate the post?

Imagine you were a vendor who were trying to trick the author into divulging his methods. Can a stranger on the Internet be trusted?


I imagine if that information is disclosed, you won’t be able to verify it, as it will be bypassed… because it was disclosed.


What a wonderful world we live in where serious people are expected to believe press releases based purely on brand prestige.


There’s a lot of assumptions in that comment.


So user-agent and whois to see if it's coming from a plausible netblock and accept: strings, http header stuff?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: