Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The scrappers by violating your wishes are doing something they shouldn't. My comment is not commenting about that. What I said doesn't mean the scrapper is any less wrong.

I'm basically saying 2 wrongs don't make a right here.

Trying to harm their system which might transitively harm someone using their system is unethical from my viewpoint.



So you're suggesting as a website operator I should do nothing to resist and pay a large web hosting bill so that a company I've never heard of should benefit? That is more directly harmful than this hypothetical third harm. What about my right to defend myself and my property?


You should block them, that is the ethical option.


If that worked this wouldn't be a discussion.

Most of these misbehaved crawlers are either cloud hosted (with tens of thousands of IPs), using residential proxies (with tens of thousands of IPs) or straight up using a botnet (again with tens of thousands of IPs). None respect robots.txt and precious few even provide an identifiable user-agent string.


As explained in the linked article, these bots have no identifiable properties by which to block them other than their scraping behavior. Some bots send each individual request from a separate origin.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: