Google's reCAPTCHA makes it impossible to use large portions of the web once you...

kabacha · on May 31, 2019

Google's reCAPTCHA is cancer upon the web. Everyone should enable fingerprint block to shut this invasive and abusive garbage.

If everyone would block it the website owners would have no choice other than to move to a different captcha system.

smittywerben · on May 31, 2019

I'm logged in on a Chrome browser with a residential IP and get a reCAPTCHA 1-3 times a day when programming. It's the kind that I don't need to solve puzzles but still need JS enabled to click the button. So after the first few (SEO optimized) pages I either get fingerprinted or temp banned. Ugh this is getting ridiculous.

DRINK VERIFICATION CAN

rum3 · on May 31, 2019

Cloudflare must be mentioned when talking about recaptcha and cancer. They are the ones locking people out from whole websites and forcing you to fill out these recaptchas. They are also the ones who have almost destroyed browsing the internet using TOR due to these recaptchas.

philpem · on May 31, 2019

While I agree with you -- I'd also like to point out that >90% of malicious traffic to the websites I administer comes through the Tor network.

It shouldn't be the case, and I don't want to block people who have a legitimate reason to use Tor. Unfortunately there isn't a "block Tor traffic from assholes" option, so all I can really do to reduce the malicious traffic is block exit nodes.

yc12340 · on May 31, 2019

This has nothing to do with Tor. Cloudflare frequently blacklists entire countries/counties worth of people (and rarely reverts those blacklists). There is a good chance, that you have missed a lot Indian/Vietnamese/Russian/Chinese visitors, because Cloudflare concluded, that forwarding their traffic to your site isn't financially viable for them.

> Unfortunately there isn't a "block Tor traffic from assholes" option

What exactly is "Tor traffic from assholes"? Bulk DDoS attacks? E-mail spam? SSH login attempts? Please share your valuable experience with everyone here, so that all of us could stay safe by learning from your example.

tragicpapercut · on May 31, 2019

And for companies that don't do business with those countries - this is not a loss.

Most "asshole" traffic I see falls into one of two categories - attempts to exploit vulnerabilities (../../../etc/passwd stuff) and account takeover attacks.

The first I can forgive, I don't frankly care where that traffic comes from and the responsibility is entirely mine as website admin to prevent these types of attacks through good coding practices, WAF, etc.

The second I have less control over because customers / the general public sucks at security. They re-use passwords they've had for 10 years and won't opt-in to 2fa. And as a merchant, my company generally eats the cost of fraud that these attacks generally result in.

If no or little legitimate traffic is coming from Tor, and a significant percentage of malicious traffic is coming from Tor - at great cost to me / my company - why the hell would I allow it to continue?

rum3 · on May 31, 2019

One simple solution I can think of is to restrict POST requests from Tor exit nodes while still allowing GET requests. Cloudflare will give you a impossible-to-solve captcha even if you just try to visit site.com/index.html and I see no reason for this.

clubm8 · on May 31, 2019

Is the issue Tor traffic, or that you know what traffic is Tor?

There are many types of "abuse" (not just trolling) - mass downloading/scanning. (Ex: several types of port scanning can't be done via Tor since it doesn't support UDP)

confounded · on May 31, 2019

“Hm, works in Chrome. Are you using Chrome?”

philpem · on May 31, 2019

I know someone who loves giving this response. "Safari is junk, use Chrome and it'll work".

I also love sending him patches to show how easy it is to fix his stuff so it works in Firefox, Chrome, Edge... and of course Safari.

molf · on May 31, 2019

This will become a less acceptable answer the more popular other browsers become.

The only way to make that happen is to stop using Chrome and tell others to do the same.

rplnt · on May 31, 2019

"But you told me to use Chrome."

Google took over with shady practices, with the help of tech savvy people.

maxheadroom · on May 31, 2019

Not to mention Edge moving to the Chrome base; which further disenfranchises anyone from making sure it works in 'x' browser, anymore.

"It works in Chrome and Edge, which is based on Chrome, so what's your problem, again?"

dhimes · on May 31, 2019

As developers we should take a blood oath that we will always optimize for Firefox.

klez · on May 31, 2019

I see that your heart is in the right place, but I think as web developers we should take a blood oath that we will always optimize for standard compliance, instead. And for a standard that is not a moving target, while we're at it.

dhimes · on May 31, 2019

Yes- this is better. I think those two things will be well-aligned though.

lstamour · on June 1, 2019

But when do we move on? When most browsers implement something the same way, or when all do? What about polyfills? What do you do when you need a new API to better support a user's device with a new form factor, interaction model, wide colour gamut, resolution, background threads, etc.? Tell them to not upgrade? Stop the world? It seems impractical to suggest "target a standard: job done, go home..."

dhimes · on June 2, 2019

If we target standards, then the standards are driving. The browser gets supported when it builds to the standards. Perhaps the issue will then be getting standards in place quickly around new capabilities?

Then maybe the standards process needs disruption. But if we don't build to standards then we are building roads that only certain cars can drive.

rplnt · on June 3, 2019

This is unfortunately not true - browsers are driving. Especially when entity everyone uses (Google) also owns the most popular browser. They can, and did, implement non-standard features that only worked in Chrome. Super cool tech demos, you have to see it, just install this browsers from an advertising company. What could go wrong?

dhimes · on June 3, 2019

Hence a blood oath is required :)

anonymfus · on May 31, 2019

Well considering that Google already specifically blocks Chromium based Edge from its current YouTube version may be Recaptcha will not work in it soon too.

klez · on May 31, 2019

Google is not blocking edge, or at least we have no proof of that. In this instance I think it's safe to assume an oversight based on naive user agent whitelisting.

And before I get accused of shilling, I hate chrome and despise Google with a passion.

trehalose · on May 31, 2019

As I responded to a comment just below this one, somebody over on reddit tested different user agents: https://www.reddit.com/r/google/comments/btysl9/google_have_....

It seems pretty clear from the fact that nonsense user agents like "TotallyNotMicrosoft" and "IE6" worked, that there is a blacklist, not a whitelist.

m_sahaf · on May 31, 2019

Not true.... https://twitter.com/ericlaw/status/1123993553070448651?s=20

trehalose · on May 31, 2019

Somebody over on reddit tested different user agents: https://www.reddit.com/r/google/comments/btysl9/google_have_....

It seems pretty clear from the fact that nonsense user agents like "TotallyNotMicrosoft" and "IE6" worked, that there is a blacklist, not a whitelist.

agentgumshoe · on May 31, 2019

Technicality, but it's a chromium base, not Chrome, which is Google's browser

jeltz · on May 31, 2019

Do you know of any good alternatives? I would love to get rid off recaptcha but it is a very convenient and quick to set up way to stop most spam bots.

dessant · on May 31, 2019

There is an ongoing thread that may help you: https://news.ycombinator.com/item?id=20058697

sebazzz · on May 31, 2019

Remember that reCAPTCHA v1 used to be noble: reading books and converting them to text.

Now you're just training many Google machine learning algorithms by classifying data. In which they get more useful for the consumer, thus more powerful.

visarga · on May 31, 2019

I hate them as much as you do, but you're wrong. Those storefront and traffic sign captchas are not useful for training ML models. If they were to be useful, they would be much more varied, like the original ones (used for OCR).

maxheadroom · on May 31, 2019

>Those storefront and traffic sign captchas are not useful for training ML models.

Not to get all tin-foil-hat, but this is going to sound like it, but if you have a car that has 9+ cameras upon it that drives in areas full of these, then maybe there would be some use for it for Google.

Bear in mind that I'm not saying that they are doing this but to dismiss it unequivocally as something that can't or wouldn't be done entirely ignores the premise that it could prove useful to other areas of their business, which might have a vested interest in such use (say, for example, if Google or it's parent company were trying to break into the self-driving car area[0]).

[0] - https://en.wikipedia.org/wiki/Waymo

dual_basis · on May 31, 2019

Of course that's what they're doing... I thought this was well known? I don't think they claim otherwise.

dhimes · on May 31, 2019

I hate them as much as you do, but you're wrong.

I would love to see some evidence (a link or something) of this. I see captchas that look like pretty good edge-detection discriminators- street lights in tree limbs, bicycles against brick, and so on.

SCdF · on May 31, 2019

My brain is unwilling to accept this.

What is the purpose of those choices then?

rum3 · on May 31, 2019

Since they introduced the square-selecting captchas I have always assumed that they use it for identifying the user. I bet that depending on how you solve the captchas they can identify who you are if their system already has a theory of who you might be.

kmlx · on May 31, 2019

they're there for denying access to automated scripts.

pawelk · on May 31, 2019

This is the reason they exist in the first place, but doesn't answer the question why they're implemented this particular way.

faceplanted · on May 31, 2019

They're implemented this particular way to provide training data for image segmentation systems, they move the image around inside the frame which allows them to use a few people doing the challenge to create a boundary representation that can be used to train things like YOLO style ML systems

bytesandbots · on May 31, 2019

They are able to verify that the user selection is correct. It is possible only if they already have the right answer. If they already have the right answer, what are they training for.

dual_basis · on May 31, 2019

They have some known right answers and some they don't know. They check that you get the ones they know correct, and then they take the other info you provide and add some confidence that they are correct. This bootstraps the system.

stavros · on May 30, 2019

The audio CAPTCHA always works first try for me. The image CAPTCHA can go eff itself, it would always take me five tries while the images loaded super slowly.

dessant · on May 30, 2019

Yes, the audio CAPTCHA is easier to solve, but the audio challenge is blocked [1] if you are not in a good network neighbourhood or they can't collect enough tracking data to classify your visit.

[1] https://github.com/w3c/apa/issues/25

danShumway · on May 30, 2019

Can confirm, it's rare for me to be able to get at the audio captcha. Occasionally I'll find that tabbing onto the the button allows it load when clicking directly on it won't. I assume if Google is observing behavior that makes them think you're sighted, they'll block access.

I kind of wonder if it would be possible to force the issue legally as an accessibility problem, but other people than me would need to do it, and in any case it feels a kind of dirty to me to use blind accessibility as a tool in the fight for privacy.

On the other hand, it also feels dirty to me that being blind would mean you're not allowed to do as much on the web to protect your privacy. Blind people should be able to use Tor.

wtmt · on May 31, 2019

> Can confirm, it's rare for me to be able to get at the audio captcha. Occasionally I'll find that tabbing onto the the button allows it load when clicking directly on it won't. I assume if Google is observing behavior that makes them think you're sighted, they'll block access.

That'd be very cruel to ignore those with vision, but who don't have anything close to perfect vision or correctable vision through glasses. It would also ignore those who have poorer vision as well as have difficulties in recognizing patterns. There's a whole spectrum of accessibility issues, and trying to "fail people" who seem to have enough vision to click on an audio button would be the definition of being evil.

> I kind of wonder if it would be possible to force the issue legally as an accessibility problem, but other people than me would need to do it, and in any case it feels a kind of dirty to me to use blind accessibility as a tool in the fight for privacy.

Even if this is not possible legally in all jurisdictions, enough publicity and outrage could help. There should certainly be some journalists from major publications/site reading HN (or HN readers with journalist contacts) who can investigate and write about this.

chmars · on May 31, 2019

Do you have a suggestion for a good alternative?

I dislike Google reCAPTCHA, however, it brought down contact form and comment spam to almost zero. (With the price of an unknown number of false positives and some frustrated users.)

stefek99 · on June 3, 2019

Wait. Patent? It's just a simple technique, quirk, workaround, something to win the arms race... Just like shadowbanning on Twitter.