Pwned or Bot

commitpizza · on Jan 19, 2023

I pay for my email that gives me a lot of aliases and most of them have not been pwned yet. So with his tool I would be flagged as a bot. Honestly, doesn't sound like a great idea to be frank.

There must be large swaths of people that have either been careful or have specific emails that they use for certain purposes that haven't been pwned.

The question, what should happen if I haven't been pwned? Should I not be able to purchase the thing or would I face some annoying captcha?

I like Troy Hunt, but this idea penalize people with good habits and that is just something I can't support.

remram · on Jan 19, 2023

This also seems to fast-track stolen accounts, by design. What a weird idea.

rippercushions · on Jan 19, 2023

It's not his idea, he's saying that there are people out there who are already (mis)using the data for this.

jtvjan · on Jan 19, 2023

Sort of. He does encourage this use-case in the final paragraph.

> Applying "Pwned or Bot" to your own risk assessment is dead simple with the HIBP API and hopefully, this approach will help more people do precisely what HIBP is there for in the first place: to help "do good things after bad things happen".

taco_emoji · on Jan 19, 2023

Yeah it seems clear to me that he's recommending it to be one portion of a risk assessment for a given email address.

mike_d · on Jan 19, 2023

This is a common investigative technique that predates HIBP, however more people are starting to automate it now (using non-HIBP datasets). I think this combined with the new request-based pricing on the HIBP API implies he just wants to make some money off being the quick to implement 75% solution.

Semaphor · on Jan 19, 2023

Edit: I misunderstood Troy.

Original comment:

No, it doesn't penalize them (at least not his idea, implementations might), it simply fast tracks pwned emails and doesn't apply the normal bot checks that would otherwise apply to everyone.

cobertos · on Jan 19, 2023

That's not how he's suggesting it would work. All checks would normally be applied to build a "how human are you" or "humanness" score. He's suggesting a pwned email test and arguing it would be a good signal for "humanness". The implementation might not make it an explicit penalty (-1 to your "humanness" score), but not being pwned might not help your case (+1 if you are pwned, but +0 if you're not).

TheCleric · on Jan 19, 2023

Yeah it would definitely be good to integrate it into a Bayesian approach where it is mixed with other factors to generate a % chance of being human vs. bot.

vachina · on Jan 19, 2023

I wonder how many pwned email and password pair still match. Crooks can take control of these pwned accounts and pretend to be trustworthy.

micahdeath · on Jan 19, 2023

It depends on the risk. I have an account that was pwnd (with the same password) but there is no risk to me as there isn't anything useful in that account (not even a DoB, Address or even a Name.) Worse case, someone changes the password and locks me out. Then I'll create another account as it's not a big deal.

jcuenod · on Jan 19, 2023

The point would not be that it's a threat to you (though it may be), it's that compromised accounts (like one you don't care about) are a threat to an ecosystem that can't identify whether a "user" is a human or a bot.

That is, your compromised account could be used in an attack and it would look like a human.

Mikealcl · on Jan 19, 2023

I agree, but cats out of the bag.

CGamesPlay · on Jan 19, 2023

Facebook and Twitter are basically closed to new users. If you've gone this far without an account, your new one will be shut down for being a bot within hours of creating a new account, or flagged for "extra verification" which requires sending a government ID to these companies so they can verify that you didn't photoshop a fake government ID.

This new approach seeks to extend this feature to the entire internet. What could possibly go wrong?

71bw · on Jan 19, 2023

I create a new account every 6 months or so on Facebook when my old one gets banned for "violating the community guidelines" and I haven't been asked for an ID ever since 2019. Twitter, though, is way worse and I had to give up and I'm currently just buying aged accounts. Violates even more parts of the ToS than just ban evasion but at least the accounts last for years instead of weeks.

randomguy0 · on Jan 19, 2023

What in the world are you doing to keep getting banned and to also want an account enough that you’ll pay for them?

ridgered4 · on Jan 19, 2023

I gave up on facebook (but I wasn't trying that hard) but it seemed to be using the extortion practices a lot of services use now. At first it appears to let you create an account but upon logging in for the first time it demands a phone number for 'verification'. Microsoft was even worse when they migrated my mojang account, it let me use it for a little while before demanding the number.

Back when I had a facebook account I recall it suddenly up and demanding I scan my drivers license one day or I couldn't log in again...on the same and only machine I actually had used facebook on.

thorin · on Jan 19, 2023

What do you have to do to get banned? I've seen people getting banned temporarily but I haven't heard of anyone getting banned permanently.

pookha · on Jan 19, 2023

If you don't like somebody on Facebook you can report them for offensive content. I posted something that some asshole facebook friend didn't appreciate and they dug through my facebook feed and found one image that was borderline (somebody in politics in their drawers) and sent in a complaint to facebook. Got my account banned for a first strike. Same douche could have sent in more complaints and facebook would have happily given me three strikes. It's a Stasi system. Don't like your boss or your neighbor, report them...

dncornholio · on Jan 19, 2023

What if my boss or neighbour didn't post anything that's against the TOS?

Nextgrid · on Jan 19, 2023

It's not about what's against the ToS, it's about getting the monkeys who review the reports to judge that it's against the ToS. Given their working conditions, they have little incentive in making an accurate determination and may just be pressing buttons at random, so spamming enough reports will eventually yield a ToS violation even on perfectly clean content.

jerf · on Jan 19, 2023

Your question contains the implicit assumption that "TOS" is some bright shining line that everyone, from all posters, to all of the AIs and humans analyzing whether something conforms, completely agrees with. Therefore, "just don't break the TOS" is a reasonable solution.

This is manifestly and obviously false, in numerous ways. I don't even need to cite capriciousness, cultural differences, or potential political bias; even ignoring those things, it simply isn't and can not ever be a bright shining line.

This is even before we consider that TOSs have been known to retroactively change. YouTube just made such a change; doesn't affect whether the videos are removed but the retroactively changed the monetization standards, with large effect. "Just don't break the TOS" is a non-starter in such an environment.

nhtsamera · on Jan 20, 2023

That would be unusual, so you could probably report them for being a bot.

After all, Meta's TOS stipulates that you must provide accurate information about yourself, and that you cannot share anything that is misleading.

It also prohibits making groundless reports or appeals, so maybe you could take an eye for an eye it you get unfairly targeted.

subradios · on Jan 19, 2023

They absolutely have, just gotta dig.

Remember the wave of people being in hot water over tweets sent in 2008?

Buffout · on Jan 19, 2023

   > so they can verify that you didn't photoshop a fake government ID.

Huh. How?

MonkeyClub · on Jan 19, 2023

By taking a publicly available template, and putting in your details and mugshot.

Buffout · on Jan 19, 2023

I mean how can Facebook check government ID if it is legit? ( not how to photoshop ID. )

Diggsey · on Jan 19, 2023

You generally can't actually check the government databases directly, but you can still determine this.

First, companies can catch most fraudulent documents simply by looking at the document (eg. are the fonts all correct, does the checksum on the MRZ add up, does the data in the MRZ match the data on the face of the document, does the data on the document match previously collected data about the individual, etc.) Some will go further by combining this with a "liveness" check (eg. they might ask you to take a picture of yourself in a certain pose, or to record a short video looking side to side)

Second, companies can use a soft credit check (if authorised by the user, which would need to be in the fine print when you sign up or when you are asked for such a document). Such a credit check won't affect your credit score, but can be used by companies to see if an individual with your details exists. Companies which offer such credit data in the UK/US/other western countries typically boast of 90-95% match rates across a population, but obviously younger people are less likely to be found since they are less likely to have a credit history. This is typically aggregated with data from non credit sources (electoral roll information, county court judgements, etc.) to reach those high match rates. They might also geo-locate the IP address from which you accessed their site and compare it against any address information they have on you (which could come from you providing it on sign up, it could be extracted from the document if it's a driving license or something, or it could come from any credit records they found relating to you)

For Facebook specifically, they might look at other online activity - other social media accounts they can link to you, etc. And throw all of that into the mix.

If at the end of all that they don't have a clear answer, they might fall back to a manual process, or allow the account to be created but have content posted by the account flagged for manual review.

jpmattia · on Jan 19, 2023

> eg. are the fonts all correct, does the checksum on the MRZ add up, ...

Is that hard?

A quick googling shows websites that will generate a California driver's license for virtually no money, so I'd assume with decent programming skills should be able to put together a generator.

See eg https://www.verif.tools/en/dl_ca/

Buffout · on Jan 19, 2023

That's the easy part. Checksum algorithms are public knowledge.

htrp · on Jan 19, 2023

thete are actual id lookup verification services......

ezfe · on Jan 19, 2023

I created a Facebook account a few months ago to use Marketplace. The profile has only a name and unique-to-facebook email. I always use it in Firefox Containers.

Still active, and I've sold a handful of things with it.

dustedcodes · on Jan 19, 2023

> If you've gone this far without an account

So they admit that new generations are not interested in FB or Twitter and they will die with the boomer generation? If not then this logic makes little sense :)

jwally · on Jan 19, 2023

I'll be a contrarian: I like it.

Is it a black and white silver bullet one call destroys 'em all solution? Not even close. But, like he states in his article; from a "defence in depth" its another strong signal.

Are you a bad guy just because you have a weirdo email (which I do)? No.

Are you a bad guy just because you use tor? No.

Are you a bad guy just because you're trying to make a purchase during an extreme surge? No.

Are you PROBABLY a bad guy given a weirdo email, you're on tor, and you're trying to buy during a surge in purchases? I would say yes. I might not ban you outright, but you're going to jump through a lot more hoops than someone with an ancient email and a residential ip address.

thih9 · on Jan 19, 2023

> you're going to jump through a lot more hoops than someone with an ancient email and a residential ip address.

I understand this kind of reasoning.

At the same time I see a potential to snowball. This will encourage people to move away from weird addresses. Which will make it an even more effective filter and will justify stricter measures. So more people will move away. Etc.

jwally · on Jan 19, 2023

Thats a really good point. I'm working through this space right now so I'm kind of myopic to stuff like this.

I use a self hosted VPN (digiOcean); but under duress, I'd be a jerk to me. tbh; most sites are, lol. I've given up youtube and google because I am reCaptcha'd to death...

To your actual point, I don't think it would be a deal killer per se in implementation. Weirdo@Weirdo.com isn't blocked because they show up in troys list of known emails.

Fakebook@Weirdo.com is suspicious in this model because it has not been seen before.

huggingmouth · on Jan 19, 2023

Wouldn't bad actors just push their fake email addressess to haveibeenpwned in fake leaks? Steps:

1- periodically set up a legitimate looking service, possibly proxying real services. 2- wait a year or two for your fake service to premiate throughout the www and for seach engines to index it. 3. Mix your bot email addresses with legitimate previously pwned addresses. 4- proclame "woe is me, for thyself hasth been pwned"

You can set up this process so that you can inject a couple 100k bot email addresses periodically every couple of months.

This is an incredibly shortsighted idea with the potential to hurt a lot of innocent people.

pygy_ · on Jan 19, 2023

It is going to happen, and some people will make money off it by farming such addresses, but it raises the time and the cost to obtain a plausible email address for fraud.

themoonisachees · on Jan 19, 2023

At that point you'd be better off making those emails and signing up to a bunch of services. Bot emails aren't fresh for 2 years, and if they are somebody isn't doing their job properly.

fudgefactorfive · on Jan 19, 2023

I think the point is bot emails shouldn't be fresh.

Same way some people just set up businesses with random names in tax-shelter territories and sell the company 10 years later to add a sense if legitimacy.

arjvik · on Jan 19, 2023

This is a cute "hack" for bot detection, but it's too unpredictable for the real world. Far too many users with good security hygiene are penalized by this system

Plus, this might incentivize hackers to defeat the system by logging into and using email accounts pwned in these breaches.

nibbleshifter · on Jan 19, 2023

> Plus, this might incentivize hackers to defeat the system by logging into and using email accounts pwned in these breaches.

This already happens at a large scale anyway.

There's hundreds, if not thousands of "account shops" and sellers online selling hacked accounts for all sorts of services. Everything from Spotify to Twitter to news sites.

They ingest new breaches (or use automated tools to go hack sites and dump databases), and automatically test the leaked credentials against loads of shit using tools like OpenBullet or SentryMBA.

Those tools even integrate rotating proxies, captcha solvers, etc.

There's a few good talks on this, credential spraying and account shops.

nerdponx · on Jan 19, 2023

I actually thought this was going to be the topic based on the title: distinguishing between entirely fake accounts, and pwned real accounts.

rippercushions · on Jan 19, 2023

The only security hygiene that can stop your email from leaking is using a different address for literally every service you ever log into. This is of course possible with your own domain, but in practice totally infeasible for the vast majority of people.

actualwitch · on Jan 19, 2023

Apart from icloud, this is also available for fastmail users as well so no it's not "totally infeasible for the vast majority of people".

yzydserd · on Jan 19, 2023

I now do this with iCloud Hide My Email. It’s very easy to do.

I’ve started converting all my heritage details for already registered accounts.

idonotknowwhy · on Jan 19, 2023

I've been doing this for 6 years now. Every service, bank and even person gets a separate address.

Takes less than 2 minutes to create one with my paid mail provider.

On 2 occasions, I knew a system was compromised before an announcement because suddenly I was getting spam to the specific email address.

NazakiAid · on Jan 19, 2023

Totally agree with this. It's cool to have this data but people using shared VPNs and unique emails will be penalized.

leoxiong · on Jan 19, 2023

> If an email address hasn't been seen in a data breach before, it may be a newly created one especially for the purpose of gaming your system.

I’ve started using iCloud Hide My Email which generates a random email that forwards to my account email. This assumption is going to cause issues.

thih9 · on Jan 19, 2023

Next sentence addresses that:

> (…) or even using a masked email address service such as the one 1Password provides through Fastmail. Absence of an email address in HIBP is not evidence of possible fraud, that's merely one possible explanation.

yuliyp · on Jan 19, 2023

It can, but that's kind of the nature of anti-spam systems these days. Come in on a Tor IP with a randomly-generated burner e-mail with a Curl user-agent and you're gonna get blocked from almost anything that spammers have an interest in. Come in on the e-mail address, aged cookies, and a geolocation associated with your credit card for years and you're gonna be fine. Do things in the middle and expect some amount of false positives.

jcuenod · on Jan 19, 2023

This feels a lot like email providers assuming that if you're running your own mail server, you must be spamming people.

This depends on the lack of use of good tools like FF's relay to anonymize accounts. I mean, HIBP is great, but Troy is self-consciously not interested in handling subaddressing, which would improve his service and its (mis)use in detecting "humanness".

WorldMaker · on Jan 19, 2023

> but Troy is self-consciously not interested in handling subaddressing, which would improve his service

I don't think Troy is not interested in handling subadressing in the general sense, I think he just dismisses it as "not worth the time" given current statistics.

If it is worth the time and you were writing one of these "Pwned or Bot" "email credit score" detectors, it is easy: you could easily strip +whatever before an @ and check if that exists as well. (Check both!)

> which would improve his service

It's not actually his service he's talking about in this particular article. He doesn't run an explicit "Pwned or Bot" "email credit score" service. He's pointing out it is an interesting use of the HIBP API and also to do it right it needs some sort of value add/scoring system, which he hints at ways to do that but does not provide one (and especially not as a service).

HIBP itself doesn't support subaddressing as a feature, but that's on purpose for a different reason: many of the people that use subaddressing, especially consistent users, use HIBP to narrow down specific account threats and it is useful to them today that HIBP tracks all of their subaddresses independently.

sgarman · on Jan 19, 2023

I think it's because they do: https://cfenollosa.com/blog/after-self-hosting-my-email-for-...

mqus · on Jan 19, 2023

Maybe I'm an outlier but the e-mail-adress I use for online payments or shops for over 10 years now has not been pwned. Maybe because I don't use this email for other sites where no money is involved or for social media. But I think hibp is not a great bot indicator.

operator-name · on Jan 19, 2023

So the crux of the technique is to roughly date how long an email has existed for, using leaked databases as a timestamping measure. I'm not sure this metric is a good one though, as older and importantly "pwned" emails are far more likely to have been taken over.

Without an idea for the percentage of emails that are still in the original owners hands, this risks a high false negative rate.

sysadm1n · on Jan 19, 2023

> This is called "sniping", where an individual jumps the queue and snaps up products in limited demand for their own personal gain and consequently, to the detriment of others.

This reminds me of Utility Monsters[0]. From Wikipedia:

> the utility monster, receives much more utility from each unit of a resource that it consumes than anyone else does. For instance, eating a cookie might bring only one unit of pleasure to an ordinary person but could bring 100 units of pleasure to a utility monster.

I'm a utility monster, and shops and convenience stores either love or hate us (since the monster consumer derives a skewed amount of utility from certain items). Some stores deliberately up their prices on certain items if they see utility monsters taking advantage, other times, they let the price remain stagnant, in full knowledge the utility monster brings them good business.

[0] https://en.m.wikipedia.org/wiki/Utility_monster

anonymousiam · on Jan 19, 2023

It's a sad state of affairs when a vendor will reject your email address as being invalid if it HASN'T been compromised elsewhere.

Hbruz0 · on Jan 19, 2023

> We're all so comprehensively pwned that if an email address isn't pwned, there's a good chance it doesn't belong to a real human.

Fatnino · on Jan 19, 2023

GeekedIn: In August 2016, the technology recruitment site GeekedIn left a MongoDB database exposed and over 8M records were extracted by an unknown third party. The breached data was originally scraped from GitHub in violation of their terms of use and contained information exposed in public profiles, including over 1 million members' email addresses. Full details on the incident (including how impacted members can see their leaked data) are covered in the blog post on 8 million GitHub profiles were leaked from GeekedIn's MongoDB - here's how to see yours.

Compromised data: Email addresses, Geographic locations, Names, Professional skills, Usernames, Years of professional experience

waynesoftware · on Jan 19, 2023

Profound: "We're all so comprehensively pwned that if an email address isn't pwned, there's a good chance it doesn't belong to a real human."

Should we be using if an email is pwned as input to antiabuse systems to give them higher confidence?

It reminds me a bit of when the % of emails that were #spam vs ham crossed 50% many years ago.

chrismorgan · on Jan 19, 2023

> it may be that they're uniquely subaddressing their email addresses (although this is extremely rare)

That “extremely rare” is about plus-addressing. My experience is that catch-all subaddressing (e.g. *@chrismorgan.info in my case) is considerably more popular, only rare rather than extremely rare.

a_c · on Jan 19, 2023

So a reverse bloom filter for identity. I imagine one day we will be KYC'd for buying pizza online.

jcims · on Jan 19, 2023

Apparently mine has been pwned 29 times.

Who’s got the high score in here?

nisegami · on Jan 19, 2023

45 breaches, 11 pastes here

dcow · on Jan 19, 2023

I have always wondered why pricing can’t fix the issue. On launch day, or for your first batch or whatever, start the pricing higher than you expect most anybody to pay. Target a constant rate of purchase by gradually lowering and raising the price to maintain some target sales per min/hour. Bots and scalpers get stuck holding the bag if they buy on launch day because the price will likely never be higher than what they had to pay to get the product. The company makes marginally more money on launches. People who really really want the product get it at a fair price (they were willing to pay).

Ajedi32 · on Jan 19, 2023

> People who really really want the product get it at a fair price (they were willing to pay).

If people were content to get the product at a fair price, scalpers wouldn't be a problem in the first place. The whole reason scalpers are considered a problem is that people want the product at a cheaper than fair price, and scalpers prevent that by buying up any inventory that is being sold for below the market rate.

Basically, if companies employed the strategy you suggest, then they'd effectively become the scalpers in the eyes of people who consider scalping a problem, with all the PR issues associated with that.

That's not to say it's necessarily a bad idea though. Once you accept the fact that scalpers exist, it makes sense for companies to capture those profits themselves rather than let scalpers just have them for free.

dcow · on Jan 20, 2023

Yes I understand and I do see your point. I wonder if the problem can be solved semantically. Instead of thinking of the company as the scalpers you want people thinking of the launch event as an auction. I don't think people would be so fickle if you framed the practice as “launch auctions”. Then it would be clear to everyone that there is no msrp until the supply and demand stabilize. And people would be at liberty to pay whatever they thought a fair price for the item all things considered. If someone else bids higher, well, tough luck.

tomalaci · on Jan 19, 2023

I think the problem bot vs real-person needs to be solved by the governments. Every government doing its own thing to tackle this wouldn't work, it would be great if they created an open-source project/standard that they implement. Alternative would be using bank accounts which is actually what Scandinavian countries do (e.g. in Sweden it is Bank ID) to verify that you are a real person.

All these methods of trying to recognize government ID pictures and etc. just seem very inefficient and not accurate enough for wide-spread use.

Unfortunately, not many governments are well-run to manage such solutions.

teddyh · on Jan 19, 2023

What about the unbanked (i.e. people who don’t have bank accounts)?

And even if bank accounts were free, getting a bank account means accepting the terms and conditions written by the bank. Not to mention the laws and regulations regarding banking, which include sending your bank details to the US government, even if you are a European using a European bank.

mik1998 · on Jan 19, 2023

I would rather not have to send every crappy website on the internet my ID or bank account information.

forgotmypw17 · on Jan 19, 2023

>Now, think of it from Nike's perspective: they've launched a new shoe and are seeing a whole heap of new registrations and purchase attempts. In amongst that lot are many genuine people... and this guy How can they weed him out such that snipers aren't snapping up the products at the expense of genuine customers?

Is it true that Nike actually wants to cut the snipers out? It seems like they're selling the shoes either way, possibly faster this way, and the resellers are doing free promotion for their shoes in order to resell them.

nickip · on Jan 19, 2023

Ha I do this all the time for buying second hand concert tickets! Scammers usually use throw away email addresses. If the seller has a pwned account I trust them more ;)

RektBoy · on Jan 19, 2023

Everything from that Stripe data about IP addresses in Europe is just pure cringe.

Did you know, that at least in my country, nearly everybody is behind CGNAT, so hundreds if not thousands households has exactly same external IP address and this rotates very often. So you constantly have same IP address, which hosts tons of torrents with porn or movies (nobody cares about torrents in my country). etc.

lma21 · on Jan 19, 2023

This is just wrong. What about masked email accounts that you can create with services like icloud or fastmail?

sgarman · on Jan 19, 2023

That exact point and even company "fastmail" is discussed in the article.

thih9 · on Jan 20, 2023

How does this work with data protection laws? Is there a way for me to object to a company doing this, i.e. an automated background check on my email address with stolen personal data?

I guess I cannot effectively object to my email being included in data leaks…

potatototoo99 · on Jan 19, 2023

Very bad idea. Most people are not terminally online like HN folks, and they barely register and barely appear in leaks. Unless every single facebook and instagram and wechat etc user is leaked, it will already have too false positives.

charles_f · on Jan 20, 2023

> Only 76% of transactions from the IP address had previously been authorised

Sounds like a self fulfilling prophecy, if they use these rules to authorize transactions.

asdadsdad · on Jan 19, 2023

how does Have I been Pwned download data breaches?

balderdash · on Jan 19, 2023

Would this only apply to webmail account domains (gmail)? I feel like corporate emails turnover much more quickly…

micahdeath · on Jan 19, 2023

So, my children will be bots when they finally get accounts. =/

charcircuit · on Jan 19, 2023

Snipers are just as much legitimate customers as anyone else. They only snipe under priced products so if you don't want people reselling them do not sell them for so cheap.

scubbo · on Jan 19, 2023

Astonishing how people will argue against their own interests in defence of a free market. If goods are underpriced and snipers are prevented, consumers pay the lower prices. If goods as "right-priced", the consumers pay a higher price.

charcircuit · on Jan 20, 2023

>against their own interests.

Wanting an RNG / ping based system is not in everyone's interests. Plenty of people want to be able to just buy a product wherever they feel like and not have to spam refresh at specific time for a chance to get a product. Resellers offer this convince and clearly people are willing to pay for it.

It's either some consumers get a good deal while others get nothing, or all consumers pay a fair price and get it.

This isn't even getting to the higher resale price if resellers are blocked because there is less competition between resellers.

sammy2255 · on Jan 19, 2023

Stupid idea by Troy Hunt