It's a cool idea, but I think they're doing it wrong.
We experimented with doing something like this on Quizlet, but didn't actually launch anything. We first looked at a lot of the data and doing based on string distance is the wrong approach.
For example, if you type hotmail.de into that checker, it suggests hotmail.fr. Another is ymail.com --> gmail.com. The more valid domains you add, the more (correct) permutations get marked as invalid. We have 20k users with ymail accounts.
I think a blacklist approach is much more solid than a whitelist approach, I just haven't gotten around to building it.
It sounds like you have the email addresses of your users stored in plaintext. In that case, you should be able to extract all of the domains of verified email addresses from your existing information, thereby covering all of the likely domain names of your userbase and future users.
> It sounds like you have the email addresses of your users stored in plaintext
Just curious, but are you suggesting that plain text is the wrong way to store an email address? Your comment makes me draw that conclusion, which of course seems rather silly.
It can be viewed as a security vulnerability, as many folks use the same password everywhere. As such, if somebody compromises your user database, they now potentially have a recoverable password and a plain text email address to go with it. This potentially compromise all users' email accounts, as well as other services that use email as username, such as PayPal accounts.
If email addresses are obfuscated in some way, the difficulty for an attacker is increased.
The tradeoff in convenience is that you force a user who has forgotten his password to remember what email address she signed up with in order to recover it via email.
Obfuscated in some way implies that it's reversible, which simply means that it's just going to take a little bit of time to unobfuscate the database--in other words, it's probably not worth it.
Hashing an email address would be pointless because the the email address is no longer usable to do things like, you know, send email to that person. As such, the only real option is to store it in plain text--and that makes the most sense.
just correct TLDs separately, and i don't think it's a problem if you have false positives in there, after all a user will recognize an incorrect suggestion and move on, should it happen at all. using something like this is definitely better than using nothing at all.
We experimented with doing something like this on Quizlet, but didn't actually launch anything. We first looked at a lot of the data and doing based on string distance is the wrong approach.
For example, if you type hotmail.de into that checker, it suggests hotmail.fr. Another is ymail.com --> gmail.com. The more valid domains you add, the more (correct) permutations get marked as invalid. We have 20k users with ymail accounts.
I think a blacklist approach is much more solid than a whitelist approach, I just haven't gotten around to building it.