There's many terrible data brokers involved in many terrible things (especially in the US - they generally get zero headlines because they are b2b and rather secretive), so it's strange to have this headline be about Facebook, when they're the only company being discussed that doesn't sell your data (even if they do many other things with it that we may not like!). I know it's very fun to use them as the worst possible example in any given category of bad things, but I'd much rather have FB handle data than almost every single US data broker, if I was forced to choose between the two. A GDPR-equivalent would still be nicer either way though too!
Also, Facebook and other big tech companies are way less likely to get hacked thanks to more established data handling practices.
Breaches still occasionally happen, but you're more likely to see data breaches from these data broker companies because they didn't secure their elasticsearch DBs or something dumb like that.
> In response to the reporting, Facebook said in a blog post on Tuesday that "malicious actors" had scraped the data by exploiting a vulnerability in a now-defunct feature on the platform that allowed users to find each other by phone number.
I think every company has an incident that makes data protection a real concern. When it's two guys with a PHP app in a dorm room, you don't really expect anything serious, but eventually the risk to the company becomes too high and some useful system gets implemented.
I think Google's watershed moment was this: https://www.gawker.com/5637234/gcreep-google-engineer-stalke... By the time I got there, it seemed like standard practice for engineers to not be able to read "their own" databases -- every piece of data would be encrypted by a per-user key that only arrives at your application when the user's session is present (or by heavily-audited special exceptions; breakglass, batch jobs, etc.) Much attention was paid to not making data available too widely. (For example, on Google Fiber our hardware knew the MAC addresses of devices that wanted to use WiFi. That's a requirement for 802.11 to work. We modified the Linux kernel, wpa_supplicant, etc. to not log these, explicitly so that someone couldn't collect the logs and do a mapreduce to see who takes their iPhone to their friend's house and intuit a social network. We did that because it was the right thing to do; we only wanted information that was required to operate the service effectively, not to be a dragnet for anything potentially interesting. I'd personally be fascinated to see that information, but it's not the right thing to do.)
I have to imagine that any other large tech company has similar access controls and privacy focus -- even Facebook. Where it gets scary are governments and large non-tech companies that just email around spreadsheets with personal information in them. Yesterday there was an article about Missouri exposing teachers' social security numbers in HTML comments. If you tried to write that code with a data storage system like Google's, it simply wouldn't work. Your code wouldn't have the decryption key for those rows, and you wouldn't be able to output them as HTML comments.
Not so secretive. Just register a domain, put an official looking website on it with contact details and enjoy sketchy sales pitches. My latest favorite is a blunt offer to buy a database with millions of entries, with all PII imaginable except maybe SSNs. These data brokers aren't much different than drug dealers, it's just PII theft and sale hasn't been outlawed yet.
Selling raw data vs. selling targeted advertising based on that data. The difference is maybe not so important to some people, but it's exactly the distinction the comment you originally replied to highlights.
"Walled garden" is a fairly common colloquial phrase, often applied to the iOS ecosystem in the context of the app store (genuinely not sure what you meant by your comment otherwise).
> Facebook ... the only company ... that doesn't sell your data
Was that CNN style reporting where Dr. Zuckerberg sells users' personal data left and right in bulk quantities and seems privacy-friendly at same time? Or just NBC style Let's Go Brandon?
> GDPR
Also, I blame Dr. Zuckerberg for constantly getting caught brown-handed while selling data to private intelligence firms. His miserable failures to do shady business in shady way resulted in this extremely idiotic legislation which rendered a significant number of US websites inaccessible to, say, Frankfurt VPDN endpoint and polluted the rest with ridiculous cookie police which is even more annoying than popup and animated porn banners.
Data can be collected by third parties about your activity on platforms like Facebook without involvement of the platform owner. On TikTok or Instagram a bot can subscribe to you. On Facebook some info may be public to other users (depends on settings) and some interactions are public, like commenting on a public post.
Right, I think a lot of the HN audience understands this often happens with public (or 'public') data on the Internet. I still think the article title is a bit misleading when the article is about data brokers instead of Facebook (even if sometimes data brokers use FB) and would hope for more constructive and better-sourced articles from Mozilla.
yeah what about the streaming industry and the game industry? Of course when we dig into them we'll come to the conclusion that human attention is havested in order to fund a whole range of industries and it's their entire business model we need to kill to stop this.
This has always been the case since before the digital age (hollywood, cable, video game, sports, even news). These industries feed on humans paying attention to things that don't matter in their life.
> In those cases your credit score could be based on just about anything. What you post on social media, whether you recently visited a doctor, or whether or not you live in a wealthy neighbourhood. Suddenly, your most recent Google search history, or your latest post on TikTok could influence whether you can get that loan.
It's strange to see Facebook and Google being invoked in these hypothetical examples when neither company sells user data. Big advertising companies aren't interested in releasing their proprietary user data to other companies. They sell ads on their own platform, but they don't sell user data.
Meanwhile, financial institutions openly share your personal information with "business partners" unless you opt out. The FTC has done a decent job of forcing disclosure of these data sharing arrangements and giving consumers a standard form to opt-out [1] These companies are actively selling user data as much as they can get away with, but Facebook and Google are getting demonized for something they're not even doing.
As for this blog post: It's a fine example of Betteridge's law of headlines (and yes, I know it's not actually a law).
They don’t need to sell anything. They aggregate and collect data, and sell value added services.
Facebook in particular always states they do not sell data. There are plenty of documented examples about how various datasets are used to target people in different ways. Google was contracted to deliver anti-extremist messaging, data from insurance claims, insurance background checking and web behavior has been used for opioid diversion.
This isn't totally hypothetical. I ran into a guy at a tech get together a few years ago who was doing exactly this. He told me they could produce much more accurate credit score data from examining some number of a user's FB posts than is achievable by Equifax. When I asked how they got the data he said they got users to give them their account credentials. Obviously violation of FB ToS. I don't know what happened to his startup but clearly what the article describes has been seriously considered by some in the lending business.
>Credit scores aren't perfect, but at least they're based on information relevant to the question at hand: whether you could repay a loan.
Well if other (non-financial) things are correlated, how are they not relevant? Should we actively hide information from loan providers, making loans more expensive for others to "protect"( I say protect with quotes, because really it is just a naked handout to people) a few? If these models are wrong, that is just a business opportunity for others to fix and price loans better.
I support privacy laws, but that argument has always seemed to be a strange way to argue for it.
As of when I posted this comment, there's a formatting error in your statement — there's a whitespace character after your second opening parentheses that was meant to precede it. That typo indicates that the component of your credit score that is derived from your HN participation should be scored lower than those whose HN comments do not include any typos, because if you're more likely to overlook a single-character error, then you're more likely to overlook a loan payment.
By your argument, it is permissible and appropriate for a credit rating company to make use of any signal — such as that typo — in their calculations. It's a business opportunity to make use of HN's freely-available data to charge you a higher interest rate for loans, because you made a typo, as by the evaluation metric "typos on HN" you're more careless than someone with fewer. It may or may not end up actually being an effective tool for increasing the accuracy of a credit rating, but there's no law prohibiting a US credit agency from trying it out today.
I do not feel that this is acceptable. I think it's an invasion of privacy to incorporate that information into your financial score, even if it technically would improve the precision of the score when applied to everyone in a society. I think that credit scores should have their inputs restricted by law to only consider your financial behaviors.
ps. The above example is impersonal and generic, and the point I'm making is not intended in any way to actually reflect poorly upon you at all. I do not as an individual consider your typo to in any way detract from your statement, and I think it's inappropriate to judge you so harshly as I suggest a credit agency would above. You should not be shamed or penalized for a typo in this forum.
I wouldn't, but sadly, some enterprising person will no doubt take use this idea to name and shame people, which will have a silencing effect on future founders.
For two, it's arguably a good public policy to judge people on the basis of what they've done, and not on the basis of statistical correlation to what other people have done.
Can you imagine i.e. having to post to Instagram regularly to get a loan, because some algorithm noticed that people who pay back their loans happen to post frequently on Instagram (when in reality the correlation is because people working salaried jobs have the time to post lots on Instagram, and people working minimum wage jobs don't)?
> For two, it's arguably a good public policy to judge people on the basis of what they've done, and not on the basis of statistical correlation to what other people have done.
Existing credit scores are based on non-causal correlation models derived from the behavior of others. Social-media based credit scores would not be any different in this regard.
I agree that credit scores based on social media posting are nonsense, but mostly because (1) it's so invasive and (2) I don't think it actually usefully correlates with anything.
> Can you imagine i.e. having to post to Instagram regularly to get a loan, because some algorithm noticed that people who pay back their loans happen to post frequently on Instagram (when in reality the correlation is because people working salaried jobs have the time to post lots on Instagram, and people working minimum wage jobs don't)?
Yeah. Some of the biggest factors in existing (US) credit scores are: on-time payment history; credit card use as a percent of limits; and average age of credit.
All of these correlate to affluence/wealth. Affluent/wealthy people can easily pay bills on time; affluent people are given higher credit limits, which makes the "use %" ratio lower; and age of credit is obviously related to age, and older people are wealthier than younger.
I would like a more specific, a much more specific, example of a bank rejecting a loan applicant based on their google search history as the article implies.
Is this really far fetched? Did you not see what Biden is proposing? Anyone with more than $600 in their bank accounts will have their transactions monitored. They will be mining that data asap.
> Anyone with more than $600 in their bank accounts
I hadn't heard this, but there is indeed such a proposal[1] by the Department of the Treasury dated May 2021. I'm quoting the proposal here from the PDF:
"This proposal would create a comprehensive financial account information reporting regime. Financial institutions would report data on financial accounts in an information return. The annual return will report gross inflows and outflows with a breakdown for physical cash, transactions with a foreign account, and transfers to and from another account with the same owner. This requirement would apply to all business and personal accounts from financial institutions, including bank, loan, and investment accounts, with the exception of accounts below a low de minimis gross flow threshold of $600 or fair market value of $600.
Other accounts with characteristics similar to financial institution accounts will be covered under this information reporting regime. In particular, payment settlement entities would collect Taxpayer Identification Numbers (TINs) and file a revised Form 1099-K expanded to all payee accounts (subject to the same de minimis threshold), reporting not only gross receipts but also gross purchases, physical cash, as well as payments to and from foreign accounts, and transfer inflows and outflows.
Similar reporting requirements would apply to crypto asset exchanges and custodians. Separately, reporting requirements would apply in cases in which taxpayers buy crypto assets from one broker and then transfer the crypto assets to another broker, and businesses that receive crypto assets in transactions with a fair market value of more than $10,000 would have to report such transactions.
The Secretary would be given broad authority to issue regulations necessary to implement this proposal.
The proposal would be effective for tax years beginning after December 31, 2022."
That's not how it's supposed to work here. You are presumed innocent, and the government can't go trolling through your "papers" e.g. bank account records without a warrant.
Without taking a stance on the "how it's supposed to work" claim, the US banking system is already regularly surveilling you for tax purposes without a warrant or any suspicion of a crime -- 1099s, SAR reports.
The IRS thinks it loses $600 billion in revenue annually from owed, but unpaid, taxes ("tax gap"). The Biden admin thinks they can recoup about $46 billion a year in owed, but unpaid, taxes just by adding this reporting measure -- without increasing tax rates at all[1]. This represents something like an additional 2.5% in tax revenue from individuals (which is about ~$1.7T annually).
If you don't illegally evade taxes, (1) this reporting is no threat to you, and (2) you should be mad at your underpaying neighbors who would pay more because of this surveillance program. And if you do illegally evade taxes, you should stop doing so.
"In some countries, loan providers can only look at your official credit score, which is based on your financial history: how much money you have in your bank account, whether you have outstanding loans, and so on. Credit scores aren't perfect, but at least they're based on information relevant to the question at hand: whether you could repay a loan."
Some things wrong with this:
1. There is no single 'official credit score'. Credit scores are calculated using different data and different methodologies by (i) credit bureaus like Equifax, and (ii) lenders. To say that loan providers can only 'look at your official credit score' is equivalent to saying that lenders cannot create their own scoring models, which is obviously wrong.
2. Most credit scores do not depend on how much money you have (or had) in your bank account. Credit scores are based on money you owe or owed. So a bank account would only be relevant to the extent that it provided a credit facility, or went negative at some point.
3. As mentioned in another comment, information other than credit/financial history is useful for credit decisions. For example, a common thing that unsecured loan providers look at is whether you have a job, and how much you earn. This information is verified through payslips, reference calls and/or bank statements. Whether you have a job now doesn't affect your credit history (it's in the past) but it sure is useful.
Yet another disingenuous privacy article by the Mozilla Foundation. Right in the second to last paragraph, they sneak in that "Countries like the US have laws that dictate what credit scores can be based on." and then fail to provide any examples of actual countries that use social data in credit decisions.
There was a whole big fad [1] of social credit scoring around 2012 - 2015ish which just kind of went away because... it turns out it doesn't work very well. The more signals you use, the more you open yourself up to adverse selection and I think pretty much every social credit scoring startup found themselves unable to offer a more compelling rate without going bankrupt and they all either pivoted or shut down.
AFAIK, there isn't currently a widespread deployed social credit solution in any country (including China). But it continues to be the favorite doomed pitch in fintech.
Big tech companies spend a lot of engineering hours (time+money) to ensure that only they have access to your data(including individual employees) because that is how they make money. On the other hand, VISA or $large_retail_store has no problem selling your data.
Video mentions India. There's a really old article [1] by Mark Ames and Yasha Levine about some of the negative parts of social media-based lending for poor people in India. It is called "micro-lending", Pierre Omidyar is big into it. I guess there were a bunch of suicides from a company he was funding. It is basically having more desperate people opt in to giving up privacy in exchange for loans. You opt in and it sucks in your data, system crunches on it and decides what to do.
But, I do think this is kind of where the whole surveillance capitalism thing is headed. It is interesting to me everyone always points at China as this dystopian nightmare, but even major media outlets are now admitting all their articles on China's evil "social credit score" were kinda BS [2]. Seems to me the real nightmare is with our own oligarchs.
Data brokers should all be required to register with the gov't and be regulated like a credit bureau. E.g. one free report a year on everything they have on you. Methods for correcting or removing inaccurate data, etc.
If there are no laws regulating this where you are, you're likely to have more straightforward bias issues, such as having people excluding locations with ethnicities, religions, or whatever else they might not like.
This is going to be country-specific. In the US, wouldn’t the old-fashioned, middle-class way of building up a credit history by getting a credit card and paying it on time still work?
Perhaps someone who works in that area will know more.
This is how it worked 20 years ago at least. I built up ridiculously good credit (by accident), when one of my first employers who didn't have a functional purchasing department.
My boss and I would take turns maxing out our credit cards (mine were rather meager having just graduated) to purchase equipment for the company and then being reimbursed.
However, credit score isn't really a measure of how likely you are to pay back a loan, but rather how profitable you will be doing so. My understanding is, the ideal customer always pays his bill in the end, but may take some time doing so to rack up extra interest charges.
> However, credit score isn't really a measure of how likely you are to pay back a loan, but rather how profitable you will be doing so.
That isn't true. It's purely a measure of how likely you are to repay a loan on time.
> My understanding is, the ideal customer always pays his bill in the end, but may take some time doing so to rack up extra interest charges.
This is true, for credit card customers in particular. Banks want people who make exactly the minimum payment on time. Everytime you pay less than the full statement balance, you're essentially taking out a new high-APR loan for the balance. Banks love customers that do that. It doesn't increase your credit score.
What about a credit repair service which offers a web crawler to generate plausible pro-social disinformation, with searches like "nearest gym", and "houses for sale in <exclusive ZIP code>"?
When I quit Facebook a few years back I read up on the details of how these profiles function. They're tied to a name, birthday and phone number. Now Facebook takes the deleted profile of a user and can still sell it for a tidy sum ($600). But get this; the birthdate, name spelling and phone number can be changed at any time by the user. So you can make their profile on you worthless for resale to data brokers by changing those details before deleting your account. I did exactly that and am banned for life. They were pissed.