Most of our ML stack has been developed internally given the unique constraints ...

d3nt · on April 18, 2018

Broadly speaking, what approach do you use to "build simpler 'explanation models'" from the more complicated "core fraud models"? Do you learn the models separately over the training data, or does the more complicated model somehow influence the training of the simpler model?

joering2 · on April 19, 2018

Why you so stubborn on IP address? Its not a holy grail! I use proxy for some years now and many times I want to buy something on the frontstore “powered by Stripe” and my card is declined due to “unknow error”. Moment I turn off my vpn, transaction goes thru. I can exect this to be a huge problem for Stripe or anyone deciding on fraud attempt greatly basing it on IP. These days if i find a cool product and see “powered by Stripe” I simply end up on Amazon purchasing same product for similar price. Worst part — your clients don’t even know!

mlm · on April 19, 2018

I’m sorry that you had this experience. We vehemently agree that any one signal (such as IP address or use of a proxy) is a pretty poor predictor of fraud in isolation. We are trying to move the industry towards holistic evaluation rather than inflexible blacklists; not everyone behind a TOR exit node is a fraudster, for example.

While we can’t fix the previous experience you had, we’ve rebuilt almost every component of our fraud detection stack over the past year. We’ve added hundreds of new signals to improve accuracy, each payment is now scored using thousands of signals, and we retrain models every day.

We hope these improvements will help. We want our customers to be able to provide you services; that’s what keeps the lights on here. We’d be happy to look into what happened if you have specific websites in mind—feel free to shoot me a note at mlm@stripe.com.

mlm · on April 18, 2018

The rough idea is that you look at all the decisions made by the fraud model (sample 1 is fraud, sample 2 is not fraud) and the world of possible "predicates" ("feature 1 > x1", "feature 1 > x2", ..., "feature 10000 > z1," etc.) and try to find a collection of explanations (which are conjunctions of these predicates) that have high precision and recall over the fraud model's predictions. For example, if "feature X > X0 and feature Y < Y0" is true for 20% of all payments the fraud model thinks are fraudulent, and 95% of all payments matching those conditions are predicted by the fraud model to be fraud, that's a good "explanation" in terms of its recall and precision.

It's a little tough to talk about this in an HN comment but please feel free to shoot me an e-mail (mlm@stripe.com) if you'd like to talk more.