I've played around with this concept, trying to replicate some previous work [1].
It's a sensitive topic, because sometimes we're actually tampering with the data, trying to eliminate known human or selection biases.
The first defence against discrimination is, in my honest oppinion, for everyone working with data to be aware of these problems. To know that, besides ROC, precisions and recalls, we should measure the impacts of the models in sensitive demographics (gender, race, nationality, sexuality).
And one of the things that I learned (in [1]) is that, even if you're carfull with the features you use, you might still have a negative effect.
> And one of the things that I learned (in [1]) is that, even if you're carfull with the features you use, you might still have a negative effect.
One needs to understand how these features interplay with each other. For example, you may not directly use a protected class feature (race) to make your prediction but you might end up using a secondary or tertiary variable (like location) to end up learning a protected class feature due to statistical correlations.
It's a sensitive topic, because sometimes we're actually tampering with the data, trying to eliminate known human or selection biases.
The first defence against discrimination is, in my honest oppinion, for everyone working with data to be aware of these problems. To know that, besides ROC, precisions and recalls, we should measure the impacts of the models in sensitive demographics (gender, race, nationality, sexuality).
And one of the things that I learned (in [1]) is that, even if you're carfull with the features you use, you might still have a negative effect.
[1] https://github.com/sergioisidoro/aequalis