It’s really not that it’s one sided, it’s that it’s clearly more common to see hate speech anywhere unmoderated against some disadvantaged groups. And this is likely a consequence of history. It’s more curious to me when people think that it should be balanced, that we would expect people to be writing hate speech about the majority as often as fringe members of the majority write hate speech about minorities.
And that’s what this metric is measuring, the model finding hate more easily with “fat people are terrible” than “normal weight people are terrible”
And that’s what this metric is measuring, the model finding hate more easily with “fat people are terrible” than “normal weight people are terrible”