Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm working on a problem kinda similar to that with a binary classification on a class imbalanced dataset, but fastText and Bidirectional LSTMs appear to work pretty terribly even with oversampling. Is there a better alternative?


Class imbalance is a tricky problem, but it is unrelated to FastText.

There is no silver bullet. The best solution is to collect more data to bring the classes to a balance. The second best approach is to try algorithms like SMOTE.


SMOTE is for more numerical data.


It is a numerical problem underneath. Word vectors are numerical vectors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: