Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm using RDRPosTagger[1], though I've optimized the code a bit so that it's not just algorithmically efficient, but to use the language in a way that is fast. It isn't perfect, but it's good enough to be useful.

Language detection and sentence splitting are the other two slow bits of processing.

[1] https://github.com/datquocnguyen/RDRPOSTagger



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: