Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

+1

I find it dishonest to call it "full text search" whereas it's actually just "English/Indo-European full text search" that uses language-specific features to achieve its goals.

Instead of pretending to have solved the string searching problem by using "language hacks", I'd really like see an open source database that provides easy to use interfaces to suffix trees instead.

The even more infuriating thing is that apparently some databases actually do have suffix tree implementations, but because of assumptions that the data is English/European, other languages work half-assedly on it.

Imagine i18n implications for projects that are based on them. And the users would have no clue how f*cked up things are.



With tsvector you have to declare the language, and if you are ingesting a diverse range of web documents you just end up applying English as a guess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: