Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>A much more interesting (and harder) problem is creating good vectors to begin >with.

Indeed, this is the hardest problem. Vector search shines when used in-domain using deep representation learning, for example bi-encoders on top of transformer models for text domain.

However, these models does not generalize well when used out of domain as the representations changes. Hence, in many cases, simple BM25 beats most of the dense vector models when used in a different domain. See https://arxiv.org/abs/2104.08663



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: