Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The repo mentions approximate NN search but the article implies this is mainly brute force. Is there any indexing at all then? If not, is the approximate part an app-space thing e.g. storing binary vectors alongside the real ones?

In addition, if things are brute forced, wouldn’t a columnar db perform better than a row-based one? E.G. DuckDB?



A columnar database is completely irrelevant to vector search. Vectors aren't stored in columns. Traditional indexing too is altogether irrelevant because brute force means a full pass through the data. Specialized indexes can be relevant, but then the search is generally approximate, not exact.


How's a database being columnar irrelevant to vector search? This very vector search extension shows that brute force search can work surprisingly well up to a certain dataset size and at this point columnar storage is great because it gives a perfect memory access pattern for the vector search instead of iterating over all the rows of a table and only accessing the vector of a row.


That makes sense. I withdraw my comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: