More

etiennedi · on Sept 6, 2023

We (at Weaviate) support separate BM25, vector, and combined hybrid search, so I can talk about the performance aspect of all of these a lot. The tl;dr is: It's complicated. What you say, may be true in some cases, but not in others. One does not always scale better than the other.

For example, ANN indexing with an index like HNSW scales pretty much logarithmically. A bit oversimplified, but roughly we see search times double with every 10x of the search space. In addition, calculations like euclidean distance and dot product (and therefore cosine sim) can be parallelized very well at a hardware level, e.g. AVX2/AVX512/neon and similar SIMD techniques.

With BM25, this isn't quite as simple. We have algorithms such as WAND and Block-Max-WAND, which help eliminate _scoring_ elements that cannot reach a top-k spot. However, the distribution of terms plays a big role here. As a rule of thumb, the rarer a word is the fewer documents require scoring. Let's say you have 1B documents, and a 2-term query, but each term matches only 100k documents. If you AND-combine those queries, you will have at most 100k matches, if you OR-combine them, you will have at most 200k. The fact that there were 1B documents indexed played no role. But now think of two terms that each match 500M objects. Even with the aforementioned algorithms – which rely on the relative impact of each term – there is a risk we would now have to score every single document in the database. This is, again, over-simplified, but my point is the following:

ANN latency is fairly predictable. BM25 latency depends a lot on the input query. Our monitoring shows that in production cases, when running a hybrid search, sometimes the vector query is the bottleneck, sometimes the BM25 query is.

> Hybrid search can be used to first filter a smaller number of candidates and then rerank them using vector search

My post is already getting quite long, but want to quickly comment on this two-step approach. For Weaviate, that's not the case. BM25 and vector searches happen independently, then the scores are aggregated to combine a single result set.

Yes, you could also use BM25 as a filter set first, then re-rank the results with embeddings, however you wouold lose the BM25 scores. If you want to use keywords just as filters, you can do that much, much cheaper than through BM25 scoring. In Weaviate matching keywords to create a filter set, would only require AND-ing or OR-ing a few roaring bitmaps which is orders of magnitides more efficient than BM25 scoring.

Disclaimer: associated with Weaviate.

snowstormsun · on Sept 9, 2023

Thanks, that's very interesting!

etiennedi · on July 18, 2023

Thanks, appreciate your perspective. This is much more insightful than just "affiliated==botnet". I agree that this can be a huge positive and - at least for me - that was the motivation for wanting to be active in the comments; to see if there are any challenging questions, feedback, etc.

As a HN user, I often go straight to the comments and only look at the submission if there's an interesting discussion there. I can totally see how spammy-looking comments with an undisclosed affiliation can have the exact opposite effect.

(affiliated with Weaviate)

nostrebored · on July 19, 2023

Nobody said affiliated equals botnet. You have accounts which go into discussions about Weaviate and write low quality comments that are essentially spam. They have no other real HN history.

Do you not think that the comments in here could’ve been generated by bots?

Stop it.

etiennedi · on July 18, 2023

Not sure if that alone is what made Go successful. But yes, having great networking tools in the stdlib is very refreshing! One of the things Go got right.

etiennedi · on July 18, 2023

Love the visual design, too. The generative/RAG/semantic search is the interesting part, but it's also just very pleasant to look at. Which can go a long way.

EDIT: Disclaimer, I am affiliated with Weaviate.

robertlagrant · on July 18, 2023

> Love the visual design, too. The generative/RAG/semantic search is the interesting part, but it's also just very pleasant to look at. Which can go a long way.

Just out of interest, do you still work for Weaviate? Probably worth mentioning.

etiennedi · on July 18, 2023

Fair point, edited my post. The fact that I love the visual design is my personal opinion though :-D

etiennedi · on Aug 16, 2022

Hi, the author and long-time Go user here. Let me know if you have any questions. For me the introduction of GOMEMLIMIT finally makes Go a viable choice for high-heap applications. No longer do we have to worry about running OOM with 50%+1Byte of long-lived memory allocated.

etiennedi · on Dec 15, 2021

The models that create the vector embeddings are trained on either general or domain specific knowledge. So, to oversimplify it a bit: The model has learned - based on the training data it was presented with - that "Scandinavian" has a relationship to "Finnish". Since the vector space is high-dimensional you can think of each language concept having a distinct place in that space. In this case the concept for "Scandinavian" and "Finnish" were close enough that you got a matching result. To simplifiy it even more: The vectors do not represent the words but the meaning behind them. So, the two sentences "I like wine" and "The fermented juice of grapes is my favorite beverage" have zero keywords overlapping, but are semantically identical. So a good model would give them very similar vectors even though a traditional search engine would find zero resemblence between them.

EDIT: Just realized I didn't answer the second part of your question. Yes, the models are language-specific, but there are also multilingual models that work across a large no. of different languages.

mariushn · on Dec 15, 2021

Thank you! Now I understand why others mentioned that generating vectors is the hard part.

etiennedi · on Dec 15, 2021

I agree, but at the same time now is the easiest it's ever been to create great vectors. Sentence-Bert [1] by Nils Reimers is a collection of pre-trained models specficially trained to create good vectors. You can use them out of the box with Weaviate. All you have to do is select your desired model [2] and your text (or images, etc.) will be translated into vectors at import time. As I mentioned in another comment, with Weaviate the goal is to make it as easy to use as any existing search engine or database while still providing you the benefits of Deep Learning & Vector Search.

[1] Sentence-BERT: https://sbert.net

[2] Weaviate Customizer with Out-of-the-box models: https://www.semi.technology/developers/weaviate/current/gett...

etiennedi · on Dec 14, 2021

Spot on! Both of those were motivating factors when building Weaviate (Open Source Vector Search Engine). We really wanted it to feel like a full database or search engine. You should be able to do anything you would do with Elasticsearch, etc. There should be no waiting time between creating an object and searching. Incremental Updates and Deletes supported, etc.

On your second point about efficient filtering, check out this article I wrote outlining how filtered vector search works in Weaviate: https://towardsdatascience.com/effects-of-filtered-hnsw-sear...

For even more details on filtering, check the documentation: https://www.semi.technology/developers/weaviate/current/arch...

etiennedi · on Dec 14, 2021

Checkout the open source vector search engine Weaviate: https://github.com/semi-technologies/weaviate

It’s not a relational db, but it supports Graph-like connections between objects, which makes it really easy to model your relations.

thirdtrigger · on Dec 14, 2021

Jup – this is an example from the demo dataset in the docs: https://link.semi.technology/3DPcphe

etiennedi · on April 11, 2021

Exactly. See my reply on this comment thread comment thread to see how Weaviate solves the filter-issue by using an inverted index to produce a whitelist of IDs which is then passed to the vector index where non matching IDs are simply skipped.

etiennedi · on April 11, 2021

Main author and architect of Weaviate (https://github.com/semi-technologies/weaviate) here. This real-time requirement was one of the major design principles from the get-go in Weaviate.

In Weaviate, any imported vector is immediately searchable, you can update and delete your objects or the vectors attached to the objects and all results are immediately reflected. In addition every write is written to a Write-Ahead-Log, so that writes are persisted, even if the app crashes.

We wanted to make sure that Weaviate really combines the advantages of AI-based search with the comfort and guarantees you would expect from an "old school" database or search engine.