how does it compare to state-of-the art? (https://github.com/erikbern/ann-benchm...

ffast-math · on Jan 11, 2020

I'm a big fan of ann-benchmarks and will be the first to tell you that the research community needs way more benchmarks like this. But I do want to add a couple caveats about it for people looking into this area:

1) Most of these datasets have extremely correlated dimensions. If you plot the covariance matrices, you'll see dense blobs of entries close to 1 all over the place. This makes the ANN task much easier than it would be with, say, high-quality DNN features. As an example, I've compressed MNIST digits down to 1 byte representations with vector quantization and still gotten nearly perfect retrieval accuracy.

2) 1M vectors is not that many. You can get easily get 1k queries per second in a single thread at a decent precision/recall just brute-force scanning through them with a SIMD approximate distance function like Bolt or Quicker ADC [1]. Also worth noting that the FAISS paper (along with a lot of other work since then) focuses mostly on 100M to billions of vectors.

3) Related to (2), I think most of these methods aren't incorporating state-of-the-art approximate distance functions yet (though I haven't dug into all of their source code). AFAICT FAISS+Quicker ADC [2] is the actual leader on x86 CPUS. Can't comment on the production-readiness of their code though.

[1] The latter is a bit faster for ANN search, though the code is more complex IIRC.

[2] https://github.com/technicolor-research/faiss-quickeradc

gujun720 · on Jan 12, 2020

Good points. I want to add one more.

I think the Ann benchmark should pay more attention on

1. The index building speed, as this is very important in some production scenarios. Now it only says I will give 5 hours to build the index on that 1 million vectors.

2. The memory footprint, as 1m vectors are not that many. We will have to deal with billion s of vectors for chemical molecules, images and word vectors. The memory consumption will definitely impact how many servers you need.

gujun720 · on Jan 11, 2020

We have some test reports in https://github.com/milvus-io/milvus/tree/master/tests

At this moment, the IVF indecies are based on FAISS. So the performance is the same as Faiss.

IVF_SQ8H is the reconstruction from Faiss IVF SQ8. Performance is much better, but you need GPU for it.

We provide benchmark test procedures and tools.

Please check this: https://github.com/milvus-io/bootcamp/tree/master/EN_benchma...

rainmanwy · on Jan 11, 2020

Cool stuff! Very easy to use and good examples to getting start.