Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(I work at Voyage)

Many of the top-performing models that you see on the MTEB retrieval for English and Chinese tend to overfit to the benchmark nowadays. voyage-3 and voyage-3-lite are also pretty small in size compared to a lot of the 7B models that take the top spots, and we don't want to hurt performance on other real-world tasks just to do well on MTEB.



> we don't want to hurt performance on other real-world tasks just to do well on MTEB

Nice!

Fortunately MTEB lets you sort by model parameter size because using 7B parameter LLMs for embeddings is just... Yuck.


It would still be great to know how it compares?

Why should I pick voyage-3 if for all I know it sucks when it comes to retrieval accuracy (my personally most important metric)?


We provide retrieval metrics for a variety of datasets and languages: https://blog.voyageai.com/2024/09/18/voyage-3/. I also personally encourage folks to either test on their own data or to find an open source dataset that closely resembles the documents they are trying to search (we provide a ton of free tokens for the evaluating our models).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: