It's great to see more and more talk of vector search and vector databases. We'v...

wswope · on Dec 14, 2021

That reference/learning page is a great resource!

As for Pinecone itself, what are the main selling points as you see them for a simple application (e.g. comparing trigram-vectorized sets of strings) when compared to a home-rolled solution using postgres with array types? Better performance, ease of indexing, etc.?

jamesbriggs · on Dec 15, 2021

It will depend on your use-case, but primarily:

(1) Pinecone uses dense vectors which can encode much more meaningful info, eg the actual 'semantic meaning' behind a sentence as we (people) would understand it, or the context in an image. Because of this, we can enable much richer, human-like interaction/search in your applications

(2) Performance wise, before joining Pinecone I was spending a lot of time with other dense vectors search tools like Faiss, and it isn't easy to get good or even reasonable accuracy and latency, particularly for large datasets. When I first used Pinecone, it took me maybe 10 minutes to figure everything out and start querying a reasonable dataset, search times were very fast and the accuracy incredible. Pinecone's tech is built by people that live and breath vector search, and what they've built outperforms anything I can build, even if I spend months trying to build it. I got better performance with Pinecone in 10 mins.

(3) Everything is production ready, no need to worry about deployment, security, maintenance etc, Pinecone deal with it and you can even use the service for free up to 1M vectors.

gk1 · on Dec 14, 2021

I pinged someone more technical from our team to chime in.

In the meantime I can say moving to the dense vector + ANN search combo turns regular searches into semantic searches, which means more relevant results.

If that's the case for you, then you can use Pinecone to go further and make those results fast (<100ms), fresh (CRUD + live index updates), and filtered (apply single-stage metadata- filtering). All on a fully managed system that you can scale up/down with one API call.

dvaun · on Dec 14, 2021

I've been toying with making a deckbuilder for Magic: The Gathering and could see this being potentially useful for finding fun card combinations. Thanks!

thirdtrigger · on Dec 14, 2021

We are actually discussing this on the Weaviate Slack :-) https://weaviate.slack.com/archives/D02JM9D3HND/p16347312830...

gk1 · on Dec 14, 2021

That would be a fun use case for us to promote. Let me know when it's ready! The free plan supports as many as 1 million items, more than enough for the all MTG cards in existence. Plus you can add and filter by metadata, like card type and properties.

dvaun · on Dec 14, 2021

> Plus you can add and filter by metadata, like card type and properties.

I read through your docs and figure that will be part of the approach.

An idea I had was to find similar, or "next best", cards for replacement in popular decks or to achieve similar effects in order to bring down the cost of EDH, Modern, etc. formats. I'm just getting back into the hobby again, so having a tool like this would make my wife and wallet happy :)

16mb · on Dec 14, 2021

I’ve resorted to playing modern with high quality fakes. Otherwise wouldn’t have the budget. Checkout bootlegmtg on reddit

kruptos · on Dec 14, 2021

I love this idea. I would pay for that service!

nop_slide · on Dec 15, 2021

I just want to chime in and say that the resources on your website look amazing. I spent 5 minutes poking around and it looks really high quality.

I'm dabbling in Postgres's full text search (ts_vector) for a small website, I know that is extremely simple compared to the offerings you provide, but your site has me quite interested in this space now.

Eager to learn more about this tech!

gk1 · on Dec 15, 2021

Glad you think so. Makes me want to expand it even more! What would you like to see covered?

nop_slide · on Dec 15, 2021

Maybe this is too simple, but a comparison to Postgres' ts_vector and how to do something similar in your service?

indeed30 · on Dec 14, 2021

Does Pinecone have any position on the status of document embeddings and whether they would be considered PII? One of the challenges of using a fully managed service is the headache of adding yet another data subprocessor and all of the legal and compliance questions that raises.

gk1 · on Dec 14, 2021

That depends on the document. We do not see the original document, only the embedding. You can argue that is sufficiently obfuscated to not count as PII. The good news is we are SOC2 compliant and GDPR-friendly and do a bunch of other stuff to help you meet security compliance requirements: https://www.pinecone.io/security/

indeed30 · on Dec 15, 2021

No, I understand that. I guess my question is actually around your experience with "You can argue that is sufficiently obfuscated to not count as PII" and whether your customers are actually successful with this argument.

gk1 · on Dec 15, 2021

Those who need more assurance just look at our SOC2 compliance, or have us go through a security review, or opt for the dedicated-environment deployment option.