More

jabo · 2025-08-29T00:15:38 1756426538

Mercury

jabo · 2025-06-27T13:34:34 1751031274

Wonder if white noise counts as noise from this perspective. Or if it’s mainly unexpected noises that make sleep quality worse.

rybosworld · 2025-06-27T15:00:09 1751036409

In many animals, hearing gets less sensitive in general when they sleep. I think it's common for people to be surprised by that, but it works that way to maximize sleep function.

Dog's are thought to be an exception, because part of their domestication involved selection for the offspring that was more alert (watch dogs).

The brain is thought to be hyper sensitive to a certain subset of sounds while sleeping, such as babies crying.

White noise is thought to work by drowning out the sounds we are most sensitive to.

jabo · 2025-06-08T13:34:26 1749389666

The day I discovered that marquee tags have a direction attribute, using which you can make the text go up/down left/right and use multiple of these tags, is still etched in my memory.

jabo · 2025-04-14T20:16:34 1744661794

We generally tend to engage in in-depth conversations with our users.

But in this case, when you opened the GitHub issue, we noticed that you’re part of the Meilisearch team, so we didn’t want to spend too much time explaining something in-depth to someone who was just doing competitive research, when we could have instead spent that time helping other Typesense users. Which is why the response to you might have seemed brief.

For what it’s worth, the approach used in Typesense is called Reciprocal Rank Fusion (RRF) and it’s a well researched topic that has a bunch of academic papers published on it. So it’s best to read those papers to understand the tradeoffs involved.

irevoire · 2025-04-14T20:29:17 1744662557

> But in this case, when you opened the GitHub issue, we noticed that you’re part of the Meilisearch team, so we didn’t want to spend too much time explaining something in-depth to someone who was just doing competitive research, when we could have instead spent that time helping other Typesense users. Which is why the response to you might have seemed brief.

Well, in this case I was just trying to be a normal user that want the best relevancy possible and couldn’t find a solution. But the reason why I couldn’t find it was not because you didn’t want to spend more time on my case, it was because typesense provide no solution to this problem.

> it’s a well researched topic that has a bunch of academic papers published on it. So it’s best to read those papers to understand the tradeoffs involved.

Yeah, cool or in other word « it’s bad, we know it and we can’t help you, but it’s the state of the art, you should instruct yourself ». But guess what, meilisearch may need some fine-tuning around your model etc, but in the end it gives you the tool to make a proper hybrid search that knows the quality of the results before mixing them.

If other people want to see the original issue: https://github.com/typesense/typesense/issues/1964

spiderfarmer · 2025-04-14T20:42:30 1744663350

I think this is a good example of why people should disclose their background when commenting on competing products/projects. Even if the intentions were sound, which seems to be the case here, upfront disclosure would have given the conversation more weight and meaning.

jabo · on Oct 1, 2024

If anyone's interested, a while ago I downloaded the MusicBrainz database and built a search-as-you-type experience here with about 32M songs:

https://songs-search.typesense.org

The dataset has been very helpful to benchmark Typesense across releases. So I'm grateful that it exists!

jabo · on Dec 14, 2023

We’ve interacted before on Twitter and GitHub, and I want to address your point about Raft in Typesense since you mention it explicitly:

I can confidently say that Raft in Typesense is NOT broken.

We run thousands of clusters on Typesense Cloud serving close to 2 Billion searches per month, reliably.

We have airlines using us, a few national retailers with 100s of physical stores in their POS systems, logistic companies for scheduling, food delivery apps, large entertainment sites, etc - collectively these are use cases where a downtime of even an hour could cause millions of dollars in loss. And we power these reliably on Typesense Cloud, using Raft.

For an n-node cluster, the Raft protocol only guarantees auto-recovery for a failure of up to (n-1)/2 nodes. Beyond that, manual intervention is needed. This is by design to prevent a split brain situation. This not a Typesense thing, but a Raft protocol thing.

jabo · on Dec 14, 2023

I'm biased, but I'd recommend exploring Typesense for search.

It's an open source alternative to Algolia + Pinecone, optimized for speed (since it's in-memory) and an out-of-the-box dev experience. E-commerce is also a very common use-case I see among our users.

Here's a live demo with 32M songs: https://songs-search.typesense.org/

Disclaimer: I work on Typesense.

keybits · on Dec 14, 2023

I can also highly recommend TypeSense and have no affiliation. You'll save a lot of money and get much faster results.

jabo · on Aug 21, 2023

I work on Typesense [1] - historically considered an open source alternative to Algolia.

We then launched vector search in Jan 2023, and just last week we launched the ability to generate embeddings from within Typesense.

You'd just need to send JSON data, and Typesense can generate embeddings for your data using OpenAI, PaLM API, or built-in models like S-BERT, E-5, etc (running on a GPU if you prefer) [2]

You can then do a hybrid (keyword + semantic) search by just sending the search keywords to Typesense, and Typesense will automatically generate embeddings for you internally and return a ranked list of keyword results weaved with semantic results (using Rank Fusion).

You can also combine filtering, faceting, typo tolerance, etc - the things Typesense already had - with semantic search.

For context, we serve over 1.3B searches per month on Typesense Cloud [3]

[1] https://github.com/typesense/typesense

[2] https://typesense.org/docs/0.25.0/api/vector-search.html

[3] https://cloud.typesense.org

Dachande663 · on Aug 21, 2023

We store a couple million documents in typesense and the vector store is performing great so far (average search time is a fraction of overall RAG time). Didn’t realise you’ve updated to support creating the embeddings automatically; great news!

ZoomerCretin · on Aug 21, 2023

This is very difficult for me to understand. Can you explain like I'm an undergrad? What exactly does this mean? What is an embedding? What is the difference between keyword and semantic search?

jabo · on Aug 21, 2023

Here's an example of semantic search:

Let's say your dataset has the words "Oceans are blue" in it.

With keyword search, if someone searches for "Ocean", they'll see that record, since it's a close match. But if they search for "sea" then that record won't be returned.

This is where semantic search comes in. It can automatically deduce semantic / conceptual relationships between words and return a record with "Ocean" even if the search term is "sea", because the two words are conceptually related.

The way semantic search works under the hood is using these things called embeddings, which are just a big array of floating point numbers for each record. It's an alternate way to represent words, in an N-dimensional space created by a machine learning model. Here's more information about embeddings: https://typesense.org/docs/0.25.0/api/vector-search.html#wha...

With the latest release, you essentially don't have to worry about embeddings (except may be picking one of the model names to use and experiment) and Typesense will do the semantic search for you by generating embeddings automatically.

mrjn · on Aug 21, 2023

We use Typesense for vector search as well for Struct.ai in production, it works amazingly.

I'm surprised the original post doesn't benchmark Typesense.

jabo · on July 12, 2023

Typesense has a vector store / search built-in: https://typesense.org/docs/0.24.1/api/vector-search.html

In the upcoming version, we've also added the ability to automatically generate embeddings from within Typesense either using OpenAI, PaLM API or a built-in model like s-bert or E5. So you only have to send json and pick a model, Typesense will then do a hybrid vector+keyword search for queries.

esafak · on July 12, 2023

I see you run hnswlib but do you (plan to) support external vector databases, so users can upgrade?

jabo · on July 12, 2023

We don't plan to support external vector databases, since we want to build Typesense as a vector + keyword search datastore by itself.

esafak · on July 12, 2023

I see. Do you plan to replace hnswlib with your own technology?

jabo · on June 22, 2023

We've been using Struct's Slack bot in Typesense's Slack community here (if you want to see a demo of how it looks):

https://threads.typesense.org/kb

I love that the discussions we're having (in public channels) are now automatically indexed and made searchable publicly to any users who are looking for information on Google, etc, even if they're not a part of our Slack community.

I previously used to be worried about all this time and effort we're putting in to a walled garden of information that Slack was becoming, not to mention their untenable pricing for communities.

I now find myself spending more time writing more detailed answers in Slack, because I know it's going to be available publicly for future searchers.