More

jeadie · 2026-04-14T10:20:44 1776162044

This is exactly what we found. Ingest rates were tough. We partitioned and ran over multiple duckdb instances too (and wrangled the complexity).

We ending up building a Sqlite + vortex file alternative for our use case: https://spice.ai/blog/introducing-spice-cayenne-data-acceler...

jeadie · 2026-04-14T10:14:43 1776161683

You might find https://github.com/apache/datafusion and https://github.com/datafusion-contrib/datafusion-federation of interest

citguru · 2026-04-14T14:25:43 1776176743

Thanks for this, really enjoyed reading this and helps validate some of my personal thoughts

MisterTea · 2026-04-14T15:29:34 1776180574

OT but: You joined in 2019, barely post anything, then suddenly in 2026 your comments are copy pasted LLM output. Why? Why don't you use your own voice and type with your own hands? Notice how all those copy pasta posts were nuked - for good reason - we don't like being insulted.

michael-wang · 2026-04-14T18:15:17 1776190517

You joined in 2017, barely post anything, then suddenly in 2025/2026 2/3 of your posts are copy pasted links, 1 of which is dead and another is 10 years old. Why? Why don’t you use your own voice and type with your own hands? Why don’t you post something new and relevant that you made instead of attacking people who are posting entire code repos of interesting technology?

MisterTea · 2026-04-14T19:53:11 1776196391

I call it suspicious activity.

youngbum · 2026-04-16T04:43:39 1776314619

touché

jeadie · 2025-09-12T23:50:56 1757721056

We’re building vector indexes into Datafusion for search (starting with S3 vectors).

Open source at https://github.com/spiceai/spiceai

jeadie · 2025-05-23T02:15:01 1747966501

This is one of the ideas behind using DuckDB in github.com/spiceai/spiceai

anentropic · 2025-05-23T11:41:56 1748000516

That looks like an amazing "swiss army knife"...!

mrbungie · 2025-05-23T02:37:04 1747967824

Looks very cool! I will take a look, tysm!

jeadie · 2025-05-06T01:00:45 1746493245

There’s also https://github.com/spiceai/spiceai

jeadie · on Dec 4, 2024

This is a common feature now. If anything, for being so early to vector databases, Pinecone was rather late to integrating embeddings.

Timescale most recently added it but, yes a bunch of others: Weaviate, Spice AI, Marqo, etc.

gdj0nes · on Dec 4, 2024

A difference between Pinecone and many of the others you listed is that we host both embedding and reranking models in a serverless fashion. You pay for what you use while we manage the entire stack.

jimminyx · on Dec 4, 2024

Do any of the others also handle reranking?

iosjunkie · on Dec 4, 2024

Qdrant does with its ‘Query API’.

https://qdrant.tech/documentation/concepts/hybrid-queries/

And handles embedding creation with its fastembed package.

https://github.com/qdrant/fastembed

tech2trees · on Dec 5, 2024

Marqo does: https://www.marqo.ai/

cess11 · on Dec 4, 2024

I don't know about them, but Manticore does.

https://manticoresearch.com/use-case/vector-search/

jeadie · on Oct 18, 2024

Why not just federate Postgres and parquet files? That way the query planner can push down as much of the query and reduce how much data has to move about?

jeadie · on May 13, 2024

This looks functionally similar as using http://github.com/spiceai/spiceai with a postgreSQL data accelerator.

jeadie · on April 1, 2024

Spice AI | Senior Software Engineer | GMT+10 (e.g. Australia) through GMT-7 (e.g. Seatle/SF/LA) | Remote | Full Time

Spice AI provides building blocks for data and AI-driven applications by composing real-time and historical time-series data, high-performance SQL query, machine learning training and inferencing, in a single, interconnected AI backend-as-a-service.

We just launched github.com/spiceai/spiceai, a unified SQL query interface and portable runtime to locally materialize, accelerate, and query data tables sourced from any database, data warehouse, or data lake.

We're hiring experienced software engineers, ideally with Rust and/or Golang production experience. We're focused on large data and distributed systems, experience in these is important too. More details: https://spice.ai/careers#section-open-positions

svashish305 · on April 8, 2024

it says remote but the open positions are mostly hybrid

jeadie · on March 28, 2024

And yes, Iceberg is very high up on our list