I'd be curious if y'all end up supporting adding filter attributes to the inverted index that can then be pushed down into the posting list traversal.
for example, a restaurant search app may have (1) an embedding for each restaurant but also (2) a cuisine. then, if a restaurant has `cuisine = Italian`, we'd also store its ghost ID in a `cuisine:Italian` posting list.
at query time, the query planner could take a query like `SELECT * FROM t1 WHERE cuisine = 'Italian' ORDER BY DISTANCE(..)` and emit a plan that efficiently intersects the `cuisine:Italian` posting list with the union of the partitions' posting lists.
this feels to me like a potential strength of the inverted indexing approach compared to graph-based approaches, which struggle with general filtering (e.g. the Filtered-DiskANN paper).
I'd be curious if y'all end up supporting adding filter attributes to the inverted index that can then be pushed down into the posting list traversal.
for example, a restaurant search app may have (1) an embedding for each restaurant but also (2) a cuisine. then, if a restaurant has `cuisine = Italian`, we'd also store its ghost ID in a `cuisine:Italian` posting list.
at query time, the query planner could take a query like `SELECT * FROM t1 WHERE cuisine = 'Italian' ORDER BY DISTANCE(..)` and emit a plan that efficiently intersects the `cuisine:Italian` posting list with the union of the partitions' posting lists.
this feels to me like a potential strength of the inverted indexing approach compared to graph-based approaches, which struggle with general filtering (e.g. the Filtered-DiskANN paper).