Thanks for the benchmarks! :) Indeed, 14GB seems really high for a 400MB Parquet...

wenc · on June 11, 2024

It’s also the aggregation operation. If there are many unique groups it can take a lot of memory.

Newer DuckDbs are able to handle out of core operations better. But in general just because data fits in memory doesn’t mean the operation will — and as I said 8GB is very limited memory so it will entail spilling to disk.

https://duckdb.org/2024/03/29/external-aggregation.html