Snowflake and BigQuery are quite expensive for big datasets. Databricks Delta La...

georgewfraser · on Sept 21, 2020

> Snowflake and BigQuery are quite expensive for big datasets.

This is just flatly untrue---they are nearly the same cost/TB as object storage, and they store everything in compressed columnarized format, so they're about as efficient as you can get.

I have heard many people make the same claim. I can't figure it out. Is there something wrong with my calculator???

bwc150 · on Sept 21, 2020

The storage costs are only a tiny fraction of the costs involved with accessing and analyzing the data using snowflake "credits"

georgewfraser · on Sept 21, 2020

For sure, but you're not going to fix that by making your own data lake using, for example, Parquet-on-S3. You're still going to pay the cost of compute when you analyze that data, and a well-optimized commercial database system is extremely hard to beat. Even if you look at Presto, and you exclude the people costs of managing it yourself, it still can't beat the commercial systems: https://fivetran.com/blog/warehouse-benchmark

hodgesrm · on Sept 22, 2020

That's because Snowflake happens to charge relatively little for backing storage at the moment. As I recall it's about the same as object storage. Virtual data warehouses on the other hand are quite expensive, especially if they have a lot of compute.