Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We were using 500+gb of memory at peak and were expecting that to grow. If I remember we didn't go with Polars because we needed to run custom apply functions on DataFrames. Polars had them but the function took a tuple (not a DF or dict) which when you've got 20+ columns makes for really error prone code. Dask and Spark both supported a batch transform operation so the function took a Pandas Dataframe as input and output.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: