Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I generally haven't had a problem with window function performance - I was mostly using MSSQL.

That said, the single biggest performance drain I've seen is people not using the database, and instead pulling data out to process in code. I don't think I've ever seen a case where it was faster to process data outside of the database - though there has also been plenty of times when the pertinent SQL to had to be written "smarter." Think in sets, not iteratively! There's a terrifying amount of loops in a lot of legacy SQL statements.



If you ever want to see some interesting thoughts on this, lookup "Toon Koppelaars thickdb". Granted he works for Oracle, but I think his point is quite well articulated.

Even if you write optimized C code, you can't tend to beat well written DB code because the IO completely overwhelms the processing time. And somewhat counterintuitively, moving processing outside the database tends to increase overall database CPU usage, because you tend to force more IO through the system.

That makes sense, since IO costs more than most cpu processing, and if you calculate in the DB Engine, you don't tend to need as much raw data out, you only need the results.

Basically I want a modern, cost effective mainframe, that uses a reasonable language for the data layer. The earlier suggestions of Python + Postgres or C# + MSSql are nearer the mark, but never quite feel first class.


Another Toon fan here. HE also co-wrote a book called Applied Mathematics for Database Professionals that is superb. It's dry, like a textbook. But the clarity and depth that they have on the underpinnings of how all this works and why it does is excellent.


I think we could be friends. As an appdev turned database/BI dev, there are a lot of intricacies of databases that are just not intuitive and take a while to get down right.

We're getting away from it by either moving to new tech or the tech is improving (adaptive query processing, auto tuning indexes, writable columnstore).

Even when I was an app dev I said you live and die by your database, either it's data model or by not putting processing in the right place. It's an architecture thing and something that (generally) has to be addressed early on.


I was given a report written in T-SQL that had multiple nested loops and a select statement inside the inner loop.

Rewriting it as one SQL query made it 4 orders of magnitude faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: