Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Justify the dev time to save micro-pennies worth of electricity to me instead.

A typical naive index won't help with my regular expression based queries, which aren't easily accelerated by an index. Or given an in-memory index, you've just increased memory use from O(1) to O(N), and I'll OOM on large files. Perhaps you'll throw a database at the problem, complicating I/O (especially when the data is generated/accessed by third party tools that aren't database based), tying me to yet another library's update cycle, and perhaps forcing me to tackle the additional problem of cache invalidation. Perhaps I need reverse lookups, in which case whatever forward indexing I might've pessemistically added ahead of time will be of no help at all.

If it'a a 5 second "this is probably the right choice" kneejerk reaction, maybe it's fine. Or if you're google indexing the internet, I suppose. But I am frequently plagued by shit, buggy, useless indexes that merely distract from the proper alexanderian solution to the gordian knot - brute force - wasting more time than they ever saved, for both people and CPUs.



> Justify the dev time to save micro-pennies worth of electricity to me instead.

KEY (user_id)

I mean, it's a dozen characters. Do you need to know how fast I type before you run the calculation?


(author)

If you're already using a relational database you should almost certainly set up indexes on your table ids and foreign keys. But that's pretty different from the examples in the post!

I'm not anti-index, I'm anti-"if you ever have a full table scan in production you're doing it wrong".


‘If you ever have a non-deliberate full table scan in production you are doing it wrong’ then?


If you ever have non-deliberate anything in production there's a non zero chance you're doing it wrong.


> Do you need to know how fast I type before you run the calculation?

I'll assume 100WPM, call that two words, billed at $200/hour and call that $0.06, falling under "too cheap to be worth arguing against", which falls under the aforementioned:

>> If it'a a 5 second "this is probably the right choice" kneejerk reaction, maybe it's fine.

That said, there's a decent chance those 6 cents won't pay for themselves if this is a login on a single user system, with any theoretical benefits of O(...) scaling being drowned out by extra compile times, extra code to load - and I'd be plenty willing to NAK code reviews that merely attempt to replace /etc/passwd and /etc/shadow with this, as the extra time code reviewing the replacement still has negative expected ROI, and it'll be a lot more involved than a mere dozen characters.

Now, maybe we're attempting to centralize login management with Kerberos or something, perhaps with good reasons, perhaps which does something similar under the hood, and we could talk about that and possible positive ROI, despite some actual downtime and teething concerns?


Now document the two words, run the test suite to verify no-breakagem commit to source control, and push to production.

Suddenly those two words cost a lot more than $0.06, and that's IF everything goes smoothly.


This falls under the “if we need to test that indexes still work on our mysql/postgres we have bigger problems” header.


Depending on your IO pattern, adding an index can make writes quite a bit slower.


First tell me the minimum amount of time typing this would have to take for you to agree it's not worth it and I will try to keep adding things like the time it takes for someone to ask you to do this, for you to start VS Code, find the file, press ctrl-s, deploy the changes, and possibly some estimation of how long it takes a new developer to read and understand this system (please tell me how fast you speak and an agreeable value for how fast the average developer reads as well for this part) vs a simpler one until it goes over that limit.


> Justify the dev time to save micro-pennies worth of electricity to me instead.

The time spent justifying it is longer than the dev time itself. Any semi experienced engineer will throw basic indexes into their data model without even thinking and cover the most common use cases.

If they never use them… who cares? It took no time to add.


An RDBMS is not the best data store for all data. Sometimes flat files or other are the simplest tool for the job, as shown in the article. Adding a database to any of those tools would definitely not be worth the trade-off.


My statement assumed that as a starting point. To be fair, the decision to use a database or other format, given a reasonable amount of experience, is also a fairly quick thought process. By the time I open my editor to write my first line of code I’ve already made my choice.


Sometimes, but I lean towards going with the RDBMS first, and then switch to flat text files if that proves to be a better choice.

Because 90% of the time the RDBMS is.


not sure if this is trolling?


> Justify the dev time

And this is exactly the sentiment that got us where we are.


The mistake in your statement I think, is assuming there is a course of action that wouldn’t result in a problem somewhere.

It may look a little different, but there is nothing without tradeoffs.

Including indexing everything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: