Justify the dev time to save micro-pennies worth of electricity to me instead.
A typical naive index won't help with my regular expression based queries, which aren't easily accelerated by an index. Or given an in-memory index, you've just increased memory use from O(1) to O(N), and I'll OOM on large files. Perhaps you'll throw a database at the problem, complicating I/O (especially when the data is generated/accessed by third party tools that aren't database based), tying me to yet another library's update cycle, and perhaps forcing me to tackle the additional problem of cache invalidation. Perhaps I need reverse lookups, in which case whatever forward indexing I might've pessemistically added ahead of time will be of no help at all.
If it'a a 5 second "this is probably the right choice" kneejerk reaction, maybe it's fine. Or if you're google indexing the internet, I suppose. But I am frequently plagued by shit, buggy, useless indexes that merely distract from the proper alexanderian solution to the gordian knot - brute force - wasting more time than they ever saved, for both people and CPUs.
If you're already using a relational database you should almost certainly set up indexes on your table ids and foreign keys. But that's pretty different from the examples in the post!
I'm not anti-index, I'm anti-"if you ever have a full table scan in production you're doing it wrong".
> Do you need to know how fast I type before you run the calculation?
I'll assume 100WPM, call that two words, billed at $200/hour and call that $0.06, falling under "too cheap to be worth arguing against", which falls under the aforementioned:
>> If it'a a 5 second "this is probably the right choice" kneejerk reaction, maybe it's fine.
That said, there's a decent chance those 6 cents won't pay for themselves if this is a login on a single user system, with any theoretical benefits of O(...) scaling being drowned out by extra compile times, extra code to load - and I'd be plenty willing to NAK code reviews that merely attempt to replace /etc/passwd and /etc/shadow with this, as the extra time code reviewing the replacement still has negative expected ROI, and it'll be a lot more involved than a mere dozen characters.
Now, maybe we're attempting to centralize login management with Kerberos or something, perhaps with good reasons, perhaps which does something similar under the hood, and we could talk about that and possible positive ROI, despite some actual downtime and teething concerns?
First tell me the minimum amount of time typing this would have to take for you to agree it's not worth it and I will try to keep adding things like the time it takes for someone to ask you to do this, for you to start VS Code, find the file, press ctrl-s, deploy the changes, and possibly some estimation of how long it takes a new developer to read and understand this system (please tell me how fast you speak and an agreeable value for how fast the average developer reads as well for this part) vs a simpler one until it goes over that limit.
> Justify the dev time to save micro-pennies worth of electricity to me instead.
The time spent justifying it is longer than the dev time itself. Any semi experienced engineer will throw basic indexes into their data model without even thinking and cover the most common use cases.
If they never use them… who cares? It took no time to add.
An RDBMS is not the best data store for all data. Sometimes flat files or other are the simplest tool for the job, as shown in the article. Adding a database to any of those tools would definitely not be worth the trade-off.
My statement assumed that as a starting point. To be fair, the decision to use a database or other format, given a reasonable amount of experience, is also a fairly quick thought process. By the time I open my editor to write my first line of code I’ve already made my choice.
A typical naive index won't help with my regular expression based queries, which aren't easily accelerated by an index. Or given an in-memory index, you've just increased memory use from O(1) to O(N), and I'll OOM on large files. Perhaps you'll throw a database at the problem, complicating I/O (especially when the data is generated/accessed by third party tools that aren't database based), tying me to yet another library's update cycle, and perhaps forcing me to tackle the additional problem of cache invalidation. Perhaps I need reverse lookups, in which case whatever forward indexing I might've pessemistically added ahead of time will be of no help at all.
If it'a a 5 second "this is probably the right choice" kneejerk reaction, maybe it's fine. Or if you're google indexing the internet, I suppose. But I am frequently plagued by shit, buggy, useless indexes that merely distract from the proper alexanderian solution to the gordian knot - brute force - wasting more time than they ever saved, for both people and CPUs.