"if you happen to care about app's performance, you will have to carry wstrings around"
If those strings are for the user to read, he's reading a million times slower than you handle the most ornate reencoding. Sounds like a premature optimization.
Not only that, but the time required to convert from UTF-8 to UTF-16 is negligible in relation to the time required to lay out the glyphs and draw them on screen. Premature optimisation indeed.
It's not a premature optimization. It's a manifestation of a different set of coding ethics which is just ... err ... less wasteful and generally more thoughtful.
Yup. I'd wish this ethics was more popular. I can understand that we "waste" countless cycles in order to support abstraction layers that help us code faster and with less bugs. But I think that our programs could still be an order of magnitude faster (and/or burn less coal) if people thought a little bit more and coded a little bit slower. The disregard people have for writing fast code is terrifying.
Or maybe it's just me who is weird. I grew up on gamedev, so I feel bad when writing something obviously slow, that could be sped up if one spent 15 minutes more of thinking/coding on it.
Yeah, I'll have to disagree with both of you. The "coding ethics" that wants to optimze for speed everywhere is the wasteful and thoughtless one.
Computers are fast, you don't have to coddle them. Never do any kind of optimization that reduces readability without concrete proof that it will actually make a difference.
15 minutes spent optimizing code that takes up 0.1% of a program's time are 15 wasted minutes that probably made your program worse.
Additionally: "Even good programmers are very good at constructing performance arguments that end up being wrong, so the best programmers prefer profilers and test cases to speculation."(Martin Fowler)
> Computers are fast, you don't have to coddle them
This mentality is exactly why Windows feels sluggish in comparison to Linux on the same hardware. Being careless with the code and unceremoniously relying on spare (and frequently assumed) hardware capacity is certainly a way to do things. I'm sure it makes a lot of business sense, but is it a good engineering? It's not.
Neither is optimization for its own sake, it's just a different (and worse) form of carelessness and bad engineering.
Making code efficient is not a virtue in its own right. If you want performance, set measurable goals and optimize the parts of the code that actually help you achieve those goals. Compulsively optimizing everything will just waste a lot of time, lead to unmaintainable code and quite often not actually yield good performance, because bottlenecks can (and often do) hide in places where bytes-and-cycles OCD overlooks them.
I think we are talking about different optimizations here. I'm referring to "think and use qsort over bubblesort" kind of thing while you seem to be referring to a hand-tuned inline assembly optimizations.
My point is that the "hardware can handle it" mantra is a tell-tale site of a developer who is more concerned with his own comforts than anything else. It's someone who's content with not pushing himself and that's just wrong.
--
(edit) While I'm here, do you know how to get an uptime on Linux?
cat /proc/uptime
Do you know how to get uptime on Windows? WMI. That's just absolutely f#cking insane that I need to initialize COM, instantiate an object, grant it required privileges, set up a proxy impersonation only to allow me send an RPC request to a system service (that may or may not be running, in which case it will take 3-5 seconds to start) that would on my behalf talk to something else in Windows guts and then reply with a COM variant containing an answer. So that's several megs of memory, 3-4 non-trivial external dependencies and a second of run-time to get the uptime.
Can you guess why I bring this up?
Because that's exactly a kind of mess that spawns from "oh, it's not a big overhead" assumption. Little by little crap accumulates, solidifies and you end up with this massive pile of shitty negligent code that is impossible to improve or refactor. All because of that one little assumption.
I agree that optimization for its own sake is not a good thing (though tempting one for some, including me), but there's a difference between prematurely optimizing and just careless cowboy-coding. Sometimes two minutes of thinking and few different characters are enough to speed code up an order of magnitude (e.g. by choosing the proper type or data structure).
Also, being aware of different ways code can be slow (from things dependent on programming language of choice to low-level stuff like page faults and cache misses) can make you produce faster code by default, because the optimized code is the intuitive one for you.
Still, I think there's a gap between "fast enough and doesn't suck" and "customers angry enough to warrant optimization". It's especially visible in the smartphone market, where the cheaper ones can't sometimes even handle their operating system, not to mention the bloated apps. For me it's one of the problems with businesses. There's no good way to incentivize them to stop producing barely-good-enough-crap and deliver something with decent quality.
For display purposes, UTF-8 vs. UTF-16 is going to be such a miniscule difference that it's not worth the potential portability bugs to try to optimize for speed. You're talking about at most 30000 characters of text on screen at once. If that's entirely stored in UTF-8, and entirely rendered in UTF-16, and the conversion takes an insane 100 cycles per character on average, you're still using less than 0.1% of a single core of a modern desktop CPU.
If you got into the 1%+ range, I could see justifying some attention to speed, but otherwise...
Less wasteful of computer time, but more wasteful of developer time. And, given that the comment is advocating a more complex strategy for using strings with different encodings rather than the simple one given in the story, probably more error-prone too.
The advocating strategy is simpler: UTF8 strings are a lot easier to handle than UCS2. What is complex is that windows API is inconsistent and more oriented toward UCS2.
If those strings are for the user to read, he's reading a million times slower than you handle the most ornate reencoding. Sounds like a premature optimization.