Coding Horror: The Sad Tragedy of Micro-Optimization Theater

bradtgmurray · on Jan 30, 2009

Jeff is not quite on the ball with his reference to the "Shlemiel" algorithm.

Joel originally described the "Shlemiel" case when referring to strings in C not having a length stored in them, and needing to use strlen to walk the string to find the length. The Shlemiel problem comes up when you use strcat in a loop, since each strcat call needs to recalculate the length of the loop, which takes longer as the string gets longer.

The problem with concatenating strings is not a length issue, but a memory issue. In order to concatenate a string in C# or C++ using built in string classes, the location to store the new resulting string has to be allocated and the two strings have to copied into this new location. The allocation does not take up a trivial amount of time, and reallocating memory over and over again for each time you concatenate a string is bad practice. There is still a bit of "Shlemiel" since you have to keep copying your existing string, which takes a larger amount of time as the string gets longer, but you have to take in consideration the allocation, which wasn't present in the original problem that the "Shlemiel" story described.

cconstantine · on Jan 30, 2009

Memory allocation is something that is typically O(1), or O(log(n)). The constants attached are fairly large, but doing lots of allocations/deallocations isn't a BigO issue.

The schlemiel problem still exists when doing '+=' in a loop. Each copy requires some code to walk the length of both sides of the operator, same as strlen in Joel's example. The loop is an O(n^2) operation, even if it has a smaller constant on it than a memcpy(src, dest, strlen(src)).

It's "Schlemiel", but with different constants attached. For big enough N the O(1) or O(log(n)) of the allocations lose importance making the entire thing O(n^2) just like "Schlemiel".

tptacek · on Jan 30, 2009

So we may be saying the same thing, but, he didn't say allocation was N log N --- and allocation/deallocation is really expensive; it's consistently on the top of profiles of programs I work on.

palish · on Jan 30, 2009

... allocation/deallocation is really expensive ...

The cost of allocating memory in .NET (and probably other GC environments) is basically equivalent to the cost of incrementing a pointer, which is to say, basically free. The reason is because the .NET garbage collector periodically "packs together" objects in memory, so the memory allocation algorithm essentially becomes:

  result = _freeMemoryPtr;
  _freeMemoryPtr += numBytesToAllocate;
  return result;

... in other words, since objects in memory are periodically "packed together" by the GC, there isn't a lot of memory fragmentation, so the allocator can simply allocate from the end of the heap. It doesn't need to try to find a "large enough free block" within fragmented memory, because memory isn't fragmented.

At least, that is my understanding. I could be wrong because I haven't investigated this thoroughly yet.

http://en.wikipedia.org/wiki/Garbage_collection_(computer_sc...

tptacek · on Jan 30, 2009

You're right; I should have said, "malloc/free is really expensive".

palish · on Jan 30, 2009

Just curious, what do you do? It's interesting that you optimize a lot of C/C++ programs... game development maybe?

tptacek · on Jan 31, 2009

Network security software, reverse engineering, and (a bit earlier in my career) streaming video/multicast.

I got a 7x performance improvement in a performance-critical component at Arbor Networks within a couple weeks of starting there just by profiling, finding malloc/free at the top of the profile, and switching to an arena allocator. I have other malloc stories, but that's the simplest. I've learned to fear malloc.

palish · on Jan 31, 2009

I'd like to hear the other malloc/free stories whenever you're inclined.

Xichekolas · on Jan 30, 2009

My guess is something cryptographic.

tptacek · on Jan 31, 2009

cperciva is the resident crypto person here. =)

jcl · on Jan 30, 2009

The reference is entirely valid. Both Jeff and Joel are using the joke allegorically; Joel uses it to describe an algorithm that counts the length of the first part of the string over and over, while Jeff uses it to describe an algorithm that copies the first part of the string over and over. (The joke itself is an old one, not original with Joel.)

dominik · on Jan 30, 2009

I'm unclear what point Jeff had in mind...

A micro-summary to illustrate:

1. I love strings.

2. You don't want to concatenate strings in (large) loops.

3. You already knew that, at least if you're a decent programmer.

4. Analysis of various methods of concatenating strings in loops.

5. Reveal that the methods don't matter.

6. It's bad to worry about micro-optimizations like the best way to concatenate strings in a loop.

7. Just avoid obvious mistakes like concatenating strings in loops.

8. Otherwise, don't worry about micro-optimization. Instead, focus on your real goal: writing better code.

Maybe it's me, but this article felt disjointed.

jerf · on Jan 30, 2009

Well, the real fastest thing to do is stream your page out by arranging it such that you don't have to construct it out of order, so you never "concatenate" strings at all, simply shipping the page out to the socket with each chunk, except with a small I/O buffer before shipping it out on the socket. Not only will this take less time in all but very pathological cases by avoiding string concatenation penalties, it'll also end up getting the page to the user sooner so they can progressively render it.

But hardly any web framework I've ever used will even permit this, all insisting on being handed one big string to ship out to the user. Sigh.

tptacek · on Jan 30, 2009

You mean like, Rails helpers and "concat"?

raganwald · on Jan 30, 2009

Meaning "no true framework":http://en.wikipedia.org/wiki/No_true_Scotsman allows you to stream results out piecemeal.

jerf · on Jan 30, 2009

Beats me, I've never used Rails. I did qualify it with "I've used" and "hardly any". I've used a number, but nobody's come even close to using them all.

ezmobius · on Jan 31, 2009

nope, rails will still buffer up the whole template render before it sends it down the socket. I think he means progressive rendering directly to a socket rather then buffering it before shipping.

daleharvey · on Jan 30, 2009

am I just being slow or has he completely missed the point of avoiding concatenation in loops, concatenating small strings inside loops is fine, its tiny cost, but concatenating the results of those 100000 iterations is completely different

cconstantine · on Jan 30, 2009

The Schlemiel problem is a BigO problem. BigO is really only useful analysis if your N is large enough. If you always work with data sets that are small (and have confidence that they will remain small) there is no point in doing BigO analysis or worrying about bad things like O(N^2).

daleharvey · on Jan 30, 2009

I wasnt talking about the length of the iterations but what the result of them is, he is just concatenating tiny strings, then apparently throwing the result away, if he used the first example in an iteration of 100000 then I dont think the results would be negligible.

daleharvey · on Jan 30, 2009

lol, someone else picked up on it and wrote the proper benchmark for it

8678ms for concatenation Vs 19ms for stringbuilding

olavk · on Jan 30, 2009

I think that is his point. Since he just concatenates a handfull of strings, the difference between the methods are neglible, even if he runs it 100000 times.

aditya · on Jan 30, 2009

Can anyone explain what he means by "writing better code"?

It seems like he sets up a straw man string concatenation problem and then tears it down in a pointless argument.

olavk · on Jan 30, 2009

I would guess that better code means more readable and maintable code.

coliveira · on Jan 30, 2009

In the examples that he gives, the main burden is in creating objects, not in concatenating strings. As long as there are just a small number of += operations, the ratio between object creation and concatenation remains similar and there will be no big difference in the long run.

jonknee · on Jan 30, 2009

Is Jeff really making HTML strings in his methods? Aren't there template systems for .NET?

johns · on Jan 30, 2009

Yes, but they get very ugly when you have a lot of control structures. Jeff prefers to move these into code, see here: http://twitter.com/codinghorror/statuses/1154609179

pseale · on Jan 30, 2009

Jeff is presumably choosing to build his reusable HTML widgets in functions, which is one valid way to do things.

Here's a good overview of all options available to ASP.NET MVC: http://www.lostechies.com/blogs/jimmy_bogard/archive/2009/01...

newt0311 · on Jan 30, 2009

OT: This is why I keep wishing that languages would grow up when it came to strings and use better optimized big-O data structures like ropes instead and save everybody a lot of headache.