I remember seeing this in Game Developer magazine awhile ago -- great article. The last one is my favorite. Here's a similar one I had to deal with once:
Once I was profiling some loading code that had performance issues. I had found a function that recursively iterated over every entity in the world to find a specific object. Fine in single player, sure. But multi-player was a different issue. Anyway, I brought in the programmer who wrote the particular function and showed him the profile...which looked something like this.
99.9999% FindUnit()
[...]
[...]
[...]
0.01% malloc()
He looked at the profile and immediately spotted a malloc near the bottom of the list. "Ah-ha! Why are we doing a malloc?! That's the problem!", he exclaimed. "Look at the top of the profile, see that recursive function there?", I replied. But to no avail, you see, several of the programmers on that project were die hard to the metal coders who detested dynamic memory allocation and now all the focus turned to why we had this malloc on player load.
Well, it was there for a pretty good reason because we needed to allocate something like 1.7MB to store all the player's possible data. "Put it on the stack.", he said. Exasperated, I responded, "But that's got nothing to do with the performance issue here, and we only have a 1 MB stack anyway!"
Ultimately? We put it on the stack, bumped the stack size to 2 MB, and I rewrote that FindUnit() function.
Once I was profiling some loading code that had performance issues. I had found a function that recursively iterated over every entity in the world to find a specific object. Fine in single player, sure. But multi-player was a different issue. Anyway, I brought in the programmer who wrote the particular function and showed him the profile...which looked something like this.
99.9999% FindUnit()
[...]
[...]
[...]
0.01% malloc()
He looked at the profile and immediately spotted a malloc near the bottom of the list. "Ah-ha! Why are we doing a malloc?! That's the problem!", he exclaimed. "Look at the top of the profile, see that recursive function there?", I replied. But to no avail, you see, several of the programmers on that project were die hard to the metal coders who detested dynamic memory allocation and now all the focus turned to why we had this malloc on player load.
Well, it was there for a pretty good reason because we needed to allocate something like 1.7MB to store all the player's possible data. "Put it on the stack.", he said. Exasperated, I responded, "But that's got nothing to do with the performance issue here, and we only have a 1 MB stack anyway!"
Ultimately? We put it on the stack, bumped the stack size to 2 MB, and I rewrote that FindUnit() function.