Dirty Coding Tricks

I remember seeing this in Game Developer magazine awhile ago -- great article. The last one is my favorite. Here's a similar one I had to deal with once:

Once I was profiling some loading code that had performance issues. I had found a function that recursively iterated over every entity in the world to find a specific object. Fine in single player, sure. But multi-player was a different issue. Anyway, I brought in the programmer who wrote the particular function and showed him the profile...which looked something like this.

99.9999% FindUnit()
[...]
[...]
[...]
0.01% malloc()

He looked at the profile and immediately spotted a malloc near the bottom of the list. "Ah-ha! Why are we doing a malloc?! That's the problem!", he exclaimed. "Look at the top of the profile, see that recursive function there?", I replied. But to no avail, you see, several of the programmers on that project were die hard to the metal coders who detested dynamic memory allocation and now all the focus turned to why we had this malloc on player load.

Well, it was there for a pretty good reason because we needed to allocate something like 1.7MB to store all the player's possible data. "Put it on the stack.", he said. Exasperated, I responded, "But that's got nothing to do with the performance issue here, and we only have a 1 MB stack anyway!"

Ultimately? We put it on the stack, bumped the stack size to 2 MB, and I rewrote that FindUnit() function. :)
 
Dirty coding hack...
Code:
union
{
   long i;
   float f;
}flong
...apparently. :cry:
 
Naaah, put in a semicolon or two and that is standard compliant! It's the regular
Code:
*(float*)&i;
which is bad...

(I probably totally missed the joke...)
No, it was no joke. :( The union method is also technically unreliable. It works in gcc but there is no guarantee that compiler "Y" will run it correctly. In effect the C standard states that reading from a member of a union is undefined unless it was the last one written to. It's meant to allow more aggressive optimisations (i.e. different register types) to be used by the compiler.

Apparently, the only thing that is legal is to retype it to an array of char (and maybe use memcpy). If I get a chance, I'll see if I can find the relevant discussion.

I clearly didn't get the joke, not having any idea what "apparently" refers to, but I have a feeling Simon spends too much time programming dynamic languages.
Dynamic?! Err no. I'm pretty much stuck in "c land" except maybe for the odd bit of maple or awk. I haven't even done C++ for years.
 
Apparently, the only thing that is legal is to retype it to an array of char (and maybe use memcpy). If I get a chance, I'll see if I can find the relevant discussion.

To my understanding, C99 defined it as that memcpy is the only legal way to make this work as intended.
 
Speaking of dirty-coding tricks in general(not just last minute hacks), I have fond memories of this one:
Code:
#define OFFSET_OF( _class, _var )   (((uintptr)((char*) &((_class*)1)->_var))-1)
 
Weird. I've only touched unions once, during a lab assignment in school. Our prof had some exercise to specifically show us unions. I didn't realize they were kind of sketchy, compiler wise. I'm glad I've never used them since.
 
Weird. I've only touched unions once, during a lab assignment in school. Our prof had some exercise to specifically show us unions. I didn't realize they were kind of sketchy, compiler wise. I'm glad I've never used them since.

If you use it according to spec you shouldn't have trouble, only I haven't really seen many people use unions for their intended purpose.
 
Yeah, memcpy is the only legal way to perform type punning.
In a perfect world, this would work (and output 0x3F800000):
Code:
	float xf=1.0f;
	unsigned int xu=reinterpret_cast<unsigned int>(xf);
	printf("0x%08X\n",xu);
Alas ... at least not in GCC4.
 
I use that one :

Code:
	float fValue = 1.0f;
	uint32 uiFloat = reinterpret_cast<uint32&>( fValue );

But as mentionned earlier, spec only guarantees copying to a char array :(
(that said both the *(float*)&i; and reinterpret_cast<uint32&>( fValue ); work on MSC since at least the first VS.net, and I suspect it also works on most compilers.)
 
I use that one :

Code:
	float fValue = 1.0f;
	uint32 uiFloat = reinterpret_cast<uint32&>( fValue );

But as mentionned earlier, spec only guarantees copying to a char array :(
(that said both the *(float*)&i; and reinterpret_cast<uint32&>( fValue ); work on MSC since at least the first VS.net, and I suspect it also works on most compilers.)

Yes. This is one of the weird things with C/C++. Type-punning by casting pointer types is so rampant that no compiler dares break it evven though, it is against the spec. :???:
 
Yes. This is one of the weird things with C/C++. Type-punning by casting pointer types is so rampant that no compiler dares break it evven though, it is against the spec. :???:

Actually, I have seen GCC break such code. The specific code that broke for me was this:
Code:
static void *pool_alloc( pool *p )
    {
    void *pf = p->first_free_elem;
    if(!pf)
        {
        link_new_subpool(p);
        pf = p->first_free_elem;
        }
    void **pf2 = (void **)pf;
    p->first_free_elem = *pf2;
    return pf;
    }
This function is basically an allocator function for a memory pool, where each free block has stored at its beginning a pointer to the next free block. Now, this function may look unproblematic, but consider what happens when it is inlined:
  • 'pf2' has a different pointer type than the pointer returned from the function. It also has a different pointer type than anything I would want to cast the return value to.
  • For this reason, GCC concluded that 'pf2' did not alias with anything.
  • However, the caller of this function would, as its very first action, overwrite the beginning of the block.
  • Due to optimization and the apparent absence of aliasing, GCC then figured out that it would be perfectly fine to put that overwrite BEFORE the 'p->first_free_elem = *pf2;' line.
  • Result: p->first_free_elem is filled with something that is not a meaningful pointer at all. On the next allocation: KABOOM!

Took me about a day to find out what was happening. The workaround I ended up with was this abomination:
Code:
static void *pool_alloc( pool *p )
    {
    void *pf = p->first_free_elem;
    if(!pf)
        {
        link_new_subpool(p);
        pf = p->first_free_elem;
        }
    // pointer-to-volatile-pointer, in order to block compiler optimziation
    void * volatile *pf2 = (void **)pf;
    p->first_free_elem = *pf2;

    // overwrite beginning of block, then read it again
    // in order to produce an artificial data dependency between the block's
    // content and the allocator's return value. This does NOT work if 
    // 'pf2' is not declared as a pointer-to-volatile-pointer.
    *pf2 = pf;
    return *pf2; // YA RLY!
    }

Apparently, this issue appears with all GCC versions from 3.4.1 and up, when using -O2 or higher optimization settings. It can be worked around with the "-fno-strict-aliasing" commandline switch, but I didn't want to do that. There probably exists better workarounds, but I was unable to find any at the time.

Mike Acton has a fairly good article explaining this issue at http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

This is not the only optimization GCC does where it exploits C/C++ "undefined behavior" in order to get more efficient code at the expense of occasional unexpected behavior. With at least gcc-4.3.3, the following code turns into an infinite loop:
Code:
int i;
// relying on signed underflow to get out of the loop
for(i=0;i<100;i--)
    {
    do_something();
    }
 
This is not the only optimization GCC does where it exploits C/C++ "undefined behavior" in order to get more efficient code at the expense of occasional unexpected behavior. With at least gcc-4.3.3, the following code turns into an infinite loop:
Code:
int i;
// relying on signed underflow to get out of the loop
for(i=0;i<100;i--)
    {
    do_something();
    }

And with code like that, if you put a printf("%d\n",i) inside the loop and run it thru the optimizer, you'll prolly wonder why the hell am I not getting an infinite loop?;)
 
Back
Top