POP QUIZ: Find the bug in the following code... (semi-OT)

aaaaa00

Regular
The code is C++.

Code:
class CFoo
{
public:
    LPCWSTR GetText()
    {
        return m_sbstrText ? m_sbstrText : L"";
    }
private:
    CComBSTR            m_sbstrText;
};

int main()
{
    LPCWSTR pwszFoo = NULL;
    CFoo Foo;

    pwszFoo = Foo.GetText();

    ASSERT( pwszFoo && wcscmp( pwszFoo, L"" ) == 0 );

    return 0;
}

Hint 1: This code will crash some of the time. Sometimes it will succeed.

Hint 2: This code compiles and runs on Visual Studio. You need to include windows.h and atlbase.h in your headers. (You also need an assert macro defined.)

Hint 3: This bug is quite subtle. I only saw it after staring at it for quite some time.

Good interview question. :)
 
Humus said:
m_sbstrText is obviously never initialized.

m_sbstrText is an instance of CComBSTR.

The constructor for CComBSTR initializes the contents to NULL.

This is not the bug.
 
I couldn't get this exact code to crash on my machine single time. The assertion is quite suspicious to me, although I don't know what it is _supposed_ to do - maybe you wanted it this way.
 
i believe xmas got it right on the nose

something ive been wondering...does the tertiary and biconditionals produce the same assembly code when compiled? or is one more efficient than the other? ie: is the ? statement more efficient than if else statements, vice versa, or the same
 
Xmas said:
http://www.devguy.com/fp/Tips/COM/bstr.htm
? operator doesn't work with CComBSTR
Ooops, I didn't think of CComBstr for a single moment. Interesting read, thanks for the link. Might be useful one day.
 
Xmas said:
http://www.devguy.com/fp/Tips/COM/bstr.htm
? operator doesn't work with CComBSTR

More or less correct.

(expression) ? a : b always returns objects of the type of a.

So the line

Code:
return m_sbstrText ? m_sbstrText : L"";

causes the compiler to create a temporary CComBSTR if (m_sbstrText) resolves to FALSE (which it will, since it's constructed to contain NULL).

The temporary CComBSTR uses the constructor overload for a string, allocates, and then copies L"" into its private member.

Since the function returns LPCWSTR, the compiler then casts the temporary CComBSTR back into a BSTR, which in turn casts back into an LPCWSTR.

But when the scope of the function ends, the destructor for the temporary CComBSTR executes, which frees the internal member.

This results in the function returning a pointer to freed memory. :oops:

Which, depending on how unlucky you are, results in you reading garbage, or causing an AV.
 
First of all, the obvious bug is that it's written in C++ instead of something like Java. ;)
 
Java is the scourge of mankind. :devilish:

Interesting quiz - although I frequently use obscure operators or ancient initialization rules to make my C++ programs more interesting to read I did not know this ternary op rule.
Shame that I was too late to participate.
 
Great java quote I saw the other day:

"Saying java is good because it works on all platforms is like saying anal sex is good because it works on all genders".
8)
 
Humus said:
Great java quote I saw the other day:

"Saying java is good because it works on all platforms is like saying anal sex is good because it works on all genders".
8)

hey, good point! i never thought of it that way... man Java ROCKS!
 
Ah, the wonder of C++ automatic type conversion rules. This is the best exhibit of why C++ is poorly designed I have ever seen. I'll save this one to my harddrive.

If there was atleast a warning, you'd know the compiler is creating an object behind your back, an object which is unsafe for automatic type conversion.
 
Humus said:
Great java quote I saw the other day:

"Saying java is good because it works on all platforms is like saying anal sex is good because it works on all genders".
8)

Hehe...I sooo agree...:D...maybe they will kick me out as a teacher at our high school now..:D..I am a Java and a Algorithms in Java teacher..:D.."poo on my shoe"



Øyvind Østlund
 
Last edited by a moderator:
I'm surprised there wasn't a compiler warning. I know very little if anything about CComBSTR, but "" is clearly a temporary to me. Besides simultaneous character array declaration and initialization, does a string literal ever not create a temporary?

I may be missing something, but I don't see why all that casting business has anything to do with the error. You're returning the address of a temporary, and I've definately had VC++ warn me about that before.
 
For your own projects, the best solution for this particular class of problems is to flag your constructors with the "explicit" keyword. This prevents the compiler from automagically creating instances of your classes.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/vcrefExplicit.asp

Mintmaster said:
I'm surprised there wasn't a compiler warning.

This code compiles fine with /W4 /WX (which is the highest possible warning level, the default VC++ warning level is only /W3).

Just step through it in a debugger instead of speculating. You'll see what the compiler did with the crazy temporary CComBSTR firsthand.

Mintmaster said:
I know very little if anything about CComBSTR, but "" is clearly a temporary to me. Besides simultaneous character array declaration and initialization, does a string literal ever not create a temporary?

I may be missing something, but I don't see why all that casting business has anything to do with the error. You're returning the address of a temporary, and I've definately had VC++ warn me about that before.

Constant string literals defined in the body of your program all get stuck into the global .rdata section of your .EXE by the compiler, they are not temporaries. Hence the compiler will not warn you about returning a temporary.

.rdata is a read-only section containing all the constants defined in your app. This is so if you have the same string literal multiple times in your app, the compiler only needs to keep a single copy around.

This is why code like this:

Code:
WCHAR*   pwszFoo = L"abcde"; // declare a pointer to a string literal, perfectly legal
wprintf( pwszFoo ); // works just fine
pwszFoo[0] = L'!'; // now try to reset the first character to a !

is wrong, and will cause an AV: your app is trying to write a read-only section, and the OS will say "nuh-uh, can't do that, buh-bye", and your app will go poof.

(On some OSes without MMU support or good memory protection, the write may succeed, meaning you'll have corrupted memory. Your program will appear to work for a while, but you may have trashed your constants, which may or may not cause a mystifying crash sometime later on.)

Do a "link /dump /all foo.exe" and you will be able to see the names and contents of all the sections in your .EXE.

See http://msdn.microsoft.com/msdnmag/issues/02/02/PE/default.aspx
and http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/default.aspx if you want more detail.

The problem is C++ is a very rich language, with very few safeguards built in, and an infinite number of ways to shoot yourself in the foot. With thermo-nuclear weapons.

The only solution is to either not use C++, or understand what's really going on underneath the covers in the compiler.

A lot of people try to use the object oriented features of C++ without understanding what's going on underneath, which inevitably leads to problems. A guy I know calls code written this way "object disoriented programming". :D
 
aaaaa00 said:
Code:
WCHAR*   pwszFoo = L"abcde"; // declare a pointer to a string literal, perfectly legal
wprintf( pwszFoo ); // works just fine
pwszFoo[0] = L'!'; // now try to reset the first character to a !

Am I the only one, who never ever used WCHAR *pwszFoo = L"abcde" but WCHAR pwszFoo[] = L"abcde" , WCHAR *pwszFoo = NULL and WCHAR *pwszFoo = pwszFoo2; ....
I'am doing C/C++ for 10 years now....

Thomas
 
while on the subject

good old gcc manual said:
It is dangerous to use pointers or references to portions of a temporary object. The compiler may very well delete the object before you expect it to, leaving a pointer to garbage. The most common place where this problem crops up is in classes like the libg++ String class, that define a conversion function to type char * or const char *. However, any class that returns a pointer to some internal structure is potentially subject to this problem.

For example, a program may use a function strfunc that returns String objects, and another function charfunc that operates on pointers to char:

String strfunc ();
void charfunc (const char *);

In this situation, it may seem natural to write `charfunc (strfunc ());' based on the knowledge that class String has an explicit conversion to char pointers. However, what really happens is akin to `charfunc (strfunc ().convert ());', where the convert method is a function to do the same data conversion normally performed by a cast. Since the last use of the temporary String object is the call to the conversion function, the compiler may delete that object before actually calling charfunc. The compiler has no way of knowing that deleting the String object will invalidate the pointer. The pointer then points to garbage, so that by the time charfunc is called, it gets an invalid argument. Code like this may run successfully under some other compilers, especially those that delete temporaries relatively late. However, the GNU C++ behavior is also standard-conforming, so if your program depends on late destruction of temporaries it is not portable.

If you think this is surprising, you should be aware that the ANSI C++ committee continues to debate the lifetime-of-temporaries problem.
For now, at least, the safe way to write such code is to give the temporary a name, which forces it to remain until the end of the scope of the name. For example:

String& tmp = strfunc ();
charfunc (tmp);

cheers
 
Back
Top