If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Junior Member
|
Gamasutra is running an article on proper coding practices for the next gen. It covers cache misses, branch avoidance, inlining and a whole lotta other fun stuff. So head on over. I like this article because it pretty much somes up how this gen is going to be different than the past gen. I lot of it we have already said on these forums one way or the other but it is nice to see it all put in the perspective of someone in the field and who has done such a good job of organizing that information together.
http://www.gamasutra.com/features/20...mason_01.shtml |
|
|
|
|
|
#2 | |
|
Regular
Join Date: Feb 2002
Location: California
Posts: 4,732
|
I don't see how
Quote:
|
|
|
|
|
|
|
#3 |
|
Regular
|
You can do it with simple boolean logic too.
|
|
|
|
|
|
#4 |
|
Member
|
You can do it without them.
For example, the C expression "a = (!b ? c : d)" is equivalent to the x86 asm Code:
mov eax,[b] or eax,eax setz bl dec bl movsx ebx,bl mov eax,[d] and eax,ebx not ebx and ebx,[c] or eax,ebx mov [a],eax EDIT2: Maybe you can still think of setz as a "conditional assignment", but I'm pretty sure it could be done without it, too. Last edited by Mate Kovacs; 21-Dec-2005 at 01:33. |
|
|
|
|
|
#5 |
|
Monochrome wench
|
With x86 probably easier using the cmov instructions introduced with the P6. Microsofts x86_64 compiler uses it.
The instructions for 'a = (!b ? c : d)' will end up being this and only uses 1 register Code:
mov eax, [b] test eax, eax cmovnz eax, [c] cmovz eax, [d] mov [a], eax Last edited by Colourless; 21-Dec-2005 at 02:36. Reason: Added code |
|
|
|
|
|
#6 |
|
Member
|
Yep, but cmov definitely is what DemoCoder referred to as "a conditional assignment or predicated assignment instruction", IMO.
EDIT: BTW, you mixed up [c] and [d], so it's equivalent to "a = (b ? c : d)" now. Last edited by Mate Kovacs; 21-Dec-2005 at 03:21. |
|
|
|
|
|
#7 |
|
Member
Join Date: Dec 2003
Posts: 288
|
A good compiler should recognize such cases and use conditional movs automatically. I don't see why I should change my coding practices...
At least pascal compiler,where you have no ? operator, does it that way. |
|
|
|
|
|
#8 |
|
Member
|
Yep, it all depends on the compiler. Even if you change your practices, but the compiler is stupid, you'll still get inefficient code. For example, since the ANSI C specification states that the inline keyword is only a 'hint', it's still up to the compiler.
|
|
|
|
|
|
#9 |
|
Regular
Join Date: Feb 2002
Location: California
Posts: 4,732
|
Well, I suppose one could store the result of a condition in a boolean, say A, and then use the following boolean equation:
R = (A ? X : Y) becomes R = A * X + not(A) * Y (* = AND, + = OR) Of course, this is just poor man's predication, with A as the predicate. |
|
|
|
|
|
#10 |
|
Member
|
Yep, you got the basic idea. (BTW, simply storing a condition is just not enough (either in the C language, or at the x86 asm level), because it's either 0 or 1. You've to convert it, such that it's either 0 or all 1 bits.)
And yes, it's just "poor man's predication", but you don't need "a conditional assignment or predicated assignment instruction", which was our point. |
|
|
|
|
|
#11 |
|
Member
Join Date: Oct 2004
Location: South Coast, England
Posts: 391
|
This whole micro-optimization stuff seems a bit silly at times.
I personally prefer the "Get it right then get it tight" approach - write some good clean code (free of such ugly micro-optimizations) and then profile it to work out where those micro optimizations will really make a difference. Theres a lot to be said for writing good quality clean code - maintenance, (lack of) bugs, adaptability, portability... Jack |
|
|
|
|
|
#12 |
|
Member
Join Date: Feb 2002
Location: LA, California
Posts: 825
|
Mate, do you know if MSVC/GCC __forceinline/ __attribute__ ((always_inline)) actually force inlining, or if they simply force the compiler to consider inlining even when optimizations are turned off?
|
|
|
|
|
|
#13 | |
|
Member
|
Quote:
"Premature optimization is the root of all Evil." (? Knuth ?) @psurge: I don't know. Honestly. I'll try to poke around. |
|
|
|
|
|
|
#14 | |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,158
|
Quote:
Virtual function overhead is probably the most obvious one (other than it's hard to eliminate them after the fact) one virtual function call doesn't kill you (not even on PS2) but 10's or even 100's of thousands a frame can really hurt. Anecdote --- A friend of mine was just realying his experience removing a lot of virtual function calls from the inner workings of a fairly major system on a cross platform product. The net result was almost no performance difference on PC and doubling of the performance one particular console. There is no way to estimate the impact of those virtual function calls without actually removing them. |
|
|
|
|
|
|
#15 | |
|
Off-season
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
|
Quote:
__________________
Binary prefixes for bits and bytes |
|
|
|
|
|
|
#16 |
|
Regular
Join Date: Feb 2002
Location: California
Posts: 4,732
|
Yes, many compilers will have an almost identical internal representation, except for the fact that ?: is an expression, and 'if' is a statement. But they are otherwise identical. Much like for/while/dowhile.
|
|
|
|
|
|
#17 |
|
Hello :-)
Join Date: Sep 2005
Location: Cambridge, UK
Posts: 1,307
|
I can't help feel like I'm stepping back 5 years reading that article, when in fact it's aimed as a prediction of the next 5 years of development.
IMO, the choice of algorithms, and overall design structure will have a greater effect on performance than things such as choice of branch style. He talks about about going to extreme lenghts to reduce memory overhead, then effectivly says 'inline everything'. ?! I've done that before... and I got an 8mb executable instead of 700k. Fantastic advice. Yes, selective inlining is very important, but this is usally done by a smart compiler, and will be obvious when it's needed with proper profiling. He also suggests templating as much as possible. Same deal, Code bloat. Takes me back to the 'C is faster than C++' wars of days gone by. "Calling malloc or the default new in a game loop is considered irresponsible". Urgh. |
|
|
|
|
|
#18 | ||
|
Off-season
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
|
Quote:
Quote:
__________________
Binary prefixes for bits and bytes |
||
|
|
|
|
|
#19 | |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,158
|
Quote:
To give you some idea of how far games are from general application develpment, many companies have a 0 runtime memory allocation policy (although it's less prevalent than it used to be). Not so long ago my games had no free, the only way to free memory was to revert the heap (actually just a stack) to a previously saved state. Most of what's in the article can make a significant performance difference. Obviously these types of optimisation go hand in hand with good algorythm choices. It's harder to do this type of optimisation as teams get bigger, development practices move more towards generally accepted large scale development. But as I mentioned above if you can enforce these types of optimisations they can be a significant performance win on todays console processors. IME on PC they make sod all difference. |
|
|
|
|
|
|
#20 |
|
Join Date: May 2002
Location: New York, NY
Posts: 12,678
|
I don't buy it, ERP. Readable code is, these days, vastly more important than slightly faster code. Better to enforce programming practices that lead to stable, readable code than much less readable but a tiny bit faster code. As JHoxley said, better to write readable code first, then go back and examine where your code is spending all of its time and optimize there.
And, more importantly, most of these optimizations are things that should be handled by the compiler in the first place.
__________________
April 20, 1979 - America must never forget. |
|
|
|
|
|
#21 | |
|
Member
|
Quote:
|
|
|
|
|
|
|
#22 |
|
Member
Join Date: Dec 2003
Posts: 288
|
use garbage collectors
|
|
|
|
|
|
#23 | |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,158
|
Quote:
It's about picking solutions that won't cripple you in the long run, malloc/new is an inherently expensive operation, even a trivial allocator will do a linear walk of a linked list. And Free is worse. With some thought you can usually (and this isn'ty true of very dynamic content) eliminate the allocations all together, it is not easy to do this after the fact. Virtual functions are even harder to remove, if you decide you "need to optimise". I conside spending days to save 1/10th of a millisecond worthwhile. But I'd rather not have to refactor large portions of a codebase to do it. While I agree in principle with the "premature optimisation is the root of all evil" stuff there is a certain class of optimisations that have to be done during your initial implementation for them to be practical. And while one case of malloc or a virtual function call may not be measurable, the 10's of thousands of them that tend to get made a frame can be a much cost than the 1/10 of a millisecond I was willing to spend several days saving. C++ gives you a lot of rope (tools) to hang yourself with in terms of performance and if the "bad" patterns are prevalent in the codenase they are extremely difficult (read impossible in any reasonable timeframe) to address after the fact. What is generally taught as "Good OO" design is not the same as good design for a game necessarilly. One of the things I lament is that a lot of programmers coming out of college now have no real concept of how the code they are writing will be compiled and what the impact of that on chache and application performance will be. |
|
|
|
|
|
|
#24 | |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,158
|
Quote:
I do think using handle based memory allocation for unpredictably sized allocations in very dynamic environments (say streaming player modified worlds) could be a win, at least you have some recourse when the allocator fails, and you can effectively defrag the heap. |
|
|
|
|
|
|
#25 |
|
Regular
|
With a 64 bit adress space if allocate 4 GB a second (ie. a 32 bit memory space) it would take 1 million hours for fragmentation to hit you even if you didn't try to reclaim free'd memory for allocation
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|