Xenon , Ps3 , Revolution ...

Gubbi said:
Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.
 
DeanoC said:
Gubbi said:
Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.

I'm going to agree with the original poster.

I'm not saying that an understanding of the underlying architecture is unimportant, rather that it's should only be important for a relatively small portion of the code. Once you make the quirks visible to the bulk of your application code your placing a huge burden on your developers.

I've actually had serious discussions at work of removing access to pointers and memory allocation outside of core code. The idea is that garbage collected memory would remove a lot of the stupid errors that take days or weeks of senior engineer time to find and fix. If you could reapply that time to optimising the critical path I argue that you would be able to trivially offset the overhead it incurs.

Now having said that, there is no good way to implement garbage collection in C++ without writing a compiler so it's just speculation.

I worked on an application recently that contained over 1500 source files, how many of those do you think ought to care about the underlying machine architecture. And if I'm doing cross platform development which architecture should they care about?
 
DeanoC said:
Gubbi said:
Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.

I gotta agree with Deano on this... If anything, you're gotta worry more about the compiler F$CKING up your code more than taking the time to plan your code for decent scheduling...
 
ERP said:
I worked on an application recently that contained over 1500 source files, how many of those do you think ought to care about the underlying machine architecture. And if I'm doing cross platform development which architecture should they care about?

While I'm lucky and have only one platform to worry about, its something that makes me lose sleep on a regular basis ;-). We are currently 1000+ source files and will get alot bigger.

I'm currently considering the drastic option of splitting the code-base into sections.
A) High level, written in whatever pretty code will only ever run on a single main processor. Can be unefficient just as long as it works.
B) Old school code. More classic C code, less pointers and dynamics stuff, How well it will run is thought about for at least couple of seconds. Cache, DMA and synchronisation will have to be considered. Will run better but will require more experienced coders. Harder to mantain and write.
C) Performance code. Special code that understand its machine perfectly (possible a different language completely!) that is for experts only.

The problem we have is that on at least one next gen console, 80% of its performance can only by obtained by B and C (and B might be 10x slower than C...).

Do you want your game to have access to 1/5th the power of your competion that embraced the darkness?
 
archie4oz said:
DeanoC said:
Gubbi said:
Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.

I gotta agree with Deano on this... If anything, you're gotta worry more about the compiler F$CKING up your code more than taking the time to plan your code for decent scheduling...

From a purely practical standpoint, maintaining code quality when dev teams get big gets harder and harder. Start adding external contractors into that mix and I'm going to claim it goes from difficult to impossible.

I doubt half the programmers on my team have ever written a significant amount of assembler, most have them don't understand what a compilers optimiser can and can't do never mind understanding a machine architecture in enough detail to make the "right" decision.

The whole in order/OOO thing I actually don't care to much about, outside of manual prefetches your at the mercy of the compiler to solve the problem for you anyway.

All I'm arguing is that your average gameplay/application programmer should be able to write code that will run effectively without having to jump through hoops to do it. For 90% of tyhe code in a game I'll take readability and maintainability over performance every day of the week.
 
DeanoC said:
If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.
So code ugly, code fast? Well that would explain a lot about your love for SoA :p

Kidding aside though, who is to say that different "prettier" abstractions couldn't eventually evolve that still ran efficiently on less "friendly" architectures. Of course that assumes these will be around long enough for the need to arise...

Just like SoA is basically forcing the definition of new atomic unit - 4x4 Matrix where we used to have 4x1Vector... Obvious question arises whether it's worth coming up with replacements for algorithms that can't be described in that manner or do we just put up with inefficiencies in those situations.

ERP said:
I've actually had serious discussions at work of removing access to pointers and memory allocation outside of core code. The idea is that garbage collected memory would remove a lot of the stupid errors that take days or weeks of senior engineer time to find and fix. If you could reapply that time to optimising the critical path I argue that you would be able to trivially offset the overhead it incurs.
Isn't that what scripting languages are for? :p
 
DeanoC said:
ERP said:
I worked on an application recently that contained over 1500 source files, how many of those do you think ought to care about the underlying machine architecture. And if I'm doing cross platform development which architecture should they care about?

While I'm lucky and have only one platform to worry about, its something that makes me lose sleep on a regular basis ;-). We are currently 1000+ source files and will get alot bigger.

I'm currently considering the drastic option of splitting the code-base into sections.
A) High level, written in whatever pretty code will only ever run on a single main processor. Can be unefficient just as long as it works.
B) Old school code. More classic C code, less pointers and dynamics stuff, How well it will run is thought about for at least couple of seconds. Cache, DMA and synchronisation will have to be considered. Will run better but will require more experienced coders. Harder to mantain and write.
C) Performance code. Special code that understand its machine perfectly (possible a different language completely!) that is for experts only.

The problem we have is that on at least one next gen console, 80% of its performance can only by obtained by B and C (and B might be 10x slower than C...).

Do you want your game to have access to 1/5th the power of your competion that embraced the darkness?

Deano, if you suggest the way that has been followed on PlayStation 2 as other platforms of thinking the game in C/C++ and then having some poor SOB's convert that code in DMA friendly/cache friendly/PPE friendly ASM code... well I do not think this is a path that can go on much longer especially when the amount of code to be "converted" grows in size.

This generation it was a pain in the ass, next-generation it will be swimming into a lava pit... can this still be done the generation after that ?

I think this is a path for self-destruction. Especially when the difference, graphically, between optimized games and not optimized games keeps shrinking.

Do you really think that your game heavvily optimized will look that much better than a game built eclusively with Renderware or Unreal Engine 3 (Xbox 2) ? I think your game still migth look better of course, but not as much as it would have last generation and much less than it would have the generation prior to that.

I think it is a fact that the difference between technically excellent titles and not technically excellent titles (provided good art resources for both titles) is shrinking and will keep on shrinking.
 
Gubbi said:
The only advantage there is to a in-order CPU is higher operating frequency (note that higher frequency does not necessarily equate to higher performance) for a given power budget compared to an OOOE CPU.

Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

Of course there'll be a few studios that can sink the extra resources into development, but seen as a whole I think making it easier for developers is the way to get better games on average for a platform.
I disagree with this whole approach. Your saying studios will think with existing data and control structures, and then re-jig them to fit a different architecture. I say that's a fault of a closed mindset so used to thinking in C terms, not a fault if in-order design.

I'm no dev and my experiences at uni were limited, but it was very apparent how hard many students found the switch to SML. Used to (and enjoying) C style constructs and pointers, lots of conditional work, the idea of writing programs from a streamed-processing point of view was very hard for them to pick up. Thankfully I didn't find it so tough, so could help a few of my friends out. I've also found this in the raytracing shader language for RealSoft3D. Many people have trouble getting their head round it and use IF statements to select processing, instead of using a more suitable data-processing, non-conditional approach. Rather than approaching it as a C program, it needs to be approached as a mathematical formula.

Um...as an example consider this. A joystick returns +/- 1 in horizontal and vertical directions as moved. A conditional process to move a player would be...

Code:
IF joystick.x = 1
   player.x=player.x+speed
ELSE IF joystick.x = -1
   player.x=player.x-speed
END IF

IF joystick.y = 1
   player.y=player.y+speed
ELSE IF joystick.y = -1
   player.y=player.y-speed
END IF

Whereas a formulaic, non-conditional process algorithm would be...

Code:
player.x=player.x+(joystick.x*speed)
player.y=player.y+(joystick.y*speed)

No branches at all. It's this different approach, extropolated perhaps to a level never approached before, that in-order wants. If you're spending time on "manually massaging data and code to run decent on a micro architecture with alot of quirks...time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is." then your not coding in-order from the ground up. If you *do* think and work in an in-order fashion, coding will be no more difficult than it is now, but execute a lot better.

It's like learning another language. If you want to understand a Spaniard, you can listen to every word and translate with an English-Spanish dictionary - very slow but something you can do straight away. Or you invest the time to learn the language properly and then you can understand him in real-time - longer investment but infinitely superior results.

As to whether an entire industry can be re-educated to a different mindset, and whether anyone would even appreciate that's the best move, I don't know. It'll need to happen but people are resistive to change and accomodating new ideas.
 
DeanoC said:
Gubbi said:
Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

If but the world was that simple, usually given a 'friendly' architecture, developers will just use worse techniques that are more 'prettier' but slower.
Having to think about the micro-architecture usually provides better code just because you forced somebody to think about it even for a second or two.

I'm not proclaiming MPU smarts can replace developer smarts. A developer that doesn't have a fundamental understanding of the architecture in use (in particular the memory system) is useless.

I'm saying that there are situations where performance can be gained that cannot statically be predicted at compile/development time, that can only be resolved at runtime. You can manually (you'd have to) schedule instructions around fixed latencies in an in-order MPU. But what happens when latencies start to vary as a result of eg. memory contention ? Data dependency stalls happen. An OOOe would be able to issue and execute instructions further which might be a big win, in particular if you can execute some loads earlier (and thereby reducing apparent latency).

Cheers
Gubbi
 
Gubbi said:
I'm saying that there are situations where performance can be gained that cannot statically be predicted at compile/development time, that can only be resolved at runtime. You can manually (you'd have to) schedule instructions around fixed latencies in an in-order MPU. But what happens when latencies start to vary as a result of eg. memory contention ?

you prefetch.

Data dependency stalls happen. An OOOe would be able to issue and execute instructions further which might be a big win, in particular if you can execute some loads earlier (and thereby reducing apparent latency).

aha, as you see yourself, you'd want to prefetch regardless of OOE (as it does not automagically solve your problem). and if you prefetch properly OOE does not even kick in.
 
DeanoC said:
ERP said:
I worked on an application recently that contained over 1500 source files, how many of those do you think ought to care about the underlying machine architecture. And if I'm doing cross platform development which architecture should they care about?

While I'm lucky and have only one platform to worry about, its something that makes me lose sleep on a regular basis ;-). We are currently 1000+ source files and will get alot bigger.

I'm currently considering the drastic option of splitting the code-base into sections.
A) High level, written in whatever pretty code will only ever run on a single main processor. Can be unefficient just as long as it works.
B) Old school code. More classic C code, less pointers and dynamics stuff, How well it will run is thought about for at least couple of seconds. Cache, DMA and synchronisation will have to be considered. Will run better but will require more experienced coders. Harder to mantain and write.
C) Performance code. Special code that understand its machine perfectly (possible a different language completely!) that is for experts only.

The problem we have is that on at least one next gen console, 80% of its performance can only by obtained by B and C (and B might be 10x slower than C...).

Do you want your game to have access to 1/5th the power of your competion that embraced the darkness?

Oops my bad I missed a 0 off the source file count should have been 15000 files (about 270 vcproj files). The joke is I've written games with less lines of code that that game has source files ;)

I'm going to disgree with your guess of 5x.......

We already seperate core code from game code, and in general people touching the core understand the architectures.

I believe that I can put systems in place that encourage the construction of code that will run efficiently on the architectures I care about.

I think there is increased danger on parallel systems of premature optimisation. Unlike single processor systems where you can optimise code in isolation, on a parallel system you can often increase the speed of the application at the expense ot the speed of a system it contains.


Isn't that what scripting languages are for?

Sure no of one with minimum execution speed overhead and a really tiny interpreter :p

All I really want is to remove new/delete for the mundane stuff, to loose the accidental writes to freed memory. I just hate debugging that stuff, because it invariably just works until you have a major deliverable.
 
ERP said:
Oops my bad I missed a 0 off the source file count should have been 15000 files (about 270 vcproj files). The joke is I've written games with less lines of code that that game has source files Wink
You need to start a system where every time somebody removes a file they get a bonus :)

Not going to ask what you up to but that just crazy... For that size system, o.k. I agree with everything you said, I thought we were talking games not Windows XP ;-)

So I'll add an caveat for projects of a reasonable size my points still sounds. However if your working on ERP sized projects than you have bigger problems to worry about and do exactly what ERP says :)
 
Panajev2001a said:
Do you really think that your game heavvily optimized will look that much better than a game built eclusively with Renderware or Unreal Engine 3 (Xbox 2) ? I think your game still migth look better of course, but not as much as it would have last generation and much less than it would have the generation prior to that.

I think it is a fact that the difference between technically excellent titles and not technically excellent titles (provided good art resources for both titles) is shrinking and will keep on shrinking.

I think your wrong, you seem to believe that the CPU horsepower is going on graphics. I don't...

This is the big question ultimatelly, will we choose to write poorly performing code and just do less per frame but because it was easier to write its had more iteration and may end up more fun or will we write efficient code that has had fewer code iterations and tweaks but has 5x grunt power (ERP disagrees with my estimate) to do its things with....
 
DeanoC said:
ERP said:
Oops my bad I missed a 0 off the source file count should have been 15000 files (about 270 vcproj files). The joke is I've written games with less lines of code that that game has source files Wink
You need to start a system where every time somebody removes a file they get a bonus :)

Not going to ask what you up to but that just crazy... For that size system, o.k. I agree with everything you said, I thought we were talking games not Windows XP ;-)

So I'll add an caveat for projects of a reasonable size my points still sounds. However if your working on ERP sized projects than you have bigger problems to worry about and do exactly what ERP says :)

Oh it was a game, shipped last year on PC, sold a lot of units ;)

I will agree I think that particular project was out of hand, a lot of it was a function of distributing the code in such a way that the 60+ engineers could work on it without stepping on each others toes.

FWIW almost all of the code volume was in gameplay code.
 
Gubbi said:
Jaws said:
Gubbi said:
...
OOOe helps you when you can't properly determine instruction schedule or access patterns at compile (or assembly-programming) time. It really has nothing to do with the skill of the developer (Jaws!).
...

That's out-of-order! :devilish:

But seriously in the context of my post, it was meant to highlight that devs aren't stupid.
Oh, I absolutely agree with you there.
Jaws said:
Experienced devs in console development will optimise code around the advantages of in-order CPUs and that easy development doesn't necessarilly equate to better games.
The only advantage there is to a in-order CPU is higher operating frequency (note that higher frequency does not necessarily equate to higher performance) for a given power budget compared to an OOOE CPU.
...

If there is a reduction in trannies for in-order CPUs, then you could also possibly squeeze in another core, ala Xenon CPU, if you think more cores might be more beneficial than a higher clock! :p

Gubbi said:
...
As for easy development not equating better games: I just disagree.

Time spent on manually massaging data and code to run decent on a micro architecture with alot of quirks is time taken away from developing better (or more efficient) AI, LOD, whatever algorithms. - Given the same budget and schedule that is.

Of course there'll be a few studios that can sink the extra resources into development, but seen as a whole I think making it easier for developers is the way to get better games on average for a platform.

Cheers
Gubbi

Oh I both agree and disagree with you on this which is why I prefixed a 'doesn't necessarily' above...

It's definitely not a given which was my point and also judging by the responses here. Studios will have their own philosophies/ styles and individual devs too.
 
darkblu said:
Gubbi said:
I'm saying that there are situations where performance can be gained that cannot statically be predicted at compile/development time, that can only be resolved at runtime. You can manually (you'd have to) schedule instructions around fixed latencies in an in-order MPU. But what happens when latencies start to vary as a result of eg. memory contention ?

you prefetch.

Data dependency stalls happen. An OOOe would be able to issue and execute instructions further which might be a big win, in particular if you can execute some loads earlier (and thereby reducing apparent latency).

aha, as you see yourself, you'd want to prefetch regardless of OOE (as it does not automagically solve your problem). and if you prefetch properly OOE does not even kick in.


If these new consoles all have 2 thread in-order general purpose units, maybe compilers will eventually support "scouting threads" (not sure what this is officially called), by which i mean a thread more or less dedicated to prefetching memory for a compute thread...
 
I was wondering. How would such platforms do with such operations as:

a) Databases
Yes, games has tons of these
b) Hashingoperations
Oh indeed, looking up is what computers do.
c) Linear searching

And has there been any talk of using programmingtechniques such as continuations? At least I saw the term "Asynchronous programming", so I can hope...
 
ERP said:
Sure no of one with minimum execution speed overhead and a really tiny interpreter :p
I'm using one with a tiny interpreter - and if I had 60 more software engineers available for the project it would also have minimum execution speed overhead years ago :p

I just hate debugging that stuff, because it invariably just works until you have a major deliverable.
Indeed, and inevitably it's ME that gets to debug that even when it's other people that left the bugs in :?

Anyway, if avoiding bugs is the thing, it's all the more valuable to keep some people limited to just script.
 
DeanoC said:
Panajev2001a said:
Do you really think that your game heavvily optimized will look that much better than a game built eclusively with Renderware or Unreal Engine 3 (Xbox 2) ? I think your game still migth look better of course, but not as much as it would have last generation and much less than it would have the generation prior to that.

I think it is a fact that the difference between technically excellent titles and not technically excellent titles (provided good art resources for both titles) is shrinking and will keep on shrinking.

I think your wrong, you seem to believe that the CPU horsepower is going on graphics. I don't...

Even on things like physics: if you take 5x the time to set-up your ultra-optimized system someone else might reach similar results by using their more thought out (at the high level: less time allocated to programming of the physics system and mroe time spent in the planning phase) system + a good deal of tweaks and cheats to make the end result behave almost the same as your more efficient fully physics based system.

This is the big question ultimatelly, will we choose to write poorly performing code and just do less per frame but because it was easier to write its had more iteration and may end up more fun or will we write efficient code that has had fewer code iterations and tweaks but has 5x grunt power (ERP disagrees with my estimate) to do its things with....

I think game projects will keep growing and will make impossible to follow the "grunt power" approach to the textent you are used to with PlayStation 2 or PSOne. it is also possible that the efficiency gap between compiled and hand-written code will shrink by at least a bit over the next years.

I just think the PS2 way of taking general purpose C/C++ code (not graphics code) and handing it to some people to be re-written in optimized ASM code is NOT the way to go: it is prone to mistakes (the person writing the ASM code might not have the understanding of the system he was handed, that is he does not fully know what the code tries to do and the philosophy behind that code) and waste of time going back and forth with the person who wrote that code and now moved to other things and it will be less and less efficient as we move forward.
 
Panajev said:
I just think the PS2 way of taking general purpose C/C++ code (not graphics code) and handing it to some people to be re-written in optimized ASM code is NOT the way to go: it is prone to mistakes (the person writing the ASM code might not have the understanding of the system he was handed, that is he does not fully know what the code tries to do and the philosophy behind that code) and waste of time going back and forth with the person who wrote that code and now moved to other things and it will be less and less efficient as we move forward.
Pana, Deano wasn't talking about re-writting stuff in ASM, let alone doing it to high-level code.
Actually - re-writting stuff in ASM is the LEAST beneficial optimization you can do for your R5900 code. And believe me, I've written my fair share of it.
 
Back
Top