Gamespot Interview with AGEIA

Jawed said:
Seems to me that Ageia got caught up in counting GFLOPs just like B3D did a few months back.]

Asides that floating point performance is relevant here..

..you really think that? Give them some credit. The one actual technical explanation we have for one of the points of discussion at GDCE had little to do with gflops.

Sure you're not stretching for any way to dismiss what was said? You'll have to try again because that's just not very credible.
 
Cell has been available to "developers" for a few months now - prolly not many devs though. Epic, who else?... Ageia as a middleware provider may have had some priority.

XB360 dev kits with final spec hardware are, apparently only just going out, now.

So, how is Ageia supposed to have tuned code on XB360? Does anyone really believe that Ageia has tuned the Cell SDK?

Jawed
 
Titanio said:
Asides that floating point performance is relevant here..

..you really think that? Give them some credit. The one actual technical explanation we have for one of the points of discussion at GDCE had little to do with gflops.

Sure you're not stretching for any way to dismiss what was said? You'll have to try again because that's just not very credible.
It simply appears to me that some twit at Ageia who doesn't normally present this material sees 2x the GFLOPs rating of PS3 over XB360 and jumps to a conclusion that is actually wrong.

GPUs have 10x, 20x the GFLOPs rating of CPUs. When you deploy GPUs in GPGPU applications to run scientific applications that are normally executed on CPUs, you get anything from 0.5x to 20x the CPU performance.

GLOPs ratings, in and of themselves, are bunk.

Jawed
 
Jawed said:
It simply appears to me that some twit at Ageia who doesn't normally present this material sees 2x the GFLOPs rating of PS3 over XB360 and jumps to a conclusion that is actually wrong.

So you think Tom Lassanske is a twit? He made the CEDEC presentation where arguably the more explicit notes on inter-platform comparison were made.

And the only tech explanation we got out of GDC, from SenatorMonkey re. Fluid/Cloth doesn't mention Gflops.

Your explanation doesn't cut it. The mere notion that AGEIA would present such points basing them on "schoolboy" flop comparisons, to fellow developers is laughable, and we know it not to be the case from what technical explanations we have heard from the conferences.
 
Titanio said:
So you think Tom Lassanske is a twit? He made the CEDEC presentation where arguably the more explicit notes on inter-platform comparison were made.

And the only tech explanation we got out of GDC, from SenatorMonkey re. Fluid/Cloth doesn't mention Gflops.

Your explanation doesn't cut it. The mere notion that AGEIA would present such points basing them on "schoolboy" flop comparisons, to fellow developers is laughable, and we know it not to be the case from what technical explanations we have heard from the conferences.

Seriously, explain to me how Ageia has had months of access to a final XBox360 dev kit to put together a tuned SDK.

Then come and tell me that they weren't guessing about XB360.

Jawed
 
Frankly, I don't even believe that Ageia has had access to a PS3 dev kit or a Cell of any kind.

The Sony Ageia deal was only quite recent.

I'm amazed you guys actually believe that Ageia has run-up a fully tuned SDK on both these systems :rolleyes:

Jawe
 
Jawed said:
Seriously, explain to me how Ageia has had months of access to a final XBox360 dev kit to put together a tuned SDK.

Then come and tell me that they weren't guessing about XB360.

Jawed

First of all, you're now switching points. This is a different point to "They're simply looking at flops like B3D did at E3, har har".

But to answer your new arguments, I'm sure AGEIA has had early access to hardware from both - I'm sure novodex is available for X360 now, and the first rev of the PS3 version is due in the next month I believe (?) - so I'm pretty sure they have PS3 kits by now ;) No one is suggesting they currently have "tuned" engines for both systems. But they certainly have had time with unfinished hardware, and I'd imagine they're among the first to get new hardware from both (not to mention non-public architectural detail).
 
They don't need final hardware in absolute terms. As long as they have the CPU's (and both XeCPU and Cell have been available for months), they can develop the engine. Only if they were wanting to access the hardware in other ways, such as offloading or integrating some calculations into the GPU's activities, would they need the final hardware and I'm sure it'll be a while before anyone's thinking about eating up GPU cycles and taking away rendering power to provide for physics power.
 
My argument about access to hardware isn't new.

To be honest, I expect Cell to be better at physics, regardless of SDK. Trouble with Cell is, it aint easy to write code for...

I just don't believe any SDK is tuned, bearing in mind that final hardware for at least one platform hasn't been available for more than about a week.

Are there final Cell-based PS3 devkits in devs' hands (ignoring the fact RSX isn't available - that shouldn't affect Cell)?

Jawed
 
Last edited by a moderator:
Jawed said:
Are there final Cell-based PS3 devkits in devs' hands (ignoring the fact RSX isn't available - that should affect Cell)?

Final as in final clock? No. But 2.4Ghz Cell machines have been available for a while. Once you have the final chip though, even if underclocked, I think it becomes much easier to predict what performance will be like on the final machine.

Let's rewind to the AGEIA statement though before we lose the run of ourselves - based on architectural analysis they think PS3 has more capacity for Novodex. I'm guessing no one takes issue with that? Officially speaking, at least, that's based solely on analysis of information in the public domain. I guess the question then is if they found "reality" with hardware (unfinished or not) thusfar to contradict those statements, would they have made them to a bunch of potential clients? That's where we can speculate, personally I think they wouldn't make such suggestions or statements if they were actually finding differently thusfar with what hardware they do have.
 
Last edited by a moderator:
Seriously, the point that gives me pause for thought in all this is that actual Xenon CPUs have been available for about a month.

Even if Ageia has PC-specific dual-core supporting code, you can't just re-compile that code for Xenon. The Xenon programming model is just too different. So is Cell.

All I see is that out of the platforms, Ageia will have had the most time with Cell.

I expect Cell to be better in the long run. But the programming models on these consoles' CPUs aren't trivial.

A month or two spent with a completely new programming model (Cell or Xenon) is just not enough time to make meaningful comparisons.

Jawed
 
Hands on experience may well be limited, but the architectures are well documented and that's what they're talking about. If fluid dynamics works with lots of processing smallish (128 kb) streamed datasets that fit ideally into SPU's LS (and they were designed to facilitate this...) then the BW to feed the processor's ALU's is there where it may well not be there for XeCPU which is working primarily through RAM with 32kb Data cache. That's where there's an architectural difference that no matter of brute FP power can get over. Without knowledge of the PPU design it's hard to see where the bottleneck might be, but this is the only obvious architectural feature that I think they can be talking about. As I said earlier they may be able to manage the cache to aid physics but that will impact other threads if they need to use a lot.
 
Jawed said:
Seriously, the point that gives me pause for thought in all this is that actual Xenon CPUs have been available for about a month.

Even if Ageia has PC-specific dual-core supporting code, you can't just re-compile that code for Xenon.

True, of course. But it's interesting to see how they characterise the X360 approach. In the CEDEC slides they say "Just like dual-core PC, but uses 3 PPC cores with HT" and the latest AGEIA statement also says "This is the same SDK that operates on the single core PC and dual core PC.". I actually asked them about that, and if there was a difference in implementation between these sdks and that of the PhysX chip, and he simply said that they Novodex is a set of common algorithms, and these get mapped to each piece of hardware. So I'm sure they will be tuning it very specifically for X360 as they will for Cell and PhysX and PC chips, but it might just be that the best overall approach for X360 is not too dissimilar to the dual-core PC approach, but with another core (and that seems implied in some of these comments).
 
Jawed said:
Even if Ageia has PC-specific dual-core supporting code, you can't just re-compile that code for Xenon. The Xenon programming model is just too different. So is Cell.
As AGEIA guys don't have to mess around with CPU-GPU interaction, I think they don't have much to optimize for Xbox 360 CPU except for in-order friendly coding.
 
Titanio, given that we know that Cell and Xenon PPEs are real dogs running "PC code re-compiled" because they aren't OoO cores, such an argument just seems like wishful thinking.
The SDK's interface should be the same across platforms - otherwise you haven't really got a cross-platform middleware product.

The actual guts of the code inside the library will need to be hand-coded for Cell and Xenon. That's the big deal with the next-gen cores, extremely intensive applications need to be mothered to make sure they perform well.

The physics algorithms are surely going to be the same (i.e. the object model, the physical effects simulated (friction, gravity, degrees of freedom, collision, etc.)) at the macro level - but the code required to implement these effects needs per-platform tuning.

A simple reason why PC code version of the SDK can't simply be re-compiled for Xenon is the VMX architecture. PCs don't have anything like that computing power available per core, so PC code just isn't going to gel on Xenon. In my opinion.

Jawed
 
one said:
As AGEIA guys don't have to mess around with CPU-GPU interaction, I think they don't have much to optimize for Xbox 360 CPU except for in-order friendly coding.
I honestly can't understand what you're saying :???:

Jawed
 
Jawed said:
Titanio, given that we know that Cell and Xenon PPEs are real dogs running "PC code re-compiled" because they aren't OoO cores, such an argument just seems like wishful thinking.
The SDK's interface should be the same across platforms - otherwise you haven't really got a cross-platform middleware product.

The actual guts of the code inside the library will need to be hand-coded for Cell and Xenon. That's the big deal with the next-gen cores, extremely intensive applications need to be mothered to make sure they perform well.

The physics algorithms are surely going to be the same (i.e. the object model, the physical effects simulated (friction, gravity, degrees of freedom, collision, etc.)) at the macro level - but the code required to implement these effects needs per-platform tuning.

A simple reason why PC code version of the SDK can't simply be re-compiled for Xenon is the VMX architecture. PCs don't have anything like that computing power available per core, so PC code just isn't going to gel on Xenon. In my opinion.

Jawed

I was talking about high level approach, not nuts-and-bolts implementation - the very first words in my post agreed that you can't just recompile PC code. Of course the code has to be different. I'm just wondering if the high level design, the splitting of work between cores etc. - if that approach will be more similar to how it's done on dual-core PC setups, but with the benefit of a third core.
 
Jawed said:
I honestly can't understand what you're saying :???:
Well, I'd like to ask you then, what kind of optimization can they add to NovodeX for Xbox 360 over the multithreading optimization they already added for dual-core SMP CPU for PC, except for the optimization for in-order cores? If it's a 3D engine, you have to pay close attention to the L2 cache in the CPU to stream data back and forth between the CPU and Xenos, but for a physics library that works on the CPU, what you have is a vague concept of the restriction that you may not be able to fully use the shared L2 cache in the CPU as it may be reserved by Xenos. Just my 2 cents, of course.
 
Titanio said:
I was talking about high level approach, not nuts-and-bolts implementation - the very first words in my post agreed that you can't just recompile PC code. Of course the code has to be different. I'm just wondering if the high level design, the splitting of work between cores etc. - if that approach will be more similar to how it's done on dual-core PC setups, but with the benefit of a third core.
But the high-level design isn't where you tweak for performance.

If you have an algorithm for simulating a spring between two objects - that algorithm doesn't change at a high level, no matter what platform you run the simulation on. The devil is in the details - which in this case is a matter of SPE stream-based coding versus VMX stream-based coding.

If that's the basis of a cloth simulation, but the cloth has 10,000 masses and four-way springs (for the sake of argument) you need to tune the micro-parallelism of your physics engine at the inner-loop per SPE/VMX level.

Have you ever written machine code?... Have you ever tuned algorithms based on the expected dataset size or the degrees of parallelism in the data? All this is craft - it's the art of programming.

The Novodex physics libary is inherently multi-threaded. That's how they've written it, with cognisance of the fact that they're targetting it at the PPU and next-gen consoles (and eventually multi-core PCs, hahahaha, when PCs get CPUs with more than 4 cores in them, prolly).

I still think Cell has a natural advantage simply because of brute GFLOPs and the larger number of cores. But Cell isn't easy to program for.

Reading the SDK documentation that was linked in the last day or two, it's interesting to note there's no mention of fluid simulation...

Jawed
 
The VMX architecture of Xenon is notably different from other PPE/PPC implementations. The register set is relatively large (it amounts to 2K of memory, if you like), which makes for a wholly different way to program with it. Cell's SPEs have a similarly extravagant set of registers, for similar reasons...

Also the instruction set and the in-built short-cuts for vector representation appear to mean that code written for other platforms won't be making use of the vector-math refinements in Xenon. That code will be taking the conventional approach, e.g. to performing dot-products, when not only has Xenon got a DP instruction, but the format of the vector data going into the DP doesn't need to be re-formatted as is normal in other vector-math architectures.

Well, that's my understanding, anyway. I'm just picking this stuff up here and there.

Jawed
 
Back
Top