Console CPU's ?

Fafalada said:
Speaking of which, with memory 400-500cycles away on these new consoles, wouldn't anything that goes out of L2 a lot hurt the in-order PPC cores practically just as much as SPEs?
DeanoC said:
But what about next-next gen, when we start seeing 30-40 processors and main memory latency in the 1,000's of cycles...
A little note: according a Mr.Hofstee's presentation about CELL we're already in the range of 1000 cycles of latency..
 
nAo said:
Usually you make the query to such collisions subsystem, and once the system is back with the answers you needed you can go on and make some decision (like destroy an object or making other queries).
This kind of stuff will run as a pig (I like to say that ;) ) on a SPE, but what if you change your way to see a collisions subsystem.?
You could make a collision engine that collects all the queries per a given frame/pass.Each query is pre-spatially sorted and then processed, re-using a big deal of data (like collision meshes..) on different queries. In a subsequent pass all the results are retired and processed, and this process can be iterated multiple time per frame. This would run much faster on a SPE.

ciao,
Marco
If I understand right, you're describing a sort of state monitoring system, recording world-wide states and batch-processing them? Wouldn't this sort of setup be very memory intensive though? Add in hi-res models, hi-res textures, complex AI, hi-fidelity music...I can't see any amount of RAM being enough!
 
Gubbi said:
Reading data from the L2 would be automatic, since the L2 would snoop all memory transactions (and hence what the DMA engine does) and could serve data that it has cached. How the SPE's (the DMA engine) would place data in the L2 I have no idea.
Been too long since I read the patent, don't remember if they considered writting to L2 or not.
Anyway, my point was just that with fetches from L2, I could get some random access in several 10s of cycles, which actually sounds manageable, as opposed to just plain running like a pig ;)

At the very least, it would mean more flexibility then running strictly streaming algorithms.
 
MfA said:
With enough bandwith and parallelism you can overcome any latency ;)

Sooooo, if we just pointed enough satellite dishes at distant satellites we could warp space and time to shorten the amount of time it takes to get a signal out and back? Awesome. Speeds overcomes latency, not bandwidth.
 
Shifty Geezer said:
If I understand right, you're describing a sort of state monitoring system, recording world-wide states and batch-processing them? Wouldn't this sort of setup be very memory intensive though?
Maybe, but it would be much more linear. We love burst accesses..and compression ;)
 
Fafalada said:
ERP said:
I know what I expect and that's a lot of single threaded games which use the SPE's for graphics and little else. Someone will use all that power at some point, for something beyond making things pretty, but I don't believe that day is particularly close to it's launch.
Well I figure as much too, but just a question to throw in the mix...
what happens in hypothetical scenario where PS3 GPU would have its own VertexShading? Would people still stick to stuffing more graphics stuff to SPEs, or...

Don't think it will matter much...

I think it's largely a question of what's easy at this point, almost all of the practical SPE applications I've heard bandied around are graphics related. from vertex trandsforms through tesselation, to radiosity computations.

All these things will have impact, but to me it's a cop out.

I really think it's just the low hanging fruit people are looking for right now.
 
Fafalada said:
Gubbi said:
Speaking of which, with memory 400-500cycles away on these new consoles, wouldn't anything that goes out of L2 a lot hurt the in-order PPC cores practically just as much as SPEs?

Sure but good L2 caches do an incredible job of minimising those hits frame to frame. Your average game has relativly random access patterns, but they are extremly coherent frame to frame. Try managing that manually with DMA.

I did some performance tuning on a game a year or so ago , it was particularly cache unfriendly, the PS2 version averaged 16000+ DCache misses a frame, the GC version running the same basic code averaged just 32.
 
Back
Top