Will the next generation consoles be CPU bound?

blakjedi

Veteran
It seems to me that the greatest strength of Cell is that it has legs enough to keep it from potentailly ever becoming CPU limited when it comes to geometry as opposed to most PC setups...

The XeCPU will have much more legs than any other system before it but unless integer instructions are the limiting factor when developers describe a system as being CPU bound, then it CAN'T keep up with Cell. At least as far as geometry is concerned.

If that is that the case, then is being CPU limited just in terms of FP or can integer instructions also be a cause of this limitation? Dont flame me for not fully understanding this principle please.
 
My best guess leans towards bandwidth being the limiting factor. Both CPUs are fairly strong in FP power, and really, there's no question that their FP power far exceeds their integer power (regular old integer ops have higher latency, I believe). Geometry processing is fine and all, but until ways to work around the pangs of the memory subsystems are worked out, I don't see anybody using the CPU for a large percentage of geometry processing (probably even less so than some of us currently do on PCs).
 
Same can be said of XeCPU though, if not moreso. Peaks Flops is all very well but keeping the processors fed will probably be the trickiest part of managing them. PS3 has both a bandwidth advantage with main-RAM and with LS for processing elements.
 
How much bandwidth would be enough though? Does the term CPU limited also included bandwidth issues? I'm learning here. :D
 
blakjedi said:
How much bandwidth would be enough though? Does the term CPU limited also included bandwidth issues? I'm learning here. :D

The issue seems to be "Bandwith Limitation" as the above posters stated. It seems like the CPU's will be more than enough...but like Shifty said..its all about keeping them fed, the bandwith for both the XCPU and CELL will probably be limited by the bandwith...
 
Enough to provide the data as needed without waiting. If you can process 100 GFlops, a 100 billion operations on 100 billion different values needed 100 billion 4 byte numbers = 400 billion bytes : of the order of 400 gigabytes/s.

If instead you're taking a value and performing several operations on it, say in procedural creating a texture, let's say doing 10 Flops on a bit of data, your consumption drops to 40 GB/s.

So it all depends on throughput. Where Cell wins out IMO, and this comes into it's 'design for bandwidth', is it's local store is fast enough and large enough to hold sufficient quantites of readily accessible data without constantly writing to and from main RAM; a job otherwise needed to be managed by L2 cache that is neither as fast as LS nor as predicatable (though recent discussion here shows it's more effective than I'd have anticipated). Depending on the types of operations, SPE's should have in a number of cases a significant BW advantage over a conventional core.
 
The term "CPU Limited" is more or less relevant to if the processor's performance in the tasks associated to it is limiting the performance of the entire application. For example if the CPU is not processing things like AI, physics, or associated collision detection logic quickly enough then this would degrade the performance of the game application overall as these are some of the most intensive elements that eat CPU cycles.

Being CPU limited has nothing to do with the degree of bandwidth you have between your CPU and your GPU, though having more bandwidth can't hurt. I could throw my video card from AGP 8x mode to PCI mode and I would not see much of a degradation of performance in most cases. I could cut my CPUs FSB down by 1/4 and I would not see much of a degradation in performance, and I could cut my system memory speed in 1/2 and not see much of a difference in performance in most cases.

Everything considered if the Cell CPU is not as competent in the most intensive areas of CPU usage that typically limit games compared the Pentium 4s and AMD64s then it is likely the Cell will become MORE CPU limited than your standard Pentium 4 and AMD64 and would degrade performance instead of improving performance, and most indications are that it is not as competent as Intel/AMD in these areas except for physics. FLOPs have nothing to do with a game being CPU limited or not and neither does bandwidth between the CPU/GPU... efficiency does play a role though.
 
That's not true. Things are harder to do on Cell/XeCPU but they are way more capable than today's Intel/Amd offerings.

Yes they did not focus of GP as much but they have more than compensated for it and there are examples of how many things done with OOOe processors can be done on both Cell and the XeCPU on this very board.

Will certain things be easier to do in how to set them up on an common desktop processor? Yes.

Do this makes them more capable? No.

In time when the shift in programming philosophies that must occur is in full flight then even perhaps the answer to the first question may change.

I'd agree bandwidth in itself should be considered a limiting factor by itself.
 
Being CPU limited has nothing to do with the degree of bandwidth you have between your CPU and your GPU
I don't know of anybody here who was talking about CPU<->GPU bandwidth as much as memory bandwidth. I mean, if a CPU has to stall for 520 cycles waiting for 128 bytes of data to come from memory, that's basically just bad.

If you can process 100 GFlops, a 100 billion operations on 100 billion different values needed 100 billion 4 byte numbers = 400 billion bytes : of the order of 400 gigabytes/s.

If instead you're taking a value and performing several operations on it, say in procedural creating a texture, let's say doing 10 Flops on a bit of data, your consumption drops to 40 GB/s.
That sort of logic works out fine within the scope of a single function in a single thread. What happens when data is being prepared for reuse much further down the line? You can't guarantee that the data will still be in cache, so you end up polling memory again. Of course, that's one thing you can talk about with CPUs that isn't much of a concern for GPUs -- latency will kill you.

In any case, the thing that makes bandwidth-limited separate is the fact that it is a platform feature. The CPU simply happens to be stuck in that platform. Personally, I think if you're CPU-limited, it should be a flaw of the CPU's own design that is limiting you, not where the CPU is placed. Now if you were bandwidth limited against the CPU's own cache (god forbid), then that's a flaw of the CPU.
 
IN a console there should be no limited . You will be system limited esp later in the life of the console .

ON a closed platform you can tweak your game to use the maximum power of all components (if properly designed ) . So i doubt either will be .
 
the bandwith for both the XCPU and CELL will probably be limited by the bandwith

You know?...I never could bend my brain around that one until you put it so plainly. You're right! ;) :p
 
Back
Top