Kryton said:
No it shouldn't unless you want suckful performance, I'm saying the graphics hierarchy is the same as what we had (AGP) and no one had a problem with it until trying to push it too hard (the PS3 interface is entirely different but follows a similar design).
Coding on PS3 like you do on a PC (well, at least the aspects relative to this discussion) won't be "suckful" at all for PS3. The only big exception is to use XDR texturing as much as possible (provided you have RAM space), whereas AGP texturing is undesirable on the PC.
It's true that there are some new effects that might be possible by doing post-processing on Cell, and GPU->CPU transfers are waaaaaay faster on PS3 than a PC. However, if you don't have any reason to do this, high-level PS3 game structure likely won't be any different than on a PC. Certainly, practicing PC coding habits on the PS3 won't hurt it.
People keep expecting the FlexIO line to be critical in PS3 performance, but the only way you get 35 GB/s of transfer is with constant reading
and writing throughout a frame. Graphics workloads just don't happen that way. If you get an average of over 100MB of data moved between RSX and Cell per frame I'd be very surprised.
And, the PC has the opposite - which is better I don't know.
I don't know which is better, though the reason for this in the PC is modularity. Different people have different needs from a graphics card (just look at the order of magnitude gap between low-end and high end), and so it doesn't make sense to have a unified memory pool. I can say for certainty that 50 GB/s to a 512MB pool is better than 25 GB/s to each of two 256MB pools. This, of course, is not the sort of situation we're seeing in XB360 vs. PS3, so the point is moot. In a way, XB360 sort of has a split pool too.
Making a good memory controller that can handle requests coming at it from 2 locations is tricky because you have to give priority to someone, but who? Full duplex just means you can read/write so I'm not sure what you mean here.
Actually, you only need to consider priority if you're transferring at the peak rate (i.e. can't accomodate both), and in this case deciding who gets the priority is mostly irrelevant.
When you're bandwidth limited, you have some total amount of data transfer that's necessary to complete one frame of game code and rendering. There is usually a frame of latency between the CPU and GPU because draw calls are buffered up, and there isn't any interdependency between them. What I'm saying is usually the GPU has one frame of commands to execute, and the CPU is preparing the next one. So if you're BW limited, it doesn't matter what order you do this in, since you'll saturate BW either way, and there's no way to go faster than that.
The only time priority matters is if you have drastic changes in BW consumption throughout a frame for sustained periods. Then ordering could matter, because low BW code on the CPU would run best with a high BW load on the GPU, and vice versa. This is not something that the memory controller can predict, though, and it's up to the coder to manually assign priorities or reorder their code. If one is on average much higher than the other, it makes sense to give the low load the priority because the high load can always fill in the gaps to keep the bus saturated.
But if both loads are high BW, then it doesn't matter who goes first.
Also, handling multiple clients in the memory controller isn't anything new or "tricky". GPUs already have to handle requests from the command processor, vertex engine, texture units, z test units, render back-end, etc. It's not a big deal for ATI or NVidia to add in CPU requests as they've done with XBox 360, original XBox, and integrated chipsets.