Jaws said:
That RSX 57 GB/sec read|write is as real as Xenos 54 GB/sec read|write (22.4 system + 32 to EDRAM module). However, in X360's case, the XeCPU under those conditions cannot access system RAM but CELL can still access system RAM (XDR).
That is totally silly.
How does half this apply to AA? If you are going to count the Read+Write over the FlexIO even though it exceeds the MEMORY bandwidth of the entire system--THUS has no relevance to the AA issue. If total memory bandwidth is 47GB/s on the PS3, and AA is a bandwidth limited task, I see no reason why we are discussing 57GB/s of RSX bandwidth. That is just playing numbers, which of course goes both ways. So lets message the numbers some:
22GB/s UMA + 22GB/s [11GB/s upstream / 11GB/s downstream] Xenon-to-Xenos L2 access + 32GB/s GPU Parent Die-to-GPU Daughter Die + 256GB/s GPU Daughter logic-to-eDRAM = 332GB/s
Wow, 332GB/s > 57GB/s
Oh, but wait, not all of this is relevant to the question on hand! Both scenarios are falacious. RSX may have a total of 57GB/s read/write bandwidth, but it surely does not have 57GB/s of memory bandwidth because the PS3 only has ~48GB/s of total system memory bandwidth. The PS3 has 10GB
less memory bandwidth than the RSX total bandwidth because it is not all to memory. So that number is totally irrelevant for the discussion of AA.
Further, the math about expected bandwidth hits does not line up with reality. While it is nice to call them "maximum" usage for HDR and AA and the like, experience is telling us that this is not the case. Theoretically mid level cards like the 6600GT should be able to do nice amounts of AA without having bandwidth issues. Yet they do. even the NV40 is bandwidth limited at times, as is the G70. How can a GPU with almost 40GB/s be bandwidth limited based on those numbers?
Maybe because theoretical maximums performance usage does not lineup correctly with the GPU's memory effeciency and how bandwidth is consumed in real world scenarios? ATI has quoted numbers much higher for expected bandwidth usage. I cannot find the link, but I remember them giving a range of something like 26GB/s-134GB/s. My numbers could be foggy, but the numbers were quite large.
Back to the original points:
1. These are not even Xenos vs. RSX slides. Neither is ever mentioned.
2. Comparing DX7 and DX8 style games, as many of the games in the slides are, as an estimate of next gen usage is completely irrelevant. One glance at the high geometry Sony Render Targets from E3 clearly demonstrates a huge gap in complexity.
3. The 720p slide contains a lot of CPU limited games, therefore it is impossible to compare the performance hit for AA at this resolution.
4. Modern games like Far Cry and Doom 3 take ~40% hit at 1600x1200 (which is ~9% less pixels than 1080p). 40% is indeed substantial and pales in comparison to the 1-5% number ATI has been quoted for 720p 4xMSAA. Further, it is obviouse Sony is aiming much higher than either game for next gen. Simple, the G70 is not giving free AA, or even a minimal performance hit on modern games. This could be more pronounced on games with much higher levels of geometry.
5. The G70 most likely has more memory bandwidth in relation to the expected available bandwidth for the RSX in the PS3 in real world scenarios. If the RSX used all the bandwidth available to it (22GB/s to the GDDR3, 15+20GB/s to the XDR, which totally maxes out the 25GB/s of XDR memory bandwidth) that would not only leave the CELL CPU memory starved--it would leave it completely IDLE. This would defeat the purpose of course and is unrealistic to say the least.
To quickly compare, if the RSX is allocated the same 38GB/s of bandwidth the G70 has, that would leave the PS3 CELL with 10GB/s of main memory bandwidth. CELL is a very memory dependant design. No one here can say whether 10GB/s is enough or not, but it would be fair to say this could be a significant hurdle to getting 218GFLOPs out of the CELL.
Best case scenario the RSX could end up having comparable available bandwidth to the G70. Yet the RSX is looking to be a 22% faster core (550MHz vs. 430MHz).
6. G70 cannot do HDR and MSAA at the same time. The reference of SSAA (which has a much larger performance hit over MSAA) with HDR is misleading. The benchmarks are showing that you need 2 G70's in SLI to get HDR + SSAA in modern games. Again, these are PR slides and nothing more.
7. The PS3 is a closed box. Miracles occur in these wonderful closed boxes. i.e. We can expect technically savy developers over the next 5 years to find ways to maximize the potential of the system, whereas even nasty DX7 games are going to be more ineffecient than the wonders that appear on the PS3.
8. ATI are not idots. There is a reason they gave up 105M transistors for eDRAM. Ditto Sony with the PS2 and Nintendo with the GCN.
People really need to stop playing down the eDRAM. Comparing the bandwidth to CELL, when it is over and beyond that of the total system memory bandwidth, is silly.
Theoretical numbers aside we already know modern games take a performance hit with AA enabled on G70. Whether that is a bandwidth issue, fillrate, etc... does not matter.
If RSX is an implimentation of RSX technology, as is expected, then RSX is going to take a hit in modern games with 4xAA enabled at 1080p. A large hit at that.
EDIT:
ultimate_end said:
It's memory bandwidth, and its application, that is being discussed here, i.e. Antialiasing. How is the bandwidth of the EIB relevant here? What are you going to do, write the frame buffer to one of the SPE's local storage?
Heh, you beat me
It is a falacious comparison. Basically RSX's total bandwidth is irrelevant when discussing AA and total system bandwidth when total system bandwidth is less than total RSX bandwidth (obviously because the RSX can be fed information from CELL, so a higher bandwidth need is understandable).
The eDRAM gets under the skin of some. Anyone who follows the forums can see that.
I am still chuckling over how this is some type of Sony PR win--even though Xenos and RSX are not mentioned--yet Major Nelson was not because we could divide the facts from the sweet talk. For some reason that same principle is not applying here