Is free AA really worth it?

Status
Not open for further replies.
onanie said:
gurgi said:
I like how the ps3 is painted as giving developers a choice, but oh btw, you don't get to choose what shaders do what, unlike the Xbox360's unified design. You could go back and forth all day.

Both machines are *complete* designs, balanced from thier own perspective. You can't just take 100m from the daughter die and plug it into the gpu, because the C1 was designed with the bandwidth savings of that 100m. You might as well suggest that Sony take one or two SPEs and plug those transistors into RSX, since it's been suggested by certain sites *cough* that the SPEs are just wasted die space anyway
You can't choose what each individual xenos pipeline does (three arrays).

On Xenos, transistors for FSAA (and other fixed functions) will have influenced the budget on the main process. If microsoft insisted on matching GPU costs with RSX, we might have seen a 300m xenos process without a daughter die.

It seems the case that Nvidia pursues pixel processing more (in their decision not to include edram), while Ati stresses post processing; not to say that AA will be difficult on RSX either.

You really missed my point. If MS is forcing developers to use AA, then sony is forcing developers to use SPE logic that could have been used to make RSX more powerful. Yes, SPEs are more general purpose, so maybe my comparison is stupid, but this whole idea is stupid to start with, so it all fits. =P
 
Jawed said:
Compression efficiency falls off as triangles get smaller or as the poly count increases, so as games progress throughout the next gen bandwidth demand goes up (even without AA, due to higher framebuffer workload).

The effect in Xenos is that all framebuffer tasks, including AA processing, cannot slow down Xenos because the peak bandwidths generated in dealing with these tasks at 4GP/s are designed-in to the EDRAM unit.
ATI's bet is on games increasing in polygon complexity, vs per pixel processing. Hence the reliance on bandwidth.

xgpu is still limited by 32gb/s write to daughter process, not the 256gb/s you might be envisioning.
 
onanie said:
xgpu is still limited by 32gb/s write to daughter process, not the 256gb/s you might be envisioning.

32GB/s is not a limit for 4GP/s fill-rate though. Work out the numbers...

Jawed
 
May I ask what 4 billion pixels a second gets used for? IF the screen's only capable of displaying 60 million a second, and 4x AA is 4 time that, why the need for so much pixel capability? I'm guessing it's a value that's used elsewhere than just straight pixel drawing. :?
 
Do you get 60 MPixels/s from 60fps@720p?

If so... well, there is overdraw to consider too.
 
Shifty Geezer said:
May I ask what 4 billion pixels a second gets used for? IF the screen's only capable of displaying 60 million a second, and 4x AA is 4 time that, why the need for so much pixel capability? I'm guessing it's a value that's used elsewhere than just straight pixel drawing. :?

4GP/s is a peak rate. You won't get a game running at that rate for the entire duration of a frame.

The point I'm making is that at (momentary) peak fill-rate the EDRAM will keep up without introducing any latency.

Jawed
 
Re. "what you need all the bandwidth for", it might be worth mentioning that there seemed to be the suggestion in B3D's Xenos article, at least, that no colour or z-compression would be used either. If you weren't accounting for that already ;)
 
Alright, all this talk of eDram has got my itching to lay out one of my crackpot theories. Sorry, if it's off topic, but I'm dieing to... jvd or sonic or whoever, if it's a problem please feel free to move it.

OK, so a while back in an interview Kutaragi had with, I believe, Hisashi Goto he spoke a little about BC saying it would be a combination of software and hardware. This left me wondering how this would be slit up. Later on in a past post there came to light an odd discrepency in the number of trannies of the nVidia GPU... about 50 million were missing. This set somethig off in my head... wasn't the eDram in the GS roughly 50 million trannies? Could it be that the RSX will also have some eDram? I'm not sure what effect these probably ficitious transistors would have, but I doubt nVidia would waste them. Told you it was a crackpot theory.

Anyway, I just had to get that out there... please return to your normally scheduled debate already in progress.
 
Well it's the G70 that might have the missing transistors - nobody knows for sure anything on RSX other than it's derived in one form or another from G70. But it would put the nail in this particular theory because there's just no way NVidia's putting 50 million transistors worth of eDRAM onto G70.
 
Shifty Geezer said:
Those quotes are talking about the shaders, not the logic on the eDRAM, as far as I can see.

JAWED : I don't know the maths for rendering too well, and you identify bandwidth usage nicely - thanks. My perspective is existing GPUs can do AA without that much BW, ergo it isn't all needed. It'll be good to see how much is used when we get performance metrics for the hardware in operation, as written how you've written it, it looks like AA is totally impossible at highres on existing GPUs and RSX who have but half or a third even of that 128 GB/s required BW.

Also, why 4 billion pixels? 270p ~ 1 million pixels x 60 fps x 4 for AA = 240 million pixels a second, 1/16th your 4 GPixel figure, which places BW needs at 4 GB/s for Z only; which explains why current hardware can manage it with 30ish GB/s BW.

We have yet to see games that push current high end gpus to their limits, maybe once we have games that make a 7800 do 30 fps at 1280x720 then AA will require that kind of bandwidth. Run doom 3 at 1280x720 on a 9600xt and then see how possible AA is.

ATI's bet is on games increasing in polygon complexity, vs per pixel processing. Hence the reliance on bandwidth.

Really? I had thought it was the other way around, ATI was hedging its bets on shaders, while Sony(not necessarily nvidia) on pixel and polygon counts. Cell could probably greatly help how many polys PS3 can push, and while shader performance might be debateable, I believe RSX has notably superior pixel performance.
 
Kutaragi has also spoken about it (eDram) in terms such as to paint it as unnecessary. So I think at the very most, if it is there, it'd be only as part of the hardware solution for PS2 backwards compatability, and not available to PS3 games (which would be so large a waste as to make the prospect very unlikely, imo).
 
Couldn't developers somehow use that eDRAM for future games even if it's only 4MB? Why lock out the use of it?
 
Ah, right, right... sorry, XbD.

I forgot to mention that I was taking into account the supposition that everyone seems to be making... that RSX is little more than a G70 with a memory bus modified to work with rambus tech.

To extend that... my thinking was that the phantom transistors in the G70 were simply unused (disabled), since eDram is uncommon to the PC GPU space and it would probably increase yeilds considerably with less need to split the production line.

Still, it's wild fantasy even if it does fit together nicely.
 
I've speculated some of those missing RSX trannies are for PS2 BC, but I doubt they're for eDRAM. KK has said they couldn't fit enough eDRAM to be any use so didn't want it, but 4 MBs could surely be used for something wierd and wonderful. As the full BW to RSX is about the same as PS2's eDRAM, plus with use of compression need for BW is much reduced, there seems little need for the eDRAM. More likely IMO memory controllers will feature to simulate eDRAM access.
 
Isn't it a bit unrealistic to expect them to include EDRAM in a G70 derivative so quickly?

Didn't ATI spend in the neighbourhood of 3 years engineering the Xenos to take advantage of EDRAM as well as unified shaders? Seems unrealistic to think Nvidia can just do a few tweaks here, few tweaks there, and add some EDRAM.

Would it even be effective? I would think the whole GPU would really need to be designed with the EDRAM in mind from the beginning....no??
 
Shifty Geezer said:
I've speculated some of those missing RSX trannies are for PS2 BC, but I doubt they're for eDRAM. KK has said they couldn't fit enough eDRAM to be any use so didn't want it, but 4 MBs could surely be used for something wierd and wonderful. As the full BW to RSX is about the same as PS2's eDRAM, plus with use of compression need for BW is much reduced, there seems little need for the eDRAM. More likely IMO memory controllers will feature to simulate eDRAM access.

Missing G70 trannies - G70 my friend. ;)

There's just no way NVidia would be having a chip fabbed with any PS specific features on it; what an utter waste. Just like hopefully, more than likely, the RSX will be tossing out PureVideo and any other PC-realm garbage.

I think RSX is just too much of an unknown as far as the GS goes right now, though I think it'd be great if it more or less incorporated it. I've been all over the G70 'shadow transistors' speculation myself, but I think as far as it applies to RSX, we can only look at it from the possibility of more pipes in the base architecture. Anything beyond that and it almost becomes a question of why is it there at all in G70 in the first place? To apply to RSX, it would have to be something that both chips could utilize.
 
Mefisutoferesu said:
Alright, all this talk of eDram has got my itching to lay out one of my crackpot theories. Sorry, if it's off topic, but I'm dieing to... jvd or sonic or whoever, if it's a problem please feel free to move it.

OK, so a while back in an interview Kutaragi had with, I believe, Hisashi Goto he spoke a little about BC saying it would be a combination of software and hardware. This left me wondering how this would be slit up. Later on in a past post there came to light an odd discrepency in the number of trannies of the nVidia GPU... about 50 million were missing. This set somethig off in my head... wasn't the eDram in the GS roughly 50 million trannies? Could it be that the RSX will also have some eDram? I'm not sure what effect these probably ficitious transistors would have, but I doubt nVidia would waste them. Told you it was a crackpot theory.

Anyway, I just had to get that out there... please return to your normally scheduled debate already in progress.


PS2's Graphics Synthesizer has about 43 million transistors. About 32 million of those transistors are for the 4 MB eDRAM.
 
Nicked said:
BenQ said:
Ontop of that, it is only the uninformed among us who believe that the entire 100 transistors, making up the daughter die, is used for "just" AA. :LOL:
And it is only the most wishful thinking x360 fans that believe it is used for anything incredibly significant.

Seriously, every x360 fan is expecting ATI to have delivered the miracle GPU that has incredible shading power and 'free' AA, that whomps RSX. When the reality is that RSX is a beast that I have no doubt will trump Xenos in many departments.


you forget though, that any advantages that RSX has over Xenos will be relatively small in comparison to the advantages than Graphics Synthesizer, Flipper and NV2A had over PowerVR2-Dreamcast. There are alot of games out on the current consoles today that don't look much better than Dreamcast's best looking games. the point is, 360 graphics are not going to be beaten very much if at all, by PS3 graphics. Some 360 games will look better than some PS3 games, and some PS3 games will look better than some 360 games.

Even if RSX significantly outperforms Xenos in one or two areas, there will be one or two areas where Xenos significantly outperforms RSX. It will be totally upto the developers to provide visually stunning games on both consoles -- both consoles will be SO close in graphics rendering features and sheer performance. There is going to be no overwelming advantage to one graphics subsystem over the other. Both are looking pretty solid. Both have weaknesses, both have strengths, in different places.
 
Shifty Geezer said:
jvd said:
But what other stuff ?
At the moment, Xenos has 3 unified shaders with 48 ALU's, divided into groups of 16, right? That, along with other components, allows them to process so many vertex and pixel shader commands a second. Developers will have to make sure their shaders can be run in time. IF those 100M eDRAM transistors were given over to other things, ATi could have added a fourth unified shader, giving a 33% increase to shader performance. They compromised between peak shader power and bandwidth saving AA enhancement.

I don't see where they wasted transitors on edram
I never said they did!!! Just that they COULD have done things differently, to gain a benefit in one area at the cost of AA performance. I've already said I like Xenos design (several times!!). I would say ATi sacrificed shader performance in terms of ALUs to make room for the eDRAM, to produce an effective, balanced rendering system that might well outperform the alternative without eDRAM in some/many areas - but they still had to decide what to include and exclude in the die space, which meant leaving some things out.

If there were no sacrifices, and no need to leave things out, ATi would produce a 256 shader system with 16 GB eDRAM ;)

XenonCube? XenosCube? ;)
 
Status
Not open for further replies.
Back
Top