Question about double-Z output

AlNom

Moderator
Moderator
Legend
Why has ATi shied away from implementing this in their GPUs (except for RV530)? Is the transistor cost that significant?
 
RV530 and Xenos both do double-Z, but the other ATI parts don't (not in this context--they all double z samples with AA, but not without it, which is what Alstrong is talking about here). When AA is enabled, Z output is doubled again for RV530 like other parts (which puts it at 16 z samples with 4 ROPs - 4*2*2AA) and quadrupled for Xenos (64 z samples with 8 ROPs - 8*2*4AA) per clock.

Bandwidth might very well be an issue for this. Xenos obviously has exactly enough bandwidth for all those reads and writes, and the X1600 only has 4 ROPs sitting on however much bandwidth it has (excuse my ignorance =P)

Of course, a totally uneducated guess, so feel free to ignore it.

//I think
 
Something to bear in mind that double Z, outside of MSAA operations, isn't a pureplay ROP operation - the necessary data has to get there in order for it to work.
 
But since the z-buffer is typically very compressable, I don't understand why bandwidth should be too much of a concern (increasing the resolution of a z-buffer won't hugely impact the required bandwidth to store or read data).
 
Bandwidth is not a concern. If you can read and write Z and write color, you can also do double Z. But you also need to double the number of quads coming from the rasterizer.
 
Thanks for the replies guys. :)

So basically, it just comes down to a transistor trade-off that ATi just didn't want to make. :???: It just seems like they "could" get a pretty big performance boost in all those games that make use of double Z with and without MSAA. As TDZ mentioned, doubled output again with MSAA for Xenos and RV530. Wouldn't we see significant performance gains :?:

I guess they're just considering all the games that are even making use of double Z too... But I'd have thought they'd want something *crazy* for the high end to definitively put it at the best of the best. (SOrry, my own mindless speculation :oops: )

Are the patents available for both companies' implementations? (or a hint on where I should be searching in the first place would be nice. :oops: )
 
Thanks for the replies guys. :)

So basically, it just comes down to a transistor trade-off that ATi just didn't want to make. :???: It just seems like they "could" get a pretty big performance boost in all those games that make use of double Z with and without MSAA. As TDZ mentioned, doubled output again with MSAA for Xenos and RV530. Wouldn't we see significant performance gains :?:

I guess they're just considering all the games that are even making use of double Z too... But I'd have thought they'd want something *crazy* for the high end to definitively put it at the best of the best. (SOrry, my own mindless speculation :oops: )

Are the patents available for both companies' implementations? (or a hint on where I should be searching in the first place would be nice. :oops: )

There's a significant difference in doing Z compares for AA and for non-AA. In the AA case, you have at least 2 z checks required per cycle, per pixel anyway. So all you need to do is double up the amount of "Z Rops" and then you can double your Z rate. However, for non-AA cases, you effectively need 2x more pixels around to do 2x the Z rate. That requires wider datapaths on top of more "Z Rops", as well as more scan generation, etc... Also, there are some good compression algorithms for AA Z which don't work as well for non-AA Z. Consequently, you effectively need more memory BW for non-AA Z if you increase that rate.

So, it's generaly much more costly to accelerate non-AA Z versus AA Z. Yes, you'd get performance gains with more non-AA Z. But it would cost a lot of HW.
 
Back
Top