r420 efficiency

FUDie said:
Mintmaster said:
FUDie, the Z-buffer only takes more bandwidth than colour on old architectures, like Rage128 or GeForce2 and earlier. All chips have Z-compression now, so colour takes up plenty more bandwidth.
This is false. Not every pixel will require a color read/write, but nearly every pixel will require a Z read and many a Z write as well.
This depends a lot on what you try to render. The number of color writes should be at least the number of Z-writes except if you're rendering shadow buffers or a Z-first pass. But in those cases you're hardly ever bandwidth limited. Multipass algorithms will take color read and write, but only Z read. Additionally, Z might have a 4:1 compression advantage and early trivial accept.
 
I've been thinking about the setup of R420, and purely by taking transistor count into consideration I believe the following:

Not all of R420's pipes are actually doing texturing. Yes I know they were talking about 4 quads of RV360 or something like that, but it might just have been 2 quads, and another 8 pipes that do just pixel shading. Why do I think this?

Well, there used to be an estimated figure about how many transistors were needed for a pixel pipeline. By looking at R300 that was thought to have been around 10 Million transistors per pipe, all told. Now R300 had 110 million, because they had other stuff on it. R420 has 160 Million transistors, but supposedly 16 pipes, what doesn't figure here?

I think they designed the chip to offer maximum shader performance, so they just took 8 (or maybe 12) texture pipes, and added 8 or 4 pure VS/PS pipes.

This might be totally wrong, I just think the 160 million transistors is awfully few for a card that's supposed to hold 16 fully fledged texturing + PS/VS units, considering that R420's shader pipes SHOULD be slightly bigger than R300's. This is all assuming that the core tech remained the same/similar, and both chips offer equal non-3D-related technology.

Look at it this way:

8x + y = 110M
16x + y =160M

Assuming y is about the same on both cards, and x is about the same on both cards you arrive at a very large figure in y=60M, and a very small figure for x=6.25M per pipe. This obviously doesn't make sense, however, ATI must be saving transistors somewhere on R420, and they won't be saving them in the VS/PS pipelines.

If you remember the leaked spec-sheet for R420 which appeared at some online store, they talked about "8 parallel pixel pipelines" and in the next line "8 super-parallel pixel pipelines". Everyone put this off as marketing garbage, but there might still be a bit of truth in it, in not all pipes being the same.

I don't know what that makes of the X800pro, where supposedly one quad has been disabled. I suppose that must have been one of the PS/VS only quads then.
 
Back
Top