It's a pity setup rate wasn't doubled (I still think it would be easy given that the rasterizers deal with different tile sets)
Why it would be easy?
It's a pity setup rate wasn't doubled (I still think it would be easy given that the rasterizers deal with different tile sets)
We had a sizeable discussion about it before, I'm not going to trawl this mega thread for it ... DIY
Setup does a coarse rasterisation, identifying all the tiles that a triangle at least partially covers, then giving the rasteriser(s) a list of tiles and triangle data in order to rasterise. I suspect the rasteriser has a tile-centric view of rasterisation, not a triangle-centric view. That's because threads of 16 quads of fragments need to be despatched, and those need to be strictly tile-aligned (because the render target is tiled). Though I also expect it to handle triangles in strict order.
One of the key questions that's still unanswered is can a thread of fragments refer to more than one triangle (e.g. 5 adjacent small triangles from a strip)?
I think it might be a matter of practicality in instancing a block of hardware rather than re-jigging things for 32-rasterisation. I don't think the number 32 is problematic (since other ATI GPUs have 4-, 8- and 12-rasterisers) merely that scaling isn't free of latency/pipelining issues across the entire width of the unit.
Jawed
It's a pity setup rate wasn't doubled (I still think it would be easy given that the rasterizers deal with different tile sets) but meh ... being a bitch about getting implementation details just because they aren't relevant to performance I see as counter-productive, I'd still rather hear them than not.
It's not like it was a paper launch where misunderstandings could fester for months ...
I want details. You want details. I don't see the difference.
We know it's just an increase in scan conversion throughput now ... regardless of it's the truth you are saying it's stupid the double rasterizers were in the diagram AFAICS.If it's just an increase in scan conversion throughput it's stupid.
We know it's just an increase in scan conversion throughput now ... regardless of it's the truth you are saying it's stupid the double rasterizers were in the diagram AFAICS.
It reduces fanout inside the rasterizer and allows you to get the rasterizers closer to the shader cores, also it was probably less work.If (as people are speculating) the implementation is tangibly different in some meaningful way then I'm just as interested as you to know more.
At a guess i'd say the 5350, 5550 and 5650.
No I haven't seen any documentation on this for R800. I was thinking in terms of each rasteriser working solely on the tiles it's given - the alternative is that both rasterisers get all triangles and decide which tiles they own by first doing the coarse rasterisation themselves. Since you queried it, I now think the latter approach is more likely.Is that based on some documentation you saw? Why would setup need to determine tile coverage? Shouldn't it be: vertices -> setup -> triangle -> raster -> tiles -> shaders?
Those aren't tile sizes, they're rasterisation rates. The rasterisation rate equals the colour fillrate on ATI. So my X1950Pro has 12 RBEs and therefore needs a 12-rasteriser. The tiles it works with might be 8x8 pixels or larger. Thread size is 48.Oh where'd you get those tile sizes from?
launching six months after GF100 eh?
For all we know GF200 may launch 6 months after GF100 too.I don't think so, but that's way too far into the future to debate right now.
I don't think so, but that's way too far into the future to debate right now.