Article on CineFX/NV30 Architecture (English)

Looking very closely, one can see another surprise: The rasterizer doesn't deliver pixels but so-called "quads" to the pipeline.

Is this really a surprise? Most 4 pipe (or more ;) ) chips work on the pricipal of quads.
 
DaveBaumann said:
Looking very closely, one can see another surprise: The rasterizer doesn't deliver pixels but so-called "quads" to the pipeline.

Is this really a surprise? Most 4 pipe (or more ;) ) chips work on the pricipal of quads.
And pretty much a requirement, AFAICS, given the derivative sampling instructions in DX.
 
DaveBaumann said:
Looking very closely, one can see another surprise: The rasterizer doesn't deliver pixels but so-called "quads" to the pipeline.

Is this really a surprise? Most 4 pipe (or more ;) ) chips work on the pricipal of quads.

Well, yes, but you just said "4 pipe or more" - while the NV3x is 1 pipe ( Or... You wouldn't be saying the R300 is 2 pipes would you? ... ) - so what's surprising is that every unit in the processor works on quads.

Although you must be right by saying most 4 pipe (or more ) chips get "quads" from the rasterizer... But then again, you most likely won't give all the details you know :)


Uttar
 
No, NV30 is still 4 pipes, but they work in a quad - again, as every 4 pipe board does; if you want to state that NV30 is '1 pipe' on the definition of a quad then so was every other '4 pixel pipe' chip before it.

( Or... You wouldn't be saying the R300 is 2 pipes would you? ... )

Well, think about it - why was RV350 relatively easy to do as an R300 derirative? Why does 9500 (or 9800 SE) work so easily with '4 pixel pipes' turned off the same ASIC as an '8 pixel pipe' chip? Is it just because a 'quad' is turned off...? ;)
 
I've been wondering, Dave, if each quad in the R300 is independent (able to issue its own instructions) of the other. Anyone know or have thoughts on this?
 
I guess all I can say is that you would loose a lot of efficency if they weren't independant.

You should also look to 9500/9800SE to see that certainly one quad can operate independantly when the other isn't 'on'.
 
Could someone help clarify this for me. There has been much mention of the Nv3x architecture being buggy. By this I had always thought that there was some errata in the GPU that was preventing it from meeting its design goals. After reading this (albeit, with limited knowledge of the subject) would it be fair to say that the Nv3x suffers from design limitation as opposed to being “buggyâ€￾? If so, what are the ramifications to the Nv4x series of GPUs? Will nVidia need to start from a clean sheet ? Will this put them further behind ATI ?
 
Ostsol said:
nelg said:
Will nVidia need to start from a clean sheet?
Erg. . . Isn't that what they did with the NV30? :?

I think the problem for nv3x was that nVidia did not start with a clean sheet...Had they had some foresight they'd have quietly scrapped nV3x last August and started over at that time. But I guess they had to see how far driver "optimization," 12-layer pcbs, and huge heatsinks & fans could take them, more's the pity. I'm hoping nv4x will indeed come from a clean design...
 
DaveBaumann said:
No, NV30 is still 4 pipes, but they work in a quad - again, as every 4 pipe board does; if you want to state that NV30 is '1 pipe' on the definition of a quad then so was every other '4 pixel pipe' chip before it.
As I have said many times before, I would say this is 'one pipe, processing quads'.
 
WaltC said:
Had they had some foresight they'd have quietly scrapped nV3x last August and started over at that time.

Wait...
You are suggesting that if you consider their GF4 series as their last product, they shouldn't have launched any products for *18* months?!

The NV30 is a failure, but the NV3x isn't. NV34, anyone? Having created the NV30 made them capable of releasing the NV34 in a timely manner, and that means that while maybe they shouldn't have released the NV30 ( just giving review samples, say they lost, but that they'll come back soon - NV35 anyone? ) , having scrapped it at that point would have been insane.

Although even scrapping the NV30 around December would have been insane, because then there'd have been no "high performance" hype for the NV30. Even if that's bad for the enthusiasts now, I'm sure it helped them for the NV34. Xabre is a good example of why having a high-end product is crucial to have good low-end sales ( not like Xabre could have sold well even if they had an amazing high-end chip, lol )


Uttar
 
...would it be fair to say that the Nv3x suffers from design limitation as opposed to being “buggyâ€￾?

That´s more or less the way I see it, albeit I´d rather speculate that there was nothing wrong with the chalkboard design, but the trouble started with the transition to silicon.

Uttar,

I don´t expect (according to simple reasoning) ATI to sit idle for very long what the low end segment concerns. There the same advantages will apply as with their higher end parts.
 
Ailuros said:
Uttar,

I don´t expect (according to simple reasoning) ATI to sit idle for very long what the low end segment concerns. There the same advantages will apply as with their higher end parts.

Agreed. Heck, I already heard of the existence of a DX9 for $79 part by ATI. No idea about avaibility and stuff - not really my source for that, just exchange of information...

On the nVidia front, I'm thinking the rumored "5200 SE" will actually put DX9 at even lower pricepoint, so if ATI is shooting for $79, nV would still have an even lower-end of the market ( Uck! ).

Although knowing anything serious on that part is relatively hard, since we don't even know on which chip it's based... ( NV34B? One of the NV33s? )


Uttar
 
is there a way to improve the performance of CineFx I/II by using a more nvidia friendly compiler,just as mentioned in the article?
if I understand everything correctly,shaders like:
arith inst 1
tex inst 1
arith inst 2
tex inst 2
.
.
.

are more friendly to R300/350 design.

and shaders such as:
arith inst 1
tex inst 1
tex inst 2
arith inst 2
tex inst 3
tex inst 4
.
.
.
are more friendly to NV30/35 design, am I right?
 
The big question on my mind is how much performance is NV30 letting go un-utilised because its drivers just aren't sophisticated and clever enough?

Will the Detonators 50.xx raise performance by 2% or 20%?

Is the design so complex that only Stephen Hawking's could solve how to optmise NV30's drivers?

What do folk hope/expect/wish we might see realised with the Det 50.xx series drivers?
 
991060 said:
is there a way to improve the performance of CineFx I/II by using a more nvidia friendly compiler,just as mentioned in the article?
if I understand everything correctly,shaders like:
arith inst 1
tex inst 1
arith inst 2
tex inst 2
.
.
.

are more friendly to R300/350 design.

and shaders such as:
arith inst 1
tex inst 1
tex inst 2
arith inst 2
tex inst 3
tex inst 4
.
.
.
are more friendly to NV30/35 design, am I right?

R(V)3XX friendly:

tex 1
tex 2
tex 3
tex 4
tex 5
tex 6
arith 1
arith 2
arith 3
arith 4
arith 5
arith 6

Large Blocks of tex and arithmetic operations with the same number of operations in both blocks.

NV3X friendly:

tex 1
tex 2
arith 1
tex 3
tex 4
arith 2

2 tex operations together and the arithemtic operations between two tex-op blocks. But it is more importantly to use a low number of temp registers.
 
Back
Top