What is the chance that the nv36 has improvments?

stevem · Oct 21, 2003

Yes, I know. But are you saying NV31/6 behaves more like a 2x2 pipeline instead of 4x1, hence needing 2 cycles/quad most of the time? I thought that was the preserve of NV34.

Pete · Oct 21, 2003

NV31/34 (and, I assume, 36) are 2x2 doing multi-texture, and 4x1 with single texturing.

I'm not sure how "shader parity" relates to the fillrate numbers one can glean from 2x2/4x1 pipeline organization. Normally, assuming clock speed equality, one can assume a 4x1/4x2 card can output twice as many shaded pixels per clock as a 2x1/2x2 card, simply because it has twice as many pixel pipelines (and, thus, pixel "shaders" embedded in each pipeline). But the FX and Radeon DX9 lines don't really have comparable shader hardware, if we listen to nV's "sea of shaders" line, so the comparison isn't that clear-cut. As it stands, Radeons appear to be more efficient per clock, but who knows what driver updates and special code paths (that may affect more than just shader brute force) will bring?

stevem · Oct 22, 2003

Texturing isn't at issue. I was curious about semantic configuration of NV37 shading pipeline vs NV35 as it was specifically mentioned. RV360/R360 is a more "traditional" 4x1 vs 8x1 so we can extrapolate, whereas essentially a 2x2 may result in <1quad/cycle. I am aware that the NV3x pipeline includes flexible ALUs & is scalable for each model.

Pete · Oct 22, 2003

Sorry, I thought you were continuing the shader discussion.

(Though I lamentably still don't understand the concept of a quad and how it moves through the hardware pipeline, I'd imagine that if an NV35 can process one quad per clock, then an NV36 (don't know of an NV37) would take two clocks.)

pcchen · Oct 22, 2003

I did some tests with a NV34 (non ultra), I got some strange results. I hope some one who know more about NV34 can verify my conclusions (all my tests are done with 45.23 WHQL driver):

1. When using pixel shader 1.1, with only one texture and one color instruction, I can get near 1,000 Mpix/s result. This suggests NV34 is a 4x1 architecture when using pixel shader 1.1.

2. When using pixel shader 1.4 or pixel shader 2.0, the maximum fillrate I can get is near 500 Mpix/s. This suggests NV34 behaves as a 2 pixel pipelines architecture when using ps 1.4 and ps 2.0.

3. Under pixel shader 2.0 (I didn't test pixel shader 1.4), using two non-dependent texture access cost the same as using only one texture. This suggests NV34 is a 2x2 architecture when using ps 2.0.

4. Using pixel shader 1.1, NV34 can only run one color instructions per cycle, instead of two instructions. This is a bit strange since almost all previous GeForces has two register combiners per pixel pipeline.

5. When not limited by registers, it seems no difference in performance between FP32 and FP16 operations.

6. Under pixel shader 2.0, most instructions have one cycle throughput. ABS is not free (unlike R300). LRP takes 3 cycles (I'll have to check about that), NRM takes 5 cycles.

Any comments?

Xmas · Oct 22, 2003

Hm, maybe it's built that way: A unit working on 2 quads that can either do 2 tex lookups or 2 fx12 ops or one fp32 op per clock. That would make it quite similar to NV30, but without the combiners following the shader core.

Robbitop · Oct 22, 2003

on 1st
NV36 has CineFX2 instead of CineFX1.
So the PS Performance must be improved.
Furthermore the Framebuffercompression has been advanced (Intellisample HCT instead of Intellisample 1.0).

Last but not least NV36 is produced by IBM.
They use FSG as electrical insulator. So the powerconsumption of NV36
should be lower than NV31. And NV36 must be higher clockable...

Arun · Oct 22, 2003

pcchen said:
4. Using pixel shader 1.1, NV34 can only run one color instructions per cycle, instead of two instructions. This is a bit strange since almost all previous GeForces has two register combiners per pixel pipeline.

If that's true...
It could mean:

NV34 = 2x ( ( 1xFP32 OR 2xTEX ) OR ( 1xFX12 AND 1xTEX ) )
NV31 = 2x ( ( 1xFP32 OR 2xTEX ) AND ( 2xFX12 ) )
NV30 = 4x ( ( 1xFP32 OR 2xTEX ) AND ( 2xFX12 ) )

All seem like a very likely scenario for their respective number of transistors IMO.
And could you test 2FX12 instructions + 1TEX instruction in PS1.1. please? I'd say that should take 1 cycle on the NV31, and 2 on the NV34 if I'm right, same thing for 2FX12 instructions + 2TEX instructions.

Uttar

BoardBonobo · Oct 22, 2003

Robbitop said:
on 1st
NV36 has CineFX2 instead of CineFX1.
So the PS Performance must be improved.
Furthermore the Framebuffercompression has been advanced (Intellisample HCT instead of Intellisample 1.0).

Last but not least NV36 is produced by IBM.
They use FSG as electrical insulator. So the powerconsumption of NV36
should be lower than NV31. And NV36 must be higher clockable...

Well since NVidia NDA's seem to be flakier than a lepper colony some reviews of the 5950 and 5700 have already appeared on the web. IXBT's was the first I think, but that has dissapeared now.

Nordichardware have their review up, great for those who can read swedish

but the graphs do speak for themselves.

The 5950 is quite capable of keeping up with the 9800XT but still dies a death when a lot of shaders appear. Looks as though it will be $100 cheaper than the 9800XT at launch which will make it worth while. Is the extra shader performance (considering how many titles are out there) worth an extra $100? Personally I'd say yes...

The 5700 looks like it will be something of a beast, they were looking at a $199 launch price which makes it exceptional value. It beat the 9600XT quite easily and they got the damn thing to overclock to 500/1100 without it breaking a sweat. Looks like they may have actually produced a fine card, at last!!

What I don't get is the PCB layout. It looks identical to the NV30, which begs the question - why was the NV30 so expensive? Either NV made a killing on the few sold or they are taking a big hit with the 5700...

Sadly the only thing that got cached on my 'puter were the pictures of the 5950 and 5700.

Robbitop · Oct 22, 2003

500/1000 is default Clock. Hexus did 595 Mhz Core 1100Mhz DDR.

BoardBonobo · Oct 22, 2003

Robbitop said:
500/1000 is default Clock. Hexus did 595 Mhz Core 1100Mhz DDR.

I'm fairly sure the review said 475\900 were the default.

Pete · Oct 22, 2003

Picking nits...

I thought default was 475/475?

Mariner · Oct 22, 2003

It seems to me that the 5700 will be a 'loss leader' product for NVidia.

Seems like a good idea for them to put out a high-performing yet cheap card to try and win back some of the mindshare of the enthusiast community. It doesn't matter if they barely make a profit on it - it is more likely to spur the sales of the lower GFFXs through name association.

Of course, I'm not going to get one because of the crappy AA. 8)

pcchen · Oct 22, 2003

Uttar said:
And could you test 2FX12 instructions + 1TEX instruction in PS1.1. please? I'd say that should take 1 cycle on the NV31, and 2 on the NV34 if I'm right, same thing for 2FX12 instructions + 2TEX instructions.
Uttar

Unfortunately I don't have a NV31. IIRC NV34 needs two cycles to run a 1TEX + 2 FX12 shader. 2TEX + 2FX12 also needs two cycles. I'll check them again tomorrow.

btw, I think NV34 should be

NV34 = (2x (1x FP32 or 2x TEX)) OR (4x (1x FX12 and 1x TEX))

Of course, this's just my speculation.

stevem · Oct 24, 2003

Thanks pcchen. It seems to support previous ideas about NV34 pipeline organization. It will be interesting to see how much the mini FP units of the NV36 affect it's configuration. According to Dave's tests, the new drivers expose these better. It makes sense that all NV3x series are the same general pipeline, only with scaled ALUs/units resulting in different shading/texturing characteristics.

Bjorn · Oct 24, 2003

BoardBonobo said:
What I don't get is the PCB layout. It looks identical to the NV30, which begs the question - why was the NV30 so expensive? Either NV made a killing on the few sold or they are taking a big hit with the 5700...

Nvidia apparently hade huge yield problems at TSMC back then. And the NV30 also has 50+ million transistors more then the 5700.

Robbitop · Oct 24, 2003

It seems that NV36 has a few more transistors than NV31.
NV36 got 3 VertexUnits, NV31 just 1.
So there must be about 10-15million Transistors more than in NV31.
So we have aproximately ~100Mio Transistors on NV36.

But the Yields @IBM seems to be excellent and they use FSG as electrical insulator.

Pete · Oct 24, 2003

I think 5700U articles pin the NV36 at 82M transistors, about 5M more than the NV31.

What is the chance that the nv36 has improvments?

stevem

Pete

Moderate Nuisance

stevem

Pete

Moderate Nuisance

pcchen

Moderator

Xmas

Porous

Robbitop

Arun

Unknown.

BoardBonobo

My hat is white(ish)!

Robbitop

BoardBonobo

My hat is white(ish)!

Pete

Moderate Nuisance

Mariner

pcchen

Moderator

stevem

Bjorn

Robbitop

Pete

Moderate Nuisance

Similar threads