Chip Comparison Chart

Dave Baumann

Gamerscore Wh...
Moderator
Legend
Anyone want to add / fill in the blanks to this:

http://www.beyond3d.com/misc/comp.php

Its still a little raw at the moment, but it will form the basis for a card comparison chart as well. If you can think of any other columns to be added then please suggest them, however things like clock rates are a product level item, not necessarily a chip level item, so they won't go into this table.
 
For pixel processor precision:
NV28, NV25: FX9
R200, RV250, RV280: FX16

Don't know about NV18/NV17, it's either 8-bit or 9-bit... My guess would be 8-bit, though.

Also, obviously, the triangles per clock of the RV350 is 0.5, since it's 2 VS units.
Heck, according to the same logic, NV25/NV28 is 0.5 too. That is, if ATI and nVidia calculate it in the same way...

Also, for the memory bus width, could we get the way in which it is done?
That is:
NV30: 32x4
NV34: 64x2
NV31: 64x2
R300: 64x4
NV25: 32x4
R200: 64x2
NV28: 32x4
RV250: 64x2
RV280: 64x2
RV350: 64x2
NV18: 64x2
NV17: 64x2

Nice getting a comparaison table like that :) Where's the R350, though?


Uttar
 
R300 has a die size of 14.8 * 14.8 = 219 mm2. I seem to remember 144 mm2 for NV20/25, but can't find the source. Dunno about any others.

RV250 has 36 million transistors, which is supposedly 7 millions more than NV17.

For GPUs with memory crossbars, you have multiple memory buses, so it would be more accurate to e.g. say '4x64bit' than '256bit' for R300.

I also seem to remember reading that NV20/25 could do 8 textures per pass when you don't use pixel shaders, although I may be wrong on this one.
 
Uttar said:
Don't know about NV18/NV17, it's either 8-bit or 9-bit... My guess would be 8-bit, though.

9-bit values were introduced in NV10 (from which 1bit is sign), so I'm pretty sure NV17/NV18 has that too.
 
Right, I've made a number of updates and additions, there's still plenty of blank spaces and older chip information that needs to be added.

R200, RV250, RV280: FX16

Not sure about that, Andy was making some indications recently that RV250 had increased precision from R200 IIRC.
 
Hmm, sounds like the bus width column is bugged now, though :(

RV250? Increased precision? Never heard of that. Would love to have more details on it. Although are you sure it's PS precision? Maybe it's sub-pixel accuracy? Don't know...


Uttar
 
According to Anand, the NV34 has 47 million transistors. IIRC, NV10, 15 and 20 have 22, 25 and 57 million transistors, respectively. VSA-100 is 14 million transistors.

Also, IIRC, VSA-100 could run at AGP4X in single-chip configurations only - in multi-chip configurations (Voodoo5 5500) it dropped down to AGP2X.

You seem to have got geometry and pixel processing swapped around for NV10 and NV15. Also, what is supposed to go into the 'integrated display' column?
 
Uttar said:
Hmm, sounds like the bus width column is bugged now, though :(

That should be fixed (although I'm trying to work out how to concetenate those strings without the spaces in php).

Uttar said:
RV250? Increased precision? Never heard of that. Would love to have more details on it. Although are you sure it's PS precision? Maybe it's sub-pixel accuracy? Don't know...

Oh, you might be right - http://www.beyond3d.com/forum/viewtopic.php?p=96219&highlight=#96219

It was Hyp-X saying he thought there was different precision, bu Andy was saying 9000/9200 was greater than FX12

arjan de lumens said:
According to Anand, the NV34 has 47 million transistors. IIRC, NV10, 15 and 20 have 22, 25 and 57 million transistors, respectively. VSA-100 is 14 million transistors.

Cheers.

arjan de lumens said:
Also, IIRC, VSA-100 could run at AGP4X in single-chip configurations only - in multi-chip configurations (Voodoo5 5500) it dropped down to AGP2X.

I'm just talking about chip level specs at the moment, not board level. I'm doing this relationally since much of the board level stuff is either duplicated from chip specs or can be derived from (i.e. fillrates = board chip speed x number of pipelines, etc.)

arjan de lumens said:
You seem to have got geometry and pixel processing swapped around for NV10 and NV15. Also, what is supposed to go into the 'integrated display' column?

num of DACS, precision of DAC's, Speed of DAC's, Num ofTMDS, speed of TMDS, that type of stuff.
 
First, it seems like a good idea to have some distinction between fixed function and shader performance characteristics, and not just for pixel processing (as the "pixel pipe" and "TMU" no longer directly map), but for vertex processing.

The way to introduce that (for vertex processing) with the least changes to the chart is to move the Triangles/clock to the left of the programmability spec listings and add a new vertex shading indicator, or have it somethingng like the triangles per second show "fixed function triangle rate (programmable triangle rate)".

For pixel pipelines, something similar, but a new column would be needed, something to represent op execution. To fully express, something like "tex op + (fp24vec3 + fp24scalar)/fp24vec4" for R300 and "texop/fp32vec2/fp16vec4 + fx12vec4 + fx12vec4" for nv30 would be necessary, but there is hardly room for that in the chart. Besides simply counting maximum ops, which is sort of a distortion for both these cases, something illustrative seemed pertinent for efficiently illustrating the possibilities.

What I'd propose is a scheme where a hollow box represent integer, a solid box represents floating point, and an intuitive color progression scales from least precision to highest precision, with special colors for tex ops and scalars. All examples would be the same height, and each possibility in a pipeline would be a certain width that makes sense to express the functionality. Also, it would be nice to have a "programmable pixel" pipe count to the left of the column with this with a number and, if possible, an "X" on the line between the columns.
For the R300's (vec3+scalar)/vec4, that would be the fp24 color at full height, with a smaller box of the special fp24 scalar color inset at about 1/4 height (leaving a border of the regular fp24 color visible). It would have 2 boxes, the tex op colored box, and this construction.
For the NV30's fp32 unit functionality, it would be the tex op color, with an inset fp16 color, with two inset 1/4 height fp32 color (all solid boxes). If the integer ops can be independent (I forget at the moment), this would accompany 2 full width FX12 colored hollow boxes (if they can't be independent, then the two hollow boxes should be "mashed" together horizontally, but perhaps still wider than the "standard" width).

Keep in mind the 1:1000 picture/word ratio when deciding whether this can be illustrative. :p I have an inkling of some colors ideas in mind, but why flesh it out when people might not like the idea?
 
DaveBaumann said:
however things like clock rates are a product level item, not necessarily a chip level item, so they won't go into this table.

The r300(2) should be listed as 256 bit (4x64). Since the 4x1 product (9500 non-Pro) is available with either 256 or 128, that makes it a product level distinction, not a chip level one.
 
OK, I'm still gathering ideas here and there's still pelnty of blanks to fill out as well as details on older chips.

Now, what type of information would you like to see for reference board level charts?
 
jerry_enCater said:
VSA100 Super-Sampling????
I am sure it is a big mistake.

Yes, it was super sampling. The technique boiled down to rendering into a buffer that was 4x and downsampling. It took a separate texture sample for each of the pixel subsamples.

Though getting a distinct definition of multi-sample vs. super sample is kind of difficult, the rule of thumb I've come up with is:
- The same texture sample is used by all of the subsamples is multi-sampling.
- A different texture sample per subsample is super-sampling.
 
RussSchultz said:
jerry_enCater said:
VSA100 Super-Sampling????
I am sure it is a big mistake.

Yes, it was super sampling. The technique boiled down to rendering into a buffer that was 4x and downsampling.

IIRC, that is not correct. While nVidia's current solutions ( and ATI's ones probably, too ) use a 4x bigger buffer, the VSA-100 used 4 normal size buffers with small biases.
Theorically, it rendered the same scene four times with a small "error" each time, and then merged the four buffers.


Uttar
 
Parhelia has 80 million transistors. With its quad vertex shader array, it seems plausible to assume that it can do 1 full triangle per clock, although there is no information available on whether its triangle setup unit can keep up with such a high triangle rate.

Parhelia apparently comes with 3 RAMDACs, 2 of which are 10-bit 400 MHz, with the third one being 230 MHz and an unspecified number of bits. The Parhelia board also comes with 2 TMDSes, but these seem to be external to the Parhelia chip itself.

Is it verified that the RAMDACs of R300 are in fact 10-bit?

Also, I am fairly sure that the NV20 memory controller is 32x4, not 32x2.
 
Most, if not all the blanks in the packaging column should be "Wirebond".

MuFu.
 
Uttar said:
IIRC, that is not correct. While nVidia's current solutions ( and ATI's ones probably, too ) use a 4x bigger buffer, the VSA-100 used 4 normal size buffers with small biases.
Theorically, it rendered the same scene four times with a small "error" each time, and then merged the four buffers.

Mathematically its the same thing, which is why I said "it boiled down to".
 
Back
Top