NV30 Vertex program performance

eSa

Newcomer
Ah, please enlight me gurus :

From "NV30 OpenGL extensions" pdf document :

" Vextex program performance clock for clock

- Over 3x faster than nv20
- over 1.5x faster than nv25

"

So, what kind of performance nv25 (GF4) has ? How many instructions/cycle ? Anyone !?
 
Hopefully I have this right.

GF3 = NV20
GF4 = NV25

I believe the GF4 has two vertex pipes. Though, there might be some resource sharing. I'm not sure about exact numbers, if that's what you wanted.
 
These numbers suggest three vertex pipelines for NV30, but it also says "over". So that could be four slightly less efficient pipelines.
 
Thats unlikely Dave, they would have to be superscalar (or long instruction word).

Could just be they have 4 with the same efficiency individually as GF4, but with the numbers simply recognizing that performance cant scale perfectly linearly with the number of pipes.
 
it will be 4 bigger and better pipelines, but performance doesnt scale linearly, as MFA said.
 
Interesting information. Here's an observation:

NV20: 1 vertex engine x 200MHz = 200M instructions/sec
NV25: 2 vertex engines x 300 MHz = 600M instructions /sec
NV30: 3 vertex engines (effective) x 400 MHz (?) = 1200M instructions/sec

If you look at the quoted Mtri/sec figures from Nvidia's documentation, they are:

NV20: 40+ Mtri/sec
NV25: 136 Mtri/sec

These numbers more or less line up with the predicted instructions/sec ratios given above. By extrapolation for NV30, assuming 400 MHz, you would get ~270 Mtri/sec. That's considerably less that the 325 Mtri/sec ATI is already quoting for the R300. Of course, these are all purely theoretical numbers, but interesting nonetheless.
 
Back
Top