NV30 : not so fast?

mboeller

Regular
So now one of the famous antiNvidia :LOL: :LOL: threads :

quote's from reactorcritical (link : http://www.reactorcritical.com/#l1205 )
In Quake III Arena in 1280x1024 with 4x FSAA enabled, NV30 is going to be 2.5 times faster than the GeForce4 Ti4600.
The R300 is 2.5times faster than the Ti4600 @1600x1200; so it seems both chips have the same speed.
In The Next Doom the board based on NV30 will be able to show 3.5 times or or even more of the performance the current Nvidia`s flagman has to offer us there.
they claim fps from alpha software with Emulation of the chip :rolleyes:
NV30 will score three times more than the GeForce4 Ti4600 in 3D Mark 2001.
with what CPU? The 3DMark2001 benchmark is processor-dependent, so maybe they can claim 3xspeed with an P4 OC 3,5GHz or what...
Effective HQ Pixel Fillrate (2x anisotropic filtering enabled) of the newcomer with will be about 2.7 times more than that of the fastest NV25.
ahh; they have fast anisotrophic filtering now. good! So they nearly catch up to the R300 which can do 16tap (bilinear) anisotrophic filtering for free (without the maybe higher bandwidth demand taking into account)
As for pixel-shading speed, it will be 4 times of the NV25.
so they have 6pixel shaders, or? The Ti4600 works at 300MHz, the NV30 seem to work at 400MHz; the Ti4600 has two (?) pixel-shaders; and so 400/300 x 6/2 + tweaks you get 4times the speed. I thought the R300 already has 8pixel-shaders @~300MHz? So both chips will have nearly the same speed; other than specific performance tweaks of the different implementations.
Nvidia claims that its upcoming GPU is capable of processing 200 millions triangles per second.
Slow! So only the NV2a pipeline ( +DX9.x) at higher speed. This also means the NV30 will run at 400MHz only.
 
I think that if it's 2.5 faster in 1280*1024, it should be 2.5++ faster in 1600*1200.

I don't see your point saying that "on alpha software emulation". That's right for all, so or you make this point and dismiss all the facts, or you are ok with it and don't put it in.

On the 3DMark, i would say in which resolution and with which options ;)

On pixel shading speed, i would rather say quantity. For speed, perhaps it comes from programmability ;)

For the rest, i'll wait for the launch in september i think.
 
I don't see your point saying that "on alpha software emulation". That's right for all, so or you make this point and dismiss all the facts, or you are ok with it and don't put it in.

Facts? what Facts?? No info has officially been released yet.. and it has not even taped out.. There is also a ludacrist claim that it will have 40gb bandwidth.. :rolleyes:

Whatever...
 
Hellbinder[CE said:
]
I don't see your point saying that "on alpha software emulation". That's right for all, so or you make this point and dismiss all the facts, or you are ok with it and don't put it in.

Facts? what Facts?? No info has officially been released yet.. and it has not even taped out.. There is also a ludacrist claim that it will have 40gb bandwidth.. :rolleyes:

Whatever...
Sorry i was thinking of "figures" and not "facts" replace facts by figure ;). You are right :)
 
I've been saying all along that the R300 and NV30 will be about the same speed, which mades these fanatic threads ridiculous. Thank your lucky stars that two great DX9 cards will be out this year.

The real difference between the two will come down to price, features, and flexibility. Each card will do some things better than the other, and I believe that neither card will be "the overall winner". It will all come down to which features you tend to like.

Of course, we will still devolve into ridiculous threads because one card does one thing better than another, and some fanatic likes running his games with that feature, and believes that this feature is the only one that matters.

I think for researchers, developers, and professionals, the NV30 might turn out to be the preferred card because of the slightly better programmability. Same for Wildcat P10, but the performance of that seems to suck. For average gamers, the R300 might be the cheaper/better choice. For insane overclockers, the dudes who like to compare their dick sizes and 3DMark scores, the NV30 might be the favored choice because the .13um processor might yield superior overclocking.

Who knows. I just don't think one card fits all personalities.
 
We were correct in almost all our predictions except the memory speed. Nvidia wants it to be about 1 GHz delivering amazing 48 GB/s bandwidth when accompanied by the 3rd generation of their LightSpeed Memory Architecture. We are not sure that Samsung will be able to deliver them 1 GHz DDR-II memory by September.

We are not at liberty to discuss this, and I doubt they are either if they gained this through official channels. They did, however, miss a very important word out when talking about that bandwidth number.
 
DaveBaumann said:
We were correct in almost all our predictions except the memory speed. Nvidia wants it to be about 1 GHz delivering amazing 48 GB/s bandwidth when accompanied by the 3rd generation of their LightSpeed Memory Architecture. We are not sure that Samsung will be able to deliver them 1 GHz DDR-II memory by September.

We are not at liberty to discuss this, and I doubt they are either if they gained this through official channels. They did, however, miss a very important word out when talking about that bandwidth number.

Let me see, my guess for that missing word would be: "effective". Effective fillrate based off an overly generous estimate of their HSR schemes.

And as for the rest, 3x the Ti4600 score in 3Dmark is sheer bullshit, I mean sheer and utter bullshit. I can't believe they'd even have the audacity to suggest that when the benchmark becomes CPU limited long before you'd achieve 30K 3dmarks.

Since its Nvidia we can assume that they're exagerrating. Essentially it is the same speed as the R300. 2.5 times faster at 1280*1024*32*4x, more or less the same as 2.5 times faster at 1600*1200*32*4x. You might think it'd be faster at 1600*1200, but due to the exagerration factor I rather doubt it. In fact, it may be slower at higher rez, maybe it chokes on its memory bandwidth? I mean if 1600*1200 showed it in better light, wouldn't they have said that instead (it is the highest "realistic" resolution, after all).

By the way, how the hell does Nvidia know any of that when the card hasn't even taped out yet?

Overall the card sound like it will be good, but right in line with R300 like I've held all along. As for the delay, we'll see...
 
At 400 MHz they'd be a bit shy of 200 MPolygons/s (more like 180, at least if the clock to spec ratio is the same as for their previous chips) still would give them only ~60% of the power of the 9700 as far as vertex shader processing goes ...
 
DaveBaumann said:
We were correct in almost all our predictions except the memory speed. Nvidia wants it to be about 1 GHz delivering amazing 48 GB/s bandwidth when accompanied by the 3rd generation of their LightSpeed Memory Architecture. We are not sure that Samsung will be able to deliver them 1 GHz DDR-II memory by September.

We are not at liberty to discuss this, and I doubt they are either if they gained this through official channels. They did, however, miss a very important word out when talking about that bandwidth number.

Dave, are you on NDA ?
I don't understand their 48GB/s number ... With 1Ghz DDR II (500Mhz in reality it's 1Gbit/s per pin) we have either 16 GB/s with a 128 bit bus or 32 GB/s with a 256 bit bus. Either they're wrong or ...

1. they talked about the multichip configuration
2. they talked about the bandwidth of a 'big cache' (I was thinking about an off chip cache maybe in the same package of the GPU with a 384 bit bus)

Guillaume
 
Lessard said:
Dave, are you on NDA ?
I don't understand their 48GB/s number ... With 1Ghz DDR II (500Mhz in reality it's 1Gbit/s per pin) we have either 16 GB/s with a 128 bit bus or 32 GB/s with a 256 bit bus. Either they're wrong or ...

1. they talked about the multichip configuration
2. they talked about the bandwidth of a 'big cache' (I was thinking about an off chip cache maybe in the same package of the GPU with a 384 bit bus)

Guillaume

The way I see it, they could mean that it's ~32 GB/sec* 1.5 (From occlusion culling/ Lightspeed 3 etc..) = ~48 GB/sec.
Or, nv30 indeed is architecture with several busses and chips :eek:
 
Poo - missed this post by its title - why I created a duplicate 2 hours after the train had left the station - sorry!

My thoughts - alot of interesting hype. I calculated a 32GB/sec effective fill rate too last week - so there must be a 50% lift factor in their calculations being given (theoretically in some circumstances) by LMA III and new coding routines if implemented the way NVidia advise to do depth testing.

3d Mark2001 of 30,000 would be impressive to see. Is this on a big Hammer configuration or does NV30 take that much load of the CPU (compared to NV25) that all the benchmarks are lifted so high, or are the video card specific tests lifted so high that when you add these to the CPU constraints its still 3x faster.

Good to hear its not going to be that late ;)

Also as said I am glad to see they know just how fast this beast will be on the final Doom 3 code - I wonder if they told John Carmack that too :)

* * * * * * * * * *

I see no end to this marketing speak until the product is delivered and thoroughly benchmarked.

But I smile thinking were this will raise the high water mark too - imagine in one or two years when cards are 2-3 times faster than NV30 or Radeon 9700, and entry levels cards have about the NV30s specs.

As the song says, "The future's so bright, I gotta wear 8) "
 
eSa said:
Lessard said:
Dave, are you on NDA ?
I don't understand their 48GB/s number ... With 1Ghz DDR II (500Mhz in reality it's 1Gbit/s per pin) we have either 16 GB/s with a 128 bit bus or 32 GB/s with a 256 bit bus. Either they're wrong or ...

1. they talked about the multichip configuration
2. they talked about the bandwidth of a 'big cache' (I was thinking about an off chip cache maybe in the same package of the GPU with a 384 bit bus)

Guillaume

The way I see it, they could mean that it's ~32 GB/sec* 1.5 (From occlusion culling/ Lightspeed 3 etc..) = ~48 GB/sec.
Or, nv30 indeed is architecture with several busses and chips :eek:

Knowing how companies like to skew the data, probably more like 16GB/s * 3x givin overdraw
 
Interresting speculation, 16GB of data bus (1GHz 128 bits DDR-II) multiplied by overdraw of 3.

Will NV30 use deferred rendering? Will this be the reason nvidia said it will not need a 256bits bus?
 
Dave, are you on NDA ?

Yes, have been for some time now - as I said there are somethings we can't discuss and some we can (witness the NV30 news entry on the main page some time ago).
 
pascal said:
Interresting speculation, 16GB of data bus (1GHz 128 bits DDR-II) multiplied by overdraw of 3.

Will NV30 use deferred rendering? Will this be the reason nvidia said it will not need a 256bits bus?

Ok, this is shot in the dark, but here goes... At Assembly 2002 demoparty (where Bitboys held the presentation), there few other presentations too. One of them was from Hybrid, Finnish company that has high end visibility/occlusion culling solution called DPSV (software based !) . Presentation was very basic, but there was interesting bit when guy talking about current state of hw first mentioned ATI's hyperZ and then said something like "some chips in next year will have more complete occlusion culling in hw" and then mentioned something about starting to use it in their engine. I wouldn't surprice if they have somekind "inside info" about future GPU's ;)

I know GF4 has rudimentary occlusion culling support as an extension, but maybe there is more to come ?! After all, Ned Green is father of basic occlusion culling techniques and he has done some work for the nVidia. Of course, those high bandwidth figures could just come from the fact that if you simply draw from front to back you will remove a lot of overdraw. (and that works even with GF4 !)
 
If it needs developer support it obviously isnt related to the bandwith though. Interesting nonetheless, I hope they meant the hardware will allow you to let the hardware autonomously decide to execute certain instructions based on an earlier bounding box test (as opposed to it doing the bounding box test, giving feedback and letting the CPU decide not to send the instructions).
 
MfA said:
If it needs developer support it obviously isnt related to the bandwith though. Interesting nonetheless, I hope they meant the hardware will allow you to let the hardware autonomously decide to execute certain instructions based on an earlier bounding box test (as opposed to it doing the bounding box test, giving feedback and letting the CPU decide not to send the instructions).

The main architecht of the DPVS, formerly know as "Umbra" made he's Masters thesis about the subject. It's available here : http://www.hybrid.fi/research/

Rather nice Masters thesis, about 140 pages :)
And for the funny trivia people out there, Timo Aila used to be know years ago as "Tsunami" of Virtual Dreams / Fairlight. He was one the most famous Amiga democoders, that is.
 
MfA said:
At 400 MHz they'd be a bit shy of 200 MPolygons/s (more like 180, at least if the clock to spec ratio is the same as for their previous chips) still would give them only ~60% of the power of the 9700 as far as vertex shader processing goes ...

Only 200M Pologons? That does seem like an awfully low pologon count compared to the Radeon 9700. Seems that the nv30 will not be the Radeon 9700 "killer" if these specs are anywhere close to being correct. Looks as though the nv30s performance will be on par with or slightly less then the Radeon 9700 from what I have read here, but this is all speculation on nv30 vapourware. Bah it is definitly a wait and see situation IMHO of course.
 
That all depends. Wasn't the geforce 4 still a better performer when it came to geomoetry throughput compared to the Radeon 8500, even though the Radeon had higher poly performance specs on the box? If this is true, it could be the same case here.
 
Didn't Pixar say that you needed just over 20 gb/s to render toy story in real time? Anyone remember what that quote mentioned?
 
Back
Top