NV30 has 125 million transistors

well, it probably does.

http://www.nvnews.net/vbulletin/showthread.php?s=&threadid=3635

pretty likely I'd say. Unless that's a fake nvidia chart.

125m transistors is nothing to scoff at. that's about 15-18 million more than Radeon9700 (R300), and more than was originally expected. Nvidia's CEO said 120M transistors in Wired Magazine awhile back. With 125m, Nv30 has almost exactly twice that of NV25 aka GeForce4 Ti. :eek:

Now if we could just have confirmation on memory bus width....
Ah well, just two more weeks til we know everything, unless someone leaks it before :)
 
If you go to the Korean site and look at all the graphs, they also quote 51 GFlops for the NV30. That seems like a very large number to me. ASCI-white (the big iron behind the US's atomic weapons projects) hit ~5000 gflops in 2000. An order of magnitude more, but it also takes up 2 basketball courts.

Guess we'll know a lot more in 2 weeks.
 
I wonder where the 51 GFlops in pixel shaders come from.

At 400 MHz, 8 pipes, 4 32 bit FP components per vector single instruction issue per cycle that would just count up to 12.8 GFlops. If they can use 16 bit FP components and work as 16 pipes (as it has been rumoured) they would just do 25.6 GFlops. Doubling that you could get 51.2 GFlops, but where they got that extra peformance? Two pixel instructions issued each cycle per pipe?
 
I believe ATI issues more than one instruction per cycle per pipe, so there's no reason to believe nVidia can't do it as well.
 
antlers4 said:
I believe ATI issues more than one instruction per cycle per pipe, so there's no reason to believe nVidia can't do it as well.

Sure, but they seem as they can issue scalar/texture/vector at the same time. For 51 GFlops it would need 2 vector instructions per cycle. A vector unit is HUGE.
 
RoOoBo said:
I wonder where the 51 GFlops in pixel shaders come from.

At 400 MHz, 8 pipes, 4 32 bit FP components per vector single instruction issue per cycle that would just count up to 12.8 GFlops. If they can use 16 bit FP components and work as 16 pipes (as it has been rumoured) they would just do 25.6 GFlops. Doubling that you could get 51.2 GFlops, but where they got that extra peformance? Two pixel instructions issued each cycle per pipe?

Interesting point. Let's do a simple calculation. Assume (because I am lazy) that NV30 runs at 500 MHz and it can do 50 GFlops. That means each clock cycle it would need to complete 100 Floats per cycle. Seems rather strange. That's a lot of computations each cycle. Going to be interesting to see how they define things to get that number.

Another way to look is take 400 MHz and the 51 GFlops number. That works out to 128 Floats/cycle. Dividing by 8, that would give 16. Meaning some combination of units and operations per unit of 8 and 16 would put us at that number. And yet again, what do they mean by "in pixel shader alone."
 
most people think Nvidia's FLOP ratings are utterly rediculas.
16 GFLOPS for Riva128
50 GFLOPs for GeForce256


but anyway, its interesting to see more info on Nv30.


Oh and that ASCI-White supercomputer that takes up 2 basketball courts, is that the one with 8,192 CPUs (Power3 i think) arranged in 512 air-conditioner sized nodes with 16 CPUs each?
 
I wonder what kind of Bus (definately a big one. Most likely double deckker ;) ) you actually would need to fill 50 GigaFlops Pixel Shader with needed data?? :LOL:
 
megadrive0088 said:
Oh and that ASCI-White supercomputer that takes up 2 basketball courts, is that the one with 8,192 CPUs (Power3 i think) arranged in 512 air-conditioner sized nodes with 16 CPUs each?

Yup. I think the ASCI program may have built an even more powerful one. I lost track.
 
yeah because I heard about ASCI-White over a year ago, and then it was already over a year old, at least.

I believe IBM and the government (NSA or whatever) want a 100 TFLOPs computer within a few years, and then several hundred TFLOPs computer soon after. I heard that NEC was soon going to have a 30 TFLOPS machine.
While ASCI-White is 10-12 TFLOPs.

I'd imagine IBM will do something massively powerful with CELL, no doubt. :)

If Cell is 1-2 TFLOPs on a single chip for normal high-end apps, and perhaps a specialized PS3 version will be 4-6 TFLOPs (only wild speculation) One could easily envision a supercomputer with 1000s of 1-2 TFLOP Cells.
Holy Christ. :eek: :eek: :eek: :eek:

getting back to topic, I wonder how feasable it would be for Nvidia to place multipule NV30 or future NV40/NV50 cores onto a single die.
 
Well as statet, at 400 MHz with 8 pipes you would need to 16 flops per pipe per cycle. Assuming 8 half float ops in the programmeable part of the pipe, isn't it conceievable that the other 8 come from the two TMUs?
 
I find this slide as interesting as any other:

33_s.jpg


This is the first "official" statement I've seen of nVidia's "PR interpretation" of NV30 performance. ("More than 2X GeForce4").

That statement could be read in many different ways. (Which GeForce4? 4200? 4600? In what circumstances? AA? High resolution? Only Shader performance?)

However, I would certainly classify Radeon 9700's performance as also "More than 2X GeForce4" from a marketing perspective as well. (You can easily create a situation with common benchmarks that backs that statement up.)

So in all, this more or less confirms to me to expect NV30 to perform pretty much on par with Radeon 9700. We'll likely see some cases where NV30 beats R300 and vice-versa.
 
Joe DeFuria said:
I find this slide as interesting as any other:

This is the first "official" statement I've seen of nVidia's "PR interpretation" of NV30 performance. ("More than 2X GeForce4").

That statement could be read in many different ways. (Which GeForce4? 4200? 4600? In what circumstances? AA? High resolution? Only Shader performance?)

A GeForce4 MX420 :p
 
Joe DeFuria said:
This is the first "official" statement I've seen of nVidia's "PR interpretation" of NV30 performance. ("More than 2X GeForce4").

That statement could be read in many different ways. (Which GeForce4? 4200? 4600? In what circumstances? AA? High resolution? Only Shader performance?)

However, I would certainly classify Radeon 9700's performance as also "More than 2X GeForce4" from a marketing perspective as well. (You can easily create a situation with common benchmarks that backs that statement up.)

So in all, this more or less confirms to me to expect NV30 to perform pretty much on par with Radeon 9700. We'll likely see some cases where NV30 beats R300 and vice-versa.
In what sense? In fact you should find the situation in which the NV30 doubles the performance of the GF4 ti 4?00. The NV30 then could be much faster, as fast as, or less powerful than the R300...
 
If they can do two instructions/clock (at 4x16bit?) they've got 51.2GFLOPS. Remember that a "MAD" is 8 FLOP.
 
Back
Top