NVIDIA GT200 Rumours & Speculation Thread

Status
Not open for further replies.
Well, you don't have an instruction pointer per element in a SSE vector, but yeah - it's a massive load of bullshit if I say so myself. The 'stream processor' nomenclature is already a bit stupid, but it's far from the 'errr that's FUD' line from a legal POV. Calling them cores, on the other hand, is really pushing it IMO...

Going back to GT200, could it be that both 240 SPs and 384 SPs are correct? i.e. GT200 would have 80 TMUs, 240 SPs, 32 ROPs/GDDR3 and a refresh coming out later would have 96 TMUs, 384 SPs and 32 ROPs/GDDR5? That would be quite aggressive on the same process node (or just one half-node ahead; i.e. 65->55nm) but not entirely impossible, especially given that GT200 did get delayed apparently.

I listened to all ~7 hours of the Financial Analyst presentation given by NVIDIA, and there was some very interesting and fascinating stuff being presented.

Did you notice that JHH spoke about how NV would move from hundreds to thousands of "cores" during the presentation?

He also said they were working things that were miles ahead of the competition. Surely he must have meant being able to use CUDA to run those many parallel processors for good use in computational finance, medicine, weather, etc?

No wonder NV is so confident moving forward. Their architecture is solid enough so that they can scale to thousands of cores over time, they have a really solid programming tool in CUDA to take full advantage of advanced GPU parallel processing, they have PhysX processing which can be incorporated into the GPU and make use of CUDA, and they have incredibly clever low power high performance devices designed that can be immediately used in next gen iPhones and such, in addition to everything else that we don't know about.

Will be fascinating to see how things work out, but I can tell from the presentation that NVIDIA is amped about the future.
 
Here is further information regarding the release date. It is definitely set for July, with May being a point of further information about the GT200 from Nvidia:

http://xbitlabs.com/news/video/disp...ribes_Next_Generation_Graphics_Processor.html


Looks like Computex is going to be quite interesting... I'm only $250 from there, so maybe I should fly over and check it out for everyone. I would really like to know if this is DX10 or DX10.1... not that it is a deal breaker to be sure.
For G92b most likely...
GT200 is looking like a late Q3/Q4 launch.
It always amazes me how these news articles always make so many assumptions based off a single quote.

I listened to all ~7 hours of the Financial Analyst presentation given by NVIDIA, and there was some very interesting and fascinating stuff being presented.

Did you notice that JHH spoke about how NV would move from hundreds to thousands of "cores" during the presentation?

He also said they were working things that were miles ahead of the competition. Surely he must have meant being able to use CUDA to run those many parallel processors for good use in computational finance, medicine, weather, etc?

No wonder NV is so confident moving forward. Their architecture is solid enough so that they can scale to thousands of cores over time, they have a really solid programming tool in CUDA to take full advantage of advanced GPU parallel processing, they have PhysX processing which can be incorporated into the GPU and make use of CUDA, and they have incredibly clever low power high performance devices designed that can be immediately used in next gen iPhones and such, in addition to everything else that we don't know about.

Will be fascinating to see how things work out, but I can tell from the presentation that NVIDIA is amped about the future.

How is CUDA "miles ahead" of ATi's GPGPUs?
ATi has had it out for quite awhile, has a much better price/performance and performance/watt and even has double precision, which Nvidia still hasn't managed to push out the door yet.
 
For G92b most likely...
GT200 is looking like a late Q3/Q4 launch.
It always amazes me how these news articles always make so many assumptions based off a single quote.

Indeed... which is why I'm not holding my breath over any of it. Computex should reveal more, certainly.
 
Did you notice that JHH spoke about how NV would move from hundreds to thousands of "cores" during the presentation?
Was that said in the context of single cor--er, GPUs, or is it safe to assume that falls under SLI (where 256 * 4 cards would qualify as "thousands")?

Assuming "GT200" is packing "only" 256 "thingamabobs," that "is."
 
For G92b most likely...
GT200 is looking like a late Q3/Q4 launch.
It always amazes me how these news articles always make so many assumptions based off a single quote.

why do you say that?

How is CUDA "miles ahead" of ATi's GPGPUs?
ATi has had it out for quite awhile, has a much better price/performance and performance/watt and even has double precision, which Nvidia still hasn't managed to push out the door yet.

Cuda as an SDK is miles ahead
 
For G92b most likely...
GT200 is looking like a late Q3/Q4 launch.
Actually it looks more and more like Q2/Q3 launch.
And more and more like 240 SPs (in 10 clusters) with 512-bit bus.
55nm refresh should be out in Q4 and will use 256-bit bus with GDDR5.
Something like that...
 
...
How is CUDA "miles ahead" of ATi's GPGPUs?
ATi has had it out for quite awhile, has a much better price/performance and performance/watt and even has double precision, which Nvidia still hasn't managed to push out the door yet.

IMLO JHH didn't mean ATI/AMD, but INTEL. :smile:
 
Was that said in the context of single cor--er, GPUs, or is it safe to assume that falls under SLI (where 256 * 4 cards would qualify as "thousands")?

Assuming "GT200" is packing "only" 256 "thingamabobs," that "is."

I got the sense that JHH was almost definitely not referring to SLI, which makes things rather interesting given that they were talking about thousands of cores and several teraflops of computing power in the near future. Also, I seem to recall reading something a little while ago where NVIDIA talked about still developing "monolithic" GPU's for their high end moving forward.
 
IMLO JHH didn't mean ATI/AMD, but INTEL. :smile:

Yes, I think JHH in almost every circumstance was referring to Intel when he referred to the "competition". Only in a few brief comments did he make any mention of AMD/ATI. In fact, he even stated that Dave Orton was "by far" the best competitor that NVIDIA has ever faced in their 15 year history, and he complimented the skills of the GPU architects at ATI.
 
NH is guessing (?) too.
http://www.nordichardware.com/news,7644.html

Unlike AMD/ATI, NVIDIA is not expected to use GDDR5. Instead it will go for a more complex PCB with GDDR3, which means a wider bus instead of higher memory frequency. GDDR3 chips tops out at 1100MHz today. We expect GT200 to use these chips, and with a 512-bit bus it puts it on par with RV770 and its 256-bit bus and 4GHz GDDR5 memory.

The clusters have changed from 16 shaders per cluster to 24 per cluster, most likely MADD + MUL, and 10 clusters with the high-end part, which means 240 shaders all in all. These will be accompanied by 32 ROPs, 120 TMUs and 120 TFUs. The core clock will hopefully be in the area of 600-650MHz, which means shader clocks around 1500MHz.

We're guessing late Q3 or early Q4, time will tell. And this is not the GeForce 9900 series. The 9900 series will be based on the 55nm G92b core.

All i want is a very fast card for the new Stalker game and Farcry 2. :smile:
 
The clusters have changed from 16 shaders per cluster to 24 per cluster,
I know G80 has 8 clusters of 16 ALUs, but is each cluster one SIMD (total of 8 processors, 16 ALUs wide) of does a cluster consist of two SIMDS (16 processors, 8 wide)?

With regards to the bandwidth of "GT200". GDDR3 at even 2100 MHz and a 512bit bus doesn't seem like a major increase over 8800GTX. It would even be slower than 8800Ultra. Would that be enough to feed 240 or even 256 ALUs? How did G80 fare with regards to bandwidth? I know that G92 seems to be quite restricted by its 256bit bus. Won't similar problems arise if GT200 uses GDDR3?
 
I know G80 has 8 clusters of 16 ALUs, but is each cluster one SIMD (total of 8 processors, 16 ALUs wide) of does a cluster consist of two SIMDS (16 processors, 8 wide)?

With regards to the bandwidth of "GT200". GDDR3 at even 2100 MHz and a 512bit bus doesn't seem like a major increase over 8800GTX. It would even be slower than 8800Ultra. Would that be enough to feed 240 or even 256 ALUs? How did G80 fare with regards to bandwidth? I know that G92 seems to be quite restricted by its 256bit bus. Won't similar problems arise if GT200 uses GDDR3?

1100Mhz GDDR3 (2200Mhz effective) gives 140.8GB/s on a 512bit bus.

Thats quite a bit higher than the Ultra with only 103.7GB/s.

If these specs are true this thing soumds like a beast! 1.08 TFLOPS of shader power at 1500Mhz!!
 
I got the sense that JHH was almost definitely not referring to SLI, which makes things rather interesting given that they were talking about thousands of cores and several teraflops of computing power in the near future. Also, I seem to recall reading something a little while ago where NVIDIA talked about still developing "monolithic" GPU's for their high end moving forward.

JHH wouldn't know thousands of cores if it bit him on the ass. He certainly doesn't know anything about hundreds, or even a hundred cores.

Really he's still in the low teens.

Aaron spink
speaking for myself inc.
 
The current 128 "cores" (SP) in G80 and G92 GPUs
(256 in 9800 GX2 cards) isn't directly comparable to Intel's 16 to 24 cores in Larrabee.

Nvidia will soon jump to 200 or 200+ SP in the GT200 / NV55 GPU and the GX2 cards based on GT200 / NV55, we're looking at 400-500 SP. By the time Larrabee ships in 2010 Nvidia will have products with 1000 or more SP.

Nvidia has 1000 SP vs Intel's 16-24 (or even maybe 32) cores .

Would seem like Nvidia has an overwhelming advantage, yet that's not really comparable.

It's like saying Xbox 360's Xenos GPU has 48 shader pipelines when it really only has 8 (ROPs). Those pipelines are really just ALUs.

Obviously what it'll come down to is not counting SP, ALUs and cores, but overall final graphics performance. Larrabee might be better than current NV50/G80/G92 based products and even upcoming NV55/GT200 based products. It'll be interesting to see how Larrabee compares to Nvidia's true next-generation architecture (NV60, for lack of a better name since Nvidia changes the way it names GPUs every month now) both in ray-tracing (if that happens) but more importantly, in rasterization or even hybrid rendering which will be more practical than ray-tracing only.
 
Actually it looks more and more like Q2/Q3 launch.
And more and more like 240 SPs (in 10 clusters) with 512-bit bus.
55nm refresh should be out in Q4 and will use 256-bit bus with GDDR5.
Something like that...

http://www.overclockers.ru/hardnews/28879.shtml
They say: GT200 is a monster...
10 clusters with 24 "SPs" and 8 TMUs -> 240 SPs and 80 TMUs at all
512bit MI with GDDR3
32 ROPs,
Some kind of CFAA, possible D3D10.1
 
JHH wouldn't know thousands of cores if it bit him on the ass. He certainly doesn't know anything about hundreds, or even a hundred cores.

Really he's still in the low teens.

Aaron spink
speaking for myself inc.

har har :D

Would it be accurate to describe them as thousands of parallel processing units?
 
The current 128 "cores" (SP) in G80 and G92 GPUs
(256 in 9800 GX2 cards) isn't directly comparable to Intel's 16 to 24 cores in Larrabee.

Nvidia will soon jump to 200 or 200+ SP in the GT200 / NV55 GPU and the GX2 cards based on GT200 / NV55, we're looking at 400-500 SP. By the time Larrabee ships in 2010 Nvidia will have products with 1000 or more SP.

Nvidia has 1000 SP vs Intel's 16-24 (or even maybe 32) cores .

Would seem like Nvidia has an overwhelming advantage, yet that's not really comparable.

It's like saying Xbox 360's Xenos GPU has 48 shader pipelines when it really only has 8 (ROPs). Those pipelines are really just ALUs.

Obviously what it'll come down to is not counting SP, ALUs and cores, but overall final graphics performance. Larrabee might be better than current NV50/G80/G92 based products and even upcoming NV55/GT200 based products. It'll be interesting to see how Larrabee compares to Nvidia's true next-generation architecture (NV60, for lack of a better name since Nvidia changes the way it names GPUs every month now) both in ray-tracing (if that happens) but more importantly, in rasterization or even hybrid rendering which will be more practical than ray-tracing only.

This is very true. But regardless of nomenclature that NVIDIA is using for their GPU's ("cores", or stream processors, or parallel processing units, etc), it's noteworthy that the performance in many of the applications discussed during the 7 hour marathon session was said to scale linearly with the number of "parallel processing units" on the NVIDIA GPU. So starting with the current 128, going to 256 would be 2x improvement, 512 would be 4x improvement, 1024 would be 8x improvement. So if a current NVIDIA high end GPU is already 200x faster than Intel Core 2 Duo in some applications, then a NVIDIA GPU in the next two or three years would be 1600x faster than a Core 2 Duo. That's a pretty incredible gap that Intel needs to bridge over the next few years with Larrabee.
 
Last edited by a moderator:
This is very true. But regardless of nomenclature that NVIDIA is using for their GPU's ("cores", or stream processors, or parallel processing units, etc), it's noteworthy that the performance in many of the applications discussed during the 7 hour marathon session was said to scale linearly with the number of "parallel processing units" on the NVIDIA GPU. So starting with the current 128, going to 256 would be 2x improvement, 512 would be 4x improvement, 1028 would be 8x improvement. So if a current NVIDIA high end GPU is already 200x faster than Intel Core 2 Duo in some applications, then a NVIDIA GPU in the next two or three years would be 1600x faster than a Core 2 Duo. That's a pretty incredible gap that Intel needs to bridge over the next few years with Larrabee.

True, but Nvidia isn't offering anything really new. Their basic TNT2 graphics accelerator was much much much faster at rendering graphics than a Pentium III processor. Larrabee will have to be optimized for rendering graphics, be it rasterization, raytracing or hybrid raster-raytracing, and other methods so that its 16-24 or 32 cores can keep up with Nvidia's many hundreds or a thousand or so parallel processing units. We've seen that even 3 or 4 CELL processors in parallal are not upto the task of matching in software what a 4-year old 6800 can do in hardware, as far as traditional rasterization (I am thinking back to one particular demo seen last year or 2006).
 
Status
Not open for further replies.
Back
Top