The G92 Architecture Rumours & Speculation Thread

Status
Not open for further replies.
However, in its single-chip configuration, it is more high-end in the sense of the 8800 GTS than in the sense of the 8800 GTX or Ultra.

So all that hullabaloo about 1Tflop was talking about multi-GPU products? I hope there is going to be a single-chip G9x variant that can bring the business and trounce G80.
 
Well, maybe it really does have 2x the gflops, but only a 256 bit bus. It would almost always be limited by memory bandwidth/texturing, hence explaining the GTS level performance on current games. That doesn't really sound like nvidia, but maybe ALUs are cheap enough and nvidia wants to feed the GPGPU people a really big bowl of flops for breakfast.

Hmm... we need further prognostications from Arun :)
 
So all that hullabaloo about 1Tflop was talking about multi-GPU products?
No, I think that's definitely talking about single-chip. I'd like to point out that NVIDIA themselvs never said 1TFlop though, they said 'nearly 1TFlop'. However, at that stage, they probably didn't precisely know the final clocks, so that has to be taken into consideration too. And everything also depends on how you counts your flops (for the better, or for the worse).

And it's certainly worth pointing out that I have no good idea about the bus width, or precise performance estimates... So once again, don't read too much into what I say. If you did do so, the only things you risk uncovering are my own personal guesses. I'm not sure what would be the point of that!
 
psurge said:
Well, maybe it really does have 2x the gflops, but only a 256 bit bus. It would almost always be limited by memory bandwidth/texturing, hence explaining the GTS level performance on current games. That doesn't really sound like nvidia, but maybe ALUs are cheap enough and nvidia wants to feed the GPGPU people a really big bowl of flops for breakfast.
Second R600? :LOL:

At the moment it seems that one G92 is far away from reaching 1 TFLOPs or the other possibility is that NV counts the SFUs, which is not very usual.

With 2.4GHz you simply could save some die-space... ;)
 
Second R600? :LOL:
When adding more TMUs and ROPs would just make you hit massive bandwidth limitations, it seems reasonable to me to increase the number of ALUs and/or the performance of the ALUs. Since ironically that would *always* be the best way to improve your perf/mm2 *within* those bandwidth constraints. This is all theory, of course, and may not apply perfectly to practice. Other units than TMUs/ROPs/ALUs would have to be considered too.

At the moment it seems that one G92 is far away from reaching 1 TFLOPs or the other possibility is that NV counts the SFUs, which is not very usual.
The G80 launch material did count the MUL... And the Tesla launch material does too. Only the CUDA Beta material doesn't.

With 2.4GHz you simply could save some die-space... ;)
And with more ALU *and* 2.4GHz, you could make CELL DP cry... Decisions, decisions! :) (I'm just pointing out that we don't know, and that what might seem as a ridiculous number of FLOPs could be justified by GPGPU - nothing else!)
 
The G80 launch material did count the MUL... And the Tesla launch material does too. Only the CUDA Beta material doesn't.

That they will count the MUL(which is usable for general shading) is clear. But I meant the SFU only unit (like sin, cos, ...) which give you on G80 every 4 cycles one flop.

So let's count: 96 (SPs) *( 2 flop (MADD) + 1 flop(MUL) + 0.25 flop(SFUs) * 2.4GHz = ~750GFLOPs.

But maybe G92 can use the SFUs more often, like G86 every cycle one SFU -> 922 GFLOPs if you calculate with 1 flop SFU in the calculation ahead.


So how do you think will NV reach the promised ~ 1 TFLOPs on G92?
 
That they will count the MUL(which is usable for general shading) is clear. But I meant the SFU only unit (like sin, cos, ...) which give you on G80 every 4 cycles one flop.
If and only if you don't use the MUL. You can't add these two figures...

But maybe G92 can use the SFUs more often, like G86 every cycle one SFU -> 922 GFLOPs if you calculate with 1 flop SFU in the calculation ahead.
What the? Since when can G86 do one SFU per cycle? That makes no sense! It must be one MUL/cycle you are thinking of (vs. 0.5/cycle truly exposable max on G80).

So how do you think will NV reach the promised ~ 1 TFLOPs on G92?
I think they'll send the bill to Andy Keane (NVIDIA), who'll pay it out of his own pocket with borrowed money, and become a billionaire afterwards. Okay, I'm exagerating, but you get my point.

EDIT: I also think that it is possible that we are thinking about this wrong: Why must the shader core be so amazingly similar to the one in the G80? There could be some not-so-obvious differences that make raw GFlops comparisons not even that interesting. But that's just speculation, obviously...
 
the combination of a potential 512-bit bus and GDDR4 standard should allow for increased ROPs and TMUs over G80. however, since Nvidia is already well ahead of ATI in the # of ROPs & TMUs, increasing the ALUs and ALU performance sounds right.

as far as "nearly 1 TFLOP", well, 1 TFLOP would be 2 to 3 times as much floating point performance as the G80, depending on how flops are counted. The ~1 TFLOP G92 could be because they're going back to "Nvflop" counting, or is that programmable flops ?
 
Or the 1TFLOP G92 could simply be the result of the massive telephone game we call "the Internet". It amazes me how quickly the 1TFLOP/double-precision statement came to be inextricably linked with the then-new rumours of a chip called "G92". It was on the order of hours..

Nvidia is going to have a PR disaster on their hands if G92 is not the chip they were talking about, as the web explodes in accusations of broken promises.</melodrama>

(Anyway, count me in among the skeptics.)
 
Did NV ever go public saying that G92 is the high-end-chip ?
Did they ever say 1 GFlop would be reached by a single chip soon ?
 
No.. Nvidia never said they would have a 1 GFlop gpu out by year's end. However they did say they would have a 1 TFlop out by year's end ;) Hey I'm all for that not being a high-end card though :D

The quote/rumor started at the inquirer:

"NVIDIA says its G92 high-end graphics card will deliver almost a teraflop of computing performance. In an analyst webcast, Nvidian Michael Hara says that the chip will be ready for Christmas, a release cycle the company adopted with G80, where high-end products come out for Chrimbo and the mid-range and low-end products hit in the spring."

Really depends on your thoughts of the Inquirer, from what I've gone by them they aren't terribly inaccurate. I've yet to find another source that hasn't just been a copy of the Inquirer's, perhaps others have?
 
So is it the basic consensus that Nvidia is also going with a dual GPU solution for it's next high end ala. ATI's R700?

If so, it looks like for the first time in almost a decade I'm going to have to forgo getting any competitor's high end video card. Unless multi-GPU algorhythms and implementations get an order of magnitude better than what is currently being done.

And here, I was hoping to be able to pick up Nvidia's card if indeed the high end ATI card was going to be 2xR700's.

Regards,
SB
 
My calculated guess would be the G92 to be single 800M transistor die at about 850Mhz, 512 bit memory bus, 32 ROPs, 40 TMU, 160 ALUs at about 2GHz.

The G98 could then simply be half: 256 bit bus, 16 ROP, 20 TMU, 80 ALU.
 
The G98 could then simply be half: 256 bit bus, 16 ROP, 20 TMU, 80 ALU.

Why should G98 be 1/2 of a high-end G9x? :???: NV needs an opponent for RV610. ;)

G92 is a performance-GPU.
The ones, who want to have 1 TFLOPs in one chip and more units than on G80 have to search another codename...
 
Status
Not open for further replies.
Back
Top