NVIDIA GF100 & Friends speculation

so is it another chip with two geometry cluster things? something huge like 288SP?

GF108 could then be a third of GF106 (96SP, 64bit bus)

288SPs would be a bit weird, and way too close to GF104, especially since the most popular variant only has 336SPs enabled.

My money's on 256.
 
288SPs would be a bit weird, and way too close to GF104, especially since the most popular variant only has 336SPs enabled.

My money's on 256.
I severly doubt GF106 will again use SMs with 32 SPs again. And 256 isn't a multiple of 48. Half a GF104 would be 192 SPs, but maybe it will have still 2 rasterizers -umm- GPCs and the rumored 192bit memory interface. I don't think we will see a 5 SM GF106 with 240 SPs.
 
so is it another chip with two geometry cluster things? something huge like 288SP?

GF108 could then be a third of GF106 (96SP, 64bit bus)

I dare to say that's impossible.

GF104 has that unusual shape because it has two GPCs sitting next to each other.
Since GF106 is (almost) a nice square again, I'm sure it has only 1 GPC.


Alexko said:
My money's on 256.
At that die-size it better should, along with 24 ROPs/192-bit, otherwise its perf./mm² will be more like GF100 than GF104.
 
So if it is 256SPs then what kind of performance would be expected?

Would it be on par with Juniper or worse? Very interested for HTPC, would like a nice for it and casual games and flash acceleration for flash games.
 
I severly doubt GF106 will again use SMs with 32 SPs again. And 256 isn't a multiple of 48. Half a GF104 would be 192 SPs, but maybe it will have still 2 rasterizers -umm- GPCs and the rumored 192bit memory interface. I don't think we will see a 5 SM GF106 with 240 SPs.

Hmmm I was actually betting on 5/6 SM's enabled for a 240 SP / 40 TMU part. Certainly seems bigger than the typical x6 part so I wouldn't rule out yield harvesting by disabling an SM just like they did for GF100 and GF106.
 
Would it be on par with Juniper or worse? Very interested for HTPC, would like a nice for it and casual games and flash acceleration for flash games.

For HTPC it'll be interesting if they finally support bitstreaming. As well, it's still questionable if they can approach the perf/watt of the AMD chips.

GF104 compares quite well at its price segment because it's in direct competition with a salvage chip (5830) with often worse power consumption than 5850 or even 5870.

Once you dip below that you're now going against more power efficient chips once again.

Regards,
SB
 
G92 and G94 both have the same GDDR bus width, 256-bit. Seems possible that GF104 and GF106 are the same, both with 256 bits. Fillrate is still king.

Then GF106 would have a single GPC, maximum of 192 MADs per clock.

GF108 would be 128-bit with a single GPC, too, I suppose, perhaps with only 2 SMs.
 
For HTPC it'll be interesting if they finally support bitstreaming. As well, it's still questionable if they can approach the perf/watt of the AMD chips.

What means perf/watt for a HTPC card?
Because the GTX460 needs less power while idling (5770 niveau) and playback of blu-rays (lower then a 5770): http://ht4u.net/reviews/2010/zotac_geforce_gtx_460/index12.php
http://ht4u.net/reviews/2010/zotac_geforce_gtx_460/index13.php
I think nobody cares that a GTX460 needs 10 watt more while it's 6% slower...
 
G92 and G94 both have the same GDDR bus width, 256-bit. Seems possible that GF104 and GF106 are the same, both with 256 bits. Fillrate is still king.

Then GF106 would have a single GPC, maximum of 192 MADs per clock.
I think that makes zero sense. GF104 isn't particularly fillrate/rop limited to begin with (there's not really that much difference between the 768MB and 1024MB version), and if you only have 4 SMs you can only output 8 (color) pixels per clock anyway.
I totally don't understand why a 128bit / 1 GPC / 4 48 core SM chip would be that big though...
 
I think that makes zero sense. GF104 isn't particularly fillrate/rop limited to begin with (there's not really that much difference between the 768MB and 1024MB version),
Yes, 4-10% margin in favour of the latter is not much. Maybe running both at ~800MHz would show more difference?

and if you only have 4 SMs you can only output 8 (color) pixels per clock anyway.
GF100 is a bit strange like this...

I totally don't understand why a 128bit / 1 GPC / 4 48 core SM chip would be that big though...
I think the ROPs/MCs (and related stuff that only appears 3 times) in the central area of GF100 are only about 25% of that central area, i.e. the central area (excluding ROPs/MCs etc.) in GF104 is really costly, roughly as much as a GPC.
 
Thank you for your feedback.

In one of our previous articles several years ago we carried out image quality investigation and the results were that ATI and Nvidia use different methods of transparency antialiasing (TAA). Due to those differences the actual quality of ATI's quality (super-sampling) setting is closer to Nvidia's multi-sampling. In fact, ATI's TAA SS is not pure super-sampling.

We also found out that no matter what test you are running the worst hit in performance with TAA enabled/disabled is around 2%. Modern graphics cards have a very advanced AA optimized architecture, which makes not only TAA but also some FSAA modes to have minimal impact on performance. Since not all scenes feature alpha textures in substantial quantities to have a tangible impact onto performance, the TAA mode is hardly crucial at all, which is why we simply tend to set image quality to the similar levels.

We stay committed to our previous findings. Since it is impossible to make games look the same on ATI Radeon and Nvidia GeForce, we simply tend to set driver settings so that the image quality was close to the maximum possible degree.
Xbitlabs official response from the comment section..

http://www.xbitlabs.com/discussion/6404.html

maybe they were referring to this one :
http://www.xbitlabs.com/articles/video/display/3way-sli-crossfire_5.html

the link compares GT200 vs HD4xxx , they should have made a contemporary comparison , because HD5xxx supposedly improved SSAA quality .

Also :
We also found out that no matter what test you are running the worst hit in performance with TAA enabled/disabled is around 2%.
could anyone validate this claim ?
 
Last edited by a moderator:
Yes, 4-10% margin in favour of the latter is not much. Maybe running both at ~800MHz would show more difference?
Likely but probably not that much. Besides, nothing probably stops nvidia from upping the memory clock to 1Ghz on GF106 based products (if they want to use the same ram chips - apparently faster ones can't be that much more expensive considering AMD is using them on HD5770). Of course that won't increase the ROP throughput - but this is already overkill anyway imho.
I think the ROPs/MCs (and related stuff that only appears 3 times) in the central area of GF100 are only about 25% of that central area, i.e. the central area (excluding ROPs/MCs etc.) in GF104 is really costly, roughly as much as a GPC.
Hmm so what's taking up all the space? In theory scaling back GF104 to GF106 should shrink size a bit more than Cypress -> Juniper - less shared logic. But maybe that's not the case... If, however, GF106 is larger because it's more than a half GF104, my bet would be an additional SM, not 256bit memory interface.
 
That 2% figure maybe happened in a game with no foliage? Cause AAA can cause big drops on my HD 5770. It's less of a hit than my HD 4670 but it can still be quite a difference. I fired up some GPU limited scenes in Mass Effect, and there was definitely a difference. With 8x MSAA I got 60 almost everywhere at 1440x900. With AAA added I dropped to 30, 20, even 15 at places.
 
http://www.xtremesystems.org/forums/showthread.php?t=256740

1zdp3xx.jpg
 

OK, so the variable part of the PCI-E connector is supposed to be 71.65mm long according to Wikipedia, and in this picture I measured it at 973 pixels. So thats 1 mm = 13,58 pixels.

And I measured the die at 151 × 154 pixels, or 11.2 × 11.34 = 127mm². That's pretty big, 27% over Redwood (probably closer to ~23% without the packaging, and accounting for a slight distortion of the image). I guess NVIDIA doesn't intend to compete with Cedar at all.
 
The memory chips are interesting. 2gbit ddr3 800Mhz 16bit. Either this board has another 4 of these chips on the back (and 2GB memory which is total overkill certainly for this performance class), or it's going to be very bandwidth constrained (64bit ddr3 interface - certainly with such a memory interface it couldn't be more than a cedar competitor no matter the die size...). Maybe the chip though would support much faster gddr5, and that's just the low-end board.
 
OK, so the variable part of the PCI-E connector is supposed to be 71.65mm long according to Wikipedia, and in this picture I measured it at 973 pixels. So thats 1 mm = 13,58 pixels.

And I measured the die at 151 × 154 pixels, or 11.2 × 11.34 = 127mm². That's pretty big, 27% over Redwood (probably closer to ~23% without the packaging, and accounting for a slight distortion of the image). I guess NVIDIA doesn't intend to compete with Cedar at all.
Well, that would be smaller than GT215. So if this indeed can compete with redwood (read: it needs to be a bit faster than GT215 in the fastest possible configuration) I think that would be very good - more features, more performance, smaller. A ~25% or ~20mm² die size difference compared to competition isn't going to cost that much more.
The board shown though certainly isn't a HD5670 competitor...
 
Well, that would be smaller than GT215. So if this indeed can compete with redwood (read: it needs to be a bit faster than GT215 in the fastest possible configuration) I think that would be very good - more features, more performance, smaller. A ~25% or ~20mm² die size difference compared to competition isn't going to cost that much more.
The board shown though certainly isn't a HD5670 competitor...

With the update to GDDR5 for the 5500 series I wonder if it would compete with it either in such a configuration. Anywho, maybe it's just me and my aging eyes but the text on the GPU "GF108-200-A1 QUAL SAMPLE" looks a bit odd even considering the image distortion, the "shadow" like effect doesn't look like it is at the correct perspective given the camera lens position.. but as I said hey' it could be my eyes. IN particular the "QUAL SAMPLE" shouldn't have much if any apparent "shadow" (impression/etching) as it would be almost directly under the lens.
 
Back
Top