Hmm could it be possible that some of the quads (clusters) are disabled for yields?
Since G80 is quoting b3d "..an 8-way MIMD setup of 16-way SIMD SP clusters. Inwardly, each 16 SP cluster is further organised in two pairs of 8.."
Couldnt G84 be an 4-way MIMD setup of 16-way SIMD SP clusters. Inwardly, each 16 SP cluster is further organised in two pairs of 8? Since the use of 80nm process and have the die size of 160mm^2 (not sure about this though) im sure this might be possible, just like how the G73 was really 4 quads.
So, just pure specualation:
G84
80nm
64 scalar shaders
divided into 4 clusters
16 SP per cluster (8x2)
1 TMU quad per 2 shader array results in a total of 16 TMUs
12 ROPs (where G84 is capable of using 192bit memory interface - more on this later)
So basically, this is 1/2 of G80 however from what we can gather from the current rumours is
8600GTS/8600GT
G84
80nm
32 scalar shaders (sp)
divided into 2 clusters (where 2 out of 4 clusters is disabled)
8 TMUs
8 ROPs (partitioned into 4, where each partition is dedicated to the 64 bit memory channel i.e 128bit memroy interface)
8500GT
G84
80nm
16 scalar shaders (sp)
divided into 1 clusters (3 our of 4 clusters being disabled)
4 TMUs
8 ROPs (partitioned into 4, where each partition is dedicated to the 64 bit memory channel)
i.e the current 8600/8500 series are infact G84s with 2/3 clusters disabled. However, it gets more interesting when the latest drivers list these
// 0400 - NVIDIA GeForce 8600 GTS
// 0401 -
NVIDIA G84-350
// 0402 - NVIDIA GeForce 8600 GT
// 0403 -
NVIDIA G84-200
// 0404 -
NVIDIA G84-100
// 0405 -
NVIDIA G84-50
With the initial performance benchmarks against the last gen high end cards, it is clear that the 8600GTS is clearly struggling to keep up with the slowest high end cards from last gen.
e.g 7900GT, X1950pro. And most notably the gap between the 8800GTS 320mb is quite big, and the idea of the 8600GTS being the bridge product is somewhat illogical. This could be somewhat backed up by the fact that the initial GTS versions will only have 256mb. (Could because of marketing reasons..)
So my theory right now is that nVIDIA actually has faster versions of the G84 core, but isnt scheduled for launch just
yet. The reason behind this could be because of yields, or they see no point of releasing a product that performs similiar to the 8800GTS 320mb, or they are planning to release it when the RV630 hits the market etc etc.
These faster G84s could be any of the bolded above and one or two could be the products between the 8600GTS and the 8800GTS 320mb. Also the rumours on the possible "ultra" version of the 8600 series. There may well be the GTX version of the 8600 series as well (similiar to the monikers used back in the FX days where high/mid/low range had its own flagshgip card in the form of the 5900/5700/5200 ultra)
So, on pure specualation
8600ultra
G84
80nm
core clock 800? (800mhz was possible on stock volts/stock cooling, but this version could potentially employ a dual slot cooler, e.g just like the RV630XT)
64 scalar shaders
16 TMUs
12 ROP
192bit memory interface (if they managed 384bit for the high end, why not for the mid end. Since the move to 256bit seems unrealistic for cost, 128bit sounds unrealistic for next gen DX10 mid end card that might be bottlenecked by bandwidth and is also expected to perform much faster than last gen high end cards) i.e this results in a new PCB, and 6 memory chips in total.
GDDR4 384 or 768mb? (64 or 128 mb per memory chip?)
memory clock 2400mhz?
bandwidth of ~57.6gb/s
shader clock of roughly more than 1500mhz. (not quite sure)
note- could potentially replace the 8800GTS 320mb due to the performance of this card and the replacement of the G80 core. (we should be seeing the refresh of G80 sometime soon hopefully)
So using jawed method of comparison (didnt calculate the flops because not sure how much the shader clock could be clocked at)
8600ultra has a
- bandwidth of 57.6GB/s
- fillrate of 9600MP/s
- AA fillrate of 38400MP/s
- and a zixel rate of 76800MP/s
- bilinear rate of 12800MT/s
- trilinear rate of 25600MT/s
- 64 SPs @ 1750MHz = 224GFLOPs (for comparison's sake and i came to 1750mhz, because ALL the shader clock on all G8x seems to follow this equation shader clock = (core clock x2) + 150mhz)
So in percentages, 8600ultra v 8800GTS:
- bandwidth = 90%
- fillrate = 96%
- AA fillrate = 96%
- zixel rate = 96%
- bilinear rate = 107%
- trilinear rate = 107%
- GFLOPs = 97%
Finally stock 8600GTS (675/2000) vs 8600 ultra
- bandwidth = 56%
- fillrate = 56%
- AA fillrate = 56%
- zixel rate = 56%
- bilinear rate = 42%
- trilinear rate = 42%
- GFLOPS = 43%
And in between the 8600 ultra and the 8600 GTS there simply could be more 8600 series too fill in that gap, e.g a 3 "quad" G84 so on and so forth.
Ok i think i went a little to far with specualation, so half of it probably dont make much sense.
I think this has too do with the G84 being so underwhelming for us, and that nVIDIA might have under estimated the mid range this gen. Even the 7600GT was quite an impressive mid range card (having to beat the 6800 ultra all across the board, even at high res/AA and AF).
All this is useless, if the bridge product is infact a 256bit variant of G8x (maybe a refresh, or another cut down G80), and that the RV630XT does infact perform similarly with the 8600GTS trading blows with the competition.
Off to bed.