NVIDIA GF100 & Friends speculation

I'd always gone with the outer ring being for power and ground plane(s), with the inner block for I/O, but with the gap still marking the chip edge. Throw out ~500mm/sq again then I guess, you both make good points.

Its actually the opposite. Outer is for I/O because it has easier escape ability out on the edge while inner is power and ground because that is what is left. If you put the I/O on the inner block you'd have to add another couple board layers just to get it out of there.
 
I'd always gone with the outer ring being for power and ground plane(s), with the inner block for I/O, but with the gap still marking the chip edge.
On the chip itself I believe the centre is normally purely power/ground. The edges of the chip (where all the I/O lives) is also where the I/O pads are.

Normally I/O pad density is limited by also having to deliver power to the periphery of the chip (quite a bit of power, too), which is where I believe this patent comes in:

http://v3.espacenet.com/publication...T=D&date=20090205&CC=US&NR=2009032941A1&KC=A1

So, I would expect the substrate to follow a similar pattern, power only in the centre and mixed power and I/O around the periphery.

This patent shows the IO Ring clearly in figure 1:

http://v3.espacenet.com/publication...T=D&date=20050428&CC=US&NR=2005087888A1&KC=A1

(The forum software is still fucking up pasted URLs - so no link text for those URLs)

and goes on to show how the area of I/O can be optimised.

Jawed
 
IF 470 is somewhere between a 5850 and a 5870, then is this GF104 @330mm2 thing going to compete with Juniper @181mm2 (or 5830 for that matter)? Is the entire architecture fucked up?
GF100's ROPs/Memory looks like a bit of a kludge: more fillrate than rasterisation rate. Perhaps GF100 is 384-bit because NVidia was hedging its bets on bandwidth (time to market, chips available at time of launch)?

Look how close 8800GTS-512 is to 8800GTX. Sure the latter had some advantages.

How much die size can NVidia save simply by deleting 128-bit of memory interface + 16 ROPs/L2? 50mm²?

Faster GDDR5 chips would result in HD5870-like bandwidth, obviously.

NVidia could delete a GPC, or make them all 3 SMs. Either way, increase clocks. An entire GPC is ~60mm² I guess.

So, that's 110mm² in cuts (1 GPC, 16 ROPs, portion of L2), assuming GF100 is 480mm² that equals 370mm² for GF104 :LOL:

Jawed
 
GF100's ROPs/Memory looks like a bit of a kludge: more fillrate than rasterisation rate. Perhaps GF100 is 384-bit because NVidia was hedging its bets on bandwidth (time to market, chips available at time of launch)?

Look how close 8800GTS-512 is to 8800GTX. Sure the latter had some advantages.

How much die size can NVidia save simply by deleting 128-bit of memory interface + 16 ROPs/L2? 50mm²?

Faster GDDR5 chips would result in HD5870-like bandwidth, obviously.

NVidia could delete a GPC, or make them all 3 SMs. Either way, increase clocks. An entire GPC is ~60mm² I guess.

So, that's 110mm² in cuts (1 GPC, 16 ROPs, portion of L2), assuming GF100 is 480mm² that equals 370mm² for GF104 :LOL:

Jawed

The question is also what area could they save without the DP and ECC bloat which sits there for nothing. And maybe the absence of it could help increase the clocks too.
 
Were these already posted?

NVIDIA GeForce GTX 470 benchmarks surfaces | IT SHOW 2010 | VR-Zone | Gadgets | PC Enthusiasts

gtx4703dmark.jpg


gtx470benchs.jpg
 
GF100's ROPs/Memory looks like a bit of a kludge: more fillrate than rasterisation rate. Perhaps GF100 is 384-bit because NVidia was hedging its bets on bandwidth (time to market, chips available at time of launch)?

Look how close 8800GTS-512 is to 8800GTX. Sure the latter had some advantages.

How much die size can NVidia save simply by deleting 128-bit of memory interface + 16 ROPs/L2? 50mm²?

Faster GDDR5 chips would result in HD5870-like bandwidth, obviously.

NVidia could delete a GPC, or make them all 3 SMs. Either way, increase clocks. An entire GPC is ~60mm² I guess.

So, that's 110mm² in cuts (1 GPC, 16 ROPs, portion of L2), assuming GF100 is 480mm² that equals 370mm² for GF104 :LOL:

Jawed

Yeah, but by all accounts, a 3GPC chip will be substantially faster than Juniper. I guess, it will fit the 5770-5850 hole.

My understanding is that putting 3 SMs/GPC will involve more work than is suggested by the gap between GF10x and GF100. Doing it makes the whole fermi modularity kinda redundant.
 
The question is also what area could they save without the DP and ECC bloat which sits there for nothing. And maybe the absence of it could help increase the clocks too.

That "bloat" is likely <5% overall.
 
Yeah, but by all accounts, a 3GPC chip will be substantially faster than Juniper. I guess, it will fit the 5770-5850 hole.
That's where I intended, or a bit faster.

My understanding is that putting 3 SMs/GPC will involve more work than is suggested by the gap between GF10x and GF100. Doing it makes the whole fermi modularity kinda redundant.
It's a matter of trading tessellation/setup/rasterisation rates. Don't have a good feel for where the correct ratios lie. e.g. rasterisation:fillrate is "crooked" in GF100.

Jawed
 
Back
Top