The G92 Architecture Rumours & Speculation Thread

Status
Not open for further replies.
Actually, looking at the "spiritual" predecessor of G92--the G84--it seems the bandwidth is the key here, but not only:

G92 (8800GT) -- 57GB/s, 75% out of 33600GTex/s theoretical fillrate;
G84 (8600GTS) -- 32GB/s, 70% out of 10800GTex/s theoretical fillrate;
Eh, so these numbers show that bandwidth has nothing to do with the achieved multi-texturing fillrate (and in fact I think that's rarely the case for most cards, at least without using float textures).
8600GTS has more than half the bandwidth of 8800GT, but achieves less than a third of the multi-texture fillrate.
 
25 GTex/s is no mystery!

Guys, AnarchX already explained the 3DMark texturing rate. It has nothing to do with bandwidth.

You need U,V interpolation for each texture you fetch. Interpolation is done in the special function unit, which has 1/4 the throughput of the SPs. 1.8GHz * 112 * (1/4) / (2 interpolations per fetch) = 25.2GTex/s

EDIT: G86 results are still weird, though.
Very low additional cost eh? :D
Yup. I'm assuming dependent texturing isn't made any faster (i.e. same register file), so basically G92 can fetch pairs of bilinear textures as fast as G80 can do single fetches. It won't give twice the speed in all scenarios, even when iterators aren't a problem, but it should be helpful nonetheless.

The only extra requirements are texcoord->mem location calcs (including LOD). Everything else is already there from the double TF units in G80.

Unfortunately, it means that G92 will benefit from brilinear mipmap optimization where G80 didn't, so we might see some decreased IQ.
 
Last edited by a moderator:
Mintmaster said:
You need U,V interpolation for each texture you fetch. Interpolation is done in the special function unit, which has 1/4 the throughput of the SPs. 1.8GHz * 112 * (1/4) / (2 interpolations per fetch) = 25.2GTex/s
I thought they were 1.5GHz? /boggle

Mintmaster said:
Unfortunately, it means that G92 will benefit from brilinear mipmap optimization where G80 didn't, so we might see some decreased IQ.
Surely, that won't be much of a performance issue.
 
Hence my original point about the apparent wastefulness of G80's TA:TF ratio.
Well, we don't know the impact yet, do we. It's unlikely that there are many pixels in most games where G80 is both texture bound and only needs one bilinear sample per fetch. Marginal gain at a marginal cost.
 
Yes, it seems that Pete's table has been updated.
Sorry, I initially entered the memory clock for some reason.

Thanks for the attr. interp. calc., assuming it's correct--as you noted, it doesn't quite line up with available test results. :) If you plug in the 1.5GHz SP clock you arrive at only 21GT/s theoretical for the 8800GT. Things are even worse for the 8600GTS, for which your equation yields only 5.4 GT/s to the 7.6 GT/s it achieves in 3DM06. Could tex. coord. reuse be so high? Are you underestimating the amount of work the SFU can do per clock? (I'm trying to remember the complaint about the validity of 3DM's fillrate #s, though I don't remember if it was for pixel or texel fillrate--maybe that could explain the discrepancy, too.)

Rightmark's fillrate #s are closer to your equation, as they saw 5.8GT/s max for a 675/1350 8600GTS (though they're doubting their test considering the discrepancy with NV's GPUs). RightMark's fillrate test yields lower rates than 3DM06's, and ATI's GPUs score closer to their theoretical tex. units * clock #s.
 
very basic review but at least we can be certain the 3dmark fillrate test from the random asian forum was legit due to the 8800gt's incredible performance without filtering.

also, with some good cooling i think 750 on the core should be common. but if this is true will it even matter for performance due to memory bandwidth limitations?
 
Quite weird performance numbers in this review, but nonetheless looking at the temperature tests G92 finally "beats" the 2900 line in heat output. That confirms some reports of overheating (read: 100+°C) in synthetic testing. Bring back the dual-slot goodie from 7900GTX!
 
Quite weird performance numbers in this review, but nonetheless looking at the temperature tests G92 finally "beats" the 2900 line in heat output. That confirms some reports of overheating (read: 100+°C) in synthetic testing. Bring back the dual-slot goodie from 7900GTX!

yeah but power consumption is less, i guess having single slot cooling was more important for nvidia than heat output at the back. i wonder if XFX will modify the reference cooling because i dont think the stock cooler would be worthy of the XXX editions. heck before newegg took it down they had a evga brand with 700/2000 clocks.
 
fillrate question is now settled for certain. buy.com has updated their PNY 8800GT information and added a picture.

Tech Specs:

BUS Technology: PCI Express 2.0 (backwards compatible with PCI Express)
Memory Amount: 512MB
Memory Interface: 256-bit
Memory Bandwidth (GB/sec): 57.6
Fill Rate (billion pixels/sec.): 33.6
Stream Processors: 112
Shader Clock (MHz): 1500 MHz
Core Clock (MHz): 600 MHz
Memory Frequency (effective): 1800 MHz

206166489.jpg





http://www.buy.com/prod/geforce-xlr8-8800gt-pcie-512mb-ctlrdvi-i-hdtv/q/loc/101/206166489.html
 
Its because of the temperature/power that big OEMs like dell decided to go in favor of the RV670.

However, for us enthusiasts we can slap on thermalright HR-03 or even the zalman vf1000 and OC this sucker to a whole new level of awesomeness :devilish:

IMO they should have just gone with a dual slot card if there are some thermal issues though.
 
Status
Not open for further replies.
Back
Top