On paper maybe but in real-world-apps there are worlds between the efficiency of both solutions.Would this chip not have the potential to go toe-to-toe with GTX 280?
On paper maybe but in real-world-apps there are worlds between the efficiency of both solutions.Would this chip not have the potential to go toe-to-toe with GTX 280?
Looks like a typo to me (should maybe be 2065?), in all other tests the 8800U is very close to the 9800GTX.
The 256mm² die size are just shown by gpuz, so they could be wrong/off.
The stand-out there is 8800U scoring 3065 in Vantage Extreme. That's quite a bit higher than the seemingly rumoured 2800 for HD4870, which would indicate that ATI is still far off being able to fully use ~124GB/s.
The indication is also that GTX 280 will end-up ~60%+ faster than 8800U. In this case that's a scaling that's better than the increase in bandwidth.
Jawed
Yeah, it's been bugging me too. So 5 SIMDs of 160 ALUs with the horrible batch size is still a contender.My problem with this ALU number is this. I'm assuming it's still 16 shaders / 80 ALUs per processor. So we're looking at 10 processors - a 150% increase in shading and control logic.
Now my question is....if there's a 150% increase in control + shading logic accompanied by a 25% increase in die size that means control+ALU in RV670 is taking up very little space. So what's taking up the rest?
In my opinion reducing the clocking of batches is as unlikely as a different clock for the ALUs. I think it would be a huge architectural change.In terms of R580 they only increased ALUs and control was untouched so batch size tripled as well. Of course batch size on RV770 might have gone up too or batches could run for less than 4 clocks which would offset any increases in width.
Register file memory should be very dense. Though I'm guessing that for porting reasons there are multiple copies of the register file (writes go to all copies - in reading, separate instances of the register file fetch distinct operands for the same clock).That's another thing...wouldn't RV770's global register file be much larger than RV670's to support that kind of increase in arithmetic? Like Jawed, I would be gobsmacked if this is confirmed but I'm still a bit hesitant.
2933 reported on this page:Looks like a typo to me (should maybe be 2065?), in all other tests the 8800U is very close to the 9800GTX.
Ah you're right (http://www.digit-life.com/articles3/video/vantage2.html) - 9800GTX around ~2600 (looks like the sucky memory management of the G80/G92 is an issue here with "only" 512MB video ram) vs. ~2900 for the 9800U. But in any case the difference shouldn't be that large as in this table.Nope, the Ultra is quite a bit faster than the GTX at Vantage extreme. The GTX score looks too low though should be around 2500.
You don't really know this. You could be correct that control logic could be quite large, but I've got some doubts about the TMU size being that big, and am pretty sure you're dead wrong about cache size using a large amount of die space (unless you also consider register file as cache, but I count that as part of the ALU). (Compared to cpus, gpu caches are still small - think about it intel fits 4MB on about 70 mm^2 on a 65nm node and rv670 only has 256KB L2 and 4x32KB L1 texture cache - sure it has other caches too (for ROPs for instance) but probably nowhere near 4MB in total.)
Now that you mention it - that seems about the biggest point against 800/40/16 in RV770 - the 256mm² as a given.Finally, look at the RV635->RV670 comparison: size goes up only 70 mm^2 and you have almost 2.5 times the shaders, double the TMU and bus width.
What about additional 12 ROPs?Because going from RV635 to RV670 you'd cramp 200 ALUs, 8 TMUs and 128 Bit memory interface into the 70mm² (which i take for granted, don't know the die size of RV635 myself)
50% faster? That's what whocares' numbers are suggesting, and there's not data out there supporting that assertion.Nope, the Ultra is quite a bit faster than the GTX at Vantage extreme. The GTX score looks too low though should be around 2500.
Somehow i counted them under memory interface, but you're right they should be mentioned separately. But nonetheless - the ROPs seem to need a little rework from RV670 to RV770 too, so that does not change my overall view.What about additional 12 ROPs?
Dave has explained multiple times that the ALU configuration of R6xx is super scalar.I thought R6xx are still Vec4+scalar but in some cases more flexible.
Maybe in other words:
If you go to Octa-TMUs, you would also go to Octa-ALUs?
Now, RV770 is supposed to be about 64mm² larger than RV670 and the only thing you'd get off there was 128 Bit memory interface, which allegedly stays the same in RV770 (except the mysterious non-crossfire Crossfire-Card R700, which would according to some need some kind of chip to chip interconnect through the ring buses).
So, with even less additional die space, you not only want to add 200 shaders and 8 TMUs but at least 480 shaders and 16 TMUs. Hm.
I guess that control logic doesn't grow very much. e.g. there's only 1 more SIMD in RV670, although the SIMDs do get twice as wide. The latter would indicate the SIMD control logic cost is halved per ALU.I have no idea how cache/register files/control logic works, but is it possible ATi build in surplus amounts in there architecture for easy scaling (like rv635 to rv670 for example) and thus didn't really need change it all that much.
They put in excess bandwidth into R600, so is it possible, or doesn't it work like that.