I wrote back in December that "a large part" of NVIO is just filler to get the pad count needed. We knew the percentage at the time but didn't quote it, but I can't seem to find that in a quick check of my notes. I'm confident it's more than 50% though.How many trannies on the NVIO are doing real work, vs just filler die space?
I'm sure Eric appreciates being called a liar. The ATI guys have been honest about the problems they've encountered these past few years.Zero evidence? How about the fact that, as you pointed out, there are cases where R600 underperforms R580? And that these cases scale with resolution?
It is far more ridiculous for you to say that all performance problems must be driver related and it's impossible for the chip to have a design flaw than for me to say it's a possibility.
No but you and plenty of others keep talking in these terms without having any facts.None of us have any way of quantifying this, so there's not much to say.
I'm interested in architecture, technology, scalability, futures. Apparently you're not.Geez, why can't you get this through your head? Who the hell cares about "equal theoretical rates" comparisons?
Whoops there you go again, with zero facts.ATI and NVidia sure as hell don't. With the same chunk of silicon and same memory BW, NVidia destroyed ATI in the low end last gen, so they got to sell their part at a higher price.
They're not cherry picked. They're about the only benchmarks that cover a lot of recent games for a decent range of cards.And please, stop posting your cherry-picked computerbase.de benchmarks. They gimp NVidia G7x cards by disabling filtering optimizations. ATI is affected too, but not nearly as much. If you don't like english websites, quote digit-life (translated from the Russian ixbt, AFAIK).
And the target market sucks up whatever they're told.The fact is that the target market reads anandtech, firingsquad, tomshardware, etc., orders of magnitude more than that niche site, and they also keep their settings at factory default aside from the AA/AF slider, just like these sites do.
Do you think when they've added in all the extra ALUs (two of these have gotta get close to 1TFLOP) and D3D10.1 functionality that'll still be true?A half G80 on 65nm is going to be the same size as RV630, give or take.
Yeah. We saw some signs of this with R580+. We also have clearly documented cases where R5xx performance in AF/AA improves at the same time as non-AF/AA performance degrades. There's a pile of "chaos" that seemingly has to be laid at the feet of the memory controllers in R5xx and R6xx.A few days ago, somebody posted a link to graphs that illustrate performance vs memory clock speed. There were ugly effects in there with negative correlation. That's often a sign of chaos theory at work and very difficult to design away.
You're ignoring the theoretically small fillrate advantage that R600 has over R580 with AA on, 14% - the no-AA case is where R600's 128% higher z-only fillrate distorts things. Of course that's going to make the AA-drop look big.The unusually high performance drops when switching on AA are suspicious also.
Not sure why you mention ALUs. If you'd mentioned TUs and RBEs, then fair enough. R5xx's long history of driver performance tweaks centred on the MC seem evidence enough.At lot of resources are fighting in parallel for the ALU's and the memory controllers. It's extremely easy to overlook secondary effects during design that can that cut a large slice of your theoretical performance (especially with decentralized bus architectures.)
I've concluded much the same, that ATI's built a bit of a frankenstein. Sort of like MS Office, way beyond what people wanted or needed or could fathom...Several times, ATI has praised their ability to tweak their arbiters for individual performance cases, which has been interpreted by many as a great advantage. A different point of view would be that this exposes a significant weakness in their architecture. If you've ever been involved in the design of anything arbitration related, you'll know that software driver guys hate to mess with this kind of knobs they don't understand and usually leave them at 1 standard setting. It also gives little hope to those who want to see a game optimized that's not part of the top-20 marketing line up.
At one point the patent application talks about 1 spare per 64!All we know is that they have 1 spare for every 16 ALUs.
But how could we? It's logically invisible unless you have the right diagnostics or can find some trace of this in the BIOS.We've yet to see an example of other places with redundancy,
Turning off an entire MC (along with its associated ROPs and L2) is not fine-grained.unlike, say, an 8800GTS which has both cluster and (a first) fine grained MC redundancy.
In R5xx it would seem that ATI proved the concept of fine-grained redundancy solely using ALUs. If fine-grained redundancy is widespread within R600 then the "overall area" problem is solved.As we discussed long time ago, it is nice to have extremely fine grained redundancy, but it's more important first to make sure that as much area of the chip is potentially redundant while still having a nice ratio of active vs disabled units. With the ALU's of R600 alone, the overall area part is not covered.
Coarse-grained redundancy's gotta hurt. What's puzzling me is the idea of a R600Pro based on R600 with bits turned off - surely the required volume of this part can't be sustained long term. Perhaps it's like X1950GT, an interim solution (that only lasted several months).And the R600 configuration with a 256-bits MC doesn't have a nice ratio.
I'm interested in architecture, technology, scalability, futures.
I'm sure Eric appreciates being called a liar. The ATI guys have been honest about the problems they've encountered these past few years.
Last night I found some Call of Juarez D3D10 benchmarks that make a mockery of Anandtech's, for example. But there's no point me linking them because this website's seemingly more thorough methodology is not what the target market wants.
Jawed
G80 SLI under Vista is still not working. Do you think it'll ever work on G80 or will G92 be the first GPU where it works properly?
Jawed
G80 SLI under Vista is still not working. Do you think it'll ever work on G80 or will G92 be the first GPU where it works properly?
As opposed to what? Prior architectures could either not turn off MCs/ROPs/L2s at all, or could only go down to the next lower power of 2 (ie: turn off half the MCs).Jawed said:Turning off an entire MC (along with its associated ROPs and L2) is not fine-grained.
Where the hell did anyone call Eric a liar? You often don't find hardware issues for a while and just assume it's drivers. The bug I was telling you about with R200 was the same way.I'm sure Eric appreciates being called a liar. The ATI guys have been honest about the problems they've encountered these past few years.
I just gave you some, but you ignored them. NVidia's margins are way above ATI's. Obviously ATI's yield advantage is not a big enough issue to negate their larger die size for equally priced parts and lower price (due to demand) for equally sized parts.No but you and plenty of others keep talking in these terms without having any facts.
That has nothing do with any of your replies to me or nAo. Any discussion of performance on parallel workloads is utterly meaningless without the context of cost.I'm interested in architecture, technology, scalability, futures. Apparently you're not.
Whoops, there you go again ignoring the facts. The 7600 always sold at a higher price than the X1600. Why? Because it blew the X1600 away in performance.Whoops there you go again, with zero facts.
They absolutely are cherry-picked. No other site shows G7x in such a bad light by gimping it so thoroughly. You won't fine 0.1% of G7x buyers running their video card with the settings of that site. Sites like xbitlabs test even more games. For any game that both computerbase.de and other sites test, the results from the former are completely out of line from everyone else. When 10 sites have mostly agreeable results and computerbase.de deviates from them so heavily in favour of ATI (when compared to G7x), how can you not call it cherry picking?They're not cherry picked. They're about the only benchmarks that cover a lot of recent games for a decent range of cards.
There's nothing more thorough about their G7x testing methodology. They arbitrarily decide that viewers are interested in G7x performance when gimped by 50% from image quality settings that barely improve the gaming experience. It's absurd.But there's no point me linking them because this website's seemingly more thorough methodology is not what the target market wants.
What? Read my post. I'm talking about a theoretical half G80 on 65nm. Nothing more. But if you want to to go down this road, fine. If increasing the ALU:TEX ratio makes NVidia's chips even stronger per mm2, then that makes my claim even more justified.Do you think when they've added in all the extra ALUs (two of these have gotta get close to 1TFLOP) and D3D10.1 functionality that'll still be true?
Jawed
Mutual.If you continue to do so then there is no point for me or anyone else to debate 3D hardware performance with you.
Hmm, that's what I've been saying, keeping a lid on change to produce the minimal architecture/tech to produce a D3D10 GPU.This seems to me to be bass ackwards. The G80 design, if anything, seems elegant, a fine example of KISS principles in action.
Hey? What has a driver compiler for ALU-utilisation got to do with the optimisation of the use of TUs, RBEs and bandwidth?At least you finally seem to be admitting that writing drivers for the R600 is more difficult. The last time I was involved in one of these threads, you were strongly arguing that there is no inherent advantage in the G80's scalar approach vis-a-vis driver compiler vs the R600.
G80's high utilisation only comes in single-function synthetics. A nice example is the z-only fillrate which is comically high (in a fuck-me, that's incredible, sense) and under-utilised in games.To me, blunt power is when you load up on raw computation resources, in order to overcompensate for weak ability to maximize utilization over those resources. The g80 obtains high utilization rates, so it really seems absurd to call it 'a brute force' approach.
I'm not giving props to R600 generally - I'm suggesting that ATI's focus is to lay foundations - they're running to a very different technology/architecture timetable than NVidia.Complexity in an of itself neither makes a superior design, nor an elegant, less brute force one. Only complexity that is spent on 'saving' work and increasing effective utilization is evidence of a good design, and for me the value of the R600's design -- well, the jury is still out on that one. Likewise for so-called margin/yield advantages. nVidia has consistently shown good margin management in recent years, so unless we see evidence to the contrary, I wouldn't give any props to the R600 on this either.
Fine-grained as in the SKU is still fully capable after the redundancy has kicked in, which is what R600 is doing, apparently. Not resulting in a second SKU to mop up faulty dies with <100% capability.As opposed to what?
And you would do this how, exactly? And how would that scale when 2 MCs are faulty?Jawed said:Fine-grained as in the SKU is still fully capable after the redundancy has kicked in,
Hmm, that's what I've been saying, keeping a lid on change to produce the minimal architecture/tech to produce a D3D10 GPU.
Hey? What has a driver compiler for ALU-utilisation got to do with the optimisation of the use of TUs, RBEs and bandwidth?
G80's high utilisation only comes in single-function synthetics.
I'm not giving props to R600 generally - I'm suggesting that ATI's focus is to lay foundations - they're running to a very different technology/architecture timetable than NVidia.