NVIDIA GF100 & Friends speculation

Doesn't look good for who exactly? And it really depends on its price.
For me. PR numbers are usually much better than real ones. I would prefer to see something really exciting on the 26th. And i think we can say that we already know the "official" MSRP. Connect that with probably weak availability that can increase the price well above MSRP and voila! Something not really exciting comes out of the oven!
 
Depends on when it's 10% and when it's 50-60%. If the 60% is only when both cards are unplayable then yeah that's irrelevant. Otherwise it might make a case for itself.
True if you would obtain those results from independent source.
I would also dare to say that +50-60% perf would be incredible if achieved in most demanding benchmarks. But we are talking about NV FTP... :cry: I am not expecting to find unbiased info in there.
 
Nah, the first post is misleading, assuming Quads post are true anyway.
The benches he's talking about are made by some normal reviewer
Nope they're nVidia benchmarks but they look legit to me atleast the ATi numbers. They're reference benchmarks for reviewers.
That's from post #17.
Anyway i would be surprised if independent reviewer would already have large number of benches and they would somehow found their way to NV's FTP (20 games benched up to 2560x1600 with various filtering settings). I thought reviewers got their cards not so long time ago?
 
I'm sorry, I'm not following at all here. Imagine a hypothetical architecture that has 2x the ALUs/TMUs/ROPs of 5870, while maintaining the exact same memory bandwidth. Would you consider it pointless to claim that the architecture is inefficient based on the (likely many!) games that scale very badly on it?
Assuming you have games that are capable of scaling decently, the hypothetical card will be faster, because bandwidth isn't a bottleneck 100% of the time. Some games will scale by quite a lot.

You can already see this effect in HD5770 in Metro 2033 which is faster than HD4890 despite having ~60% of the bandwidth.

This hypothetical SKU (let's say it's the end of 2011, and we're talking about a $200 card), might be considered unbalanced on today's games. But that SKU will be considered good value, because newer games will probably be fine. It might have even less bandwidth than HD5870. Of course, by then, the architecture may have enjoyed other tweaks that make it more bandwidth-efficient. (Cypress appears to have some such tweaks.)

There are some suggestions that Cypress was meant to be 384-bit, with presumably 48 ROPs. Instead this 256-bit chip is stuck in no-man's-land with considerably less bandwidth, despite having a die area that could accommodate a 384-bit bus. Some rumours suggest the ALUs/TMUs in that chip would have been the same, other rumours suggest it would have been 1920.

Taking the best case: 1920, 48 ROPs, 384-bits, would seem to imply a better-balanced chip, i.e. ROPs/bandwidth scaled by 50% over Cypress with 20% more math/texturing. One might argue that existing games are most sensitive to fillrate, and so that chip's bias towards fillrate/bandwidth makes for a more efficient SKU.

But, using games that are scaling poorly anyway doesn't make a good basis for saying the architecture is or isn't efficient. D3D10's more finely-grained state management and the multi-threaded CPU side of state construction in D3D11 both demonstrate that graphics performance scaling is a prisoner of more than just the graphics card.

I'm not saying that Cypress is efficient - tessellation appears questionable for example. I am saying that scaling comparisons with HD4890 that exclude bandwidth are ill-judged and the poor scaling of lots of games makes that comparison even worse. Using them to generalise about efficiency going forwards is pointless.

See HD5830 for a great example of junk. HD4890 has vast amounts of bandwidth that it generally uses fairly poorly. R600 is a disaster zone.

So, to answer your question: well, such a chip isn't technically possible right now on the current ATI architecture, it would need 28nm (maybe it would squeeze in below 600mm²?). A chip of that kind of specification is very likely to turn up eventually (compare HD4850 and HD2900XT). Regardless, such a chip would be wasteful for "today's games" assuming no bandwidth efficiency gains (since bandwidth is increasing so slowly - and even then that's assuming games that scale well). It is possible to scale up an architecture too far and fall victim to architecture-intrinsic limitations, e.g. tessellation in Cypress may be showing up the rasterisers as unfit. Or maybe it's the ALU thread scheduling that's hit the end stops?

Chips of a given architecture can't scale linearly with unit count, that's Amdahl's law, as D3D isn't perfectly parallel. If people want to assert that Cypress is beyond the pale and is an inefficient configuration, then they have to show it failing to scale when another architecture (or at least a better configuration of the architecture at a similar die size) continues to scale, on the same games.

Your hypothetical chip seems likely to be crap if it were to appear now - too unbalanced (the opposite of HD2900XT). R580 had a mildly similar problem, appearing "over specified" for its time. Eventually games catch up with these increased ratios of ALU:GB/s and TEX:GB/s and fillrate:GB/s, but by then the chip turns out to have too little bandwidth for anything but budget gaming. The architecture will have been revised.

GTX480 could be a chance to see where Cypress fails, if it shows good scaling in places that Cypress is rubbish, on existing games. Not long to go...

Jawed
 
Well we'll have to just disagree then. Several people have already pointed out the error in this approach. You can't eliminate a game from the evaluation because it doesn't scale the way you think it should. That is selection bias to the extreme. You can't simply say the architecture is efficient, but then restrict your test cases to workloads that are perfectly suited to it. You're feeding the dependent variable back into the equation!
Well, as I said a while back, feel free to evaluate graphics architecture efficiency with Quake 3.

Jawed
 
Supposed to show what?
75% and 80% scalings appear there.

There is still a lager delta from CPU than GPU?

And patch 1.(0)3 is old, we run patch 1.05 now, did wonders for performance(and the A.I)...but this game is still master CPU hog ;)

In my book it's a better CPU bench than Supreme commander...
I'm stuck with an A64 3500X2 which is why I haven't bought the game yet.

Didn't stop me hosting an insanely-popular custom mission for the Arma 2 demo for a couple of months last year (I had people queueing for hours to play :p), with about 80 enemy AI...

Jawed
 
There are some suggestions that Cypress was meant to be 384-bit, with presumably 48 ROPs. Instead this 256-bit chip is stuck in no-man's-land with considerably less bandwidth, despite having a die area that could accommodate a 384-bit bus. Some rumours suggest the ALUs/TMUs in that chip would have been the same, other rumours suggest it would have been 1920.

Jawed

U can reach a given target ALU/TMU, ROP performance by clocks too. Someone posted in R8xx thread that some of the partner cards seem to hit 1150MHz on air.
Nvidias 48 ROP-s runing at 600-700 MHz wouldnt be enough for that monster fillrate. :p.
Lets hope they dont go with 256-bit for a third time.
 
Whether AMD's overclocked Radeon HD 5870 card will be able to stand up to NVIDIA's new flagship remains to be seen, but AMD doesn't seem that worried about NVIDIA's first GeForce Fermi card.

runaway.gif
 
Back
Top