NVIDIA GF100 & Friends speculation

Yes, but 3870 wasn't competitive enough with G92. Things changed RV770 onwards. Those points are more applicable after that point. GF104 vs 5830/5770 will be a better decision point this gen.

Not only was 3870 not competitive enough with G92 when it came out, but it fares even worse now. Metro 2033 shows a 9600 GT (G94) slaughtering a 3870, let alone what any G92 derivative does in comparison.
 
If it is GF104vs 5830/5770. IT could easily be the gf 104vs 5830/50

Based on the current estimates of 7/8 GF100 being slightly faster (10-30% depending on estimates) than 5850, I doubt half GF100 could be anywhere near 5850
 
That's not what I remember. Quite the contrary IIRC it is very, very rare HD5870 is more than 80% faster than HD5770 (except some synthetics).
Sorry, you're right, 80-100% faster is a better description. I'm basing my position on the results in this review:

http://www.xbitlabs.com/articles/video/display/radeon-hd5770-hd5750-crossfirex_11.html

Obviously there are cases where the scaling is much worse.

Not even the metro link you posted exceeds that.
Yeah, that's only 73-76% scaling, while HD5770 is 6-12% faster than HD4890. Whatever is stopping HD5870 from being ~100% faster than HD5770 isn't stopping it from being ~90% faster than RV790 (let alone RV770).

Including games whose scaling isn't a function of the 2x ALUs, 2x TMUs and 2x ROPs, in an argument that says the architecture is inefficient, is just completely pointless.

Even if the new architecture is more bandwidth efficient (is that why HD5770 is faster than HD4890?) the overall lack of bandwidth scaling undermines the argument that 2x those units are "pointless". Particularly as there are games that can use that extra capability.

Jawed
 
Based on the current estimates of 7/8 GF100 being slightly faster (10-30% depending on estimates) than 5850, I doubt half GF100 could be anywhere near 5850

On paper, half gf100 should edge out gtx285 on raw specs, even if marginally. Smaller die => lower intra-die variation => somewhat better clocks. Anandtech said 5850 is ~10-15% ahead of 285.
 
Based on the current estimates of 7/8 GF100 being slightly faster (10-30% depending on estimates) than 5850, I doubt half GF100 could be anywhere near 5850

I'm talking in terms of pricing. By the time the gf104 comes out I don't think it will be priced against the 5770s ($160) or even the 5830($250). I don't expect these cards to cost these prices when the gf104 comes out this summer.
 
Including games whose scaling isn't a function of the 2x ALUs, 2x TMUs and 2x ROPs, in an argument that says the architecture is inefficient, is just completely pointless.
I'm sorry, I'm not following at all here. Imagine a hypothetical architecture that has 2x the ALUs/TMUs/ROPs of 5870, while maintaining the exact same memory bandwidth. Would you consider it pointless to claim that the architecture is inefficient based on the (likely many!) games that scale very badly on it?
 
The window for the gf 100 products seems to be very small

Looks that way. Only question that remains is how the timing of the refreshes is going to line up. Well assuming that AMD is working on a refresh and not some monster DX11 chip of their own.

This doesn't account for sunk costs (R&D) at all, nor variable costs such as manufacturing. AMD's GPG had good margins during the RV770 timeframe but failed to realize a profit not through any lack of margins, but rather volume necessary to overcome costs.

Good point. But it all feeds into the notion that you can't simply look at the area of two chips from different companies and predict their profitability. If that was the case then AMD would long be out of business considering their vast CPU die-size disadvantage.

On paper, half gf100 should edge out gtx285 on raw specs, even if marginally. Smaller die => lower intra-die variation => somewhat better clocks. Anandtech said 5850 is ~10-15% ahead of 285.

But also less than half the TMUs and probably much less bandwidth than the 285. If I had to guess I'd put it around 5830 performance in older games.
 
What's stopping them from implementing tesselation detail levels?
Most of what's out there with DX11 tesselation right now is already unplayable on anything less than 5850. So why would developers bother with these $100 to $200 DX11 products of yours at all?
Plus having half of geometry engines of GF100 is still two times more than what Cypress has.

I made the same point as Dave like 50 pages ago, but I dont think anyone really got it. The point is, if their low end parts are designed with disabled GPCs, as would seem logical, then their geometry rate goes down accordingly, because they can only do 8 pixel triangles per GPC. So while theoretically Cypress and any product all the way down from it, is doing one 32 pixel tri/clock, GF100 is also doing only one 32 pixel tri/clock, but GF100 is really doing four 8 pixel tris/clock, one for each GPC. A 2 GPC part (256 shaders) would then only do one 16 pixel tri/clock, or one normal 32 pixel tri/2 clocks, half geometry rate compared to Cypress family of products. Same for the low end, 1 GPC will be doing one 8 pixel tri/clock, or one normal 32 pixel tri/4 clocks, essentially quarter rate compared to Cypress products.

If all polys were 8 pixels or less, this wouldn't be the case, but find a wireframe of any game (I think there was a heaven one back in the thread somewhere?) and see how many of the polys account for less than 8 pixels. The majority of the screen seems to be covered with >8 pixel polygons, even in heavy tesselation cases that are unrealistic in games today.
 
More leaks, but I have no idea if they're legit:
http://www.hexus.net/content/item.php?item=23032

We've seen the Farcry 2 ones long since and that's not a surprise, so I'm not sure why that poster would bench it at 3 different resolutions.

Dirt 2 is new and is suspicious due to lack of details, and 20-25% faster doesn't appear to coincide with the 5870 vs gtx470 benchmarks we saw earlier in this thread where the 5870 was also 20% faster in that game. Unless the 480 is 45% faster than the 470, one of those benchmarks is wrong or hiding something.
 
So while theoretically Cypress and any product all the way down from it, is doing one 32 pixel tri/clock

You sure Dave was talking about rasterization rate and not triangle rate? Why would a 8 or 16 ROP Cypress derivative rasterize 32 pixels per clock? In terms of triangle rate one Fermi GPC = Cypress = 1 tri/clk so there's no disadvantage there.
 
On paper, half gf100 should edge out gtx285 on raw specs, even if marginally. Smaller die => lower intra-die variation => somewhat better clocks. Anandtech said 5850 is ~10-15% ahead of 285.
Not necessarily. Even on raw alu power, it only edges out gtx285 if it really has those "somewhat better clocks" (though I think that should be doable). However, it'll only have half the SFUs for instance, and less than half the tmus (unless those magic missing tmus turn up), even if they are more efficient that is a pretty big gap (of course, that's assuming the basic clusters aren't changed).
It is hard to tell though without knowing what the rop count / memory interface will be. Haven't really seen anything even claiming to be a rumor, so imho could be anything from 128bit / 16 rops to 256bit / 32 rops I guess (the obvious solution, half that of GF100, might not be desirable due to the odd memory size for a mainstream card). I have my doubts though in whatever configuration it could challenge HD5850 (with 256 alus), I think the deficit in alu/tex is just too big (unless nvidia could increase shader clock to 9800GTX levels maybe). HD5830 might be doable, there's quite a gap between that and HD5850 (though that wouldn't be great, as that's a bit too close to HD5770 really).
 
I made the same point as Dave like 50 pages ago, but I dont think anyone really got it. The point is, if their low end parts are designed with disabled GPCs, as would seem logical, then their geometry rate goes down accordingly, because they can only do 8 pixel triangles per GPC. So while theoretically Cypress and any product all the way down from it, is doing one 32 pixel tri/clock, GF100 is also doing only one 32 pixel tri/clock, but GF100 is really doing four 8 pixel tris/clock, one for each GPC. A 2 GPC part (256 shaders) would then only do one 16 pixel tri/clock, or one normal 32 pixel tri/2 clocks, half geometry rate compared to Cypress family of products. Same for the low end, 1 GPC will be doing one 8 pixel tri/clock, or one normal 32 pixel tri/4 clocks, essentially quarter rate compared to Cypress products.

Why would GF100 faster than Cypress with only 32pixel tri/clock? :?:
 
Dave Baumann said:
So, do we really think it makes sense for developers to solely focus on one single high-end product?

Developers already face these decisions today with the huge gulf in performance between high end and low end parts, especially given they are target hardware going back several generations. I'm not buying this spin, in the absurd conclusion, the obvious course is to eliminate all SKUs but 1 to maximally help developers.

Scaling back tessellation factors is easy to do as well, just like chopping down resolution, so yes, I expect developers to build games that won't run acceptable well with all settings maxed on all but the top end systems, which is the situation we have today with many titles.
 
Good point. But it all feeds into the notion that you can't simply look at the area of two chips from different companies and predict their profitability. If that was the case then AMD would long be out of business considering their vast CPU die-size disadvantage.
IMHO, it is a miracle they have managed to stay in the cpu business all this while battling so many disadvantages they had.
 
What else besides Metro 2033 did you bring up?
Well, until Metro 2033 is proven to be a driver fluke favouring Evergreen over R7xx (certainly possible), it's a datapoint you ignore at your peril. If Metro 2033 can scale this well, why shouldn't GTX480 scale nicely (albeit the texturing situation is questionable).

If games scale badly it's not NVidia's fault if GTX480's performance is "boring". (Though competition with HD5870 is a slightly different question.)

And what does XS have to do with anything? :???:
The signature pix ("We're not ready" etc.) and the original tags (before they were cleaned up) on the Fermi threads are by far the best feature of the graphics discussion over there.

Jawed
 
Well, until Metro 2033 is proven to be a driver fluke favouring Evergreen over R7xx (certainly possible), it's a datapoint you ignore at your peril. If Metro 2033 can scale this well, why shouldn't GTX480 scale nicely (albeit the texturing situation is questionable).

Oh, I'm not dismissing it at all. If you see an earlier post of mine I said I'm waiting to see performance in titles where it matters - Metro 2033 being one of them. I was simply saying that we were discussing average performance and there will of course be outliers but those don't invalidate the average.

If games scale badly it's not NVidia's fault if GTX480's performance is "boring". (Though competition with HD5870 is a slightly different question.)

Well we'll have to just disagree then. Several people have already pointed out the error in this approach. You can't eliminate a game from the evaluation because it doesn't scale the way you think it should. That is selection bias to the extreme. You can't simply say the architecture is efficient, but then restrict your test cases to workloads that are perfectly suited to it. You're feeding the dependent variable back into the equation!
 
Back
Top