NVIDIA GF100 & Friends speculation

Discussion in 'Architecture and Products' started by Arty, Oct 1, 2009.

  1. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,027
    Likes Received:
    90
    Not only was 3870 not competitive enough with G92 when it came out, but it fares even worse now. Metro 2033 shows a 9600 GT (G94) slaughtering a 3870, let alone what any G92 derivative does in comparison.
     
  2. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,244
    Likes Received:
    4,465
    Location:
    Finland
    Based on the current estimates of 7/8 GF100 being slightly faster (10-30% depending on estimates) than 5850, I doubt half GF100 could be anywhere near 5850
     
  3. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    Sorry, you're right, 80-100% faster is a better description. I'm basing my position on the results in this review:

    http://www.xbitlabs.com/articles/video/display/radeon-hd5770-hd5750-crossfirex_11.html

    Obviously there are cases where the scaling is much worse.

    Yeah, that's only 73-76% scaling, while HD5770 is 6-12% faster than HD4890. Whatever is stopping HD5870 from being ~100% faster than HD5770 isn't stopping it from being ~90% faster than RV790 (let alone RV770).

    Including games whose scaling isn't a function of the 2x ALUs, 2x TMUs and 2x ROPs, in an argument that says the architecture is inefficient, is just completely pointless.

    Even if the new architecture is more bandwidth efficient (is that why HD5770 is faster than HD4890?) the overall lack of bandwidth scaling undermines the argument that 2x those units are "pointless". Particularly as there are games that can use that extra capability.

    Jawed
     
  4. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    On paper, half gf100 should edge out gtx285 on raw specs, even if marginally. Smaller die => lower intra-die variation => somewhat better clocks. Anandtech said 5850 is ~10-15% ahead of 285.
     
  5. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    13,878
    Likes Received:
    4,724
    I'm talking in terms of pricing. By the time the gf104 comes out I don't think it will be priced against the 5770s ($160) or even the 5830($250). I don't expect these cards to cost these prices when the gf104 comes out this summer.
     
  6. corduroygt

    Banned

    Joined:
    Nov 26, 2008
    Messages:
    1,390
    Likes Received:
    0
  7. PeterT

    Regular

    Joined:
    May 14, 2002
    Messages:
    702
    Likes Received:
    14
    Location:
    Austria
    I'm sorry, I'm not following at all here. Imagine a hypothetical architecture that has 2x the ALUs/TMUs/ROPs of 5870, while maintaining the exact same memory bandwidth. Would you consider it pointless to claim that the architecture is inefficient based on the (likely many!) games that scale very badly on it?
     
  8. jimbo75

    Veteran

    Joined:
    Jan 17, 2010
    Messages:
    1,211
    Likes Received:
    0
    Same could have been said for 5770 vs 4890.

    edit - 5770 has 50% less bandwidth of course duh.
     
  9. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
  10. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,057
    Likes Received:
    3,114
    Location:
    New York
    Looks that way. Only question that remains is how the timing of the refreshes is going to line up. Well assuming that AMD is working on a refresh and not some monster DX11 chip of their own.

    Good point. But it all feeds into the notion that you can't simply look at the area of two chips from different companies and predict their profitability. If that was the case then AMD would long be out of business considering their vast CPU die-size disadvantage.

    But also less than half the TMUs and probably much less bandwidth than the 285. If I had to guess I'd put it around 5830 performance in older games.
     
  11. Ninjaprime

    Regular

    Joined:
    Jun 8, 2008
    Messages:
    337
    Likes Received:
    1
    I made the same point as Dave like 50 pages ago, but I dont think anyone really got it. The point is, if their low end parts are designed with disabled GPCs, as would seem logical, then their geometry rate goes down accordingly, because they can only do 8 pixel triangles per GPC. So while theoretically Cypress and any product all the way down from it, is doing one 32 pixel tri/clock, GF100 is also doing only one 32 pixel tri/clock, but GF100 is really doing four 8 pixel tris/clock, one for each GPC. A 2 GPC part (256 shaders) would then only do one 16 pixel tri/clock, or one normal 32 pixel tri/2 clocks, half geometry rate compared to Cypress family of products. Same for the low end, 1 GPC will be doing one 8 pixel tri/clock, or one normal 32 pixel tri/4 clocks, essentially quarter rate compared to Cypress products.

    If all polys were 8 pixels or less, this wouldn't be the case, but find a wireframe of any game (I think there was a heaven one back in the thread somewhere?) and see how many of the polys account for less than 8 pixels. The majority of the screen seems to be covered with >8 pixel polygons, even in heavy tesselation cases that are unrealistic in games today.
     
  12. jimbo75

    Veteran

    Joined:
    Jan 17, 2010
    Messages:
    1,211
    Likes Received:
    0
    We've seen the Farcry 2 ones long since and that's not a surprise, so I'm not sure why that poster would bench it at 3 different resolutions.

    Dirt 2 is new and is suspicious due to lack of details, and 20-25% faster doesn't appear to coincide with the 5870 vs gtx470 benchmarks we saw earlier in this thread where the 5870 was also 20% faster in that game. Unless the 480 is 45% faster than the 470, one of those benchmarks is wrong or hiding something.
     
  13. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,057
    Likes Received:
    3,114
    Location:
    New York
    You sure Dave was talking about rasterization rate and not triangle rate? Why would a 8 or 16 ROP Cypress derivative rasterize 32 pixels per clock? In terms of triangle rate one Fermi GPC = Cypress = 1 tri/clk so there's no disadvantage there.
     
  14. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    Not necessarily. Even on raw alu power, it only edges out gtx285 if it really has those "somewhat better clocks" (though I think that should be doable). However, it'll only have half the SFUs for instance, and less than half the tmus (unless those magic missing tmus turn up), even if they are more efficient that is a pretty big gap (of course, that's assuming the basic clusters aren't changed).
    It is hard to tell though without knowing what the rop count / memory interface will be. Haven't really seen anything even claiming to be a rumor, so imho could be anything from 128bit / 16 rops to 256bit / 32 rops I guess (the obvious solution, half that of GF100, might not be desirable due to the odd memory size for a mainstream card). I have my doubts though in whatever configuration it could challenge HD5850 (with 256 alus), I think the deficit in alu/tex is just too big (unless nvidia could increase shader clock to 9800GTX levels maybe). HD5830 might be doable, there's quite a gap between that and HD5850 (though that wouldn't be great, as that's a bit too close to HD5770 really).
     
  15. Sontin

    Banned

    Joined:
    Dec 9, 2009
    Messages:
    399
    Likes Received:
    0
    Why would GF100 faster than Cypress with only 32pixel tri/clock? :?:
     
  16. DemoCoder

    Veteran

    Joined:
    Feb 9, 2002
    Messages:
    4,733
    Likes Received:
    81
    Location:
    California
    Developers already face these decisions today with the huge gulf in performance between high end and low end parts, especially given they are target hardware going back several generations. I'm not buying this spin, in the absurd conclusion, the obvious course is to eliminate all SKUs but 1 to maximally help developers.

    Scaling back tessellation factors is easy to do as well, just like chopping down resolution, so yes, I expect developers to build games that won't run acceptable well with all settings maxed on all but the top end systems, which is the situation we have today with many titles.
     
  17. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    IMHO, it is a miracle they have managed to stay in the cpu business all this while battling so many disadvantages they had.
     
  18. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,714
    Likes Received:
    2,135
    Location:
    London
    Well, until Metro 2033 is proven to be a driver fluke favouring Evergreen over R7xx (certainly possible), it's a datapoint you ignore at your peril. If Metro 2033 can scale this well, why shouldn't GTX480 scale nicely (albeit the texturing situation is questionable).

    If games scale badly it's not NVidia's fault if GTX480's performance is "boring". (Though competition with HD5870 is a slightly different question.)

    The signature pix ("We're not ready" etc.) and the original tags (before they were cleaned up) on the Fermi threads are by far the best feature of the graphics discussion over there.

    Jawed
     
  19. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,057
    Likes Received:
    3,114
    Location:
    New York
    Oh, I'm not dismissing it at all. If you see an earlier post of mine I said I'm waiting to see performance in titles where it matters - Metro 2033 being one of them. I was simply saying that we were discussing average performance and there will of course be outliers but those don't invalidate the average.

    Well we'll have to just disagree then. Several people have already pointed out the error in this approach. You can't eliminate a game from the evaluation because it doesn't scale the way you think it should. That is selection bias to the extreme. You can't simply say the architecture is efficient, but then restrict your test cases to workloads that are perfectly suited to it. You're feeding the dependent variable back into the equation!
     
  20. Lonbjerg

    Newcomer

    Joined:
    Jan 8, 2010
    Messages:
    197
    Likes Received:
    0
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...