NVIDIA Kepler speculation thread

Discussion in 'Architecture and Products' started by Kaotik, Sep 21, 2010.

  1. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    6,719
    Location:
    Well within 3d
    It's more accurate data that is tens of millions to a hundred million cycles out of date with the latency of the off-die control loop.
    A measurement can be several frames out of date, in more relatable terms.

    The comparison is between a having an accurate thermometer that gives you the temperature of last week compared to a somewhat less precise model that tells you how hot it is today.

    A more advanced scheme might be a less conservative counter-based system with feedback from voltage and current measurements to tamp it down if it gets too aggressive.

    Possibly better might be a version of the initial non-digital Foxton that Intel almost used, or if AMD or Nvidia ever manage to get on-die voltage control, the loop can be sped up significantly.
     
  2. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,617
    Location:
    London
    Is NVidia measuring current and volts or just current and assuming volts? And how does current off-die tell you about the heating effects of that current on die?
     
  3. NathansFortune

    Regular

    Joined:
    Mar 3, 2009
    Messages:
    559
    If it is 40% faster than GK104 it makes GK110 around 50-60% faster than 7970 and up to 90% faster than GTX580.

    In terms of raw performance that is far from fail. Does that raw performance come at too high a die size, sure, but that has been the case for all Nvidia big die chips. What makes 7970 such a big fail is that a 50-60% performance delta hasn't been seen since the days of the G80, ever since then Nvidia have taken compute hits to push their Tesla line while AMD didn't reducing the performance gap all the way down to 20% with the 5870 and GTX480. Now that AMD have taken the same compute hit as Nvidia the 50-60% larger die will have 50-60% better performance...
     
  4. AlphaWolf

    AlphaWolf Specious Misanthrope
    Legend

    Joined:
    May 28, 2003
    Messages:
    8,170
    Location:
    Treading Water
    Is that some new kind of math? For gk110 to be 90% faster than the 580 it would need to be a lot more than 40% faster than the 680. The 680 is not 66% faster than the 580 (not in general anyway).
     
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    6,719
    Location:
    Well within 3d
    I've not seen a reference to what measurements Nvidia uses.
    Off-die measurements won't give much detail on localized thermal effects, but Nvidia's tech is described as taking thermal data into account as well.
     
  6. Chalnoth

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,706
    Location:
    New York, NY
    That's not necessarily important. It all depends upon how rapidly heat is dissipated. If the heat is dissipated from the die more slowly than the measurement frequency, then it really does not matter that there is a significant delay there: any current fluctuations on time scales shorter than the dissipation rate are going to get averaged together anyway.

    I'd be really, really surprised if the heat dissipation rate was fast enough for short-term current fluctuations to make a significant difference to die temperature.
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    6,719
    Location:
    Well within 3d
    I suspect that would make things worse. If the chip is measured as being very close to the limit at step N, there's a chance that it will go over the limit for some amount of time prior to step N+1.
    The slower heat moves off-chip, the faster it accumulates from transient activity spikes, which is risky if regions of the chip are already toeing the line and the time steps are long.

    Part of the weakness of the long control loop is that the time periods in question are so long that they are thermally significant from the POV of the cooling solution. They can't say everything averages together because their spikes can last longer than what is considered transient.

    All of this can be avoided by inserting a decent guard band, which Nvidia has done.
     
  8. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
  9. NathansFortune

    Regular

    Joined:
    Mar 3, 2009
    Messages:
    559
    I take it you have never heard of compounding...

    7970 > 580 by 20%
    680 > 7970 by 15%
    GK110 > 680 by 40%

    580 = 100
    7970 = 120
    680 = 138
    GK110 = 193

    Don't get me wrong, that is a best case scenario which is why I said up to, chances are it will be more like 80-85%...
     
  10. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    6,466
    Err, the difference between 680 & 7970 is hardly 15%, unless you're only counting low resolutions.
    TPU has the largest game selection used by review sites, and 680 comes out only 7,5% faster than 7970 at a mere 1920x1200, while at 2560x1600 the difference shrinks to mere 4%, even if you'd take the 7,5% difference it would already drop GK110 number by 13% to 180% of 580
     
  11. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    330
    (1) We're talking about performance @2560x1600 resolution here - as that's the setting upon which that ominious slide is based (possibly because the performance delta between GTX580 and GTX780 won't be that big at lower resolutions - always assuming that the slide actually is real, of course). Gk104 is about as fast as HD7970 in that szenario.

    (2) So we're talking about 40% performance advantage over Tahiti - which should end up about in line with the expected difference in die size. So we can speculate that GK110 might yield about the same perf/mm² as Tahiti. That's a great improvement for Nvidia (who always struggled in that respect) - but no humilation for AMD.

    (3) 40% better performance than Tahiti is just within reach of an upcoming HD8970 (or an HD7870 crossfire solution, for that matter) - which again brings us back to the timing problem. Even if GTX780 launches very early this summer - it's still at least 6 months behind Tahiti. By the time GTX780 is out, HD8970 won't be too far away.

    I'm not saying that Gk110 won't be an impressive product. I'm just trying to point out that some people seem to apply double standards.
     
  12. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,309
    Location:
    New York
    It's all about expectations. I've met women who were floored because I opened a door for them or asked if they got home ok cause their expectation is that all men are douchebags (as if opening doors changes that fact :lol: ).

    People expect nVidia to make big, power hungry compute focused chips with relatively low gaming efficiency because nVidia has made its priorities blatantly clear. This is why Kepler was a positive surprise - it broke that expectation. Pitcairn is equally impressive yet got nowhere near the same reaction because it "only" met expectations.
     
  13. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    6,719
    Location:
    Well within 3d
    It's not like most reviews didn't test the 580 in the same benchmarks.
    Skip compounding the error with an extra comparison to a dissimilar architecture and just go 580-680-780.
    It's handwaving within handwaving already.
     
  14. Dooby

    Regular

    Joined:
    Jul 21, 2003
    Messages:
    478
    I just laughed as hard at this as when people were insisting that the 7970 beats the 580 by 50%, when it was at most, 15-20%. Go go inflated numbers!
     
  15. Arty

    Arty KEPLER
    Veteran

    Joined:
    Jun 16, 2005
    Messages:
    1,906
    So you are saying women opening doors for women will not earn them any brownie points? Damn sexism or prejudice for that matter.
     
  16. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    330
    Both GTX680 and HD7970 are about ~30% faster than GTX580 @2560x1600.

    Some review sites say a little less (e.g. techpowerup.com @ 27%), some say a bit more (e.g. computerbase.de @ 33%). But ~30% should be a very good number to go by.

    Lower resolutions are a totally different story, of course.
     
  17. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    6,719
    Location:
    Well within 3d
    There are swizzling ops indicated for GCN that are LDS instructions that don't take up any LDS, but must be using the crossbar logic to move data between lanes.
    That might be similar, or a subset of Nvidia's instruction.
     
  19. Dade

    Newcomer

    Joined:
    Dec 20, 2009
    Messages:
    206
    Please, keep in mind that LuxMark kernel is several thousand of lines long while a Mandelbrot kernel is just 10-20 lines of codes. The kind of load and bandwidth requirements are simply too different to be compared.

    Blender/Cycles (a CUDA path tracer) users are reporting the same kind of results shown by LuxMark: the 580 is faster than the 680. So it doesn't look like a OpenCL specific problem.
     

Share This Page

Loading...