NVIDIA Maxwell Speculation Thread

Discussion in 'Architecture and Products' started by Arun, Feb 9, 2011.

Tags:
  1. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,503
    Likes Received:
    420
    Location:
    Varna, Bulgaria
    Source

    Why the shared memory is still reported at 48K?
     
  2. Bob

    Bob
    Regular

    Joined:
    Apr 22, 2004
    Messages:
    424
    Likes Received:
    47
    Shared memory per block is limited to a smaller value than the total shared memory capacity per SM.
     
  3. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    I have no idea: it's the first time I've seen these latency graphs for a GPU. Very surprising to see them on TomsHardware of all places.
     
  4. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
  5. pixelio

    Newcomer

    Joined:
    Feb 17, 2014
    Messages:
    47
    Likes Received:
    75
    Location:
    Seattle, WA
  6. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
  7. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Sorry. I should have come and corrected my post. True, it's overpriced, meaning having to wait six monthes or a year. Nvidia pulled the same thing with GTX 650.
    Nvidia does that all the time, introducing new GPUs at a high price, then it gets priced down when AMD can compete or when the former nvidia generation is EOL. AMD introduces stuff at more affordable prices from the start.
     
  8. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,512
    Likes Received:
    930
    You've either said far too much or far too little. Far too little for my taste. :D How is the result bogus? How will it be addressed? Why the quotation marks?
     
  9. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,296
    Likes Received:
    3,626
    Location:
    Well within 3d
    The Sandra latency benchmarks have been questionable before.
    Their L1 numbers for AMD's VLIW GPUs are about double latencies measured in arithmetic loops, for example. The numbers were still very high, just not that high.
    I haven't come across a reason why that was the case, or why this pattern of overestimation persists with the 270.
     
  10. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
    Can i ask you, gm107 has DP dedicated cores? Because here i read this..:

    We're not sure what to think about GM107's increasingly hobbled FP64 capabilities. You can either say double-precision performance is really bad, or the single-precision numbers are really good. Regardless, at the end of the day, artificial limitations meant to prevent cheap desktop cards from being viable workstation parts are no less irritating.

    http://www.tomshardware.com/reviews/geforce-gtx-750-ti-review,3750-16.html
     
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,429
    Likes Received:
    181
    Location:
    Chania
    From what I've understood so far there are 4 FP64 SPs in each SMM. With 5 clusters you'd have 20 FP64 SPs for the GM107. It's not much more compared to the 16 SPs in GK107 but it isn't less either.

    And I'm obviously missing something in that toms' link since the DP results they're showing are anything but bad for such a humble GPU especially if I consider how damn expensive a Quadro K5000 is and where that one lands compared to a 750Ti. If the entire thing is based on the 1:24 vs. 1:32 thingy then it's rather nonsense. With 2.5x times more clusters than a GK107 it isn't necessarily an absurd design decision, always of course under NV's typical reasoning for those kind of things.
     
  12. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    From the wording of Tomshardware ('artificially hobbled') it's almost as if they believe that gm107 has more DP units than reported.
     
  13. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,018
    Likes Received:
    114
    You have to keep in mind this test doesn't really depend on DP rate all that much. If you compare Titan to 780Ti for instance the former is faster yes but merely by 40% or so.
    That said though you'd think at least all the low DP performance chips (that is those with 1:16-1:32 DP:SP performance) would rank somewhere close to their DP capability (since you'd think with such low DP performance other bottlenecks would disappear) but that's not the case neither. The 750Ti indeed is still very close to HD 7790 even though the latter has more than twice the DP flops (though the 7790 does make up some ground there compared to the single precision results where the 750Ti beats it).

    I think for technical analysis you shouldn't trust Tomshardware :).
     
  14. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    Just when I thought they redeemed themselves a bit by being the only ones to publish this latency graph. Must be that analysis is harder than running a benchmark... :wink:
     
  15. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
    Ok, thank you very much.
     
  16. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    946
    Likes Received:
    46
    Location:
    LA, California
    rpg.314's point about L1 misses mainly going to L2 also makes sense, and I also didn't know that L1 mainly cached stacks. Sounds like maybe things might be different for later chips.

    Well, that settles that then. Thanks for the clarifications!
     
    #1156 psurge, Feb 19, 2014
    Last edited by a moderator: Feb 19, 2014
  17. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    946
    Likes Received:
    46
    Location:
    LA, California
    Comparing the tom's hardware latency graph with Haswell numbers here, it looks like NVidia's 24KB L1 latency is about 3x Haswell 6MB L3 latency, or comparable to Intel's 128MB off-chip L4 latency (for in-page random loads, which apparently means an access pattern that avoids TLB misses). And that's comparing just cycle counts; Haswell's cycle time is less than half that of the GM107.

    I'm wondering if Sandra might be launching a bunch of warps/wavefronts that all do memory access and end up getting scheduled in round robin fashion, so that what Sandra really measures is something more like the size of the scheduler's queue of warps/wavefronts rather than actual cache latency.
     
  18. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland

    For be honest, i have some real doubt about the numbers we see on benchmarks of SP ( and DP ) on computing made so far with maxwell ( well the 750TI )... I have some pain to see how a 750TI with half the cudacores of the GTX680, can in SP tests got the same numbers in F@H ... with similar clock speed.... Specially when this is absolutely not traduct in any graphics bench... we are speaking there about OpenCL and not CUDA, so the CUDA version is not in cause... so drivers ? softwares ? bug ? In reality, computing benchmark are asbolutely all over the place in the benchmark i have seen.. From expected SP performance ( regarding the gpu is between a 750TI and 750TI boost ) to extremely surprising in this F@H SP results ( close to the 680 )
     
    #1158 lanek, Feb 19, 2014
    Last edited by a moderator: Feb 19, 2014
  19. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,429
    Likes Received:
    181
    Location:
    Chania
    If NV keeps that pace & Maxwell cores below the GM200 have slightly more DP units then their predecessor than GM204 might end up way more powerful than I imagined so far.
     
  20. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    Not on GM204.. im 99,99% sure than on DP they will crippled it again and disable DP units.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...