NVIDIA Kepler speculation thread

Discussion in 'Architecture and Products' started by Kaotik, Sep 21, 2010.

Tags:
  1. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Look at the die shot. It matches with 4 schedulers as well.


    Dual issue means executing 2x more instructions. SFU/L/s ops, in all probability cannot be issued on top of 192 ALUs present. If single issue feeds 128 ALUs, then what is 192 ALUs?
     
  2. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    If that is the case then why have 4 schedulers?
     
  3. denev2004

    Newcomer

    Joined:
    Apr 28, 2010
    Messages:
    143
    Likes Received:
    0
    Location:
    China
    I even forget how much threads are there in one cycle's scheduler's issue.....
     
  4. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    I'm not sure what you're asking. The number of ALUs alone doesn't dictate the maximum number of instructions issued per clock. Each of the four schedulers can issue 2 instructions per clock for a total of 64 across the chip. Granted it can't do this on every cycle because some execution units can't accept a new instruction every clock (SFU, L/S) and there aren't enough SIMDs to go around. There might not be enough instructions to go around either.
     
  5. Ninjaprime

    Regular

    Joined:
    Jun 8, 2008
    Messages:
    337
    Likes Received:
    1
    I'm starting to wonder if there is a "big kepler" for the desktop or if this is just a tesla/compute oriented modification of current kepler. With 680 at $500 is not like they can really charge more for a single GPU card... Would they even bother making a bigger GPU for a card that might sell 100k units or less? I suspect a 690 dual card is the new planned high end and GK110 is a tesla specific part.
     
  6. AlphaWolf

    AlphaWolf Specious Misanthrope
    Legend

    Joined:
    May 28, 2003
    Messages:
    9,470
    Likes Received:
    1,686
    Location:
    Treading Water
    Selling products for more than $500 has never been a problem for them in the past. And it's possible that the 680 won't be positioned where it is now when GK110 is ready.
     
  7. AndrewM

    Newcomer

    Joined:
    May 28, 2003
    Messages:
    219
    Likes Received:
    2
    Location:
    Brisbane, QLD, Australia
    What if GK110 will be marketed as 780, and GK114 760?
     
  8. kalelovil

    Regular

    Joined:
    Sep 8, 2011
    Messages:
    568
    Likes Received:
    104
    Current prices will come down as the manufacturing process matures.
    No matter how good the 680 is, the market for it will be limited so long as the price remains at US$500.
    It is better for Nvidia if they can sell 200,000 GK104s at an average of $350 each rather than 40,000 GK104s at $500 each (using completely hypothetical numbers).
     
  9. seahawk

    Regular

    Joined:
    May 18, 2004
    Messages:
    511
    Likes Received:
    141
    Only if you get enough wafers to make 200.000 GK104s, while honoring the OEM deals for the smaller chips as well. And before the end of the year, it unlikely that TSMC will be able to meet the demand for 28nm wafers by AMD, Qualcom and NV.
     
  10. UniversalTruth

    Veteran

    Joined:
    Sep 5, 2010
    Messages:
    1,747
    Likes Received:
    22
    Yes, and 780's performance might be something like this:

    [​IMG]

    Well, if the performance improvements presented in the previous slide are close to real (even if the slide is fake, that doesn't mean nvidia can't launch such a monster), then AMD can do almost nothing to escape the humiliation (let's use this wordie). ;)
     
  11. NathansFortune

    Regular

    Joined:
    Mar 3, 2009
    Messages:
    559
    Likes Received:
    0
    GK110 will be the GTX780 ready for the end of this year (probably around November) and the 104 will be bumped down to GTX760Ti.
     
  12. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    I haven't tracked down all the benchmarks, and that helpful watermark blots out some pretty important settings. However, from the few I've compared, it looks like the big chip is 30-50% better in several of those games, which seems to fit the likely bump in die size and TDP.
     
  13. jlippo

    Veteran

    Joined:
    Oct 7, 2004
    Messages:
    1,744
    Likes Received:
    1,090
    Location:
    Finland
    Did some test.
    GTX480, GTX680 in fps

    Fluid3D_2
    11, 40
    ------------

    Mandeldx11

    iterations = 2048
    vector: 178, 372
    Scalar: 195, 404
    Double: 18 , 15
    ------------

    Julia4D
    dx11 compute shader

    full detail
    without shadows: 146, 300
    with shadows: 99, 210
    ---

    Seems like doubles took a hit, but otherwise there seems to be some progress.
     
  14. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    There's progress in singles as long as the compiler has been optimized for it / for the things the program does. If it's not, Luxmark happens.
     
  15. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Interesting. Are those CUDA, OCL, or DirectCompute?(edit: never mind, it's DirectCompute.)
    The numbers here are more or less what you'd expect given the clocks and numbers of ALUs.

    Is there something particularly taxing on LuxMark that's not present in these tests?
     
  16. Mianca

    Regular

    Joined:
    Aug 7, 2010
    Messages:
    333
    Likes Received:
    19
    Tahiti is about 70% bigger than Pitcairn - and about 30% faster @2560x1600 in real gaming benchmarks (while running @ a mere 925Mhz).

    Some people consider that a complete and utter fail.

    Now GK110 is rumored to be about 80% bigger than GK104 - and there's an alleged slide that says it's about 40% faster than GK104 @2560x1600 in some marketing picked benchmarks.

    The same people that consider Tahiti an utter fail now proclaim that GK110 is a chip of many wonders - and ponder it's about to totally humilate AMD.

    EDIT: Did I mention that Tahiti was 3 months early (compared to Pitcairn) - and GK110 will probably be 6-9 months late?
     
    #3756 Mianca, Mar 27, 2012
    Last edited by a moderator: Mar 27, 2012
  17. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    I believe that RayTracing, generally speaking, is heavily reliant on cache, which GK104 has little of. But given the extent to which the GTX 680 tanks in LuxMark, there might be other factors at play.
     
  18. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    The bandwidth to the L2 in Kepler has been doubled, though. The larger RF quarantines more threads in flight, though the unchanged data L1 cache size won't make spilling more graceful.
    Sadly, with the poor state of the NV's OpenCL 1.1 run-time, there's no easy way to qualify how the new architecture handles more complex compute tasks using this API. And I think there's still a pending update of CUDA that covers the new Compute Capability 3.0 for Kepler.
     
  19. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    This is incorrect IMHO. You're jumping to a rather strong conclusion based on the simple fact that their CL stack is somewhat bugged, and not quite a priority. Whilst this looks bad for LuxMark and OpenCLBench, its relevance in the real-world is pretty tame, and I'd not look at it for actual insight into how the constraints on their compiler efforts have shifted.
     
  20. Sxotty

    Legend

    Joined:
    Dec 11, 2002
    Messages:
    5,496
    Likes Received:
    866
    Location:
    PA USA
    That was how I read it too which is confusing. I would rather have actual data vs. inferred data any day of the week.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...