NVIDIA Kepler speculation thread

Discussion in 'Architecture and Products' started by Kaotik, Sep 21, 2010.

Tags:
  1. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
  2. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    I really like the techreport reviews, always go there first. I just wish they would look at more games. It's enough to get a trend, but if you go into gory perf detail like them, 5 games is too small a sample to make broad conclusions.

    There seem to be 2 different issues at play: one huge hitch in Arkham and some warmup slowdown in Crysis 2.

    (If anyone of TR is reading this: please normalize the horizontal axis? Right now, it's impossible to correlate spikes between different GPUs.)
     
  3. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Huh? Not saying it's the compiler, but if you have on-demand compilation of shaders as you move along in a level, this is the kind of stuff you could see.
    That being said: a compiler issue doesn't explain why you spikes on GTX5xx and not on GTX680 for a number of tests, so it maybe memory management is a more likely cause?
     
  4. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Nobody does that. Real time rendering is hard enough without trying to do real time compilation.

    For that, I'd say 680 is better than 5xx.
     
  5. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    TR is doing a great thing with it's 99%ile measurements, but they need to stop putting out times and start publishing 99%ile frame rates for all the reviews, not just in conclusion.
     
  6. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Can DP co-issue with CUDA SP cores?

    I can certainly make quite the list for why this seems like a bad idea.
    I wonder what the good reason was....
     
  7. dkanter

    Regular

    Joined:
    Jan 19, 2008
    Messages:
    360
    Likes Received:
    20
    If you are running code that is sensitive to DP performance on a Kepler, then you are doing something wrong.

    DP is to Kepler as x87 is to Sandy Bridge.

    DK
     
  8. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    That doesn't make sense: how does a compiler compile the shaders if it hasn't seen them?

    It's up to the application to decide when to present a particular shader program to the GPU (at least for OpenGL it is, see here for example: http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=279913 ) There is no requirement to list all shaders to be used at the start of the context creation.

    I'm sure it's more predictable to declare all shaders up front, but in large worlds with many different material shaders, that would be prohibitively costly by itself.
     
  9. dkanter

    Regular

    Joined:
    Jan 19, 2008
    Messages:
    360
    Likes Received:
    20
  10. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    If you've got so many shaders that it makes level load times unacceptably long, why not just compile them all at game install time and cache the binaries somewhere?
     
  11. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    When you load a level, compile all the shaders you need, load all the textures you need ...

    Read that thread through. That person was using legacy fixed function pipeline and nobody uses that anymore. Changing stuff there will force regeneration/recompilation of shaders.


    Less costly than shader compilation in frame. Not sure if DX offers the option of offline compilation.
     
  12. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Well, yes, that's what I suggested myself. But it's not an API requirement and nobody prevents you from doing otherwise.

    See also here, http://developer.amd.com/afds/assets/presentations/2902_2_final.pdf, slide 4: Could happen at any time before “dispatch/draw”

    Gathering all shaders and textures of a full world may not be practical.

    There's the choice to precompile the textual source code into MSFT intermediate format, of course, but I don't know (doubt) if you can precompile the final GPU specific compiled assembly code.
     
  13. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    That does seem to be the fugly wart bit. BTW -- Kepler, or just GK104, or maybe GK10*? Hard to believe they'd let compute sit in the cold for a whole release cycle.

    -Dave
     
  14. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    It's not required, but it would be dumb not to. Games go to great lengths to avoid touching the driver in frame. Not doing this is just dumb.
    That's for first shader compilation, not recompilation. Worst case, it affects first frame.
    I don't see any other choice.
    AFAIK, on DX the driver never sees the shader text, only the IR. May be a DX dev can tell us if there is a way to cache the final GPU specific code.
     
  15. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Ok. So we agree there, then.

    AFAIK, there is no restriction about when it can happen. Again, that doesn't make it wise to do it in the middle of things.

    I know for a fact that the OpenGL driver sees the full text, because that's precisely what happens for iPhone code.

    DX has the following call to covert from sorce to byte code:
    http://msdn.microsoft.com/en-us/library/windows/desktop/dd607324(v=vs.85).aspx
     
  16. PowerK

    Regular

    Joined:
    Jun 12, 2004
    Messages:
    312
    Likes Received:
    0
    Location:
    Seoul, Korea
    VGA second-hand market is flooded with 7970...
     
  17. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    4
    [​IMG][​IMG]


    Interesting results in overcloking Radeon 7970 and Geforce GTX 680
     
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Anandtech's BF3 results with FXAA have some of the biggest margins of improvement over the 580, and also 7970. Can this be due to the 680's FP16 throughput advantage?

    Really? Where are those numbers being pulled from?
    I'm sure there are those itching to get the latest, but I'd almost wait for the big daddy to come out before making a modest hop to the side.
     
  19. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    4,309
    Likes Received:
    1,107
    Location:
    35.1415,-90.056
    I'm quite sure he was joking; there would be no need for someone to buy a 7970 on the first day it was available, and then in two months sell it at a loss, only to buy a 680 on the first day it was available.

    While the 680 is arguably faster than the 7970, it isn't that much faster. If you were an epic NV fanboy, you'd never purchase the AMD card. If you were an epic power-miser, you'd never buy something in the top tier. If you're an epic AMD fanboy, the 680 doesn't bring enough to the table (IMO) to swing your vote.

    I like the 7970 because of their proven implementation of Power Tune and how overclocking fits into it. I'm very hesitant about NV's very first incarnation of this Boost Clock business, especially when it comes to the oddities involved in overclocking. That is basically the only reason why I'd lean towards the 7970; pretty much everything else looks better on the 680 front.

    I'm pretty sure Boost Clock is gonna get better, maybe even for this generation with just better drivers, but certainly on next generations with more feedback from users and VAR's. I'm just not sure I would want to be an early adopter of this particular tech, based on what little I've seen so far.
     
  20. cal_guy

    Newcomer

    Joined:
    Jun 27, 2008
    Messages:
    217
    Likes Received:
    3
    Speaking of Turbos some of AMD's Bobcat based cpu have GPU turbo, how does AMD's method work?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...