NVIDIA Kepler speculation thread

Discussion in 'Architecture and Products' started by Kaotik, Sep 21, 2010.

Tags:
  1. xDxD

    Regular

    Joined:
    Jun 7, 2010
    Messages:
    412
    Likes Received:
    1
  2. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Maybe the driver still needs tuning for Kepler, since the scheduling now must be JIT-ed?
     
  3. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    Or... memory access isn't very coherent, and smaller cache size/compute results in more misses. Also, long kernels should have high register usage, which might limit the number of warps per SMX and accordingly hinder latency hiding ability, perhaps more than on 580. (I could of course be wrong, but the static scheduling thing doesn't sound so incredibly complicated compared to e.g. scheduling in AMD's VLIW4/5 regime).
     
  4. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    I think the compiler is being way more hyped than what it is responsible for. From what nv has described it doesn't appear to be anything more than what GCN does.

    It's simple dual issue for crying out loud. It's pretty simple with a half decent ISA, assuming it's possible given the workload. Which it should be since AMD used VLIW5 not too long ago.

    Far bigger is the castration of latency hiding. Vis a vis GF104, there's 4x more compute and only 2x more registers. Shared/L1$ hasn't increased at all. It's not surprising that in benchmarks like LuxMark which are pretty memory intensive, it's getting hammered.
    Their CL stack must be equally borked for both 580 and 680, since it generates PTX.
     
  5. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    Yeah but, does pitcairn suck at compute? Cause, 680 does.

    I mean what I gather is, 680 is so efficient because Nvidia decided to finally separate "game focused GPU" and "compute focused GPU", with 680 being the former.

    7970 still seems to follow the "compute+game together" model, therefore it's gaming efficiency is lower.

    But my question is, does pitcairn suck relatively at compute? If not then it's definitely pretty impressive, as it would appear to have 680 class game efficiency while retaining strong compute ability.

    Overall, I think what Nvidia did splitting compute/game was smart, and it will be best for AMD to follow suit (especially if process shrinks are stalling, it will be necessary), and hey, it's time for AMD to copy Nvidia for a change :razz:

    It also might sadly be a necessity for AMD to unveil it's own boost type system, not because it's really better, but because it gives you a benchmark edge for reviews, which as SB has been pointing out even if it's 5%, that's a big deal. But on that front, I think it's best to see how enthusiasts cotton to boost. So far it doesnt seem to be hurting them, the desire to own the card that has the longer bars in all those review graphs far outweighing any user trepidation at the loss of traditional overclocking.


    I still dont think 7970 is too bad at all, it's pretty close to 680 despite the compute burden, and again AMD really needs to kick the default clock to 1ghz, that would help close the perceived gap.

    My guess is with it's compute part already close to 680, a game centric AMD part would once again trounce Nvidia in perf/mm. It doesn't appear AMD engineering has lost it's superiority.

    Edit: well I could have googled it before asking but it does appear pitcairn's compute is just fine. Impressive indeed then http://images.anandtech.com/graphs/graph5699/45164.png
     
  6. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,426
    Likes Received:
    10,320
    We can't really say that GTX 680 sucks at consumer focused compute yet or not as there are scenarios where it does quite well.

    It's too early to determine whether its tragic failing in other situations is due to shortcomings/compromises in the architecture or if Nvidia just didn't care enough or didn't have the time to work on the driver frontend for it.

    Obviously HPC/professional level of compute is certainly compromised, but that has little impact on the consumer market.

    Regards,
    SB
     
  7. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    Well I mean, this was posted on GAF, and they really did consciously take compute out of 680. And kudos, it was smart.

    http://www.embedded.com/electronics...s-Kepler-to-get-compute-cousin--says-analyst-


     
  8. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,426
    Likes Received:
    10,320
    Yes, we already know that GTX 680 isn't appropriate for the HPC/professional market when it comes to compute. That link just reinforces that.

    What we don't know is whether any of that impacts the consumer compute workloads. Or if those corner cases are just due to lack of attention by Nvidia within the driver frontend. Or if it's a direct result of those compromises in making a more efficient consumer oriented GPU.

    Hence why Pitcairn is a more appropriate architechtural comparison despite the size and marketplace discrepency. And as such in those areas where Pitcairn doesn't fall off a cliff like GTX 680, that's where we don't know if it's purely architechture or insufficient work on the driver/software frontend of GK104 that is causing those pitfalls.

    Regards,
    SB
     
  9. vking

    Newcomer

    Joined:
    Jun 17, 2007
    Messages:
    15
    Likes Received:
    2
    Thermal management and power throttling are two different issues. So better to not combine them into a single discussion.

    Current measurement helps with staying within the bounds of EDP/TDP. Thermal is altogether another story.
     
  10. Bob

    Bob
    Regular

    Joined:
    Apr 22, 2004
    Messages:
    424
    Likes Received:
    47
    18 months ago, all we heard was how NVIDIA was abandoning graphics and was going to build HPC-only parts (or somesuch silliness). Today, I hear the opposite. Times sure change.
     
  11. Ninjaprime

    Regular

    Joined:
    Jun 8, 2008
    Messages:
    337
    Likes Received:
    1
    This sounds vaguely familiar...

    :razz:
     
  12. Cookie Monster

    Newcomer

    Joined:
    Sep 12, 2008
    Messages:
    167
    Likes Received:
    8
    Location:
    Down Under
    Well, nVIDIA does have past history about selling single GPU based video cards above the $499~$599 mark, ala 8800Ultra ($799?), 7800GTX 512MB($999?), 6800Ultra EE etc
     
  13. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    4
    #3793 Shtal, Mar 28, 2012
    Last edited by a moderator: Mar 28, 2012
  14. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    Will be interesting to see that benched against Kepler.

    And of course the Kepler overclocked variants too.
     
  15. RecessionCone

    Regular Subscriber

    Joined:
    Feb 27, 2010
    Messages:
    505
    Likes Received:
    189
    Don't forget GF104 had its ALUs running at 2x frequency.
    48 ALUs * 2 Ops/Hz = 96 ALU Ops/Hz (GF104)
    Versus
    192 ALUs * 1 Op/Hz = 192 ALU Ops/Hz
    (GK104)

    So GK104 doubled the compute and also doubled the registers.
     
  16. Shtal

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    1,344
    Likes Received:
    4
    I think AMD gave a green light, let partners factory overclock beyond 1GHz limit restriction because of Kepler's performance.
     
  17. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Pitcairn is just the same as Tahiti, it has the same Compute Units with the scheduler, the same memory hierarchy (scaled, of course) but simply lacks ECC and has slower DP support. In SP workloads it should perform just like Tahiti (proportionately, obviously).
     
    #3797 Alexko, Mar 28, 2012
    Last edited by a moderator: Mar 28, 2012
  18. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
  19. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    An AMD gaming centric part , would require going back to VLIW5 arch , and We are not sure that design is able to properly utilize resources beyond a certain limit , giving that the ALU count is contstanly on the rise , while other aspects like software are not .
     
  20. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,791
    Likes Received:
    1,596
    Just saw this...(Nvidia future roadmap) http://forums.overclockers.co.uk/showpost.php?p=21532597&postcount=29

    Week old post but new to me.

    Not very exciting if true. Maybe I'll pull the trigger on that 7850 then.

    Wish it had projected USA prices though instead of just UK ones, as I'm not sure what exactly those translate too. I'm assuming roughly straight 1:1 conversion to dollars though.

    Edit: appears USA dollar prices are somewhat higher, so it's even worse. >> http://videocardz.com/31551/geforce-600-roadmap-partially-exposed-gtx-670-ti-coming-in-may
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...