NVIDIA Maxwell Speculation Thread

Discussion in 'Architecture and Products' started by Arun, Feb 9, 2011.

Tags:
  1. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    I agree -- that's where I would place my bets as well. The numbers don't work favorably for dp anything. If we assume 40W for the GPU, we wind up with ~10W for dp, and for the alu section, a LOT less. Full-rate dp is 20pJ/op, which would be ~13W. Area-wise, we'd be looking at ~64mm^2 for 640 units, which also seems too large. From a business perspective, this is a scaled-up mobile design, and dp alus in mobile are pointless. I don't see room for full-rate dp at all.

    In theory, quarter-rate dp mul is pretty cheap on top of sp mads, but it's a pain to optimize for power using that kind of design for HPC (hence the dedicated units in Kepler, presumably). NV's two optimization problems are at the extremes, and they have two different market needs -- one would expect the designs in the middle borrow from one side or the other and don't have their own market-specific optimizations. Partial-rate dp seems unlikely from that perspective.

    One is left to wonder what the marketing guy was smoking when they used a GK110 block diagram instead of a GK107 one. The Anand article is interesting as well, because I don't see how HPC is really a scaled mobile design given the different market needs. There's a tension between optimizing for mobile and hpc, and reusing work across the product line. Given they're stuck on 28nm for awhile, I'd be surprised if they pushed aggressively on the reuse side at this point. Pleasantly surprised, but surprised nonetheless.
     
  2. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Yep G7x & G80 were both on 90nm.
     
  3. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    Eh, not really.... Perf/Watt is your sun, moon, and stars in either case.
     
  4. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Yes, but your measurement of perf is different in the two different scenarios. All of NV's slides up to this point were talking about dp-ops/w, which makes sense for hpc. The optimal dp rate on mobile is epsilon of zero....
    Mobile cares very much about extremely low idle, hpc not so much. HPC could care less about better nvenc implementations, while mobile would benefit from a more complex, inclusive system for video taking and photo enhancement. Once you get rid of the low-hanging fruit, optimization becomes highly domain specific.

    The stated low-hanging fruit was operation gathering, and presumably the register file has moved closer to the alus with the re-partitioning. But there has to be a path beyond the low-hanging fruit....
     
  5. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    This doesn't really impact architectural design.

    It isn't as if this is particularly costly (especially when it isn't being used) and will only become less significant over time.

    You seem to be focusing on things which are order(s) of magnitude beneath primary concerns.
     
  6. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    It's rather critical for the part of architectural design under debate -- whether or not dp is implemented.

    Oh, certainly possible. You're right, from a power budget, nvenc is de minimus in hpc. From an implementation (worker resource) budget, better camera support is pretty important in phone design, and people hired to work on that are not working on (say) hpc-specific problems (maybe >64bit fp support, which would be a boon for EM/plasma physics, at least). That said, low idle is absolutely critical for mobile, and I wouldn't consider that to be a non-primary concern at all. In fact, it kind of demonstrates my point -- the items of primary concern in mobile are not the same as the items of primary concern in hpc.
     
  7. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    They wouldn't be anyway...

    Well, we will have to agree to disagree on this one.

    I will say I find your perception of the importance of DP in mobile applications to be rather short-sighted.
     
  8. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    nvenc is most needed for Grid. The current GRID product (http://www.techpowerup.com/gpudb/1699/grid-k1.html) is using 4 GK107 chip. It stands to reason that GM107 will be used for this in the future and that this is the main reason for improving nvenc.
     
  9. no-X

    Veteran

    Joined:
    May 28, 2005
    Messages:
    2,455
    Likes Received:
    471
    RV670 -> RV770 (HD 3000 -> HD 4000)
     
  10. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    9,045
    Likes Received:
    1,119
    Location:
    WI, USA
    I think I'd go with Cayman. 40nm. A lot different. Initially planned for 32nm.


    I'm really curious about all this talk of significant perf/watt improvement... Can't wait to see a review.
     
  11. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Ha, yes. G92 to GT200 also, but the core architecture similar. Perf/W wasn't as much of a big deal then as it was now, so I don't remember to what extent it went up or down.
     
  12. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    How do you figure 25%?
     
  13. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    The interconnect network is obviously there, just not shown.

    The absence of uniform/constant cache and the decoupling of L1 and share memory is interesting. Perhaps the biggest changes would be on the memory side.

    I am disappointed at the meagre increase in shared memory. May be they'll surprise us with the L1 sizes.
     
  14. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    (0.9/0.85)^2. Note I explicitly wrote 'all other things equal' ! I also should have added 'for dynamic power'.
     
  15. DSC

    DSC
    Banned

    Joined:
    Jul 12, 2003
    Messages:
    689
    Likes Received:
    3
  16. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
  17. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    Is there something bad about having SFUs?
     
  18. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    DDR4 support on GM107 : would that be likely? It would give a useful bandwith increase.
    Then in 2015 when availabilty is better, DDR4 would go on low end consumer cards.
     
  19. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    Not inherently, it's just worth noting that AMD integrated the special function handling into the main ALUs back with VLIW4 (it never was fully separate with VLIW5 neither but clearly the 5th alu lane there you could consider as a SFU), whereas they remain separate in nvida's design, apparently. Could have similar reasons to as why DP units aren't integrated, though maybe that has changed with Maxwell.
     
  20. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    I always wondered why NVIDIA never added SFUs to their ALUs count. the only reason I can think of is that they are very limited in function.

    I also wonder if AMD is counting SFUs in their GCN line up.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...