Why is AMD losing the next gen race to Nvidia?

Discussion in 'Architecture and Products' started by gongo, Aug 18, 2016.

Tags:
  1. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    I found an article that has the 1070 at double/almost double the hash rate of the 480. Where did you get your benches from.

    http://cryptomining-blog.com/tag/rx-480-hashrate/
     
  2. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    647
    Likes Received:
    92
    Good points. Not to mention the Automotive side..there are certain dedicated resources for that segment as well. Plus AFAIK Nvidia spends a lot more on Software than AMD does.
    Chicken..meet egg.
     
  3. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I do not do mining but isn't the issue more to do with Ethereum and needing to use Linux to overcome them?
    Under Windows this has low hashrates for Nvidia cards including Pascal.
    Cheers
     
    #43 CSI PC, Aug 20, 2016
    Last edited: Aug 20, 2016
  4. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    I think I read that in one of the articles i was browsing, but they were waiting for a driver fix from nvidia. But my point is that itsmydamnation seems to imply that a 1070 only matches a 470 in compute power in crypto currency and that the performance might be entirely ROP related.
     
  5. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,296
    Likes Received:
    395
    Location:
    Australia
    Thats not what i said....lol
    I said ( maybe very poorly) from what i have seen on the latest drivers 1070 and a 470 has around 27GH/s. Both being limited by memory. If you OC the memory of either they both go up. So i should be more specific in saying 470's that have Samsung 8gbps memory modules on them, most people can take them to a clock of 2.2 and end up around 27-30GH/s.

    Given that both the 1070 and 470 with equal memory can do equal hash rate, i was using that as a basis to say in gaming workloads (you know the things that use rops...lol) Alot of NV's advantage in terms of power consumption and need for less memory bandwidth could largely be from improved ROP's but i dont think its from compression, i think* it would be from not needing to move as much data around even on chip.

    I wasn't aware that 1070 odes better in linux.

    * please note im a complete layman :)


    edit: person getting 27-28GH/s on a 1070 in windows on very latest drivers
    https://forums.anandtech.com/threads/ethereum-gpu-mining.2463816/page-81#post-38424395

    more discussion around that post.
     
    #45 itsmydamnation, Aug 20, 2016
    Last edited: Aug 20, 2016
  6. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    well compression doing compute tasks, will not be effective at all, so bandwidth limitations if that is the case, there is no real way around that, as you stated, it all depends on the vram modules being used.
     
  7. firstminion

    Newcomer

    Joined:
    Aug 7, 2013
    Messages:
    217
    Likes Received:
    46
    Completely agree, man-hours are the most valuable resource.

    But a hammer will sound overrated to you if you got used to putting nails with a screwdriver. :-|
     
  8. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    337
    Likes Received:
    294
    At least in the case of Etherium, I doubt it's the compression which scales so horribly.
    But rather that Etherium - by design - prevents coalesced read and write transactions by accessing pseudo random memory locations which don't fit into an onchip cache. The GPU is constantly starved by memory latency, and AFAIK Pascal consumer GPUs still don't reach the same concurrency levels as the GCN equivalents (due to the smaller register files?), so there is nothing to hide the stall.

    I'm also suspecting that Maxwell and Pascal are actually rather weak in terms of raw, sustained memory write transactions per second, not just a "miss match" between concurrency and available memory bandwidth. Hidden perfectly well if coalescing works in the L2 cache or in the ROPs, but once you miss that, say goodbye to efficiency.
     
    #48 Ext3h, Aug 22, 2016
    Last edited: Aug 22, 2016
    Razor1 likes this.
  9. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Are you sure about that? Sebbi has been saying that register file pressure is one of his complaints about GCN.

    Maybe it's different for later GCN versions, but the GCN white paper says 64KB per CU, where Maxwell has 256KB per SM. Now Maxwell has 4 sub-SMs, so that really 64KB per sub-SM. But GCN has a warp that is 64 wide, while Maxwell has 32. So you'd think that Maxwell can have more warps in flight.

    Or am I looking at this the wrong way?
     
  10. OpenGL guy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,357
    Likes Received:
    28
    GCN has 64KB of registers per SIMD, so 256KB per CU:
    4 SIMDs per CU * 64 threads per SIMD * 256 registers per thread per SIMD * 4 bytes per register = 256 KB per CU.
     
    Razor1 and Heinrich04 like this.
  11. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    I read in some of the other forums figures showing that under Windows the hashrate was between 24 to 27GH/s, but using Linux the figure increases to the mid 40s and this is why if mining with Etherium it is recommended to use Linux on Nvidia cards.
    The example you give should also affect Linux builds?

    Cheers
     
  12. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    Thanks!
     
  13. CSI PC

    Veteran Newcomer

    Joined:
    Sep 2, 2015
    Messages:
    2,050
    Likes Received:
    844
    Just to show how well Pascal 1070 does with other mining algorithms (not Ethereum) compared to AMD 480, this crypto mining site benchmarked both and the 1070 manages double the performance, which surprised them.
    http://www.cryptocoinupdates.com/performance-of-the-amd-radeon-rx-480-for-other-algorithms/

    They will be re-evaluating the Pascal 1070 with Ethereum and Linux at a future date as they also found the performance issue under Windows.
    Worth noting while Nvidia has problems with Ethereum under Windows, Polaris with the 480 it seems has problems with sgminer under Windows (they have not tested it under Linux yet).

    Cheers
     
    #53 CSI PC, Aug 22, 2016
    Last edited: Aug 22, 2016
  14. gongo

    Regular

    Joined:
    Jan 26, 2008
    Messages:
    582
    Likes Received:
    12
    So AMD 'bigger' GPU Vega is 1H17, not even 1Q17, that is so disappointing....not even the big one with HBM.. :(
    Can we say they have lost this round already?

    Just what is wrong with their 14nm GCN3.0? The process or the architecture? Who wants to put in some speculation with Polaris as a clue..?
     
    homerdog likes this.
  15. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    Actuallly i read different report, articles saying Vega 10 october 2016, 2017 for Vega 11... H1 dont mean forcibly june, could just mean first half of 2017. ( i need to watch the investor video, slide are just there for background ).

    But well as rumors are completely mixed, hard to really know.
     
  16. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,056
    Likes Received:
    1,020
    I can see several obvious possibilities.
    One is yield. Lisa Su made a comment about 14nm yield issies regarding Polaris just over a week ago, so it may be that Vega needs to be respun, and that they can't fully trust the respins to yield sufficiently well to dare push the "launch" button yet.
    Another, more speculative, is that they are waiting for a more suitable process for high power chips than 14nm LPP, wich would perform better in the rather extreme highpower GPU segment. (I'd love to see how the 14nm HP process from IBM would perform. )
    And again even more speculatively, HBM2 may be yielding too poorly to allow large volumes at decent cost to be counted on, which would also make being a bit cautious in their roadmappery wise.

    We just don't know. We may never get to know.
     
  17. Anarchist4000

    Veteran Regular

    Joined:
    May 8, 2004
    Messages:
    1,439
    Likes Received:
    359
    It may be as simple as this came out during an investor meeting and revenue takes time. If they were expecting significant revenue 1Q17 they'd likely have to launch before Christmas. The slide I saw also specifically mentioned "enthusiast" which likely doesn't include both of the Vega chips. Speculating here, but if they are bonding dies together that puts HPC Zen and Vega11 in roughly the same timeframe. Seems reasonable the two could be related as Zen should feature leading graphics IP so some tech is likely shared. HBM2 is the other potential culprit because I'd have expected the Pascal Titan to be launched with it if readily available considering the price. Little reason not to add HBM2 and charge whatever you want on that market.
     
  18. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    For Titan, or they was initiallly allready plan to launch consumers grade with GDDr5x and so a different sku, GP102, for have it as fast possible, or they have change plan in between, something AMD was not as fast to do. ( this could explain too why GP102 have some little things that GP100 dont have )
     
  19. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    10,801
    Likes Received:
    2,172
    Location:
    La-la land
    Um, wasn't it obvious they've lost this round when they first announced polaris as a mid-range GPU? The 480 is barely faster than the 390X, which in its original incarnation is nearly three years old by now, and there's no faster product anywhere on the horizion. If they had anything, AMD would have been hyping it to try and prevent gamers from buying NV high-end boards.
     
  20. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    nV is already charging what ever they want with GP102 with GDDR5x, why would they need GP100 and HBM2? To cut margins or make a product that cuts into their professional cards market and thus cut down margins there too?

    nV's stance on not using HBM, HBM2 in enthusiast level products, tells a great deal about how much more expensive it is to use such products.

    It would have been cheaper for them to use GP100 and HBM2 because they wouldn't have needed to spend the money on design of the GP102, but margins must be quite different from HBM2 to GDDR5x for them to plan like this. nV even noted the reason why in the past they used professional chips in the top end enthusiasts, this is the first launch where they have branched away from that. (also have to factor in the cost of the die, GP102 is a smaller die, but a GP102 with HBM would have been doable, and they still didn't do that either, so it comes down to the cost of HBM and its needs).
     
    homerdog, Grall and spworley like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...