AMD: Speculation, Rumors, and Discussion (Archive)

Discussion in 'Architecture and Products' started by iMacmatician, Mar 30, 2015.

Thread Status:
Not open for further replies.
  1. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    10,055
    Likes Received:
    1,557
    I doubt it as the 7850 aren't in the mainline driver profile anymore. But the 290 is GCN as is the RX
     
  2. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    939
    Likes Received:
    398
    well the 7850 is GCN also but the first iteration. I'm really thinking in getting the 480, I still play with an 7850 since I play at 1080p and I don't mind playing with the settings to play at 60 or 30FPs but now im playing Paragon and my card just can't handle it even at 720p all low no shadows...The game is not yet fully optimized but I feel the need to be able to play at at full and the 480 since to be able to do.(or a 970/980 or 380/390).
     
  3. eastmen

    Legend Subscriber

    Joined:
    Mar 17, 2008
    Messages:
    10,055
    Likes Received:
    1,557
    +
    you'd see a decrease in performance i'd wager although there are tests where they used a radeon 290x and the apu graphics and saw speed ups it think it would be very little and just a better idea to jump completely to a rx
     
  4. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    let say, that mixing generations, memory capacity and performance on CFX/ SLI have never really got good result. memory capacity are reduced to the lower capacity one, and in CFX, the faster gpu will allways finish faster to render the frame and so the second become nearly not used. It should work, but not much to expect from it in term of performance.

    The only time i have mix different gpu's was with the X1950XTX (GDDR4 ) and the X1900XTX ( GDDR3 ) with excellent result, but both gpu's was OC to the same core speed, and the memory capacity was the same outside the speed of it.

    Basically if both gpu's have close performance ( let say a 290 and 290x ), with same memory capacity, ofc, this is working well..
     
  5. I.S.T.

    Veteran

    Joined:
    Feb 21, 2004
    Messages:
    3,174
    Likes Received:
    389
    So, has a review date for these cards been announced?
     
  6. Pressure

    Veteran Regular

    Joined:
    Mar 30, 2004
    Messages:
    1,389
    Likes Received:
    320
    I believe the embargo lifts on the 29th, no?
     
  7. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,183
    Location:
    Germany
    #2367 CarstenS, Jun 8, 2016
    Last edited: Jun 8, 2016
  8. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    939
    Likes Received:
    398
    Yes the nda ends the 29.

    Well talking about cf I was referring to the new functionality of dx12, explicit multi gpu? So I can keep the 7850 doing post processing or adding effect while the new card takes the hard work. But I don't know if it will work on non-dx12 gpus.

    Enviado desde mi HTC One mediante Tapatalk
     
  9. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,183
    Location:
    Germany
    You will need title-by-title developer support for this. And yeah, both GPUs have to understand and speak DX12-API.
     
  10. xEx

    xEx
    Regular Newcomer

    Joined:
    Feb 2, 2012
    Messages:
    939
    Likes Received:
    398
    So time to sell my VGa for a ridiculous low price :( but I think thats the life ;)
     
    Lightman likes this.
  11. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    690
    Likes Received:
    425
    Location:
    Slovenia
    ...Which is a superb example of people not getting what DX12 actually is and what it entails (not that it has been any different with DX11, DX9, DX8,...).
    7850 can work just fine with DX12 API. It just doesn't support all the new features.
     
  12. TomRL

    Newcomer

    Joined:
    Apr 19, 2014
    Messages:
    183
    Likes Received:
    4
    I still don't see how this contradicts what I said. I never said that Nvidia absolutely doesn't support async. We're talking about which one is better at it. The AMD gpu has ACE's which are apparently important for async but that's all I know.
     
  13. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,183
    Location:
    Germany
    So, from the second quote posting the relevant part with a bit of bolding on my part:
    Let me try and rephrase it: With AC, you get a higher percentage of your theoretical TFLOPS throughput in, say, games' frames per second. Given that card, driver, API and application use this feature. What could be better than that, right?

    Maybe, if you got this higher percentage of your TFLOPS regardless of API and application?

    Hence the analogy to hyperthreading. It's good that it's there, but it can only exploit bubbles in the execution pipeline. If there weren't any bubbles, hyperthreading would yield nothing at best. And it has the requirement that there's enough multithreading going on. In a single-thread application, hyperthreading couldn't do squat, regardless of how many bubbles of emptiness are in the pipeline.
     
    #2373 CarstenS, Jun 8, 2016
    Last edited: Jun 8, 2016
    DavidGraham likes this.
  14. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    You can easily turn this argument around. There is also some inherent (hardware) "overhead" for creating an architecture providing a high performance with fewer threads. And as SMT/hyperthreading provides generally a performance uplift for throughput tasks, it is basically always preferable from a performance perspective, even on design A delivering a higher performance with a low number of threads than another design B. It's basically independent. If the integration is worth the effort on design A or design B, is another question. It is basically a similar question as, if the effort for implementing changes to design B so it gets the performance characteristics with a low amount of threads of design A is worth it ;).
     
    CarstenS likes this.
  15. TomRL

    Newcomer

    Joined:
    Apr 19, 2014
    Messages:
    183
    Likes Received:
    4
    So you're saying async is like a fix to already existing problems that Nvidia doesn't have. The 'bubbles'. But if that's the case why would Nvidia gpu's perform worse with async on than it does when it's off.
     
  16. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    The "bubbles" aren't the same "bubbles" as AMD GPU's. ACE's don't mean much to async compute, they might be important to AMD hardware, but Intel and nV's architectures, don't need them, as they use different command processors to do the same things as the different command units in GCN.
     
  17. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    There are occasional pipeline bubbles on nV GPUs, too. The question is, how common are they and how much could they profit from running compute shaders simultaneously on the same SMs to fill some of them. They would benefit from this capability, I'm sure about it. Maybe not as much as Radeons, but they would benefit, too.
    The command processors have not much to do with the pipeline bubbles within the SMs. There is an untapped performance potential in case of nV GPUs.
     
    pharma and Razor1 like this.
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,297
    Likes Received:
    3,629
    Location:
    Well within 3d
    I think there is room for a lot of nuance here.
    There is a debate from a more workload or algorithmic form of overhead, and then there's hardware overhead. The hardware is to a significant extent engineered as much as possible to be a "so?" kind of overhead where it is only permitted so many nanoseconds or so many mm2 or so much energy out of a budget that scales somewhat fitfully.

    The "overhead" for high performance in one thread is a known area of seriously diminishing returns.
    However, the debate between "fewer" and "more" threads, since these GPUs through much of their processing are very SMT is much less clear cut in the middle with very clear downsides if you wander into "too many" since caches/interconnects/memory controllers/control can thrash or congest in ways where serialization turns out to be preferable, and diminishing returns on the parallel resources' contribution to hardware footprint (if the front end is highly parallel, then there is generally a back end broad enough to support it).
    The SMT analogy taken from a CPU context breaks down because so much of this is so vastly parallel in the back end while the front end pipelines so deeply that internally it turns into a question of how well two different resource types are load-balanced and how much of the overall scalar component each one contributes if trying to apply Amdahl's law to parallel systems that are aside from specific points are very close in overall parallelism.

    Some of the tweets and back and forth over AOTS seem to indicate that there is an additional device check besides the async flag in the in that is effectively disabling it for Nvidia anyway.
    The game is also rather variable, although the minor loss seems to be mostly consistent for whatever reason.
    At least some of the 1080 results actually made it a wash or vacillated with tiny losses or gains, and one answer about this indicated that Pascal actually might have architectural quirks that benefit from some of the changes in the game's behavior with async on despite the device check.
    However, the messaging on this has been inconsistent.
    If there were a demerit for AOTS as a DX12 benchmark, or as an experimental tool in my opinion, it's this sort of potential non-orthogonality and inability to really control for factors the knobs are labeled for. Some of the confused discussion about it also makes me uncertain how the different paths structured, and if they are comparable. Having a flag for async that can be overridden by the software is one thing, but that it might be overridden imperfectly makes it seem like the innards are a bit "leaky" for drawing conclusions--particularly if we find out that different vendors/chips behave unexpectedly in different ways (and they have).
     
    DavidGraham likes this.
  19. TomRL

    Newcomer

    Joined:
    Apr 19, 2014
    Messages:
    183
    Likes Received:
    4
    Sorry to keep harping on, but you're all saying I should stop fretting about Async when looking at the 1070 or 1080? Because if that's the case, it's hard not to get the Nvidia card since they generally have way better past-proofing with DX11 and they won't be behind on future-proofing with DX12.
     
  20. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    With the limited DX12 games out so far, and what we know about those games and who put more support into dev support, we can't really get a good picture of what Pascal can do with async compute yet.
     
    #2380 Razor1, Jun 8, 2016
    Last edited: Jun 8, 2016
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...