DX12 Performance Discussion And Analysis Thread

Discussion in 'Rendering Technology and APIs' started by A1xLLcqAgt0qc2RyMz0y, Jul 29, 2015.

  1. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,743
    Likes Received:
    2,587
    Location:
    Finland
    I'm quite sure without double checking that according to AMD, command processor can handle graphics and compute queues, while ACEs can independently handle compute queues at the same time
     
  2. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    Command processor is the one that assigns what the ACE's and Graphics command processors do.
     
  3. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,743
    Likes Received:
    2,587
    Location:
    Finland
    On the "block diagram" -level AMD has only ACEs and GCP, not separate "command processor"
     
  4. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    That is from AMD's GCN whitepaper.

    pretty much all three processors must work in unison to do asynchronicity.
     
    pharma likes this.
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,297
    Likes Received:
    3,629
    Location:
    Well within 3d
    Technically, it might argued that if they were doing things in unison, it might not be asynchronous.

    To the broader debate over whether ACEs make it possible, asynchronous from the standpoint of the software doesn't need independent processors any more than asynchronous functionality did back when there was only one CPU core in a system. A processor can made to juggle multiple queues if need be, and can actually happen in the case of runlist execution with HWS and virtualization handling.

    That appears to have been AMD's choice in this case, but the presence of multiple other vendors with DX12 support shows it wasn't the only one.
    The ACEs are rather over-engineered for the purpose the DX12 happens to use them for.
     
    vLaDv, pharma, Ext3h and 1 other person like this.
  6. BRiT

    BRiT Verified (╯°□°)╯
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    14,216
    Likes Received:
    11,782
    Location:
    Cleveland
    Lazy Devs.
    Lazy Hardware Devs.
    Lazy Devs.

    Got it.
     
    vLaDv likes this.
  7. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    Well that wasn't what I was meaning I was actually meaning the opposite, they are not underutilized any more than 10%, if things are done right, which they seem to be and I haven't seen anything that would show that hardware is crap, drivers are crap, and software is crap.
     
  8. AnomalousEntity

    Newcomer

    Joined:
    Jun 6, 2016
    Messages:
    38
    Likes Received:
    25
    Location:
    Silicon Valley
    All games run drawcalls which cannot use 100% of your HW units - you will be limited by either shaders, memory bandwidth, geometry processing or some obscure FF unit like blending. No drawcall will utilize 100% of all your GPU units as each has different workloads. For example postprocessing shaders are mostly bandwidth limited which means your compute units could do more while waiting for memory to be fetched.

    Devs have gotten as much as 6-7 ms perf improvement using Async Compute which is HUGE! And no, AMD didn't do any marketing dump on async - those are real gains achieved in real games. It pavs the way to do more stuff on GPU like culling, particles etc.
     
  9. Ext3h

    Regular Newcomer

    Joined:
    Sep 4, 2015
    Messages:
    365
    Likes Received:
    319
    Please don't use the term "asynchronous", when what you really mean is simultaneous/parallel execution.

    Asynchronicity is not an attribute of the hardware. It's an attribute of the API.
    It only means that the order of execution is not defined implicitly by the invocation pattern, but instead modeled explicitly by the use of signals and fences.

    This can be either implemented using cooperative scheduling on fences, or a sufficient number of monitors in hardware when opting for simultaneous execution or low latency scheduling.
    In the first case, the hardware does not need any support for that at all.

    Even GCN "degrades" to cooperative scheduling if you exceed the number of monitored queues. Even though I have yet to see a legit real life example where that actually happened...
    Doing things "right" isn't easy. At least if you define "right" as achieving a constant utilization of all possible bottlenecks, while also keeping the working set below cache sizes and alike.

    This has been said a couple of times in this and other threads. The current design of the render paths is still a straight forward evolution from the old fixed function setup, where you would treat the rendering process as a set of operations applied sequentially on the whole frame each. We are yet to see a wide spread move over to tile based renderers, and a departure from the use of overly expensive full screen space effects.

    As it stands, you just can't achieve an even/constant load on all subsystems of the GPU.
     
    vLaDv, Kej, Silent_Buddha and 3 others like this.
  10. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    What does draw calls have to do with async compute, I thought we were talking about async compute only.

    6-7ms, for which systems? Without that context that latency savings is meaningless.
     
  11. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    LOL sorry just going along the lines of the discussion, but yeah.

    I completely agree about it being an attribute of the API, but of course the hardware is made based on the API.

    Yeah you won't get constant utilization with today's, hardware, nor will it really ever happen, even in titled based rendering, but it should be considerably better.
     
  12. AnomalousEntity

    Newcomer

    Joined:
    Jun 6, 2016
    Messages:
    38
    Likes Received:
    25
    Location:
    Silicon Valley
    Umm, you can only send work to the GPU in drawcalls - even compute shaders are converted to drawcalls at ISA level. Async compute/shaders is just an scheduling optimization to run them asynchronously and better utilize GPU units.

    Regarding 6-7 ms, devs of Tomorrow's children claim this : https://twitter.com/selfresonating/status/738470011065372672
     
  13. MistaPi

    Regular

    Joined:
    Jun 12, 2002
    Messages:
    362
    Likes Received:
    7
    Location:
    Norway
    Do you guys think DX12 wil be a bigger benefit for AMD in general because of Xbox One? DX12 opens up for closer to hardware coding and since the game developers already developes for GCN on XBO the code will fit GCN on PC better than Nvidia architectures? I'm thinking the developers have a big enough incentive (Nvidia markedshare) and support from Nvidia to make optimized code for Nvidia GPU's that it wont be any big difference, except for bad console ports.
     
  14. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,743
    Likes Received:
    2,587
    Location:
    Finland
    It's not only that, but also the fact that all of GCN simply couldn't be utilized under DX11, and that AMD DX11 drivers don't support multithreading properly, making the cards more CPU-dependent
     
  15. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY

    Right but now with DX12 you don't need to worry as much about draw call amounts anyways ;)

    And again, we don't know anything about the system/systems they are talking about.
     
  16. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    Doing ports is one thing, but doing bad ports is another. Will dev's that choose to do a pc port stick with no optimizations for nV hardware, where nV has so much marketshare they can't be ignored? I don't buy the whole console wins will drive pc marketshare though optimized games for said hardware. Never worked in the past, and it won't really work now. You might see slight fluctuations but nothing major. nV didn't build their marketshare based on consoles did they?
     
  17. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    This doesn't sound right at all, what does the ISA have to do with anything? If anything it has to do with the command processor.
     
  18. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    I think he stated what oxide stated as what they call draw calls
     
  19. AnomalousEntity

    Newcomer

    Joined:
    Jun 6, 2016
    Messages:
    38
    Likes Received:
    25
    Location:
    Silicon Valley
    The command buffer generated by the driver is just a bunch of GPU consumable commands which includes state changes and GPU ISA (not OpenGL calls). This is read by the command processor to tell what each unit will be doing. At that level it's just a bunch of instructions for GPU and there is no difference between a graphics drawcall or a compute shader.
     
  20. Razor1

    Veteran

    Joined:
    Jul 24, 2004
    Messages:
    4,232
    Likes Received:
    749
    Location:
    NY, NY
    Well compute shaders can issue additional draw calls, but I don't think it happens all the time. You can't draw anything from compute shaders without a draw call being issued by the pixel or vertex shader prior.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...