D
Deleted member 11852
Guest
Yup.I think this is something that DX12 does handle for you; hence the effort to place it into the API and call it a feature.
Yup.I think this is something that DX12 does handle for you; hence the effort to place it into the API and call it a feature.
Maybe I'm wrong, but the impression I got from the earlier comment was a handled structure, although that wouldn't really be the memory system as I called it but, as Iroboto points out, the API.If the memory system manages this automagically, it's the first I've read of it.
Which would require more processing power and potentially resources.Or you need more draw calls if you want individuality to things like exploding walls and debris all acting on its own each piece affected by its own physics everywhere.
Which would require more textures and resources.Or if you want tons of different objects each with their own material and shaders.
Yep.Or if you want to just ramp up performance faster, like being able to fill a command queue quickly if it gets empty.
Maybe I'm wrong, but the impression I got from the earlier comment was a handled structure, although that wouldn't really be the memory system as I called it but, as Iroboto points out, the API.
Maybe I'm wrong, but the impression I got from the earlier comment was a handled structure, although that wouldn't really be the memory system as I called it but, as Iroboto points out, the API.
On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.
Which would require more processing power and potentially resources.
Which would require more textures and resources.
Yep.
Basically, in all the years of developers talking about bottlenecks, of GPU power limits and triangle set up limits and CPU limits and memory limits, and them eking out every ounce of usability from the finite RAM and bandwidth, drawcalls has never been an issue. The lack of variety in games has never been because there's not enough drawcalls to go around, but because the RAM is full already or the cost of creating assets is prohibitive. Eliminating the drawcall cap is going to be a plus for devs as it makes their lives easier, but it surely can't be a world-changing event. Especially coupled with async compute for maximal utilisation of GPU resources.
Of course draw calls have been an issue. That's why new APIs always try to address the issue. You can find comments about DX10 and DX11 reducing the draw call overhead. DX12 is like a step function here. Things like instancing came about at least partially to reduce draw overhead.Basically, in all the years of developers talking about bottlenecks, of GPU power limits and triangle set up limits and CPU limits and memory limits, and them eking out every ounce of usability from the finite RAM and bandwidth, drawcalls has never been an issue. The lack of variety in games has never been because there's not enough drawcalls to go around, but because the RAM is full already or the cost of creating assets is prohibitive. Eliminating the drawcall cap is going to be a plus for devs as it makes their lives easier, but it surely can't be a world-changing event. Especially coupled with async compute for maximal utilisation of GPU resources.
On console?Of course draw calls have been an issue.
I would expect a software API (for DirectX/OpenGL libraries) to schedule and queue calls like this but I've not read anything to suggest that GPU hardware does. What GNM has is anybody's guess - we only have "most GPU commands are just a few DWORDs written into the command buffer". Most. I must admit I assumed "command buffer" was a reference to the GPU itself but it could be the API.
As from what I understand so far, you are responsible for the CPU side of multithreading, so you are responsible in this case for the assigning of queues. DX12 will be responsible for the management of it.Interesting. I'd like to see more details on implementation. How are shaders assigned to queues, and at what granularity? Do you assign priorities to queues, or are the queues for each ACE of a fixed priority (eg 0 lowest, 7 highest)?
Also up at AnandTech, but @Ryan Smith seems to have made some mistakes, namely the amount of queues on GCN (which should be 64 for GCN 1.1/1.2 GPU's with 8 ACEs, not 8 - also it's not clear if the compute queues actually eat any graphics queues on AMD like AnandTech's article claims)This is a cross post from @Kaotik 's post in the other thread - except this accompanying write-up is in English and not in Chinese
Asynchronous Shaders in DX12
http://www.tomshardware.com/news/amd-dx12-asynchronous-shaders-gcn,28844.html
Today must have been embargo day. Interesting.Also up at AnandTech, but @Ryan Smith seems to have made some mistakes, namely the amount of queues on GCN (which should be 64 for GCN 1.1/1.2 GPU's with 8 ACEs, not 8 - also it's not clear if the compute queues actually eat any graphics queues on AMD like AnandTech's article claims)
http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading
Why Asynchronous Shading Wasn’t Accessible Before
AMD has offered multiple Asynchronous Compute Engines (ACEs) since the very first GCN part in 2011, the Tahiti-powered Radeon HD 7970. However prior to now the technical focus on the ACEs was for pure compute workloads, which true to their name allow GCN GPUs to execute compute tasks from multiple queues. It wasn’t until very recently that the ACEs became important for graphical (or rather mixed graphics + compute) workloads.
Why? Well the short answer is that in another stake in the heart of DirectX 11, DirectX 11 wasn’t well suited for asynchronous shading. The same heavily abstracted, driver & OS controlled rendering path that gave DX11 its relatively high CPU overhead and poor multi-core command buffer submission also enforced very stringent processing requirements. DX11 was a serial API through and through, both for command buffer execution and as it turned out shader execution.
Yeah.Today must have been embargo day. Interesting.
You are referring to the table with graphics + compute queues right?
260-series should be 1 gfx + 7 or 8 compute (depending if the gfx actually eats 1 slot or not)AMD GCN 1.2 (285) 1 Graphics + 7 Compute 8 Compute
AMD GCN 1.1 (290 Series) 1 Graphics + 7 Compute 8 Compute
AMD GCN 1.1 (260 Series) 1 Graphics + 1 Compute 2 Compute
AMD GCN 1.0 (7000/200 Series) 1 Graphics + 1 Compute 2 Compute
Anandtech:
lol ouch, but we all knew this to be true.
This is starting to make a lot of sense. I think this is good evidence that upon release GNM had closer closer to a full DX12 functionality.
It would explain why I:SS runs hotter, by reports than other games do.
Well i'm a guy that likes to fail fast, looks like PS4 has really had both a major API and hardware advantage. I'm surprised Drive Club and Order1886 didn't leverage this, they may have though, it's just not listed on the slide.
Using compute for something and using async shaders are two different things. I doubt you could find a(n AAA) game these days that doesn't use computeI think weather use compute in Drive Club.
Anandtech:
lol ouch, but we all knew this to be true.
This is starting to make a lot of sense. I think this is good evidence that upon release GNM had closer closer to a full DX12 functionality.
It would explain why I:SS runs hotter, by reports than other games do.
Well i'm a guy that likes to fail fast, looks like PS4 has really had both a major API and hardware advantage. I'm surprised Drive Club and Order1886 didn't leverage this, they may have though, it's just not listed on the slide.
Unfortunately, no one knows the extent as to how much was actually done on any of those titles. And once again, optimization time must have been fairly limited back then. 5 different platforms etc, try to make it for launch date.The PS4 version of battlefield 4 runs at 900p and Xbox One at 720p....you would think there would be even more of a gap though wouldn't you?
900p is 56% more pixels than 720p. That's more than the theoretical difference in GPU power.The PS4 version of battlefield 4 runs at 900p and Xbox One at 720p....you would think there would be even more of a gap though wouldn't you?
900p is 56% more pixels than 720p. That's more than the theoretical difference in GPU power.
Unfortunately, no one knows the extent as to how much was actually done on any of those titles. And once again, optimization time must have been fairly limited back then. 5 different platforms etc, try to make it for launch date.
I'm going to wait to see later releases of Frostbyte (by DICE) to compare how far that engine has come.