DirectX 12: The future of it within the console gaming space (specifically the XB1)

Is someone going to explain what UAV's are ?

GPU resources, like buffers/textures, can be accessed in an unordered way (from any position, direction etc) for reading/writing to in a multi-threaded way ..

A "View" lets you access these "unordered resources" in " general purpose memory formats so that they can be shared by multiple pipeline stages" ..


There are other GPU resources that have "Views" that let you access them..

eg.

Shader Resource (SR) --> Shader Resource Views (SRV)
Render Target (RT) -> Render Target View (RTV)
Shader Buffer (SB) --> Shader Buffer View (SBV)
Unordered Access Resources (UA) -> Unordered Access Views (UAV)
 
You could start by reading Digital Foundry's interview with 4A games after they finished Metro Redux.

Always love those interviews with that guy.
Will be interesting to see how much improvement the Xb1 will gain and if it leapfrogs PS4.
What will/can Sony do with GNM in the future, is it already complete feature wise compared to DX12 or do they need to add to etc.
Fun times coming on these boards.
Is it just me or is there always more info/discussion on the Xb api, feels like it when thinking back upon the last gen also.
 
Last edited:
Will be interesting to se how much improvment the Xb1 will gain and if it leapfrogs PS4.
What will/can Sony do with GNM in the future, is it already complete feature wise compared to DX12 or do they need to add to etc.
If it's already feature complete, XB1 cannot possibly leapfrog PS4. You can't get make a >40% gain in hardware performance from an API change that isn't unique to your platform. If it's not already feature comparable, Sony will bring it in. And the 40% GPU advantage will still be there. If anything, the changes in graphics APIs favour PS4's advancement because it's more CPU bound than XB1, and alleviating that CPU pressure will free up a larger percentage of CPU resources.

But as far as we know, there's no much to address here on the CPU side. Consoles have been able to use tens of thousands of drawcalls, so have clearly been doing something different to PC. ;)

Is it just me or is there always more info/discussion on the Xb api, feels like it when thinking back upon the last gen also.
The XB API is pretty much public as it parallels DirectX on PC. Any talk on DX on PC was thus relevant for XBox. Sony's APIs are 100% behind NDAs and we have to wait for Sony/devs to discuss it.
 
If it's already feature complete, XB1 cannot possibly leapfrog PS4. You can't get make a >40% gain in hardware performance from an API change that isn't unique to your platform. If it's not already feature comparable, Sony will bring it in. And the 40% GPU advantage will still be there. If anything, the changes in graphics APIs favour PS4's advancement because it's more CPU bound than XB1, and alleviating that CPU pressure will free up a larger percentage of CPU resources.

Not claiming the hardware will improve, but if there is overhead in both platforms tools/sdk, and DX12 now shrunk the bloat/overhead for Xb1 it will gain on PS4.
Question is, can it catchup and/or get ahead. It might not be "physically" possible, but if the PS4 SDK/tools also have some amount of overhead/bloat, then it will be some changes at least. Also the extra fast ram thingy (esram/edram?) might alleviate some of that 40% you mention, when DX12 arrives.
Anyway, it will be interesting to see how the platforms compares after DX12 arrives on Xb1 and it will be interesting to see if Sony implements some features from DX12.
 
Last edited:
There is no proof yet that X1 is on a completely different Feature Level then PS4 therefore what Shifty writes is correct, if there no difference hardware wise then any difference can be made up by Sony just by developing them, should they choose to. However, this is the part where my agreement begins to diverge; the notion that the API changes will benefit both systems symmetrically.

With the introduction of DX12, it will undoubtedly change the method in which games are programmed; as such bottlenecks on the system are expected to move. Past performance in games will not be good indicators for the future, and past bottlenecks may not be future bottlenecks. DX12 shows a strong movement towards reducing CPU overhead whilst simultaneously increasing GPU efficiency/CU saturation by reducing the amount of time the CUs will stay idle waiting for work to arrive. However GPU efficiency comes at a cost, it is not free, improving GPU efficiency has three additional requirements, the first is increased sustained bandwidth to feed the CUs to do more work, the second is the heat/longevity of the silicon, and lastly the increased power requirement. How PS4 and XO approach these limitations is actually different, as such since their approaches are different they cannot equally gain the same amount. My position is that Xbox One is better designed in these aspects for DX12 than PS4 is, thus has more to gain moving to DX12 than PS4 would. [Please note this does not mean I expect X1 to out perform PS4, but I do believe the gap will shrink]

When looking at available bandwidth, this AMD paper [http://research.cs.wisc.edu/multifacet/papers/micro13_hsc.pdf] specifically notes
Page 4: Specifically, for a GPU composed of 32 CUs, we found that 700 GB/s eliminated the memory bandwidth bottleneck for all our workloads.
Doing some math will indicate that @ 853 Mhz at 12 CUs, the required bandwidth to remove the bandwidth bottleneck for CUs will be approximately ~240 GB/s. This happens to be approximately the total complete bandwidth of Xbox One (192 GB/s + 67GB/s). When looking at PS4 we have a max theoretical of 176GB/s, and as such, bandwidth will become a bottleneck for all 18CUs. Math will show that there is only enough bandwidth to fully saturate 9CUs. From a design perspective Xbox One is better built for sustained high bandwidth work, the less idle time the more performance you can obtain from the Xbox. So much so that in it's perfect impossible form it can fully saturate 12CUs vs PS4s 9 CUs. This is not a strong argument to compare performance between the two systems, but this is a strong argument on how their systems are designed for the new API.

The second factor is thermal. The GPU efficiency is bound to increase as idle time for the CUs drop, this also has a side effect of increased thermals. Xbox One reportedly had very loud fans for developer units because they did not have sensors ready so they ran the fans at 100%. To date no game has ever caused the Xbox to exhibit more noise such that it becomes more audible. A simple look at the hardware for Xbox One and we see an ethusiast level cooler for a SoC that requires significantly less power than its PC counterparts.
The Xbox One does use less energy than the PS4 when it's playing games (112 watts versus 137 watts) or streaming videos (74 watts versus 89 watts). Polygon, http://www.nrdc.org/media/2014/140516.asp
The PS4 does not have such a cooler, today is exhibits increased fan noise when playing specific games. The question becomes, how well tested is the hardware for this level of GPU efficiency?
Can it take the heat? Excessive heat and voltage will cause the lifespan of a chip to die, so are the chips ready for 7 years of torture? I'm confident that XBox One can its heatsink cooler cools only the SOC and a majoirty of its bandwidth comes from esram which is centralized on the SOC, I'm not sure if the PS4 can, its bandwidth is off SOC, thus increased pressure will cause RAM to heat up as well (there are no heat sinks on GDDR5 like on PC parts). If you like to overclock you know Prime95, and when running Prime95 if stability is an issue you need to both underclock, increase voltage, or decrease speeds on the bus, on RAM, essentially where ever the weakness exists. In this scenario, I believe once again Xbox One is better designed for the high saturation games we should see coming with DX12.

Lastly the final factor is power. Not only does Xbox One use less power than PS4, but it also has a large dedicated external power brick vs the internal power supply of the PS4. Overtaxed power supplies are a less common cause to crashing in Prime95 but it still can be a factor. There are just very few programs that ever push your chips to 100% but power supplies that are pushed to its limits over sustained periods of time the increase heat in the power supply can cause voltage drops resulting in crashes and general instability. The Xbox One has an external brick with it's own dedicated fan vs PS4 internal with its shared SoC fan. So once again from a power perspective we see that Xbox One is ready for DX12, PS4 is not as prepared.

When we consider these points together, it is clear to me at least that MS has designed the console to be ready for DX12. They knew it was coming, they knew what it would it do to the silicon. All areas of the console have been beefed up to prepare for it and these design choices reflect that DX12 was meant to be the endstate for Xbox One; ultimately we should see performance of the console improve greatly as engines move towards DX12 based design. For PS4, I don't see a console that is ready for the additional burden. Sony could choose to implement all the features found in DX12 but they must determine whether or not the hardware is suited for it, for the remaining lifespan of the console. Will their units begin to fail because the way newer games are designed. If so, then the way to control it is to not allow full access to these features, optimize the API only as much as their console can take. The newer games will still drive PS4 to its limits, but the limits of what Sony wants (whatever that may be).

edit: tl;dr; You can't put something into the SDK and then later take it out at another date. If they had never designed the console with this particular level of stress in mind, then they need to be very thorough on testing on what to enable. Sony doesn't want to be in a situation where all the Gen 1 consoles start dying by year 4-5. This isn't applicable to Xbox One however. They are fully aligned for this, and have stress tested accordingly.
 
Last edited:
It's complicated, and no single figure can represent the available the bandwidth. There's no meaningful average, save perhaps the general case aggregate of mean accessed BW's in existing titles.
 
I think iroboto is right when he says the Xbox One probably has more available bandwidth/CU than the PS4. Providing it's been programmed to make effective use of the ESRAM. I'd hazard a guess that the benefits rapidly drop the closer to the peak bandwidth you get, so it doesn't mean the the Xbox will outperform the PlayStation.
 
Is that fair/valid as less than 0.5% of the memory is capable of 192gb/s ?
wouldn't the average bw be just under 68gb/s ?

That's a good question. I'm certain that it's unlikely the average bandwidth for Xbox One is only slightly larger than 68GB/s. That would put it in approximately Radeon 4850 territory. You also have to factor in CPU contention with that part of the memory as well, it would be operating well below 30 GB/s then. The performance in titles today are likely using much more therefore esram utilization must be better than we expect. Peak bandwidth in game code has been recorded at ~140 GB/s in esram. So it comes to question how often that number is sustained.

It only occurs to me that this is like a reverse Geforce 970 situation.
 
Last edited:
When we consider these points together, it is clear to me at least that MS has designed the console to be ready for DX12.
That's an interesting theory. A counterpoint is that Sony were/are expecting GPU utilisation to shoot up thanks to compute. If AMD provided accurate data on high-utilisation, it should have been factored into the design. Of course, mistakes happen and there's the possibility that Sony didn't really design with suitable future-proofing for changes in software and hardware utilisation.
 
That's a good question. I'm certain that it's unlikely the average bandwidth for Xbox One is only slightly larger than 68GB/s. That would put it in approximately Radeon 4850 territory. Y.

why is that a problem or surprising? XB1 GPU is what 852MHz and 768sps? a 7770 (250X) is 1000MHz 640sps and 72GB/s with no ESRAM
 
That's an interesting theory. A counterpoint is that Sony were/are expecting GPU utilisation to shoot up thanks to compute. If AMD provided accurate data on high-utilisation, it should have been factored into the design. Of course, mistakes happen and there's the possibility that Sony didn't really design with suitable future-proofing for changes in software and hardware utilisation.
Yes and I agree with this - we will unfortunately never know as GNM will stay under lock down. My only fear is that if they stressed tested for a specific profile and tuned the entire system around it, and now there is an even higher performance profile available to them - then careful considerations need to be done before implementing it; but there still could be room for it however it's up to Sony.

Another counterpoint is if GNM had built in DX12 features already and that was the profile they tested against. However in this scenario I would expect the gap between the two consoles to be larger than a canyon. And instead we see the patterns of a performance gap particularly 900p vs 1080p - which is accountable by hardware alone. I don't think GNM is quite yet DX12, so the question becomes how stress test profiles are managed.
 
why is that a problem or surprising? XB1 GPU is what 852MHz and 768sps? a 7770 (250X) is 1000MHz 640sps and 72GB/s with no ESRAM
Well if esram wasn't used at all it would be operating at around 30GB/s or less as it fights with the CPU for memory access.
If esram is only being leveraged to 30GB/s to 68GB/s of it's 192GB/s then likely you're not going to obtain a lot of performance from the hardware. You would be I guess in theory obtaining close to 7770 level performance.

My interest is piqued though, I am curious as to how much bandwidth is really required to run some of these games.

I'll be careful with how I word this, not to incite forum wars, but if Order1886 is the definitive graphics title to PS4 and resolution is approximately greater than 900p, Quantum Break which looks similar, but perhaps not as refined, but still operating at 900p, there's no way that Xbox is pulling that off operating at 1/4 the bandwidth of PS4. Something is missing in the equation so it's clear esram is performing for the system.
 
Last edited:
Another counterpoint is if GNM had built in DX12 features already and that was the profile they tested against. However in this scenario I would expect the gap between the two consoles to be larger than a canyon. And instead we see the patterns of a performance gap particularly 900p vs 1080p - which is accountable by hardware alone. I don't think GNM is quite yet DX12, so the question becomes how stress test profiles are managed.
I consider that a given. 'DX12', this new style of graphics hardware implementation, starts with Mantle a year or so ago. It didn't exist when XB1 and PS4 launched and likely doesn't exist in their current SDKs.
 
MS tested the 360S against a CPU and GPU power virus, to make sure it could handle any code thrown at the new SoC. One would hope they were just as rigorous with XBone. That's a huge cooler and fan for a chip with power draw that sees Intel speccing a shite blue aluminium noise-generator.

Is that fair/valid as less than 0.5% of the memory is capable of 192gb/s ?
wouldn't the average bw be just under 68gb/s ?

Well, probably only 1% of the memory needs more than 68 GB/s. With only 32MB of esram, Xbone seems to fall into that uncomfortable 0.4 ~ 1% need that it can't quite meet.

PS4, OTOH, has many GBs of GDDR5 burning power and $ where DDR3 would easily suffice.
 
I consider that a given. 'DX12', this new style of graphics hardware implementation, starts with Mantle a year or so ago. It didn't exist when XB1 and PS4 launched and likely doesn't exist in their current SDKs.
Perhaps - but then there is this slide that likely MJP won't talk about ;)
yq4pkk2.jpg


fixed, moved the GDC slide to imgur. Redtechgaming didn't link the direct link I made.

edit: There appears to be some form of improved multi-threaded command buffer generation - the question is how much improved.
 
Last edited:
Back
Top