Digital Foundry Microsoft Xbox Scorpio Reveal [2017: 04-06, 04-11, 04-15, 04-16]

Update to orginal Digital Foundry article:
http://www.eurogamer.net/articles/digitalfoundry-2017-project-scorpio-tech-revealed

[UPDATE 7/4/17 20:44: Microsoft's Andrew Goossen has been in touch to clarify that D3D12 support at the hardware level is actually a part of the existing Xbox One and Xbox One S too. "Scorpio builds on the Command Processor capability present in the original Xbox One," we're told. "Our implementation of D3D12 supports all Xbox Ones, and games have already shipped that use it. When a game using D3D12 starts up, we reprogram the GPU's Command Processor front-end. The 50 per cent CPU rendering overhead improvement was reported by shipping games. The amount of win is dependent on the game engine and content, and not all games will see that size of improvement. Scorpio's Command Processor provides additional capability and programmability beyond what Xbox One/Xbox One S can do. We plan to take advantage of this in the future."]

Seems DX12 at the hardware level was something that was already in Xbox One.



I thought DX12 into GPU was the "killer customization" of Scorpio, now we find out this thing was already in Xbox One/One S
 
Perhaps a simple litmus test is to perform a dx12 draw call test on the same setup but a Radeon 7970 vs 480x. Polaris should have released with that customized command processor, and 7970 would have released with one designed for dx11 (assuming this is all real).
Wouldn't that depend on whether AMD is permitted to use Microsoft's custom microcode payload in discrete products, assuming DX12 on the PC can transparently work with it?
Since it is a custom feature, Microsoft might be able to reserve it for private use or restrict it from disadvantaging other IHVs under Windows.

The troubling (?) thing is that the supposedly custom-built Jaguar cores that are in the Scorpio that show these wonderful gains because DX12 is built into an additional compute unit closer to the metal on the GPU that offloads draw functions from the CPU and shows 30% (?) gains in the Forza Tech demo so that the Scorpio runs equivalent to a GTX 1070 with PC Ultra settings at 4k only does so using ExecuteIndirect and only does so because it's a 1st party MS game. The XB1 never saw gains from its "DX12 Secret Sauce" because 3rd party developers didn't use it. What would motivate them to use it for Scorpio if they didn't for XB1?

I don't see where the DX12-optimized Jaguar concept comes from. They were re-implemented for the process node and may have some secondary tweaks, but it's the GPU's command processor and its software payload that is involved with the "hardware" D3D12 functionality.
The Xbox One's custom DX11 implementation had accelerated functions already before DX12 was brought in.
 
I thought DX12 into GPU was the "killer customization" of Scorpio, now we find out this thing was already in Xbox One/One S
No.
The killer customization of Scorpio is the profiling done to alter the specs of the overall system to support Xbox one games at 4K. It's true that what they've done could only be done at a mid gen refresh, without the data points it would be hard to profile.
 
I thought DX12 into GPU was the "killer customization" of Scorpio, now we find out this thing was already in Xbox One/One S

It might have been a misunderstanding or miscommunication between Goossen and Leadbetter. The article that provides sections of the conversation that might have lost some context, but Goossen's original quote prior to the update is ambiguous in that it describes Xbox rather than Scorpio in particular, and can be interpreted to be discussing gains developers have already seen.

From http://www.eurogamer.net/articles/digitalfoundry-2017-project-scorpio-tech-revealed
"It's a massive win for us and for the developers who've adopted D3D12 on Xbox, they've told us they've been able to cut their CPU rendering overhead by half, which is pretty amazing because now the driver portion of that is such a tiny fraction," adds Goossen.

That this capability is described as being expanded with Scorpio may be in line with how GCN 3 and higher GPUs were more flexible in the microcode features they could load or update, due to microcode capacity or specific hardware features.
 
Yeah it's hard to know with the snippet about DirectX 12 in that article. Who knows in what context they were talking or what question was asked. Re-reading it it sounds like it could have been a general tech answer..
 
That this capability is described as being expanded with Scorpio may be in line with how GCN 3 and higher GPUs were more flexible in the microcode features they could load or update, due to microcode capacity or specific hardware features.
Thinking about it, the ACE/HWS addition would be new relative to XB1 and likely reduce CPU usage as it handles synchronization. I'll admit I don't know how async draw calls were handled on console, but comparable PC hardware isn't receiving support anymore. This may be what the CP optimization brought as you suggested.

It's also possible it's fixed on Scorpio as the HSA features AMD swapped in aren't relevant on Scorpio. If the implementation was somewhat final, a hardwired element, along with smaller programmable element for indirect execution or new features, might be what they described.

The async scheduling hardware would be the major DX12 hardware requirement relevant to the command processor.
 
4K televisions are being pushed by big businesses. The simple truth is that a consumer gets very little visible resolution from a 4K television than a 1080P television at ordinary viewing distances. However, eager to take people's money, big companies try to make 4K seem absolutely amazing so people will think they need something better than the television already in their living rooms. I'm trying to look out for the consumer. If consumers -- who aren't rich with money to burn on any random thing they see -- were not so manipulated by advertising and these big businesses, far fewer 4K televisions would have been sold. Part of the reason for these 4K consoles (which I'm not even convinced are powerful enough to push native 4K with zero compromises on textures or other issues) is to create another need for 4K televisions. It is a self perpetuating circle that sucks more money out of the consumer who is NOT winning (unless they are rich with money to burn).

Console buys who know what is going on need to reject these consoles. Unless MS and Sony want to dramatically discount them. If Sony and Microsoft both want to upgrade mid cycle offering consoles with maybe half the power increase of a typical refresh, the price should be reduced accordingly.

I don't know. Gamers are buying 4k monitors and expensive GTX1070 and GTX1080 gpus to drive them. Then you have people who want UHD BluRay etc. There are people that care about increased resolution. I don't know if they've all been tricked into it. That's kind of where PS4 Pro and Scorpio fit, in that high-end gamer niche. Sony pretty much explicitly said they're made pro to retain consumers that switch to PC mid-gen. Dice even says console gaming was due for a resolution upgrade in one of their GDC presentations, the one where they talk about checkerboard rendering in Mass Effect. I've seen some devs that are not that hyped about it. But that's kind of the point. There is a market for it. Some people want it. Not everyone has to. If you don't want it now, skip the "mid-gen" or whatever you want to call it and just buy the next console.

I also don't really know if microsoft has any incentive to sell 4k tvs. Sony does, but I don't think I'm cynical enough to believe that's primarily why they made PS4 Pro. I think their story about keeping people on the platform instead of migrating to pc mid-gen is pretty believable.
 
Any thoughts why they need 362sqmm for the SOC when a RX480 with 36CUs needs 232, 8 Jaguar cores should be around 12sqmm(3.1sqmm at 28nm) at 16nm, 4 extra CUs, the extra GDDR5 controller unit and small stuff we know about can't take almost 120sqmm. They say no L3 for the CPU what do we miss here? extra blown up caches for the RX480 or just more redundancy to compensate for the higher clock affecting yield?
 
3.1mm^2 for 28nm Jaguar core does not include the L2 cache. Just measure the the two quads in the chipworks photo.

IIRC, 16nmFF is slightly less dense than 14nmFF as well.

There's probably other additional hardware from MS.

I believe it was mentioned that there are 4 disabled CUs.
 
Last edited:
Some of the discussion I've seen was more against 4k+ the panel manufacturers were pushing as opposed to HDR, color depth, variable refresh, etc that are arguably more meaningful upgrades. Pixel density on 4k is pretty good except for large TVs.
I think it was against locking HDR to 4K panels. If there was a quality screen with HDR that was cheaper than the 4K variant it would sell very well.
 
Wouldn't that depend on whether AMD is permitted to use Microsoft's custom microcode payload in discrete products, assuming DX12 on the PC can transparently work with it?
Since it is a custom feature, Microsoft might be able to reserve it for private use or restrict it from disadvantaging other IHVs under Windows.
Agreed. I was thinking along the lines of the release of the Radeon 7790, which contained all the features found in Xbox One. I was under an impression that it was possible such a thing happened with Polaris. It would be next to impossible to test this feature though.

I'm not sure how a development studio would be able to confirm the increased output as a result of these customizations. I assume one way could be that to compare a PC spec build of Scorpio, versus the actual dev kit and see where they land in draw call performance.
 
3.1mm^2 for 28nm Jaguar core does not include the L2 cache. Just measure the the two quads in the chipworks photo.

IIRC, 16nmFF is slightly less dense than 14nmFF as well.

There's probably other additional hardware from MS.

I believe it was mentioned that there are 4 disabled CUs.

There are also disabled units in the RX480 so that's pretty much a wash at least if we assume the same kind of redundancy. Even with 2*2MB L2 extra that's still a lot unexplained room.
 
There are no disabled units in the 480. There are only 36CUs.

Define "a lot" ? It's known that MS added some additional bits. A high speed GDDR5 bus isn't going to be small either, so +50% is going to be considerable.

Perhaps there is some amount of dead space for thermal/power density reasons as well. See Durango/Liverpool layouts.
 
"Those are the big ticket items, but there's a lot of other configuration that we had to do as well," says Goossen, pointing to a layout of the Scorpio Engine processor. "As you can see, we doubled the amount of shader engines. That has the effect of improvement of boosting our triangle and vertex rate by 2.7x when you include the clock boost as well. We doubled the number of render back-ends, which has the effect of increasing our fill-rate by 2.7x. We quadrupled the GPU L2 cache size, again for targeting the 4K performance."

Xbox One has 4 Render Backend right?
Scorpio now has 8 Render Backend...But why?
 
Wouldn't that depend on whether AMD is permitted to use Microsoft's custom microcode payload in discrete products, assuming DX12 on the PC can transparently work with it?
Since it is a custom feature, Microsoft might be able to reserve it for private use or restrict it from disadvantaging other IHVs under Windows.

This quote is from Andrew Goossen 2013 interview with DF:

We also took the opportunity to go and highly customise the command processor on the GPU. Again concentrating on CPU performance... The command processor block's interface is a very key component in making the CPU overhead of graphics quite efficient. We know the AMD architecture pretty well - we had AMD graphics on the Xbox 360 and there were a number of features we used there. We had features like pre-compiled command buffers where developers would go and pre-build a lot of their states at the object level where they would [simply] say, "run this". We implemented it on Xbox 360 and had a whole lot of ideas on how to make that more efficient [and with] a cleaner API, so we took that opportunity with Xbox One and with our customised command processor we've created extensions on top of D3D which fit very nicely into the D3D model and this is something that we'd like to integrate back into mainline 3D on the PC too - this small, very low-level, very efficient object-orientated submission of your draw [and state] commands.

http://www.eurogamer.net/articles/digitalfoundry-the-complete-xbox-one-interview

So MS wanted this custom feature on PC side, too.

There are no disabled units in the 480. There are only 36CUs.

Define "a lot" ? It's known that MS added some additional bits. A high speed GDDR5 bus isn't going to be small either, so +50% is going to be considerable.

Perhaps there is some amount of dead space for thermal/power density reasons as well. See Durango/Liverpool layouts.

If I remember correctly Xbox One has 2MB SRAM between two cluster of Jaguar CPUs. This might be the case for Scorpio too but at higher capacity (e.g. 4/8MB or higher because of 4k contents).
 
Last edited:
A Zen and Vega combination would have represent pushing the cutting edge -- at least a little. We haven't seen any console maker incorporate anything revolutionary or highly novel since the PS3 launched with the CELL processor. Since then, we see consoles taking little baby steps forward without any of the main players (Sony, Microsoft, or Nintendo) taking a single risk. The Scorpio is a disappointment for me because it is a further sign of the new trend of console makers being perfectly happy with mediocrity.

If Scorpio had added SOMETHING significant to the current combination of the Jaguar and Polaris, I wouldn't be so irritated. Instead, they did nothing.

Happy with mediocrity? Where did Cell get Sony? The reality is that regardless of how cutting edge you as a console manufacturer may want to be, you can't step up every 4-8 years and expect to rival the latest and greatest from intel/AMD/nvidia with tech that's meant for a $300-$500 device.

Smart engineering in the console space went a long way in the 80s-90s when the general purpose CPU was the primary workhorse for PC rendering and gpus were a new thing centered around small companies mostly servicing a fledgling market. But we are past those times and now we have a few companies that collectively generate billions annually by producing processors specifically designed for gaming. Showing up every few years to outsmart those who spent everyday for decades trying to outsmart each other isn't a sound strategy. It's smarter to appropriate their designs instead trying to invent tech thats more performant while being cheaper. Because in the long run, the former strategy will prove to more fruitful than the latter.
 
Nice DF article. It certainly seems to me that Microsoft has made some smart choices without complicating things unnecessarily. It is all about making the right compromises in a Console within a silicon and cost budget. I might very well be in the market for one for the family. :yes:
 
Is it just me who thinks MS knocked it out of the park features wise? Not so much the hardware itself, don't care about that...but their commitment to making a statement on making sure all of the library is treated with consistency and respect with that more capable hardware is great to see.

Is this the advantage of being a software oriented company over a hardware oriented one?
 
Back
Top