Xbox Scorpio Hardware Prediction Thread [April 2017 Revision]

Jay · Apr 4, 2017

It's funny that their doing things in reverse.
Before:
Games Console -> Games & Media Console

Now:
Games & Media Console -> Games Console

End up missing out on a couple sales each side now.

For dash with the added power, can replace the static game tile back to how it was, and hdmi-in pip would be additional ways to set it apart. But as said, they seem to be rolling everything back

mrcorbo · Apr 4, 2017

Jay said:
ps4 still suffers contention as os, cpu, gpu all share the same gddr5

To what degree does PS4 (and Pro) "suffer" due to contention because the CPU and GPU are both pulling from the same pool of bandwidth and capacity and using the same physical interfaces? If contention is a major issue with the unified setup in PS4 and Pro, is creating separate pools of memory with different physical interfaces the best, most efficient, solution to the problem? Isn't presenting a bigger pool of unified memory with a wider interface (allowing more accesses in-flight at any given time) also a solution to that problem?

For some rough perspective, RX480 @ 5.5TF is served by 256GB/s of memory bandwidth on a 256-bit bus. For the sake of this, let's assume that memory bandwidth needs scale proportionately with the TF rating and aren't widely different between Polaris and the Scorpio GPU arch. 6.0TF/5.5TF*256GB/s gives us 280GB/s of memory bandwidth needed to serve a 6.0TF GPU. Scorpio has 40GB/s on top of this and, assuming a 384-bit interface to the unified GDDR5, the benefits of the wider bus in allowing more concurrent accesses. Is adding an additional 64-bit bus so you can have DDR4 going to deliver enough of an additional benefit to be worthwhile?

Edit: To add another data point: The 8C/16T@3.2Ghz (3.7 Turbo) Core i7 -6900k has 76.8 GB/s of memory bandwidth when paired with 4 channels of DDR4-2400.

iroboto · Apr 4, 2017

mrcorbo said:
To what degree does PS4 (and Pro) "suffer" due to contention because the CPU and GPU are both pulling from the same pool of bandwidth and capacity and using the same physical interfaces? If contention is a major issue with the unified setup in PS4 and Pro, is creating separate pools of memory with different physical interfaces the best, most efficient, solution to the problem? Isn't presenting a bigger pool of unified memory with a wider interface (allowing more accesses in-flight at any given time) also a solution to that problem?

For some rough perspective, RX480 @ 5.5TF is served by 256GB/s of memory bandwidth on a 256-bit bus. For the sake of this, let's assume that memory bandwidth needs scale proportionately with the TF rating and aren't widely different between Polaris and the Scorpio GPU arch. 6.0TF/5.5TF*256GB/s gives us 280GB/s of memory bandwidth needed to serve a 6.0TF GPU. Scorpio has 40GB/s on top of this and, assuming a 384-bit interface to the unified GDDR5, the benefits of the wider bus in allowing more concurrent accesses. Is adding an additional 64-bit bus so you can have DDR4 going to deliver enough of an additional benefit to be worthwhile?

slightly OT?
I thought contention was caused by read/write switching / seek time? I didn't think the size of the bus was the cause for heavy contention, though that also makes sense to me. But I don't see why the bandwidth of 176 would drop to 140 if other factors weren't involved?

mrcorbo · Apr 4, 2017

iroboto said:
slightly OT?
I thought contention was caused by read/write switching / seek time? I didn't think the size of the bus was the cause for heavy contention, though that also makes sense to me. But I don't see why the bandwidth of 176 would drop to 140 if other factors weren't involved?

Providing background for my speculation that there will be just the one pool of GDDR5. So, OnTopic, I think? I'll let the mods decide. Not sure what you mean by the bolded.

iroboto · Apr 4, 2017

mrcorbo said:
Providing background for my speculation that there will be just the one pool of GDDR5. So, OnTopic, I think? I'll let the mods decide. Not sure what you mean by the bolded.

infamous slide. Applies to all shared pools of memory. This is effective bandwidth numbers here, so 176GB/s (theoretical PS4 max) is probably all reads or all writes. 140GB/s is some sort of optimized mixture I imagine. But once the CPU gets in there, the bandwidth begins to drop it drastically, I assumed this was a result of additional small read/writes being inserted in there and taking priority from the GPU.

mpg1 · Apr 4, 2017

http://www.neogaf.com/forum/showpost.php?p=233345929&postcount=373

Clukos · Apr 4, 2017

12 gigs GDDR5
2880 GCN Vega x 1050Mhz clock speed
4 core/8 thread zen 2.0-2.4Ghz (that's a maybe but definitely possible)

449-499 $

Jay · Apr 4, 2017

iroboto said:
infamous slide. Applies to all shared pools of memory. This is effective bandwidth numbers here, so 176GB/s (theoretical PS4 max) is probably all reads or all writes. 140GB/s is some sort of optimized mixture I imagine. But once the CPU gets in there, the bandwidth begins to drop it drastically, I assumed this was a result of additional small read/writes being inserted in there and taking priority from the GPU.

thanks, tried to quickly find it when on break but couldn't.

but considering 90% of us here currently believe that its using vega and vega has HBCC as part of it selling point, which can also automatically do the paging between small fast memory pool to slower bigger pool, I'm amazed I'm one of the very few that likes the couple gig gddr5 and lots of ddr4 layout.

mrcorbo · Apr 4, 2017

iroboto said:
infamous slide. Applies to all shared pools of memory. This is effective bandwidth numbers here, so 176GB/s (theoretical PS4 max) is probably all reads or all writes. 140GB/s is some sort of optimized mixture I imagine. But once the CPU gets in there, the bandwidth begins to drop it drastically, I assumed this was a result of additional small read/writes being inserted in there and taking priority from the GPU.

Ah, I see. Makes sense, especially since it's been said that CPU operations are more latency-sensitive, so probably are allowed to cut in line and, outside of SIMD instructions, are not going to tend to be working on large data sets. I would think that the additional "slots" enabled by the wider bus would help here, though. I could also see having a virtual partitioning of the memory resources by creating different affinities for different regions of memory depending on whether the CPU or GPU is writing the data. My thinking being that this will cause CPU data to tend to cluster around certain memory addresses and the specific hardware that services them. You have a better ability to give the GPU or CPU more memory resources if they need them, though, then if you had a fixed partitioning based in hardware.

mrcorbo · Apr 4, 2017

Jay said:
thanks, tried to quickly find it when on break but couldn't.

but considering 90% of us here currently believe that its using vega and vega has HBCC as part of it selling point, which can also automatically do the paging between small fast memory pool to slower bigger pool, I'm amazed I'm one of the very few that likes the couple gig gddr5 and lots of ddr4 layout.

Mainly because developers say things like this (quoted from this DF article)

Marco Thrush - CTO Bluepoint Games said:
Speaking to Marco Thrush, CTO and owner of Bluepoint Games (known for its excellent Uncharted: the Nathan Drake Collection remaster project), brings us closer to an understanding. Though PS4 and Xbox One bear many overlaps in design with modern PC architecture - more so than earlier console generations - direct comparisons aren't always appropriate. Integrating CPU and GPU into one piece of silicon, then giving both components access to one large pool of memory is an example of how PC tech has been streamlined for its migration into the new wave of consoles. It offers fundamental advantages, but it comes with challenges too.

"The amount of AF [anisotropic filtering] has a big impact on memory throughput," Thrush says. "On PCs, lots of memory bandwidth is usually available because it's fully isolated to the graphics card. On consoles with shared memory architecture, that isn't quite the case, but the benefits you get from having shared memory architecture far outweigh the drawbacks."

Edit: Wanted to add to this a bit.

In the case of discrete cards, the additional pool of memory is a sunk cost. It is present whether the GPU is set up to use it or not, so the calculus of leveraging that resource is different than it is for a console which would have to factor in the cost of adding that secondary pool vs. expanding the capabilities of the unified pool. The HBCC is also very important in the HPC and content creation markets where being able to deal with huge data sets in a performant way again changes the calculus. AMD stuck a freaking SSD on a graphics card to address this prior, so something like the HBCC is sorely needed here.

Karamazov · Apr 4, 2017

Exclusive Scorpio talking to PS4 pro

ProspectorPete · Apr 4, 2017

For the record: I changed my GPU vote to Polaris

Jay · Apr 4, 2017

mrcorbo said:
Mainly because developers say things like this (quoted from this DF article)

Edit: Wanted to add to this a bit.

In the case of discrete cards, the additional pool of memory is a sunk cost. It is present whether the GPU is set up to use it or not, so the calculus of leveraging that resource is different than it is for a console which would have to factor in the cost of adding that secondary pool vs. expanding the capabilities of the unified pool. The HBCC is also very important in the HPC and content creation markets where being able to deal with huge data sets in a performant way again changes the calculus. AMD stuck a freaking SSD on a graphics card to address this prior, so something like the HBCC is sorely needed here.

Phone playing up so can't reply properly
I would say that the gpu memory is only a sunk cost if the overall cost of having it is greater than not.
If you could have overall more memory for the same then it's a win in that regards.
12gb gddr5 may cost more than 4gb gddr5 + 10-12gb ddr4. (unsure of costings)
HBCC can be used as you've highlighted, but that's not it's only use as demonstrated by amd themselves with the Deus x demo.
They also have split memory on their road map for laptop class chips i believe.
Usually for console single pool would be the ideal, but that was prior to things like hbcc, also wouldn't stop dev manually handling it if they really felt the need.
I think handling 2 pools in the past was a lot more work and trouble than it would be now.
2 days to go, gonna be one of the craziest reveals, more so due to reactions

mrcorbo · Apr 4, 2017

Jay said:
Phone playing up so can't reply properly
I would say that the gpu memory is only a sunk cost if the overall cost of having it is greater than not.
If you could have overall more memory for the same then it's a win in that regards.
12gb gddr5 may cost more than 4gb gddr5 + 10-12gb ddr4. (unsure of costings)

Think about it, though. To get the bandwidth you want you either have to go wide or go fast. Using less memory to get to the same bandwidth would therefore require you to:

Use smaller chips on the same size bus, meaning you're not getting as much as you could out of the investment in that bus and the mainboard real estate that the chips and traces take up, and then adding *another* bus and taking up yet more mainboard real estate with your secondary memory chips and traces
Run faster, more expensive, chips on a smaller bus to get your bandwidth (paying more per GB for that memory) so some of your memory can be cheaper.

I can't say it's impossible that either of those approaches ends up cheaper, but it seems a little shaky to me.

iroboto · Apr 4, 2017

mrcorbo said:
Think about it, though. To get the bandwidth you want you either have to go wide or go fast. Using less memory to get to the same bandwidth would therefore require you to:

Use smaller chips on the same size bus, meaning you're not getting as much as you could out of the investment in that bus and the mainboard real estate that the chips and traces take up, and then adding *another* bus and taking up yet more mainboard real estate with your secondary memory chips and traces

Run faster, more expensive, chips on a smaller bus to get your bandwidth (paying more per GB for that memory) so some of your memory can be cheaper.

I can't say it's impossible that either of those approaches ends up cheaper, but it seems a little shaky to me.

and wide tends to be the story of consoles. Wide = less power. Less power = less heat. Less heat = more stability. More stability = more reliability.
just harder to code for

ProspectorPete · Apr 4, 2017

If it IS Polaris and Jaguar (which I am 93% certain of), then MS will have DF showing examples like CrackDown3 where the Xbox Scorpio is put next to a i7, with Scorpio offloading physics calculations to it's GPU and beating the i7 on frame rate when calculating physics.
So they will push (ironically) the compute angle, I expect the design to focus on compute as well, just as PS4 did 4 years earlier, even if that didn't really come into fruition, it will be used to combat the 'weak' Jaguar perception

Deleted member 11852 · Apr 4, 2017

ProspectorPete · Apr 4, 2017

DSoup said:
93%.

There is a 7% chance it uses Vega instead of Polaris ;-)

Cyan · Apr 4, 2017

VitaminB6 said:
I was thinking 10 GB gaming 2GB OS. 2 GB of DDR4 dedicated to the CPU to avoid contention, 8 GB GDDR5 for the GPU, and 2 GB DDR4 dedicated to the OS. The two pools will be unified virtually so to the developer it will look as though it is one pool of ram. I have fairly limited knowledge so I could be way off here Most of my knowledge comes from reading this forum for the last three years. I always enjoy your posts Shifty.

Xbox Shifty.

Now seriously. What about the name? I have a feeling it is going to be called Xbox 10 or Xbox Eco -because of the ecosystem-.

mpg1 · Apr 4, 2017

ProspectorPete said:
If it IS Polaris and Jaguar (which I am 93% certain of), then MS will have DF showing examples like CrackDown3 where the Xbox Scorpio is put next to a i7, with Scorpio offloading physics calculations to it's GPU and beating the i7 on frame rate when calculating physics.
So they will push (ironically) the compute angle, I expect the design to focus on compute as well, just as PS4 did 4 years earlier, even if that didn't really come into fruition, it will be used to combat the 'weak' Jaguar perception

You're suggesting Microsoft went to Digital Foundry of all places to attempt to pull the wool over the eyes on people of the specs/performance? Most who even know about this reveal or watch/read Digital Foundry are the most hardcore of fans and tech enthusiasts.

I am hoping for a simple breakdown and analysis of the specs whatever they end up being.

Xbox Scorpio Hardware Prediction Thread [April 2017 Revision]

Can the B3D gestalt accurately predict Scorpio hardware?

CPU: Jaguar based

CPU: Zen based

GPU: Polaris based

GPU: Vega based

HDD: 2.5"

HDD: 3.5"

SSD instead of HDD

SSD/flash in addition to HDD

RAM: 12 GBs

RAM: > 12 GBs (including slower RAM pool)

Foo Fighter

Daft Funk

Foo Fighter

Daft Funk

Bloodborne 2 when?

Foo Fighter

Foo Fighter

Foo Fighter

Daft Funk

Deleted member 11852

Guest

orange

Similar threads