Next Generation Hardware Speculation with a Technical Spin [2018]

Status
Not open for further replies.
The Pro came out because there was a market for a better PS4, and the marketing wasn't 'VR' or explicitly '4K', and the reasons given were to offer a higher tier to stop bleeding to PC. Unless you can present hard numbers correlating 4K/VR with Pro sales, it's safe to assume it'd likely sell the same numbers without the existence of 4K TVs and VR.
Agreed, 1080p@30fps was starting to look very dated compared to PC but both console providers missed the mark. IMO The best way to prevent bleeding to PC is to encourage solid 60 FPS modes and provide the necessary hardware (GoW is just shy of greatness). VRR would be the icing on the cake. I would much prefer to play the console versions of most games but I really have a hard time with 30 FPS. H:ZD wasn't too bad, although I never went back to finish side quests or for NG+ and 30 FPS has a lot to do with that. Once I experienced the story, grinding again with constant blur wasn't worth it. GoW high-res mode gives me a headache, so I deal with the judder in the high-perf mode.

I think there is a decent market for "elite" console SKUs with larger SoCs or at least higher APU/memory clocks and proper cooling. I don't see it being any smaller than the high end GPU market and Nvidia moves a lot of 1080s. I would certainly pay $500-700 for a rock solid 60 FPS experience.
 
Last edited:
Agreed, 1080p@30fps was starting to look very dated compared to PC but both console providers missed the mark.
With launch hardware?
I don't see it being any smaller than the high end GPU market and Nvidia moves a lot of 1080s.
How many 1080s are used for gaming? Steam survey says about 8% for 1070 and 1080's. 8% of 80 million Steam accounts is 6.4 million, which is about 4-5% of a total 150 million console generation install base.
I would certainly pay $500-700 for a rock solid 60 FPS experience.
It would probably have to cost a lot more than that, $700minimum, and history shows $700+ consoles don't sell.
 
With launch hardware?
With refresh hardware. I think Sony did a good job with launch hardware for the price. For the refresh clearly there were CPU tech constraints. I'm not saying they did a bad job, I still bought a PS4 Pro and I'm glad to have it. I'm saying if the goal was to prevent bleeding to PC, I don't see how it achieves that. People weren't migrating to PC for 4K. 4K just happened to be the main marketable improvement for both Sony and MS without a newer CPU in the mid-gen SKU.
How many 1080s are used for gaming? Steam survey says about 8% for 1070 and 1080's. 8% of 80 million Steam accounts is 6.4 million, which is about 4-5% of a total 150 million console generation install base.
I don't have figures but I would suspect most 1080s are absolutely used for gaming, as mining tends to scale better on multiple smaller GPUs. I should have been more clear about the TAM. I would count 970, 980, 980ti, and 1060 or greater as well. Chances are those people have a CPU that can handle 1080p@60. Don't forget that China also skews Steam stats so those percentages of the real TAM are likely higher.
It would probably have to cost a lot more than that, $700minimum, and history shows $700+ consoles don't sell.
Why would it have to cost more than a "high settings 1080p@60" PC BoM? Also, $700 isn't what it used to be. As for $700 consoles not selling, it would not be the only SKU like last gen where 360 had a superior architecture for half the price.
 
Well the $700 price was for a launch console that's significantly more powerful. A refresh wouldn't need to be as pricey. Still, cost of a new machine includes the R&D to develop it. If you're selling 70 million units, that R&D gets spread pretty thin. If you're selling 20 million units (PS4Pro), cost per unit is higher so reusing existing architecture helps. Creating a completely different box that's compatible using different architecture is muy expensivo, and if you're only going to sell 10 million, you need to pass those costs on as a notable addition to the price tag. Common economics also tells us that the more something costs, the less it sells, so a more powerful PS4Pro costing more would sell even less than the current PS4, making it even more niche.
 
Fair enough, Sony probably did the smart thing with the Pro. A better CPU wasn't available and additional CUs or higher GPU clocks without increased bandwidth makes little sense. Xbox One X is kinda in no-man's land. More expensive, CPU limited for 60 FPS, and still not a viable PC replacement. I wonder if they should have waited a year or two and started fresh with Zen. Could have gotten the jump on the next gen, like they did with 360.
Good news is next gen should have a great CPU baseline. Later upgrades can lead to higher resolutions/settings for those who want them but we can all enjoy 60 FPS with dynamic scaling. Pretty please with sugar on top :D.
 
Last edited:
Curious what everyone’s thoughts on are on the assumption of FP16 improving performance on next gen.

https://gpuopen.com/first-steps-implementing-fp16/

10% gains in their engine, not bad. That's impressive as FP16 isn't suitable for all tasks. I can't remember which developer it was, but I believe they mentioned that with optimal use you could potentially get around a 30% increase in performance over not using dual rate FP16 with something around a 20% boost being a realistic performance target. Keep in mind that was just one developer, other developers may have different opinions.

Regards,
SB
 
IIRC, it was 30% increase in 1 specific situation (just a small part of the larger graphics work), not 30% overall improvement.
 
Does anyone think there’s a chance that Intel would license EMIB technology? AMD’s semi-custom division has already worked with it for HBM2 and Kaby Lake G, so it’s not completely out of the realm of possibility. My guess is the biggest hang up would be who packages the APU. I assume Intel would want control, and that may not be a good situation for MS/Sony in terms of cost or supplier flexibility.
 
10% gains in their engine, not bad. That's impressive as FP16 isn't suitable for all tasks.
There's more to it in the article.
Apparently the current DirectX effect-compiler (FXC) only supports FP16 for compatibility (I guess with DX8.1 and previous games) and promotes everything to FP32, so the devs must go through some loops to make effective use of the faster ALU operations. In the process, a number of cycles is dedicated to converting FP32 to FP16. The conversion is happening in both DX11.1 and DX12.
This will apparently be solved with Shader Model 6.2 (to which Vega GPUs and maybe P100 should automatically be compatible with), after which FP16 operations can be inserted directly into the code and won't go through these loops.

I don't know if Vulkan shares the same limitations, but since it's so recent and broke backwards compatibility with OpenGL, it may not have.

So the 10% performance upgrade came up despite the current conversion overhead and the fact that the author admittedly didn't spend a lot of time profiling the shaders that could use FP16.
The author concludes that geometry buffer and post-processing shaders (TAA, checkerboard) are safe to use with FP16, and it seems to be where he used it.

Also, given the author's background, it's possible that these results were achieved in F1 2017, which is a DX11 game.


New IIRC, it was 30% increase in 1 specific situation (just a small part of the larger graphics work), not 30% overall improvement.

The 30% number came from slide 82 of this presentation from DICE on Frostbite:
https://www.slideshare.net/DICEStudio/4k-checkerboard-in-battlefield-1-and-mass-effect-andromeda

It doesn't really say if it refers to total performance or just 30% faster at using the checkerboard resolve shader.
If it's the later, it doesn't sound like a lot because that particular shader seems to use only FP16. One would think a 100% higher compute throughput and half the required register bandwidth would result in a performance upgrade higher than 30%.


Then again, almost all games are using FP32 in everything nowadays so the full scope of FP16's usage may not have been fully understood yet.
 
Theres some followup clarifying posts from the developer either on here, twitter, or resetera (or even neogaf depending on the timeline). I just cant remember where the original discussion took place. I know we've already talked about on B3D and its no miracle performance increase.

Perhaps it is somewhere in thus thread: https://forum.beyond3d.com/threads/id-buffer-and-dr-fp16.60209/
 
Heres the part saying it was only 30% in 1 particular shader, the checkerboard resolver, https://forum.beyond3d.com/posts/1977469/

30% on a specific process...and using FP16 not specifically Double-Rate FP16 so this would apply To Scorpio to..
http://www.frostbite.com/2017/03/4k-checkerboard-in-battlefield-1-and-mass-effect-andromeda/

x158BWV.jpg

Another optimization was the usage of FP16 GCN instructions in our checkerboard resolve shader. PS4™Pro has support for FP16, and we used it throughout the checkerboard resolve shader. The largest benefit was the ability to greatly lower our LDS memory usage. This resulted in a 30% performance improvement in the resolve shader.

In the grand scheme of things...this isn't much at all...and would apply to Scorpio.
 
About RPM, the most informative explanations were exposing all the caveats of trying to convert some shaders to rpm/fp16, againsts the early simplistic opinions of doubling the performance of those shaders.

So 10% overall is an expected figure, all things considered. Benchmarks of shaders which can heavily use RPM show only a 30% gain. So even if there is a lot of work left to increase fp16 usage, it doesn't seem likely to ever reach 20% overall. Let's say a (very) wild guess for a point of diminishing returns is 15% for next gen. After all tricks are figured out and integrated in AAA engines.

If we want 8x this gen's effective gpu performance to call it a true next gen:
PS4/XB1 average x8 = 12.4TF
Pro equivalent with RPM (15%) = 4.83TF
Clocking 911 - > 1200 = 6.36TF
Doubled density = 12.72TF (i.e. 11TF fp32 and 22TF fp16)

So with a reasonable upclock to keep power in check, the gpu density must be able to double without breaking the bank, and power reduction must improve by more than what comes with 7nm. Or otherwise a new architecture which can reach much higher clocks than GCN at the same power. I don't know if next gen can happen at 7nm without AMD pulling a rabbit out of a hat.

It also needs 550GB/s memory, which is less of a problem compared to power consumption.
 
Last edited:
About RPM, the most informative explanations were exposing all the caveats of trying to convert some shaders to rpm/fp16, againsts the early simplistic opinions of doubling the performance of those shaders.

So 10% overall is an expected figure, all things considered. Benchmarks of shaders which can heavily use RPM show only a 30% gain. So even if there is a lot of work left to increase fp16 usage, it doesn't seem likely to ever reach 20% overall. Let's say a (very) wild guess for a point of diminishing returns is 15% for next gen. After all tricks are figured out and integrated in AAA engines.

If we want 8x this gen's effective gpu performance to call it a true next gen:
PS4/XB1 average x8 = 12.4TF
Pro equivalent with RPM (15%) = 4.83TF
Clocking 911 - > 1200 = 6.36TF
Doubled density = 12.72TF (i.e. 11TF fp32 and 22TF fp16)

So with a reasonable upclock to keep power in check, the gpu density must be able to double without breaking the bank, and power reduction must improve by more than what comes with 7nm. Or otherwise a new architecture which can reach much higher clocks than GCN at the same power. I don't know if next gen can happen at 7nm without AMD pulling a rabbit out of a hat.

It also needs 550GB/s memory, which is less of a problem compared to power consumption.

256-bit GDDR6 is 576 GB/s at 18Gbps per pin, which is on the high end of the spec. 17.2 Gbps would be near spot on. Still, that’s preferable to slower clocks at 384-bit.
 
Does anyone think there’s a chance that Intel would license EMIB technology? AMD’s semi-custom division has already worked with it for HBM2 and Kaby Lake G, so it’s not completely out of the realm of possibility. My guess is the biggest hang up would be who packages the APU. I assume Intel would want control, and that may not be a good situation for MS/Sony in terms of cost or supplier flexibility.

Possible, but I also wouldn't be surprised if AMD's contract with Intel had some sort of stipulation what kind of products AMD could release themselves. AMD effectively doesn't need the EMIB with their experience in interposers, and the time they've spent researching on-die stackable HBM. A Ryzen APU few more CUs and HBM stacked on top would be a direct threat to Kaby Lake G. I can only imagine the contractual obligations set out by both companies when signing the deal. AMD probably demanded that a certain minimum number of Vega M GPUs be bought at a certain price, and in turn Intel likely demanded that AMD not release an HBM equipped APU in the PC space for a certain amount of time. Otherwise I could've never seen the agreement going through.
 
A Ryzen APU few more CUs and HBM stacked on top would be a direct threat to Kaby Lake G.
For an up to ~35W APU I'd agree, but higher than that (i.e. home consoles) and I see how temperatures could become a problem for the lower stacks.
(This is also why I think a PS4 Go seems so irresistible at the moment, at 7nm AMD could make a 5-10W APU with base PS4 performance and a single HBM stack on top.)

Regardless, as you stated many companies have been looking into methods to make lower-cost interposers, so it's not like Intel is the only one who's been researching the subject.
2019 will probably already see final products using much cheaper interposers and/or alternatives with similar functionality.

I can only imagine the contractual obligations set out by both companies when signing the deal. AMD probably demanded that a certain minimum number of Vega M GPUs be bought at a certain price, and in turn Intel likely demanded that AMD not release an HBM equipped APU in the PC space for a certain amount of time. Otherwise I could've never seen the agreement going through.
Agreed, but that would only last until Intel launches their discrete graphics solutions, which could be as early as 2019.
Besides, that "non-compete clause" would probably only refer to markets where AMD and Intel actually compete (PC desktops, servers, laptops, etc.). It wouldn't apply to consoles.
 
Great news, this makes me believe even more in a 2019 launch. With Sony already working on dev tools right now, I wonder if we will hear some news about dev kits in the next coming months.
 
It would be in AMD's interests to solicit the best and broadest range of input on the compiler. Low level console experts would certainly be worth consulting on compilers, especially as game code is important to Zen.

I fully expect CPUs from the Zen line to be integrated into the next major hardware iterations from Sony and MS, but this may not have been work done specifically for a PS5 chip / devkit.
 
Status
Not open for further replies.
Back
Top