Next Generation Hardware Speculation with a Technical Spin [2018]

Status
Not open for further replies.
I'm not convinced a 50% size reduction is on the cards. Xbox One's SoC/APU reduced by 33% going from 28nm to 16nm (363mm^2 to 240mm^2) so expect a similar reduction going to 7nm.
High density SRAM cell is 0.127mm^2 in TSMC's 28nm process and 0.074 mm^2 in their 16nm, 1.7x density improvement
High density SRAM cell is 0.08mm^2 in GF's 14nm, 7nm promises 0.027mm^2 in both GF's and TSMC's, almost 3x density improvement.

TSMC's 16nm process used the same metal pitches as 20nm, which is why you saw sub-optimal density improvement.

Cheers
 
My thanks to Shifty Geezer for the answer to my question. I must say, this is all rather depressing from a hardware standpoint.
 
Given Mark Cerny's emphasis on "time to triangle" with the PS4, is there an argument that, whilst offloading CPU tasks to the GPU is possible and prevalent in bigger studios, it's worth going with a substantial CPU upgrade for ease of development across the wider development community?
Perhaps OT but I’m not sure if that should be a major factor for console development.

You learn to code for 1 profile on console. But at least it’s only 1 profile Vs easier development but having the headache of a lot of profiles (ie PC).

We’ve seen lots of indies manage themselves over time with PS3. Offloading to GPU shouldn’t be a worse experience than that of PS3 era.
 
High density SRAM cell is 0.127mm^2 in TSMC's 28nm process and 0.074 mm^2 in their 16nm, 1.7x density improvement
High density SRAM cell is 0.08mm^2 in GF's 14nm, 7nm promises 0.027mm^2 in both GF's and TSMC's, almost 3x density improvement.

TSMC's 16nm process used the same metal pitches as 20nm, which is why you saw sub-optimal density improvement.

Cheers

It will be interesting to see what percentage reduction we actually get for a whole chip/SoC moving to 7nm. I don't even think Intel claim 3x density improvement from 14>10nm?
 
There's a whole discussion on AI. It's not an easy thing to change. Bethesda had to tone down the Radiant AI in Oblivion because it screwed with the gameplay. We don't want AI that's too smart or it'd be unbeatable.

The important thing is to have versatile hardware that can be used as needed. I'll repeat my desire for a volumetric modelling of the world that can be used for graphics, audio, and AI. Whether that warrants a ray0tracing accelerator, or just a fast GPU with loads of RAM and bandwidth, I don't know.
I don't see specific hardware for raytracing anytime soon because there isn't much of a silver bullet in thar area yet. Right now, or in the near future, I think faster gpu, cpu and memory etc, allows for more research and experimentation. I just see a continuation the current trend of more emphasis on general compute and gradual deprecation of the more fixed function aspects of hardware.
 
I don't think there's any chance of either Sony or Microsoft going with an ARM-based CPU solution with PS5 and Xbox Next. Nor do I think they'll stick with Jaguar.

Given the amount of time that's passed since PS4, Xbox One release in late 2013, and with PS4 Pro having been released just over 14 months ago, I'm sure Mark Cerny has long since conducted a postmortem on PS4 with his team and long since alreadt asked developers what they'd like to see in next generation hardware. I'll bet that at the top of the list, was CPU performance. With that in mind, I'm expecting PS5 (and the next Xbox) to use an 8 core, 16 thread Zen 2 CPU, at 7nm. The only question in my mind is, the clock frequency they'll be able to achieve in a console. I also highly doubt the new consoles will get released before November 2020 at the earliest, but that's also the most likely time frame IMHO. I think Fall 2019 is just too soon, especially for Microsoft as that would only be 2 years after Xbox One X, and the PS4 is still selling extremely well, with a lot of AAA games likely to come out that year.
 
Last edited:
Megadrive1988 said:
PS4 is still selling extremely well, with a lot of AAA games likely to come out that [2019] year

Actually this could just as easily indicate new hardware will be released in 2019. Recent history shows lots of AAA releases even when new hardware is announced/released.

2013 first/second party AAA games:

GT6
Beyond: Two Souls
God of War: Ascension
TLoU

On top of which the small matter of GTA5 released.....PS4 being announced in February 2013 and releasing in November didn't seem to matter at all.
 
It will be interesting to see what percentage reduction we actually get for a whole chip/SoC moving to 7nm. I don't even think Intel claim 3x density improvement from 14>10nm?

Intel’s 14nm is probably closer to what the others would call 10nm in terms of he sizes.

While 10nm from TSMC and Samsung has only been used in mobile applications, it does have nice density scaling, just not the performance gains. In the ideal world, you’d get a 4x scaling by going to 7 from 14 (in area 14*14=196, 10*10=100, 7*7=49), so 3x is already suboptimal.
 
High density SRAM cell is 0.127mm^2 in TSMC's 28nm process and 0.074 mm^2 in their 16nm, 1.7x density improvement
High density SRAM cell is 0.08mm^2 in GF's 14nm, 7nm promises 0.027mm^2 in both GF's and TSMC's, almost 3x density improvement.

TSMC's 16nm process used the same metal pitches as 20nm, which is why you saw sub-optimal density improvement.

Cheers

Intel’s 14nm is probably closer to what the others would call 10nm in terms of he sizes.

While 10nm from TSMC and Samsung has only been used in mobile applications, it does have nice density scaling, just not the performance gains. In the ideal world, you’d get a 4x scaling by going to 7 from 14 (in area 14*14=196, 10*10=100, 7*7=49), so 3x is already suboptimal.
3x the density but not 2x the transistor count on those mobile chips I'm guessing? If cost per transistor isn't decreasing by a significant margin we may get denser smaller more energy efficient chips but we need cost per transistor to decline so we get more shader cores, tmus, more capable geometry engines, larger gpu caches, more sophisticated command processors etc.
 
3x the density but not 2x the transistor count on those mobile chips I'm guessing? If cost per transistor isn't decreasing by a significant margin we may get denser smaller more energy efficient chips but we need cost per transistor to decline so we get more shader cores, tmus, more capable geometry engines, larger gpu caches, more sophisticated command processors etc.
From what I recall The current trend is cost per transistor has been increasing since 28nm as a result of all the additional processing required to make the smaller nodes happen. If the goal is to use one of the smaller nodes say 7nm, we may need to wait a significant amount of time such that the process is matured enough for substantial gains.
 
From what I recall The current trend is cost per transistor has been increasing since 28nm as a result of all the additional processing required to make the smaller nodes happen. If the goal is to use one of the smaller nodes say 7nm, we may need to wait a significant amount of time such that the process is matured enough for substantial gains.
From what I read, using 250W EUV frickin lazer beams, the cost reduction should continue.

20nm, 14nm and 10nm were the worst, specially at their introduction. All processes still have cost reduction over time.
 
If Sony and Microsoft were smart enough, they would offer two models upfront during launch, and possibly a third during the PS5/NextBox midlife cycle. The launch models would consist of an affordable entry model ($399) and premium ($599). Later during the midlife cycle (4-5 years), they'll introduce another premium Pro/X model ($599) with suitable price cuts on previous models ($249 EM / $349 OG PM).

Entry Model Specs
8 Core CPU @3.2GHz
8-9 TF GPU Performance
16GB of GDDR6
1TB Hybrid HDD

Premium Model Specs
8 Core CPU @3.4GHz
12 TF GPU Performance
16GB of GDDR6
2TB Hybrid HDD

Midcyle Model Specs
8 Core CPU @3.8GHz
16 TF GPU Performance
20 GB of GDDR6
4TB Hybrid HDD
 
@3dilettante Would more memory controllers have any benefit for the APU setup? Was just thinking how GDDR6 is organized as 16-bit per chan (32-bit dual chan) whereas GDDR5 is 32-bit per chan organized as 64-bit MC for Radeons.
 
20nm, 14nm and 10nm were the worst, specially at their introduction. All processes still have cost reduction over time.
I guess the question is how long we need to wait for it to be really cheap enough to sell 2x/3x transistor density at the same price point of an older chip. 4 years seems both short and long. I'm honestly having problems predicting the rate at which technology can move. It's seem exponential, but also at the same time, if feels like we're hitting this weird wall.
 
mm... indeed. It was roughly 4 years between 28nm and 16nmFF if we go by desktop GPUs.

The Slim consoles at least were introduced at a smaller chip size, which probably offset the premium that TSMC was demanding at the time. Neo is closer to the original chip size, but it was also produced in more limited quantities.
 
If Sony and Microsoft were smart enough, they would offer two models upfront during launch, and possibly a third during the PS5/NextBox midlife cycle. The launch models would consist of an affordable entry model ($399) and premium ($599). Later during the midlife cycle (4-5 years), they'll introduce another premium Pro/X model ($599) with suitable price cuts on previous models ($249 EM / $349 OG PM).

Entry Model Specs
8 Core CPU @3.2GHz
8-9 TF GPU Performance
16GB of GDDR6
1TB Hybrid HDD

Premium Model Specs
8 Core CPU @3.4GHz
12 TF GPU Performance
16GB of GDDR6
2TB Hybrid HDD

Midcyle Model Specs
8 Core CPU @3.8GHz
16 TF GPU Performance
20 GB of GDDR6
4TB Hybrid HDD


Still not sure why people are expecting desktop frequencies into a closed-box machine....
 
Still not sure why people are expecting desktop frequencies into a closed-box machine....

Why not? XB360/PS3 CPUs had respectable clock frequencies (3.2GHz). And todays cooling solutions are far more robust than yesterday designs (heatpipe/vapor-chill designs have gotten much better).
 
@3dilettante Would more memory controllers have any benefit for the APU setup? Was just thinking how GDDR6 is organized as 16-bit per chan (32-bit dual chan) whereas GDDR5 is 32-bit per chan organized as 64-bit MC for Radeons.

I know that increased channel count has been cited as being beneficial, while the number of separate controllers hasn't been discussed as much. In the CPU space, the ability to operate the 64-bit channels in unganged mode has usually provided better utilization in more irregular workloads.
For GPUs, the 64-bit controller seems to have been preferable since GPUs frequently stripe data at a multiple of DRAM burst length, and the ratio of cycles for commands vs data transfers was apparently low enough that a controller could juggle multiple channels. I believe there was some discussion that HBM might have had more channels per controller, given the sheer number of them and their modest speed, but I wasn't able to find a clear reference.

GDDR6 would seem to provide less slack for the controller to juggle multiple GDDR devices, since the time available before a channel needs more command input would be smaller as the clock increases. That might create a need for one controller per DRAM, or some hierarchy of DRAM management. A 1:1 controller/channel arrangement might be too simple, since the GDDR6 device itself would have more global needs that would introduce a linkage between controllers.

More controllers means more flexibility and scheduling resources for accesses. If trying to balance the GPU and CPU, there's more buffer capacity and more possibilities for allocating memory that can help cater to the different priorities of the two processor types.
For an APU with coherent accesses, AMD's protocols also rely on the controller or its associated logic for broadcasting snoops and determining the global order of accesses to their associated DRAM. Having more of them would mean being able to sustain more concurrent traffic. Other features in the CPU space like memory encryption would scale with the controllers. I'm still not entirely sure where the GPU's compression logic lies, so this might have an association with controller count.

The downside would be that this scales up the global amount of controller hardware and interconnect cost. The CPU and GPU don't have the same desired level complexity per controller or client count, which may be an area of conflict.
AMD's allegedly leaked HPC "APU" slides sidestepped this with separate CPUs with DDR4 and a GPU with HBM.
A single-chip console APU might not have that option, though it could dispense with some of the high-end features of a server-bound chip. High-latency memory is already a feature of current console APUs, so plugging into a Vega-like mesh might be an option. Alternately, APUs show a willingness to have more complex controller layouts, which may produce a hierarchy or intermediate fabric topology between the full crossbar of Zen and Vega's mesh.
 
Status
Not open for further replies.
Back
Top