Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
I'm going to leave this premature BOM estimate of PS5, based on past isuppli analysis, dramxchange, and a bit of wild guessing...

$72 - 512gbit nand x 12 ($6 each)
$10 - Custom nand controller
$100 - 310mm2 7nm (edited... convincing arguments below)
$25 - UHD bluray drive
$100 - 16GB gddr6 (depending if sony managed a good deal this time too)
$25 - PSU (I expect same power but higher efficiency, compact)
$75 - Mechanical, cooling, pcb, other...
$20 - Controller
$6 - Box content

$433 Total

It could go up from gddr6 prices, or down from 7nm getting more affordable. Or both might happen cancelling out. I don't think the rest can move much. Maybe yield on the soc because of the clock, maybe expensive mechanical from crazy cooling materials/size....
 
Last edited:
The context is in the transcript I posted.

The downclock is to avoid having to design the entire power/heatsink to allow AVX256 all the time which is a substantial margin required (we don't know what MS does about this). The reason given for smartshift is to take "unused" power from the cpu which would statistically allow the gpu to peak more often, "to squeeze every last drop of power available".

This makes sense if we think about what happens if it doesn's implement smartshift. It's invariably better with smartshift because when the cpu is waiting on a bunch of cachenmiss or is in a less compute intensive part of the pipeline, the gpu can be signalled to use more power than it would normally be limited to, so it will stay at it's peak more often. Therefore "to squeeze every last drop".
We actually know this: Mini PC tower shape + huge heatsink + big fan.
 
I'm going to leave this premature BOM estimate of PS5, based on past isuppli analysis, dramxchange, and a bit of wild guessing...

$72 - 512gbit nand x 12 ($6 each)
$10 - Custom nand controller
$142 - 310mm2 7nm (because 348 was $100, and 60% more for 7nm? Is it?)
$25 - UHD bluray drive
$100 - 16GB gddr6 (depending if sony managed a good deal this time too)
$25 - PSU (I expect same power but higher efficiency, compact)
$75 - Mechanical, cooling, pcb, other...
$20 - Controller
$6 - Box content

$475 Total

It could go up from gddr6 prices, or down from 7nm getting more affordable. Or both might happen cancelling out. I don't think the rest can move much. Maybe yield on the soc because of the clock, maybe expensive mechanical from crazy cooling materials/size....
With your die size, assuming a square die and the known figures of $9000 per wafer and 0.09 defect density, I get a 76% yield and $66.66 per part.

Of course, not all parts will yield out performance wise, but it’s a far cry from your dollar figure.
 
With your die size, assuming a square die and the known figures of $9000 per wafer and 0.09 defect density, I get a 76% yield and $66.66 per part.

Of course, not all parts will yield out performance wise, but it’s a far cry from your dollar figure.
Has that price per wafer been confirmed somewhere by TSMC? I mean, we know from several sources that 7nm is about double the price per mm^2 compared to 16/14(/12)nm.
Quick googling revealed one tweet claiming $9965 per wafer but that's as close to your $9000 I quickly found, and that doesn't really fit the nearly double the price since same tweet says $5912 for 16nm
 
We actually know this: Mini PC tower shape + huge heatsink + big fan.
It's not huge, the heatsink is similar in size to xb1x and could be done with the same form factor with only slightly increased depth for a larger fan. It's a restrictive airflow not taking much avantage of the form factor, at least not the way it was speculated. It's only 20% more airflow than the xb1x through the heatsink (from DF interview). PSU indicates a probable 225W max versus the 200W max measured for the xb1x. 130mm isn't "huge" compared to 120mm.

It's not the "monster" 400+mm2 die either. It's 360mm2 which is in the same ballpark of all console launches and mid-gen launches of ps4/xb1.

However if we find out it's pulling 300W it would be a different discussion. The 225W remains an estimate based on their past consoles.
 
Has that price per wafer been confirmed somewhere by TSMC? I mean, we know from several sources that 7nm is about double the price per mm^2 compared to 16/14(/12)nm.
Quick googling revealed one tweet claiming $9965 per wafer but that's as close to your $9000 I quickly found, and that doesn't really fit the nearly double the price since same tweet says $5912 for 16nm
Yes, that 9K number was confirmed. I’ve posted here before but I’ll have to dig to find it again.

That 2x number was from 2018 from AMD.
 
With your die size, assuming a square die and the known figures of $9000 per wafer and 0.09 defect density, I get a 76% yield and $66.66 per part.

Of course, not all parts will yield out performance wise, but it’s a far cry from your dollar figure.
There's yield yes, and also I am curious what additional packaging costs and maybe profit from different intermediaries (AMD?) are incurred. I just used a shortcut with $100 for the ps4 348mm2 from isuppli as a reference and added the 60% for 7nm. Maybe that was a bit too crude, but OTOH we don't have the rest of the cost structure for the final part.

If we had some historic wafer cost from 2013 we could see how isuppli added to the base wafer cost to estimate the ps4 soc at $100.
 
There's yield yes, and also I am curious what additional packaging costs and maybe profit from different intermediaries (AMD?) are incurred. I just used a shortcut with $100 for the ps4 348mm2 from isuppli as a reference and added the 60% for 7nm. Maybe that was a bit too crude, but OTOH we don't have the rest of the cost structure for the final part.

If we had some historic wafer cost from 2013 we could see how isuppli added to the base wafer cost to estimate the ps4 soc at $100.
Agree, we’re lacking the historical comparison to give the needed context. It’s probably over $100, but $140 seems extreme.
 
Agree, we’re lacking the historical comparison to give the needed context. It’s probably over $100, but $140 seems extreme.
Yeah now that you pointed out the base 7nm die cost it looks like I should drop it to 100, there's no way the rest cost that much, nor the yield that bad!

So I'm at 433 for a 450 retail.
 
Yeah now that you pointed out the base 7nm die cost it looks like I should drop it to 100, there's no way the rest cost that much, nor the yield that bad!

So I'm at 433 for a 450 retail.
I believe the Bloomberg estimate was around 450, so I’d say you’re pretty close.
 
Sony needs to bin for peak boost clocks being advertised. From a binning perspective that doesn’t change anything. The yields will depend on voltage tolerance range for the bin, assuming the chip can hit the peak frequencies in the first place.
Sony needs to bin for the clocks the chips can reach and the power they consume at those speeds, otherwise they would compromise the consistent behavior at constant power Cerny touted. A chip that can hit those clocks but pulls excess power would need to be excluded.
AMD's binning also makes allowances for different cooler capabilities, seemingly on a per-SKU basis. The PS5 could include a cooler generous enough for all chips that pass a more forgiving set of criteria, or be more stringent in what chips it can accept. A more complicated method would be to continue to bin internally to a limited extent, with the initial console board revisions biased towards supporting chips that in a worst case need additional VRM and cooler investment. They could then monitor improvements in binning as manufacturing goes up the learning curve and the mix of produced chips starts to have fewer marginal examples and more higher-quality. At some point, it could become economical to tighten the bin requirements and pare down the cooling and power limits of the console.
 
The best possible texture that you can possibly see is dependent on your display resolution

Not even close.
Otherwise Blu Ray Avatar movie at 720p would have been looking worse than Doom Eternal at 4K. Which is not the case.
Any raster graphics has problems with aliasing. Unless you defeat aliasing any texture will do.
You can do it in naive way: just rendering things in 8K and then downscaling. Or you can do it in smarter ways (which is what approximately every game is trying to do).
In fact any game that pursues high IQ 3D graphics is essentially fighting aliasing in a that game-specific way.
Obviously fighting it only with higher res textures will not do. But if we're talking about naive approaches - it's the first (after 8K and downscale).
 
In gpu extreme one can think of furmark. on CPU side cerny specifically mentioned avx2 being power hungry and not being widely/heavily used in current engines. It really comes down to how well game developers manage to optimize their code, i.e. cpu/gpu utilization and using specific power hungry instructions heavily. Most developers likely will not manage to load cpu and gpu 100% and there is wiggle room in power draw. Some developers later in the cycle might be able to create insane loads with very little bubbles, hit max power draw and either cpu or gpu is preferred.

Cerny also specifically said the gpu clockspeed is not limited by power draw. He said there is something else that imposes hard ceiling for max clock. So hitting max clock is not same as hitting max power draw. Same applies to cpu, just hitting max clock is not hitting max power draw. To hit max power draw very specific and likely bubble free stream(s) of instructions is needed.

I do not know what the maximum power draw is. But clock speeds are not limited by it or by the cooling solution. It is limited due to internal timmings with other hardware.

CPU speed will most certainly decrease if AVX 256 is used. Even Intel CPUs do that.
 
How do you get FLOPS from INTs, though? By definition, FLOPS are floats and INTS are integer.


If you are talking about the gears benchmark, I'm not sure I would draw any broad performance metrics from that simply because it runs so well on everything. A Vega 56 can hit 4k 30FPS (console framerates!) on ultra.


On pc? That's because AMD CPUs in the desktop space basically didn't support AVX256 until Zen. AMD APUs released after 2015 or so included support for them, but those were slower parts, not meant for gaming. So unless you wanted to make games only for Intel CPUs, you stuck with AVX128 or avoided them completely. It wasn't until about a year ago where a large number of PC games required AVX instructions at all. There are a bunch of youtube channels where guys bench modern games on older rigs, and I saw a couple of them finally upgrade from Phenom 2's within the last year because enough games just don't launch on them.

The thing with consoles, though, is if you are trying to squeeze every bit of performance out of a box made 4 years ago, you will leverage the hardware that's there. If AVX256 has a performance benefit developers will use them.


Memory buses are described by a bus width, in bits. In the case of Xbox Series X, its 320 bits. But it's really ten 32bit connections to each of the 10 memory chips. Those 32bits multiplied by the 10 connections gives you the 320 bit bus. Each of the memory chips can transfer 56 GB per second, so reading from all chips gives you 560 GB/s of total bandwidth (10 connections at 56GB each). Here's were things get complicated, though. Not all of the memory chips are the same size. There are 1GB and 2GB chips, all with the same speed connection. Six 2GB chips and four 1GB chips for a total of 16GB.

So if you are reading all 10 chips at the same time, you can get 560GB/s, but if you only need data stored on those 2GB chips, you are limited to 336 GB/s. There isn't a dedicated path for the CPU or the GPU, but you can store less used data in a way that it only resides on those 2GB chips, so frequently used data can take advantage of the full bus speed most of the time.
Sure if the game doesn't use the CPU much. But is that really going to happen in graphically intensive games ? What's going to happen if a game saturate the CPU at 90% most of the time ? Because from what I gathered ideally all 10GB of fast memory should be dedicated to the GPU.
 
One question... looking at intel charts we see CPU clock speeds decrease with AVX usage.
PS5 may also have CPU clocks decrease.
Bit Xbox series X has locked speeds. Does this means it cannot use AVX256?
 
One question... looking at intel charts we see CPU clock speeds decrease with AVX usage.
PS5 may also have CPU clocks decrease.
Bit Xbox series X has locked speeds. Does this means it cannot use AVX256?
We currently do not have all the information to know what the strategy is for the Series X. The power supply and VRMs as described are generous for a CPU alone, but we don't have a reference for the GPU. The CPU doesn't go into the 4GHz+ turbo range, and it's SMT-off at 3.8, which can reduce power as well.
Without knowing what the budget is and getting a good profile of what heavy AVX256 usage would look like, it may be something they've made an allowance for.
Also, there are other ways of paring back power demand if instructions are problematic, such as reducing their issue rate or other ways of slowing the ramp-up of their usage. However, that may include advice to developers that the benefits of those instructions could be more complicated to realize.

This isn't a new issue for Intel chips, although there are a lot of variables like power supply, TDP, core count, SMT, specific core version, specific instructions, etc.

One thing I am curious about the consoles, and the PS5 in particular, is how readily clocks might shift on a per-core or per-CCX basis.
In mixed loads, other threads not running AVX can be penalized if their cores are downclocked. If there are tasks that depend on one another, and a neighboring thread with AVX drops the speed that locks or dependencies can be cleared, there can be more penalties to overall performance than just paying attention to the wide AVX thread.
 
Not even close.
Otherwise Blu Ray Avatar movie at 720p would have been looking worse than Doom Eternal at 4K. Which is not the case.
Any raster graphics has problems with aliasing. Unless you defeat aliasing any texture will do.
You can do it in naive way: just rendering things in 8K and then downscaling. Or you can do it in smarter ways (which is what approximately every game is trying to do).
In fact any game that pursues high IQ 3D graphics is essentially fighting aliasing in a that game-specific way.
Obviously fighting it only with higher res textures will not do. But if we're talking about naive approaches - it's the first (after 8K and downscale).
So you're suggesting that PS5 is going to render everything in greater than 4K native and scale down to 4K? Just to take advantage of their hard drive throughput?
 
Status
Not open for further replies.
Back
Top