Predict: The Next Generation Console Tech

Status
Not open for further replies.
an Intel CPU is great at turning paper FLOPS into real FLOPS.
early on you had an embarassingly slow quake 4 on the x360, though that's what you get with making no use of multithreading.

so I guess it depends on what parts of the code you consider, there's the gigaflops friendly, vector friendly code and the code that runs faster on a fat core with all hardware tricks to make it efficient.

price is an interesting point, but unrelated to production cost. if the 40€ celeron g530 was unlocked we'd all overclock it past 4GHz and game on it :).
 
Wow.

That's rather sad, smart, and confusing all at the same time.

The Good:
-Software > Insta-ports of existing pc games should be a snap and help fill the gap while developers come to grips with using the hardware in more meaningful ways.

Yes, just like every game gets instaported to OSX and Linux too... oh wait, they don't, only few exceptions ever do.
PC hardware != Windows PC
 
Power consumption doesn't fall at the same rate, so we're looking at an 800W console if we want 16x the performance.

Since the PS3 launched, we've gone from 90nm processes to 28nm processes. That's 10% of the area and a fraction of the power since P =fcV^2. c and V both fall as your process gets smaller.
 
Are you so sure the latest revision of the chips have scaled accordingly? You can't blindly rely upon the equations here - they're only ideal conditions. Are the PS3 chips even at 28nm currently?
 
Top end PC GPU's are already 20-25x more powerful than what's in the PS3 7 years after it launched.
25x the performance at what kind of code?

As for power usage, from PS2 -> PS3 didn't the first-release console's power usage rise several times? I'm fairly certain original PS2 consumed considerably less than 100W.
 
Are you so sure the latest chips have scaled accordingly? You can't blindly rely upon the equations here. Are the PS3 chips even at 28nm?

No, you can't rely on them, but they're still a good general way of approximating performance improvements.

PS3 chips are at 45nm right now, but what does that have to do with anything? Since we're talking about that, for reference since RSX and Cell have gone from 90 nm to 45nm, the PS3's power usage has decreased by over 50%. I've found that, roughly, you can use a factor of 0.7 when shrinking a chip from one process to the next (full node drop, not half node). Xbox 360 has seen a similar power reduction.

Extrapolating, if they reduced to the 32nm process, they would be ~35% of original power usage. Of course, there's a non-linear factor from the other components in play in the system too.
 
The 90nm Cell SPE could run at 3.2 GHz at a little over .9V, possible a bit higher with margin.
A 28nm ARM test chip that was touted a while back operates at .85V.

Voltage is not scaling all that well, particularly at the multi-GHz range.
 
Some of you guys are getting lost in this without stepping back and looking at the big picture.

If Sony is going with an APU from AMD, the likelihood of that being a custom part is practically nil.

Think about it. When was the last time AMD (not ATI) hit a deadline?

Now mesh this with the comments from Sony WRT not wanting to launch significantly later than MS as they did with PS3.

That spells off-the-shelf-component to me.

If PS4 = AMD APU then we are looking at Trinity or Llano. Or if MS waits long enough, Sony might have the option of going with Kaveri.

The whole gameplan seems to be centered around Time to Market, Low R&D, and Ease of Development.

Trying to get a custom APU part that fits that criteria (especially from AMD of all sources) will result in failure.




One interesting spin on this whole thing may be liolio's idea of SOC+GPU.
In that way, they can still hit the criteria above, while also not being limited to relatively gimped performance. A Trinity+GCN wouldn't be out of the question. Time to market would still be relatively fast. Ease of development is still there (though a bit trickier to maximize performance). And R&D is again pretty low if we're talking off the shelf components.

An interesting option so as to avoid the relatively pathetic llano performance.
 
I've found that, roughly, you can use a factor of 0.7 when shrinking a chip from one process to the next (full node drop, not half node).

90nm to 45nm is two full node jumps (65nm); the half-nodes are 80nm, 55nm etc. Transistors scale in more than one dimension, so ideally, you would expect 50% reduction in size after each node. A 0.7 factor is actually very shitty.
 
Transistors scale in more than one dimension, so ideally, you would expect 50% reduction in size after each node. A 0.7 factor is actually very shitty.
Isn't it so that basically only sram scales that well and all sorts of wiring and some more complex units are far from ideal thus bringing the actual scaling from full node not to the ideal factor of 0.5?
 
Isn't it so that basically only sram scales that well and all sorts of wiring and some more complex units are far from ideal thus bringing the actual scaling from full node not to the ideal factor of 0.5?

Pretty much, which is why equations aren't the most reliable, especially as we go smaller and smaller. The design of the pipeline for a target clock can be an obstacle as well. Then there are all sorts of problems with analog components preventing a chip to be reduced further. You can still get a bit closer than 70%, but still you're kind of leaving it up to empirical evidence to give you that number.
 
Is yield proportional to area or transistor count? Since yield is essentially a probability of defects, if there are four times as many transistors per area from 90 to 45, does the yield gets logically worse "per area" when shrinking?

I remember IBM giving up on their 4ppe/32spe design because of major yield issues. At 45nm it would have been a similar die size as the original 90nm, so 235mm2. But they moved on to the A2 which is something like 359mm2 at 45nm, so it's clearly not apples to oranges. They don't have yield issues with a larger but different design, so thinking in reverse, can we assume that the cell design cannot shrink well, while the A2 can?
 
Isn't it so that basically only sram scales that well and all sorts of wiring and some more complex units are far from ideal thus bringing the actual scaling from full node not to the ideal factor of 0.5?

SRAM is most ideal, but GPUs also scale very well.

X1800XT density was 1.11 transistors/mm2 on 90nm
HD7970 density is 12.25 transistors/mm2 on 28nm

AMD GPUs average density over nodes for reference:
90nm 1.11
80nm 1.67
65nm 2.33
55nm 3.29
40nm 6.07
28nm 12.13


GFLOPs per watt has also scaled very well over the years:
90nm 1.07
28nm 16.13
 
Pretty much, which is why equations aren't the most reliable, especially as we go smaller and smaller. The design of the pipeline for a target clock can be an obstacle as well. Then there are all sorts of problems with analog components preventing a chip to be reduced further. You can still get a bit closer than 70%, but still you're kind of leaving it up to empirical evidence to give you that number.
Wiring is going to be more of a limiter as processes shrink below 28nm as wires don't scale like transistors.
 
Status
Not open for further replies.
Back
Top