Predict: The Next Generation Console Tech

Status
Not open for further replies.
I can't seem to find it now, but someone was asking about stacked/interposer timetable. Here's a couple more references.

Hynix's outlook:

Kim said that critical to lowering cost will be depreciation (of equipment) and improved yield. Design optimization will help as well as reducing process turn-around time to increased productivity. He shared Hynix 3D roadmap, saying that volume TSV production will officially start after 2013:

DRAM on Logic for mobile applications in a known good stacked die (KGSD) driven by form factor and power, are in development in 2012 with low production expected early 2013 ramping to volume late 2014.
DRAM on interposer in a 2.5D configuration for graphics applications, driven by bandwidth and capacity is in development in 2012 with low production expected by the end of the year and ramping to HVM early in 2014.
3D DRAM on substrate for high performance computing (HPC) driven by bandwidth and capacity is in development in 2012, with low production expected early 2013, ramping to volume late 2014.

And another roadmap:

yole.jpg


The interesting thing to note on the roadmap is that Sony (and their manufacturing partner Toshiba) dominate the 2 lower categories, cmos imagers and MEMS sensors. They are also now involved with Power & RF components. While they aren't involved with manufacturing memory, they would be considered a large consumer and thus an influencer in that category which also ties back to the top category which is what has been discussed here as also a Sony interest.

All in all when looking at this roadmap, it's no wonder why Sony would be mentioned as being active.
 
Well Bulldozer would be terrible in a console, and piledriver will be based on it.

Why?

Bulldozer (or more likely the offspring in Piledriver/Steamroller) only looks poor against Intel. Assuming a CPU power allocation in the 40-70W range would a console-ized Bulldozer (less cache) what else is there? ARM, PPC, Cell, etc all have major draw backs. Where are the 8+ high performance ARM cores with beefy float performance to compare against?

Lets say Sony goes with AMD/"Bulldozer". AMD is already working on some of the performance issues (already claiming 10% improvement). There is also info from ISSCC that future iterations will scale much, much higher in frequency at the same power (basically Bulldozer didn't scale in frequency as AMD expected/designed). Stripped down "Bulldozer" cores aren't going to be huge--not as small as ARM or SPEs, for certain, but not necessarily unruly large. The benefits, of course, is you get processor cores that perform better on poor code, access to the huge x86 tools and developer base, more synergy with the PC/development, and so forth.

Not all effective strategies need be laid on on a flop/cycle basis. If going with an x86 chip leverages other development resources (PC), helps bring down some development times/issues, etc that can impact the end product it is a win. Seeing as an 32 SPE Cell isn't in the wings (and who is going to develop such? And what would be the industry response?) the performance gap between platforms will be smaller per mm^2 than this generation "on paper" where "on paper" Cell walks away from Xenon. And yet we see time and time again as a platform it didn't look like a 3x faster system. This coming gen we would be looking at a similar core count between such a Bulldozer and a PPC solution of ball park performance per core.
 
then.. why not a VIA CPU? :D

I'm not fond of a sea of weak cores. the most demanding thread becomes a bottleneck for everything else (as with Amdhal's law)
 
Is it possible to modify a BD core to make it run SPE codes? If not, is it feasible to integrate the CELL to PS4 as a secondary CPU?

One last question, would you let Sony be the only one with x86 CPU next-gen, if you were MS?
 
if sony really is using an amd cpu jaguar is the most energy efficient, and is ready to be produced at tsmc
just saying
a bunch of 3ghz jaguar core that leave a lot of budget for the gpu, or even the possibility to integrate them in the gpu
 
Bobcat isn't designed to hit frequencies that high. It exists in the roughly 1-10W range, while cores that hit 3GHz are in the 10-100W range.
It's mostly impractical to have a design that tries to span more than one order of magnitude, so the next low-power core from AMD most likely won't try.
If anything, AMD is more interested in cutting the power use, which makes 3 GHz less reachable.
 
then.. why not a VIA CPU? :D

I'm not fond of a sea of weak cores. the most demanding thread becomes a bottleneck for everything else (as with Amdhal's law)

Yeah, while it sounds nice to have a "sea" of smaller cores how do you efficiently connect them? That is a problem. And I was going to mention that last bit about Amdhal's law as a lot of code that can be made to run across multiple cores see a reduction in return per-core. So while you may see an 80% speed up going from 1 core to 2 cores, moving to 3 cores may see a 50% improvement (2.3x performance 1 core versus 3) and the next may drop down to 30% (2.6x 1 core versus 4), and so forth. This should be kept in mind if we see scenarios where one manufacturer has a robust 6 core solution and the other a 12 or 16 core. On face value it looks like a 6 core solution would be mince meat but if those 6 cores are a solid OOOe solution, eat through sloppy code, have great single threaded performance, that isn't necessarily a horrible option. Both solutions would need attention (and probably benefit each other as optimizations to the 12 core solution benefit the 6 core and focusing on the heavy-code segments on the 6 core would do vice versa) just in different ways.
 
I wonder how complex would it be to remove the x86->internal microcode translation part and have the code run directly on (really :D) bare metal.

The bare metal instruction lengths are on the order of hundreds of bits. That would be awfully slow.

On the other hand, they could design an entirely new encoding for the frontend. Just replace x86 with an instruction set with a reasonable encoding and exactly the same semantics. Shouldn't be hard. :)
 
Speaking of the "Samaritan" demo, Rein noted that when they showed it off last year, it took three Nvidia cards and a massive power supply to run. However, they showed it off the demo again that ran using a new, not yet released Nvidia card and one 200 watt power supply.

Rein said that graphics technology has advanced even faster than Epic thought it would. He, not surprisingly, said that Unreal Engine 4 is "blowing people's socks off" and that there was no going back after seeing that level of graphics... however, Unreal Engine 4 was not show off to the press.
http://www.gamesindustry.biz/articles/2012-03-08-gdc-epic-aiming-to-get-samaritan-into-flash

:runaway: [it has to be a missquote, 200w gpu fits]
 
Last edited by a moderator:
Yeah, while it sounds nice to have a "sea" of smaller cores how do you efficiently connect them?
Perhaps with a mesh like Tilera's: http://semiaccurate.com/2009/10/29/look-100-core-tilera-gx/.

Amdahl's law is not a serious problem for games; outside scientific computing and digital signal processing there can't be many applications with more inherent parallelism. The problem with fine-grained parallelism isn't the application, it's the programming language. Leaving C++ will be painful, but staying might not be much better.
 
Here is the info about updated version of Samaritan demo [and video of it]
http://www.geforce.com/News/article...-running-on-next-generation-nvidia-kepler-gpu

They switched 4xMSAA with FXAA.

"Without anti-aliasing, Samaritan’s lighting pass uses about 120MB of GPU memory. Enabling 4x MSAA consumes close to 500MB, or a third of what's available on the GTX 580. This increased memory pressure makes it more challenging to fit the demo’s highly detailed textures into the GPU’s available VRAM, and led to increased paging and GPU memory thrashing, which can sometimes decrease framerates.”

"FXAA is a shader-based anti-aliasing technique,” however, and as such “doesn't require additional memory so it's much more performance friendly for deferred renderers such as Samaritan." By freeing up this additional memory developers will have the option of reinvesting it in additional textures or other niceties, increasing graphical fidelity even further.
 
Ok to answer my own question, they did just show UE4 to a selected few behind doors. Here's a few inserts from an interview.
http://www.gametrailers.com/side-mi...rything-i-could-get-out-of-epic-vp-mark-rein/
"We're doing [Unreal Engine 4] and I don't think anybody else is doing anything as incredible as that," Rein boasted. A developer who had seen Epic's UE4 demo was sitting nearby, eavesdropping. He vigorously nodded his head in agreement.

"Right? You've seen it?" Rein asked him. That developer gave us his non-verbal impression of Unreal Engine 4 by offering us an eyes-glazed-over, jaw-dropped look.
I asked Rein at one point during our interview if Epic Games was going to make hardware spec recommendations to video game hardware manufacturers and, once again, potentially cost Microsoft another $1 billion in RAM costs, as it did with the Xbox 360. Mark declined to give hints about next-gen game hardware, but did assure me that "We are constantly pushing everybody to give us everything, for the benefit of ourselves, our licensees and consumers."
I really hate it when Rein teases like that.
where's anonymous when you need them to hack his pc:LOL:
 
Status
Not open for further replies.
Back
Top