Predict: The Next Generation Console Tech

upnorthsox · May 18, 2010

There's also the possibility of a mini-spu for isolation/security. At the very least it wouldn't need the full 256K LS which could cut it size in half or more if they stripped out other things not needed. I would also be looking at 28 or even 22nm instead of 32nm, and while Toshiba may be better off with a CMOS version for SPURS the 10% power reduction with SOI in a handheld would be very desirable.

patsu · May 18, 2010

*Assuming* that people want a home console experience on a portable (to play in another location), then there are more needs for a high power CPU.

Natural interface applications would benefit from SPU power. Also, MLAA is awesome. Would be cool to see a GPU implementation someday.

I'm not fully convinced that a portable console needs that much power yet though.

Carl B · May 18, 2010

upnorthsox said:
There's also the possibility of a mini-spu for isolation/security. At the very least it wouldn't need the full 256K LS which could cut it size in half or more if they stripped out other things not needed. I would also be looking at 28 or even 22nm instead of 32nm, and while Toshiba may be better off with a CMOS version for SPURS the 10% power reduction with SOI in a handheld would be very desirable.

HKMG, not SOI - SOI should be done after this gen across the SOI alliance. And I'm talking 32nm - 22nm I think is just too far out for the purposes of planning any near-future console launch.

upnorthsox · May 18, 2010

Carl B said:
HKMG, not SOI - SOI should be done after this gen across the SOI alliance. And I'm talking 32nm - 22nm I think is just too far out for the purposes of planning any near-future console launch.

HKMG can come on silicon or metal oxide, the question I was getting at is what's more important, reduced power consumption or cost? IBM is already working in 28nm and they (along with just about everyone else) are skipping over 32nm at this point. You're right that 22nm is probably too far out at this point.

Carl B · May 18, 2010

upnorthsox said:
HKMG can come on silicon or metal oxide, the question I was getting at is what's more important, reduced power consumption or cost? IBM is already working in 28nm and they (along with just about everyone else) are skipping over 32nm at this point. You're right that 22nm is probably too far out at this point.

To the specific point of whether Sony would go bulk or non, I almost feel as if that answer will be pre-determined based partly on external business considerations. In a way that may seem 'cost,' but on another level I think it's going to boil down to the pre-existing agreements Sony/SCE has structured with both Toshiba and IBM, and how those might be leveraged into a production advantage relative a new Cell-derived chip. Toshiba themselves joined the IBM HKMG alliance for 28nm last year, so who even knows if there's a difference between the two at this point.

Of course that said, all of this presupposes the legitimacy of the rumor itself; we have to remember this is all thought experiment territory still.

upnorthsox · May 18, 2010

Carl B said:
To the specific point of whether Sony would go bulk or non, I almost feel as if that answer will be pre-determined based partly on external business considerations. In a way that may seem 'cost,' but on another level I think it's going to boil down to the pre-existing agreements Sony/SCE has structured with both Toshiba and IBM, and how those might be leveraged into a production advantage relative a new Cell-derived chip. Toshiba themselves joined the IBM HKMG alliance for 28nm last year, so who even knows if there's a difference between the two at this point.

Of course that said, all of this presupposes the legitimacy of the rumor itself; we have to remember this is all thought experiment territory still.

I agree wholeheartedly with that.

TheWretched · May 18, 2010

aaronspink said:
Not really, even games spend a large amount of time not in Pmax.

Using a WinMo phone as an example of battery life is like complaining that your 150HP 1950s 3 TON V8 car gets bad gas mileage. WinMo phones have always been the worst battery life phones on the market.

Most games put load most of the time on the CPU... the time they idle can be easily dismissed...

And I said FOR EXAMPLE! I have used Androids and Iphones too, and those were empty in the same time, like my WM6 was. And mine only has one of the ultra slow 200Mhz Omap 850, compared to the 800Mhz and more Android (Samsung 5700i Spica). And that was without using WiFi or Bluetooth or GPS. Just plain playing around with the software that was installed prior to this.

eastmen · May 21, 2010

Any leaks on on dx 12 info yet? I know dx 11 is not even a year old at this point , but i would think information would start leaking on dx 12. Esp if it makes a 2012 date with win 8

NRP · May 21, 2010

Corinne Yu hints strongly (IMO) that in the near future there will be significant paradigm changes. She implies that software rasterizers are coming back, and sooner rather than later. Or at least, the near future will bring much more programmability and flexibility.

I don't know if what she says is merely her "wish list", or if that is indeed where MS and the industry in general are headed. Nevertheless, this would certainly affect nexgen hardware.

liolio · May 21, 2010

NRP said:
Corinne Yu hints strongly (IMO) that in the near future there will be significant paradigm changes. She implies that software rasterizers are coming back, and sooner rather than later. Or at least, the near future will bring much more programmability and flexibility.

I don't know if what she says is merely her "wish list", or if that is indeed where MS and the industry in general are headed. Nevertheless, this would certainly affect next-gen hardware.

Well I don't know what it implies but she made it even clearer through this presentation

EDIT watching at all last the second part of the interview she gave for channel9, first question she somehow gives a timeline 5/7 years from now

Shifty Geezer · May 21, 2010

NRP said:
I don't know if what she says is merely her "wish list", or if that is indeed where MS and the industry in general are headed. Nevertheless, this would certainly affect nexgen hardware.

It's headed that way, but not next-gen hardware. When GPUs evolve to full programmability, it'll be down to the software developers how those processing reserves get used. Until then we get the most bang-for-buck from still somewhat specialised hardware (look at Larabee's lacklustre reported performance). In the future, the versatility of programmability will outweigh brute-force performance advancements. eg. At the moment we have AA resolve hardware. We've since developed MLAA, and future algorithmic analysis methods will only get better, at which point hardware MSAA becomes a waste of silicon.

V3 · May 22, 2010

In Cell the SPUs don't consume that much power, so maybe Sony managed to bolt those SPUs to a differeint PPE than the PowerPC in Cell. Maybe this time they'll managed to integrate everything into single chip like how they wanted PSP to be.

PSP wasn't exactly easy on the battery when it was launched. If the rumours are true, SPUs and those two cameras can create some funky augmented reality sort of games that would be difficult to replicate on consoles.

Is MBX something confirmed to be on PSP2 at least ?

Rolf N · May 23, 2010

Shifty Geezer said:
It's headed that way, but not next-gen hardware. When GPUs evolve to full programmability, it'll be down to the software developers how those processing reserves get used. Until then we get the most bang-for-buck from still somewhat specialised hardware (look at Larabee's lacklustre reported performance).

Why would you want to evolve GPUs to full programmability? If you're making GPUs and want a slice of CPU margins move over to your own products, sure, try it. But outside of localized, vested interests, there's no system-level benefit.

Shifty Geezer said:
In the future, the versatility of programmability will outweigh brute-force performance advancements. eg. At the moment we have AA resolve hardware. We've since developed MLAA, and future algorithmic analysis methods will only get better, at which point hardware MSAA becomes a waste of silicon.

In the future, process shrinks will grind to a screeching halt, and all functional blocks will have to prove useful to be worth including in a design.

Acert93 · May 23, 2010

Shifty Geezer said:
eg. At the moment we have AA resolve hardware. We've since developed MLAA, and future algorithmic analysis methods will only get better, at which point hardware MSAA becomes a waste of silicon.

The hardware responsible for MSAA has found itself valuable for improving image quality / increasing performance for quality in areas outside anti-aliasing. Likewise MSAA still provides superior quality for very small objects that often prove difficult for MLAA and other techniques. MSAA has some downsides, like bandwidth, but iirc they are not significantly large in terms of silicon budget. One of the reasons MLAA is finding itself useful, especially on the PS3, is because 4 years after launch PS3 games still are having a hard time finding ways to effectively utilize the SPEs--and it isn't because games are now GPU-bound and don't need the CPU resources but instead the SPEs have been a solution difficult to fit problems into.

Shifty Geezer · May 23, 2010

Rolf N said:
Why would you want to evolve GPUs to full programmability? If you're making GPUs and want a slice of CPU margins move over to your own products, sure, try it. But outside of localized, vested interests, there's no system-level benefit.

I still believe in the idea of a single, unified, scalable architecture that serves all purposes; a pool of processing resources to be used however the software requires, with maximum flexibility and zero wastage.

Joshua Luna said:
One of the reasons MLAA is finding itself useful, especially on the PS3, is because 4 years after launch PS3 games still are having a hard time finding ways to effectively utilize the SPEs--and it isn't because games are now GPU-bound and don't need the CPU resources but instead the SPEs have been a solution difficult to fit problems into.

You have recognised yourself the excellent IQ of GOW. Are you saying you'd rather this was not possible and stick with 4xMSAA? Or the increased demands of 16xMSAA? Yes, one way of looking at it is trying to find something useful for SPEs to do, although I feel that somewhat belittles the GOW and associated teams' efforts in squeezing the technique onto already very occupied hardware. It's not like they had 4 idle SPEs doing absolutely nothing and were just trying out things to fill them up, and found MLAA was a great processing hog that let them max out the platform so they can claim 100% utilisation!

However, another way to look at it is that MLAA was a solution looking for a platform that could pull it off. T.B. has mentioned in that MLAA thread that the solution as featured in GOW3 doesn't map well to current GPUs, meaning it's not an option. The versatility of SPEs means that is an option, as are other techniques, which is the benefit of programmability; the cost being reduced performance relative to GPU hardware at what GPUs do best. As you say, MSAA hardware can be repurposed by clever coders, but that's them working against/around the limits of the design. Wouldn't it be better if instead of cleverly using custom hardware to do unconventional workloads (the whole basis of GPGPU performance), the hardware was fully programmable and there were no architectural limits that either limits your options (not being able to use MLAA if you want it) or adds complexities (finding a way to reengineer existing MLAA methods to a GPU's design)?

The idea of software raasterisers means zero architectural limits. No requirement to use one or other AA method, or one or other lighting method, or to do rasterisation when your game would benefit more from ray-tracing. It throws the doors wide open, and I believe the advancements in software solutions would be dramatic leading to efficiencies that outweigh the brute-force method of the current GPU structure.

Squilliam · May 23, 2010

Shifty Geezer said:
I still believe in the idea of a single, unified, scalable architecture that serves all purposes; a pool of processing resources to be used however the software requires.....

Larry! Larry! Larry! Larry! Larry! Come on I see what you did here, you're wanting Larry Bee back into the console hardware futures market!

, with maximum flexibility and zero wastage.

Wait... *shifty eye look*... Isn't that an oxymoron? The greater the flexibility the greater the wastage because you introduce features which aren't needed in all cases and therefore must idle them like the lazy SOB features that they are.

Now, serious like... if seperate and specialised can produce higher throughput then how can you consider unified to be both more flexible with less waste outside of the low hanging fruit of shader unification?

Im trying to see this from your perspective. Is the key issue with unification of both the CPU and GPU not in regards to overall use of hardware but how efficiently it gets used? You can easily throw jobs at a console with spare runtime but that doesn't mean that its the most efficient way to run those jobs, it could just be a matter of convenience inasmuch that its not the most efficient but its more efficient than giving them nothing to do.

Acert93 · May 23, 2010

Shifty Geezer said:
I still believe in the idea of a single, unified, scalable architecture that serves all purposes; a pool of processing resources to be used however the software requires, with maximum flexibility and zero wastage.

Define Wastage.

To me dropping your Texture Units for "a pool of processing resources to be used however the software requires" that is an order of magnitudes slower for a task repeated many times in every frame is a good example of Wastage. This would be allowing Philosophical Idealism to rule good Silicon utilization.

You have recognised yourself the excellent IQ of GOW. Are you saying you'd rather this was not possible and stick with 4xMSAA?

Devil's Advocate: 1) essentially no games use this technique so if the measure of good design is the rare exception, then yes, it is probably not a good thing to enable and 2) no insult to the GOW3 guys, but the PS3 has 6 available SPEs, so why weren't those resources spent for the game? 3) 4 SPEs is a ton of resource footprint, hard to argue against a ton more general system bandwidth and pure GPU power (something the PS3 lacks and the majority of games would benefit from) as these could and would be of more benefit in more games on a more frequent basis.

Or the increased demands of 16xMSAA? Yes, one way of looking at it is trying to find something useful for SPEs to do, although I feel that somewhat belittles the GOW and associated teams' efforts in squeezing the technique onto already very occupied hardware. It's not like they had 4 idle SPEs doing absolutely nothing and were just trying out things to fill them up, and found MLAA was a great processing hog that let them max out the platform so they can claim 100% utilisation!

I like the results I saw but a I think it is worth noting that 4x to 8x MSAA isn't a 2x doesn't result in a halfing of performance of GPU bound software nor require 2x the hardware to get the performance parity. If your bandwidth is a general resource (unlike Xenos) and you have the goal of quality IQ (which this generation shows: Most developers and console makers aren't married to clean textures and smooth edges... stupid assumption on my part!!) the cost of MSAA in the chip is low, the IQ is high and when implimented correctly avoids nasty IQ issues (see: repi's complaints about MLAA and why it was NOT used in BFBC2), accessible to developers for the purpose, offers performant solutions to other IQ related issues rather than brute-force approaches that are costly, and the bandwidth is a sharable resource. We have already seen that MSAA can be adjusted on the fly to adapt to framerate so we could see situations where a game is bandwidth bound scale the MSAA to adapt to resources present.

In general I don't think we will see MSAA just dropped from hardware anytime soon because it has earned its place on the hardware. Lets not get ahead of ourselves by 1 or 2 games where 1) resources were not utilized and 2) the approach worked well with the game on 3) a system with some SERIOUS gotchas in terms of other AA approaches. As noted above repi didn't think much of these alternative approaches in BFBC2 so it may not be a good general solution... yet.

However, another way to look at it is that MLAA was a solution looking for a platform that could pull it off.

This reminds me slightly of the RayTracing verses Rasterizing debate. RT is always a technique waiting for a platform to pull it off. Yet the resources to do so are always much, much higher than those to do better looking Rasterization. This isn't as extreme but, and as noted I liked the effect a lot in GOW3, it seems the cost is quite high (why not go with robust MSAA hardware + kick as CPUs that actually developers can use easily... and make more games better overall) and some developers have already given a thumbs DOWN to the techniques.

T.B. has mentioned in that MLAA thread that the solution as featured in GOW3 doesn't map well to current GPUs, meaning it's not an option.

The current solution

We gave PS3 developers 4+ years to figure this one out, lets give the GPU guys the same time frame to justify their hardware.

As you say, MSAA hardware can be repurposed by clever coders, but that's them working against/around the limits of the design. Wouldn't it be better if instead of cleverly using custom hardware to do unconventional workloads (the whole basis of GPGPU performance), the hardware was fully programmable and there were no architectural limits that either limits your options (not being able to use MLAA if you want it) or adds complexities (finding a way to reengineer existing MLAA methods to a GPU's design)?

I wouldn't say using MSAA hardware to allow significantly cheaper soft shadow edges or A2C that is passable IQ (versus complete game redesign to remove heavy alpha usage) is hacking the hardware or unconventional. The problem is the hardware as it stands, SPEs and all, this little amount of hardware offers big IQ and performance increases. I defer back to the Texture Unit example previously given. Larrabee wanted to be the pie in the sky programmable platform but even Intel couldn't stomach the thought of dumping these units (which are large and take away from a lot of potential programmable units!)

It all sounds good in theory--and you can find corner cases to prove your point--but if this generation tells us anything there are bigger fish to fry. Approachable hardware that gets quality products out with robust content, on schedule and on budget, are more important metrics. Yeah, it sucks that real world business dictates the coolness of the industry, but I think the reason we saw the hardware your envision (Intel's for practical purposes) get v.1 canned was for the very issues the industry faces. Cool concept, too slow, too much out of the box, trying to find problems for the solution instead of addressing the core issues.

The idea of software raasterisers means zero architectural limits. No requirement to use one or other AA method, or one or other lighting method, or to do rasterisation when your game would benefit more from ray-tracing. It throws the doors wide open, and I believe the advancements in software solutions would be dramatic leading to efficiencies that outweigh the brute-force method of the current GPU structure.

You should be writing the Intel GPU blog

KKRT · May 23, 2010

4 SPEs is a ton of resource footprint, hard to argue against a ton more general system bandwidth and pure GPU power (something the PS3 lacks and the majority of games would benefit from) as these could and would be of more benefit in more games on a more frequent basis.

They use all 5 SPU if they are free for MLAA [sixth is dedicated for color correction as i remember], but they need only 2 SPU for reaching 60hz [2 SPU filter image in less than 10ms in 720p, 5 SPU do it in less than 4ms]. So they only 'waste' maximum 25% of SPU time on MLAA in 60hz conditions.

Squilliam · May 23, 2010

Joshua Luna said:
Define Wastage.

To me dropping your Texture Units for "a pool of processing resources to be used however the software requires" that is an order of magnitudes slower for a task repeated many times in every frame is a good example of Wastage. This would be allowing Philosophical Idealism to rule good Silicon utilization.

In terms of unification is it not done under the context of either A: A task becomes an ever smaller subset of another, for instance GPU 2D rendering is done on a seperate but insignificant proportion of the overall GPU die or alternatively B: A task shares enough commonality with another so that if 2 units of N can perform the same tasks as 1 unit of A and B respectively but more efficiently than both seperate units.

In terms of console hardware are we not looking at a bit of element A in terms of having an increasingly disproportionate quantity of compute performance and time dedicated to either lower level GPU functions specifically in the GPU and higher level GPU functions in terms of VMX/SPEs on the CPU being devoted to the graphical subsystems? If the current overall ratio of resources spent in hardware/software is 2:1 GPU:CPU then as this trend continues the GPU will eventually dwarf the CPU.

Element B also comes into play as well. If the majority of the simulations for next generation gaming are a balance between present day stereotypical GPU work and the higher level related work performed presently on the CPU in the PS3 and Xbox 360 then this is the next candidate for unification is it not? More modern GPUs are already orders of magnitude faster and more efficient at these tasks than they were 5 years ago and they will be even better candidates for this form of unification in the future. If 2 units of N can perform A + B better than A+B can independantly then unification is a desired goal.

So if the GPU grows even faster than the CPU and at the same time obsorbs relevant GPU related CPU tasks then what is left is effectively a modern GPU with fixed function texture and CPU units. If theres no longer need for a CPU large enough to devote an entire die on the package then effectively the GPU has eated the CPU and the overall design is more unified than not even if some fixed function CPU/GPU specific hardware remains.

BadTB25 · May 24, 2010

So will Cell2 have even less of an impact in a future PS4 compared to the PS3?
If I understand you correctly, when it comes to consoles, the GPU is king.

Predict: The Next Generation Console Tech

upnorthsox

patsu

Carl B

Friends call me xbd

upnorthsox

Carl B

Friends call me xbd

upnorthsox

TheWretched

eastmen

NRP

liolio

Aquoiboniste

Shifty Geezer

uber-Troll!

V3

Rolf N

Recurring Membmare

Acert93

Artist formerly known as Acert93

Shifty Geezer

uber-Troll!

Squilliam

Beyond3d isn't defined yet

Acert93

Artist formerly known as Acert93

KKRT

Squilliam

Beyond3d isn't defined yet

BadTB25

Similar threads