Predict: The Next Generation Console Tech

Squilliam · Jun 3, 2010

MfA said:
It's a bit non sequitur with the rest of your post ... how about you provide an example of something you think would provide a net mm2 savings in the average use case.

I feel you're under achieving here given your technical background. Perhaps you might enlighten us with your 'ideal' console architecture with respect to if you were calling the shots?

MfA · Jun 3, 2010

400 mm2 worth of ATI's or NVIDIA's most up to date GPU silicon with 4 bobcat cores on a SOC if Bobcat performs decently, if not PowerPC again (Intel's x86 cores are close to as good as Aaron says they are, but they aren't licensable).

liolio · Jun 3, 2010

MfA said:
It's a bit non sequitur with the rest of your post ... how about you provide an example of something you think would provide a net mm2 savings in the average use case.

I can trying but it will mostly be a "fraud". I miss the data and more fundamentally the knowledge.

I read again some stuffs about deferred shading/rendering, deferred shader, pre light pass deferred shader, light indexed deferred lightning, tile based deferred shading using compute shaders, etc. I admit this is pretty confuse to me. Still I think that this kind of rendering could map pretty well to the kind of design I'm trying to suggest.
So the fixed function pool of hardware functions that way, it is feed by general purpose vector units, it can fill various different types of render targets, general purpose vector units does the lightning, etc.
How much time on current GPU takes the creation the creation of the multiple render targets vs the time the whole frame takes to be processed?
How would perform fixed function hardware dedicate to these tasks? How much saving it could offer vs Shader cores doing the same task?
I don't know.

I come to think that you could pass on the extra fixed function hardware and it doesn't invalidate the idea.
Say you need a juniper out of the cypress to fill your MRT, mostly half of your chip.
You put this part on one side of the chip and you fill the rest with general purpose vector units.
My idea to go with fixed function was to allow for more general purpose vector units. I thank they would have done the job more efficiently in perf/mm² at the cost of some choices for the programmers.

You got me to realize that fixed function hardware is not that relevant to the idea I had (being broken or not). Why go through the pain of making GPU "do it all" when they are possibly already too big and you have in fact room for more general purpose types of resources on your die?

I still don't know if it makes sense neither I did answered your question properly

but the first line of my post made it clear

.

Rolf N · Jun 3, 2010

liolio, one big advantage of a fully integrated GPU is that it does not need to pass intermediates around via memory. If the vertex processing immediately feeds the rasterizer and that immediately feeds the pixel processing and that immediately feeds the ROPs, you avoid a ton of storage and bandwidth needs for intermediates. If you split the work along the way, you also have to solve (new) storage and transfer issues.

Simple case: forward render, diffuse+specular.
A traditional GPU stores color and Z per pixel in a single pass and is done.
Split it after the rasterization and you suddenly need a place to at least store interpolated texture coordinates to feed your screen-space shading pass, and you also need to budget bandwidth to write it out and read it back in. This goes further up for every vertex attribute you need during shading.

In terms of silicon expenses, a traditional GPU doesn't do this totally free either, but the machinations to buffer and pass this data around are all inside the chip. There's no intermediate data going out to memory and back in. Only the final results.

liolio · Jun 3, 2010

Rolf N said:
liolio, one big advantage of a fully integrated GPU is that it does not need to pass intermediates around via memory. If the vertex processing immediately feeds the rasterizer and that immediately feeds the pixel processing and that immediately feeds the ROPs, you avoid a ton of storage and bandwidth needs for intermediates. If you split the work along the way, you also have to solve (new) storage and transfer issues.

Simple case: forward render, diffuse+specular.
A traditional GPU stores color and Z per pixel in a single pass and is done.
Split it after the rasterization and you suddenly need a place to at least store interpolated texture coordinates to feed your screen-space shading pass, and you also need to budget bandwidth to write it out and read it back in. This goes further up for every vertex attribute you need during shading.

In terms of silicon expenses, a traditional GPU doesn't do this totally free either, but the machinations to buffer and pass this data around are all inside the chip. There's no intermediate data going out to memory and back in. Only the final results.

Thanks for the insight

I was a bit concern about it as you'll have to store quiet some data per pixels if you want vectors to "resume" shading till I realize that situation would not be worse than for deferred shading. You store a lot of data per pixel in you G-buffer. There would be more to the pool of fixed hardware than rasterizer(s) and texture units: some form of fixed pixel pipeline (or as I said if you want to keep more choices/programmability some shaders). So I don't think it's a problem.

I don't see on chip communications be more a problem than it is now, whether the "remains of GPU" is fixed function or not there would be buffers.

In console set-up freed from the directx compliance one manufacturer could pass on ROPs all together and favor MLAA, no matter its drawbacks it's quiet a Saviour in memory consumption and overall quality is great. Actually I would see the thing works a lot like a Larrabee.

Vectors units process geometry in bins like in larrabee and then send it to the "pixel pipeline" which create the G-buffer in RAM, then the the vectors units would act once again as larrabee is intended by processing tile that fit within their local memory. Actually it could save external bandwidth.

I'm concern about thing like vertex texturing, say you handle tessellation with the vertor units and then you want to use displacement mapping you have to send data to the pool of fixed function hardware for texturing. Result will be put in buffer on chip or in RAM. Vector units will process it when available. But is that all different than what happen today?
I realized that texture units are next to SIMD array for a reason so once again there would hardware on top the triangle set-up/rasterizer, texture unit and command processor.
That's why I consider fixed function hardware to minimize the cost.
Point is the amount of pixel won't augment much in the near future, even less in the console realm, the cost would go down.
Say the cost were as much as 50% of the die now @40nm, it would already be down to 25/30 in 2 or 3 year @32nm when next generation system will launch.

EDIT 1
I think have should have describe the thing from scratch as "having a VPU and GPU on the same and try as hard as possible to minimize ther GPU cost" it may have been clearer.
EDIT 2
In regard to the 50% it's just for the sake of the discussion I think cost would be way lower.

Gitaroo · Jun 3, 2010

how much memory does MLAA takes up compare to like 4X MSAA?

Acert93 · Jun 4, 2010

MfA said:
400 mm2 worth of ATI's or NVIDIA's most up to date GPU silicon with 4 bobcat cores on a SOC if Bobcat performs decently, if not PowerPC again (Intel's x86 cores are close to as good as Aaron says they are, but they aren't licensable).

400mm2 of top end GPU? I like, I like

iirc Xenos w/ southbridge minus daughter die was ~230mm2 (about 330mm2 with eDRAM), RSX around 300mm2. A shift toward even more GPU resources that are more general purpose would be a good start for a performant console imo.

qb2k5 · Jun 4, 2010

Next Xbox.

CPU- 23nm 3.2-3.4ghz 6 core / 12 thread cpu based on the original 90nm 360 Xenon design with improved instructions to make it more efficient and cost effective. It will also use less power.

GPU- Derivative of whatever ATIs latest HD68xx gpu series with 32MBs of eDram. This gpu will be focused on handling 3D stereo games at minimum 30fps. It will allow every game to run at 1080p with a minimum of 4xaa and as high as 16xaa. All games will also have 8xAF.

Ram- 3GB GDDR5

HDD- 1TB 2.5" mechanical hard drive. 8GB-16GB SSD to be used as a scratch pad/caching storage to help speed up loading times and to also help improve streaming in open world games.

Game Storage- Standard 50GB BluRay Disc and GOD.

Network Connectivity- 1000Gb NIC, 802.11 a/b/g/n

Sound- will support DDTrueHD and DTS-MA -HDMI and SPDiF Ports

MfA · Jun 4, 2010

Oh I can come up with one slightly more adventurous direction I think one of the console manufacturers should take.

Sony should put money into flip-chip stacking with TSVs and memory optimized for it ... developing it would have some synergy with their other activities (more so than Cell I think) and it could give them a huge advantage in memory bandwidth and power consumption. They need to leverage the fact that they are the only manufacturer working with silicon themselves and in multiple lines of business ... what's the point in being vertically integrated if you outsource everything?

liolio · Jun 4, 2010

Joshua Luna said:
400mm2 of top end GPU? I like, I like

iirc Xenos w/ southbridge minus daughter die was ~230mm2 (about 330mm2 with eDRAM), RSX around 300mm2. A shift toward even more GPU resources that are more general purpose would be a good start for a performant console imo.

FMA speak of a SoC

so it's 400 minus something

kyetech · Jun 11, 2010

qb2k5 said:
.... and GOD.

with god in every console, it would be able to render the entire earth in 6 mili-seconds, but of course it would be resting on the seventh therefore causing a micro stutter!

Dungeonscaper · Jun 12, 2010

kyetech said:
with god in every console, it would be able to render the entire earth in 6 mili-seconds, but of course it would be resting on the seventh therefore causing a micro stutter!

I think he means games on demand. Never heard it abbreviated like that before

AlNom · Jun 12, 2010

I have a question...

What does god need with a console :?:

Mobius1aic · Jun 12, 2010

AlStrong said:
I have a question...

What does god need with a console

To make himself feel important and appreciated. That bastard!

Honestly I'm into the idea of Fusion being in the next Xbox, but we all know it's in MS's best interest to stay with PPC to keep BC possible. With casuals being such a large market now, MS can't afford to alienate a new customer base that will be difficult to convince having to buy a new console.

pc999 · Jun 12, 2010

I personally doubt that any of the console makers are to worried about CPU/GPU/RAM...

That is because of two reasons:

1- todays inexpensive HW already beat the living helll out of any console in performance and it is much more power efficient.

2- there is much bigger things to plan and pay attention, touch screens, camera, wii remote like, iPad/iPhone, 3D, social networks...

If next gen is near it will be one of innovation, if it is far it may "only improve" in the above mentioned techs.

That said I wouldnt be suprissed if we saw a Fusion on the next XB or Wii, as far as we know a much better performer, power efficient, cheaper and out of the box solution that should play Crysis 2 very easly.

liolio · Jun 12, 2010

AlStrong said:
I have a question...

What does god need with a console

Kids?

Tahir2 · Jun 12, 2010

Interesting direction AMD have taken with Llano, it is designed to use very little power compared to their other CPU's in the same ballpark (quad core Athlon II's).

My ideal console right now - Llano with a discrete graphics card as well.

That would be a 2084 cores processing monster !!!11!!

4 core CPU
1600 core GPU
480 core integrated GPGPU

As to Bobcat and Bulldozer - will wait for them to be released before commenting. AMD sure like to bake their processors well before releasing them, the damn teasers.

zRifle1z · Jun 13, 2010

Ok, how about no actual console from MS, but just an interface to XBOX Live?

XBOX Live moved to serious cloud based computing in order to operate more like the OnLive service; Next up, MS replaces the console with an advanced version of NATAL (or, whatever it's called in the next few days).

MfA · Jun 13, 2010

Sony would love that ... their only competitor for the hard core would be the PC then.

MfA · Jun 14, 2010

What would be really precious is if Nintendo came out with a next gen platform first. Higher performance, same price as M$/Sony's last generation hardware, same precision on their next generation WiiMote as Move ... so basically vastly superior.

Microsoft and Sony might have to play the waiting game to pump cash from their investments ... but Nintendo doesn't, and hardware has advanced fast enough that it doesn't take expensive silicon to blow away current generation hardware. If Nintendo does this Microsoft and Sony get caught with their pants down they'd be forced to hurry paying the extra cash necessary for getting a package together and creating overly expensive hardware. After they launch Nintendo can then just go back to being the cheapest, where they have no problem earning money now.

Predict: The Next Generation Console Tech

Squilliam

Beyond3d isn't defined yet

MfA

liolio

Aquoiboniste

Rolf N

Recurring Membmare

liolio

Aquoiboniste

Gitaroo

Acert93

Artist formerly known as Acert93

qb2k5

MfA

liolio

Aquoiboniste

kyetech

Dungeonscaper

AlNom

Moderator

Mobius1aic

Quo vadis?

pc999

liolio

Aquoiboniste

Tahir2

zRifle1z

MfA

MfA

Similar threads