Predict: The Next Generation Console Tech

fehu · Feb 21, 2011

dumb question

considering that maybe this time sony and microsoft will go for a relatively small chip, and that a wide memory controller can generate pin layout problems on a small chip on smaller process technology

there's the possibility of a fused (not fusion) cpu + gpu to put all the silicon in a bigger chip?
i'm not thinking anything too much elaborate like shared cache etc, only use the same node and technology for both the components and then say to the gpu guys "this is the cpu, attach it like a lego brik"
in this way tou can eliminate the (small) interconnection silicon, and implement a 256bit unified mc
can i have this pony?

and another question, this time less dumb
To increase memory bandwidth you can increase the memory controller's width, or use faster memory
I know that it's difficult to make a memory controller capable of reaching huge speed, and to do it the mc become bigger, more complex, hotter and such
A wider mc it's easier to develop, and you can use more relaxed memory chip to reach hight bandwidth, but take a lot of space too, and the traces on the board plus additional memory chip increases the cost
but if i understand, it's the only way to go 4GB of gddr5 (an appreciate feature by me)

what is the best option to make a fast cheap system?

AlNom · Feb 21, 2011

fehu said:
in this way tou can eliminate the (small) interconnection silicon, and implement a 256bit unified mc

Well, it's not a question of whether or not they can make a big enough chip. Chip cost is directly related to the size of the chip by way of the number of chips that are successfully produced and useable from a single wafer. There are power and thermal considerations as well, but more importantly, the whole goal is to eventually reduce the size of the chip, and a 256-bit bus would make life extremely annoying down the road. As it is, you're doubling the number of trace paths to the memory chips -> more signal noise, more complex motherboard (than it would have been), what if they want to halve the number of memory chips to further reduce material cost...

All the little things add up.

If you look at the current die sizes of the 360 slim chips, you wouldn't get to that on a 256-bit bus. So there goes potential power/thermal reductions and chip cost savings. It's a cascading issue that extends to the motherboard components, the case size, the power supply....

It's possible that they could go with 256-bit now and wait for some mythical/unannounced RAM tech that let's them double the bandwidth per bit, but that's years down the road and high risk. The chip memory controller would also need to be updated. In theory, it's something that could happen at a full process node transition, but they already need to deal with making the signalling 100% identical to the previous node since you're dealing with smaller transistors and potentially shorter paths between chips and memory. So it really depends on how much money they're willing to spend on a process meant to save money.

but if i understand, it's the only way to go 4GB of gddr5 (an appreciate feature by me)

The density of the memory chips will be the main factor in hitting higher memory capacities just by way of the number of chips they are willing to solder to the motherboard's surface. I suppose DIMMs are an option, but that'll have it's own caveats with the space taken up within the console as well as with signalling/latency/performance.

Squilliam · Feb 21, 2011

AlStrong said:
The density of the memory chips will be the main factor in hitting higher memory capacities just by way of the number of chips they are willing to solder to the motherboard's surface. I suppose DIMMs are an option, but that'll have it's own caveats with the space taken up within the console as well as with signalling/latency/performance.

Would it be of any cost to benefit improvement to fit the RAM chips to the respective packaging of the CPU/GPU or CGPU whatever it may be? I always wondered why Sony had their GDDR3 chips on package whereas Microsoft never did (I know the ED-RAM complicates things). Would it be easier or harder to do things that way? If everything is on the same package I would think that the overall simplifying of the motherboard would make up for the more complicated pacage, but is there something im missing?

Edit: Assuming that they are going to use no more than 4 32bit chips overall or for either the CPU or GPU.

corduroygt · Feb 21, 2011

AlStrong said:
It's possible that they could go with 256-bit now and wait for some mythical/unannounced RAM tech that let's them double the bandwidth per bit, but that's years down the road and high risk.

XDR2 has more than twice the bandwidth of GDDR5 per pin, but it's unfortunate that everyone else have shunned them.

MarkoIt · Feb 21, 2011

Squilliam said:
Would it be of any cost to benefit improvement to fit the RAM chips to the respective packaging of the CPU/GPU or CGPU whatever it may be? I always wondered why Sony had their GDDR3 chips on package whereas Microsoft never did (I know the ED-RAM complicates things). Would it be easier or harder to do things that way? If everything is on the same package I would think that the overall simplifying of the motherboard would make up for the more complicated pacage, but is there something im missing?

Edit: Assuming that they are going to use no more than 4 32bit chips overall or for either the CPU or GPU.

Maybe because it's shared, where in PS3 each chip has its own ram pool.

Any chance that Nvidia or AMD will design a specific architecture, maybe using some of their current tecnhology, for next generation? For example, take Cayman and replace the 4-ways ALU with 16-ways ones. It should be smaller right? (less redundant transistors).
I was reading that some developers are asking for Vec16 ALUs.

Squilliam · Feb 21, 2011

corduroygt said:
XDR2 has more than twice the bandwidth of GDDR5 per pin, but it's unfortunate that everyone else have shunned them.

There are probably some very important reasons why they 'just don't use it'. It may cost a lot more, it may involve a lot more board complexity or maybe they just hate RAMBUS as a company in general and prefer not to deal with them. I don't think if the deal was 'pay slightly more and get a far better technology' they would shelve it for no good reason. Maybe the fact that GDDR5 is very difficult to get to the rated speeds without high board complexity is a clue to this based on what Dave has said in the past regarding traces. Fewer traces may just mean it is even harder by another order of magnitude to produce especially within the constraints of a typical GPU PCB.

Squilliam · Feb 21, 2011

MarkoIt said:
Maybe because it's shared, where in PS3 each chip has its own ram pool.

I wouldn't think that'd make any difference, personally. Whether it is shared on not at a chip level doesn't change the basic memory interface architecture.

Any chance that Nvidia or AMD will design a specific architecture, maybe using some of their current tecnhology, for next generation? For example, take Cayman and replace the 4-ways ALU with 16-ways ones. It should be smaller right? (less redundant transistors).
I was reading that some developers are asking for Vec16 ALUs.

Well... Since Nvidia uses something entirely different I don't think they'd do it. As for AMD, well they reduced the width to 4 rather than increased it beyond 5. I think that may be a clue as to the current direction of the shader workloads. AMD said that the average lane utilisation was infact 3.5 I believe which shows that 5 wide was indeed far too wide.

corduroygt · Feb 21, 2011

Squilliam said:
There are probably some very important reasons why they 'just don't use it'. It may cost a lot more, it may involve a lot more board complexity or maybe they just hate RAMBUS as a company in general and prefer not to deal with them.

Rambus, a US-company, sues taiwanese memory manufacturers for patents that they questionably obtained in US court, which they're very likely to win. Therefore, the taiwanese companies collude to sell their memory very cheaply, pricing Rambus out.

Techwise, XDR works just fine in the PS3 and I have very little doubt that XDR2 wouldn't work fine too. It's all about Rambus and their legal team and how they alienated themselves from everyone else. I'm sure if Sony and MS approached Rambus for XDR2 IP licensing, they'd get good deals since there are no other customers. If nothing else, it might force GDDR5 suppliers to give better prices.

AlNom · Feb 21, 2011

Squilliam said:
Would it be of any cost to benefit improvement to fit the RAM chips to the respective packaging of the CPU/GPU or CGPU whatever it may be?

mm... hard to say. The whole packaging process will be automated anyway. It's just the wire tracing that will be the main issue, and I'd worry about signal noise with such a packed configuration, particularly with higher frequency RAM.

I always wondered why Sony had their GDDR3 chips on package whereas Microsoft never did (I know the ED-RAM complicates things). Would it be easier or harder to do things that way? If everything is on the same package I would think that the overall simplifying of the motherboard would make up for the more complicated pacage, but is there something im missing?

MarkoIt said:
Maybe because it's shared, where in PS3 each chip has its own ram pool.

I'd wonder if the eDRAM was the sole reason for not packaging the GDDR3 next to Xenos considering that's where the memory controller is, and Waternoose necessarily goes through the GPU to access memory anyway. However, signal noise would be something to consider as well because you'd be packing all these high frequency components near one-another. I have to wonder if that played a role in the mem clock reduction for the PS3.

Thermal dissipation would be another concern as they'd more than likely be using heat spreaders; were it up to me, I'd cool the chips separately, but space is limited too.

corduroygt said:
XDR2 has more than twice the bandwidth of GDDR5 per pin, but it's unfortunate that everyone else have shunned them.

But IIRC, each bit requires a second wire trace anyway for the signalling. I'd have to look it up again.

3dilettante · Feb 21, 2011

MarkoIt said:
Any chance that Nvidia or AMD will design a specific architecture, maybe using some of their current tecnhology, for next generation? For example, take Cayman and replace the 4-ways ALU with 16-ways ones. It should be smaller right? (less redundant transistors).
I was reading that some developers are asking for Vec16 ALUs.

The 4-way ALU clusters are part of a 16-wide SIMD.
A 16-ALU cluster would have 256 units in the SIMD.
It would also be more challenging to connect 16 units to the per-cluster register file.

kyetech · Feb 21, 2011

Squilliam said:
Give it another year for a console to come out on a particular node. If 22nm comes out in 2013 then expect to see it on consoles in 2014. The high margin server and desktop chips get first dibs at a new process. It may happen the other way, but it is safer to expect a bigger node than to assume they will release on a bleeding edge node.

Well TSMC is skipping 22nm to a slight bump to 20nm

http://www.eetimes.com/electronics-news/4088580/TSMC-skips-22-nm-rolls-20-nm-process

They are targeting 2012 h2. So I would expect to see con soles being targeted for this node.

Incidentally, the scaling from 90nm (which xbox launched at) to 20nm is 4.5 in x, and 4.5 y. Which means you can fit 20x times more transistors in to the same space on a 20nm process.

even if its not 20x, 15x more trans than xbox360 is alot!

fehu · Feb 21, 2011

Ithink that you can't base your worldwide launch on the hope that this time tsmc isn't late and isn't a mess
if the best available is 20nm they will go for a safer 28nm imho

Squilliam · Feb 21, 2011

kyetech said:
Well TSMC is skipping 22nm to a slight bump to 20nm

Incidentally, the scaling from 90nm (which xbox launched at) to 20nm is 4.5 in x, and 4.5 y. Which means you can fit 20x times more transistors in to the same space on a 20nm process.

even if its not 20x, 15x more trans than xbox360 is alot!

We'll have to wait and see if TSMC is on time with a good process. On the other hand we'd have to expect smaller chips which cost as much as the larger chips simply because the power requirements have gone up considerably and the price per transistor hasn't fallen as rapidly as it has in the past due to increased costs in each subsequent node.

(((interference))) · Feb 22, 2011

Is there a limit to the amount of RAM these machines will need, ie. where the law of diminishing returns starts setting in?
Or will consoles in 2020 have 32 GB of RAM?

I would think that with streaming and other LOD techniques, virtual textures and procedural generation RAM would become less of an issue as time went on.

The only real problem with the 360/PS3's memory is that at the time they launched they already had less memory than PCs at the time, (while their CPUs and GPUs were generally as good or better than what was seen in the PC space).

If the machines had a GB of RAM, we'd see console titles look far closer to what we see on PC.

So I think the next gen machines should ship with at least 4 GBs of memory, maybe less if they're going to use SSDs for data storage instead of HDD.

With eDRAM how much is necessary for a 1080p framebuffer with 4xMSAA and full range HDR?

Devious · Feb 22, 2011

(((interference))) said:
With eDRAM how much is necessary for a 1080p framebuffer with 4xMSAA and full range HDR?

Does this post help answer your question?

http://forum.beyond3d.com/showpost.php?p=1076665&postcount=6

corduroygt · Feb 22, 2011

I can't help but think it has to be cheaper to just go with 128-bit XDR2 for >200GB/s bandwidth for all of the memory than reserving a huge chunk of silicon for a sufficiently large EDRAM. It'd also be a lot more flexible for developers, since they can use that bandwidth for anything they please, decreasing development costs as well.

MfA · Feb 22, 2011

You say >200 GB/s like that is supposed to be an impressive number.

Squilliam · Feb 22, 2011

It is impressive compared to 25GB/s isn't it?

Gubbi · Feb 22, 2011

I wouldn't rule out that we will see a DRAM die directly attached to the GPU, either stacked on top, or edge-attached. It could solve a lot of bandwidth problems.

A 2 or 4Gbit die could hold all render targets for 1920x1080 with a decent amount of AA *and* still have room left for a bunch of texture data.

Cheers

AlNom · Feb 22, 2011

Deviousb33r said:
Does this post help answer your question?

http://forum.beyond3d.com/showpost.php?p=1076665&postcount=6

Time once was... lol

Anways, those numbers were assuming a simple forward renderer (depth+backbuffer, 32bpp/RGBA8). The numbers would be higher with multiple render targets & FP16.

Predict: The Next Generation Console Tech

fehu

AlNom

Moderator

Squilliam

Beyond3d isn't defined yet

corduroygt

MarkoIt

Squilliam

Beyond3d isn't defined yet

Squilliam

Beyond3d isn't defined yet

corduroygt

AlNom

Moderator

3dilettante

kyetech

fehu

Squilliam

Beyond3d isn't defined yet

(((interference)))

Devious

corduroygt

MfA

Squilliam

Beyond3d isn't defined yet

Gubbi

AlNom

Moderator

Similar threads