Predict: The Next Generation Console Tech

Status
Not open for further replies.
Well, actually AMD Fusion could prevent the use of an external eDRAM. If they use a much bigger L3 on die (32 to 64 mb), both GPU and CPU could acces to those data with extremely low latency and at very fast speed. Developers would be free to use it as they want (framebuffer or whatever).
But since Microsoft is probably going to launch in 2011, i can't see a custom APU solution coming so soon.
 
Well, actually AMD Fusion could prevent the use of an external eDRAM. If they use a much bigger L3 on die (32 to 64 mb), both GPU and CPU could acces to those data with extremely low latency and at very fast speed. Developers would be free to use it as they want (framebuffer or whatever).
But since Microsoft is probably going to launch in 2011, i can't see a custom APU solution coming so soon.

32MB of cache would take up a lot of room even on the 32nm process wouldn't it? The only chip with that much on board cache is the POWER 7 and that hasn't been released yet! In addition to that its simply a very big chip and much larger than you would expect a console to employ.

Are there any even denser cache architectures which could be employed on die?
 
32MB of cache would take up a lot of room even on the 32nm process wouldn't it? The only chip with that much on board cache is the POWER 7 and that hasn't been released yet! In addition to that its simply a very big chip and much larger than you would expect a console to employ.

Are there any even denser cache architectures which could be employed on die?

IBM estimates that the eDRAM has a 6:1 Latency improvement for L3 accesses relative to an external L3. Relative to an internal SRAM array, eDRAM takes about one third the space and consumes about one fifth the standby power. As for performance, IBM characterizes it as "almost as fast" and says that it handles the memory refreshes required by DRAM--memory contents have to be periodically written or they will decay--during "windows of opportunity" and generally won't have much of an impact on system performance.
http://news.cnet.com/8301-13556_3-10316305-61.html

http://arstechnica.com/hardware/new...er7-twice-the-muscle-half-the-transistors.ars


So 32 Mb of L3 DRAM take the same space of 10 mb of SRAM.
If at 90nm, 10 mb of eDRAM had a size of about 70 mm^2, at 32nm the same amount would take roughly a ninth of the space. So for 32mb we are talking of 20 mm^2..
Actually, it would be nice to have a daughter die of around 80mm^2 with 128mb of eDRAM, shared between GPU and the CPU (no need for on die L3 in this case).
So developers could have 128mb of really fast shared cache, (that can be used also for the framebuffer), and this would lower the need for a big fast memory pool. 2 Gb of not-so-fast-and-expensive GDDR5 would do the work.




a) one GPU and one CPU
  • Two different memory pool. Pros: less expensive, more memory. Cons: loss of UMA advantages, lack of back. compatibiliy.
  • A single memory pool. No Edram.
  • A single memory pool + Edram.
b) two GPU and one CPU
  • One memory pool. Pros: it would resolve the lack of bandwidth problem, and using 2 128bit buses would lower the complexity of the motherboard layout instead of using a 256bit. Cons: 3 die-size, heat and power problem, less memory or more expensive
c) A custom APU.
Probably the best in term of efficency, but not the fastest solution due to the costrain of die size and cost. Something like 4 GPU cores and 4 CPUs cores with a big (32 mb) shared L3 memory, and one memory pool.



By the way, Llano GPU is not powerfull enough.. AMD has stated that's a "Gigaflops class GPU".
 
By the way, Llano GPU is not powerfull enough.. AMD has stated that's a "Gigaflops class GPU".

Nice bit of work there MarkoIt.. howabout using the LLano GPU for GPGPU only and then having a discrete GPU in there too like an equivalent 6770 or something of its ilk.
 
So 32 Mb of L3 DRAM take the same space of 10 mb of SRAM.
If at 90nm, 10 mb of eDRAM had a size of about 70 mm^2, at 32nm the same amount would take roughly a ninth of the space. So for 32mb we are talking of 20 mm^2..
Actually, it would be nice to have a daughter die of around 80mm^2 with 128mb of eDRAM, shared between GPU and the CPU (no need for on die L3 in this case).
So developers could have 128mb of really fast shared cache, (that can be used also for the framebuffer), and this would lower the need for a big fast memory pool. 2 Gb of not-so-fast-and-expensive GDDR5 would do the work.

I would definately be thinking 'on die' frame buffer more than anything else. If 20-25mm^2 for 32MB is true then it would seem like a no-brainer for a console which was targetting reasonably high performance to implement it. You could fit a 1920/1080 framebuffer in that space and the cost savings relative to off chip bandwidth should definately be worth it.

I don't see it working as an external cache for the GPU and CPU because it sounds like a nightmare trying to get data to a hungry GPU and then trying to feed a CPU at the same time. It would also add latency to the CPU -> ED-Ram -> GPU link unless they duplicate the traces which means extra pads for signaling and extra board complexity.

I simply don't believe that the next generation will really have the TDP available for chips to get big enough that they need to be split off for yields/performance. Its pretty easy nowadays to hit 100W on a 200mm^2 chip and thats the limit for a safe and practical high(er) performance system. Because you have to add on PSU inefficiencies, Ram, HDDs, Optical drives and fans and you're easily getting about 140-150W draw overall with a single 100W central CPU. Then you can start to think of the possibilities of having the GPU -> CPU latencies so low.
 
You could fit a 1920/1080 framebuffer in that space

1920*1080*32 is 36.480, so maybe isn't enought, and programmer must want to use that edram to store something temporarely
compression apply to this? transparent? and what about mrt messing with the rendering pipeline?
 
Eh? You can fit three 1080p render targets plus the depth/stencil buffer (all 32bpp) in under 32MB.
 
There's not enough real-world experience with IBM's eDRAM technology in a market segment within a light year of the console space.

As nifty as it is in a 500+ mm2 chip that goes into systems that can costs hundreds of thousands of times what a console costs with service contracts that will cost even more than that, we haven't seen this promised to work well for the manufacturability of a product that must care about yield and measures its profitability and loss per unit in what would be less than a rounding error for IBM.
 
Heh, indeed. That L3 is not an insignificant portion of the chip in the die shot. Even if they cut out half the Power7 cores there, the raw space taken up by the full eDRAM portion would still make for a pretty big chip.

1920*1080*4*(32+32) = 530 Mbit = 66 Megabyte
No one mentioned AA, I certainly didn't. It's simply just under 8MB per 1080p buffer. Simple.
 
I don't want to attract the hardcore SEGA fans but Hitachi and SEGA had a good relationship back in the post Genesis/Megadrive days culminating in the SH-4 and who can forget the golden goodness of the dual core Sh-2 in the SEGA Saturn.

Hate to nit pick here, but it was two full CPUs, not a dual core solution.
 
Depends on the terminology .. whether on die or off die, there were two SH2's even though they were not on the same die. Dual and quad core CPU's nowadays also come with essentially complete CPU's too.

I believe the story goes that SEGA added the second CPUa after learning about the Playstations overall 3D power and knowing they would not be able to compete with a single SH2.
 
I would like to know if the next box will be equal to an ati 5970 or 2x that ?

also what kind of graphics can i expect from a $400 console in 2012?
 
I would like to know if the next box will be equal to an ati 5970 or 2x that ?

Nobody can answer that question. First, because console are different from pc. Second, because the next-gen is 2 to 3 years away, and in that time many things can change.
 
Nobody can answer that question. First, because console are different from pc. Second, because the next-gen is 2 to 3 years away, and in that time many things can change.

Thanks. It is just that i have heard next gen consoles launch with either tech that is a year old or new (modified/ new hardware).

I just want to say i joined this site because this thread seems to be the only one i have noticed on the net that actually gets into what hardware may be around for consoles besides just naming possible ati cards (northern islands ect..).

Does a console that is Gpgpu need a cpu or can it function with only Gpgpu plus graphic card?
 
Does a console that is Gpgpu need a cpu or can it function with only Gpgpu plus graphic card?

wouldn't the GPGPU be the graphics card already?

What you're basically saying is that you would like to see a console that scares away each and every developer because it has no development history at all? Even the PS3 had a basic dual thread PPC so you could at least code something on it.
 
Does a console that is Gpgpu need a cpu or can it function with only Gpgpu plus graphic card?
GPGPU is "graphics card", so you couldn't have GPGPU + GPU. Well, you could in theory, but that'd be very mental! Future GPUs won't be able to run all desired code, and will still need a CPU of sorts, kinda like how SPUs need a PPU or CPU (in the case of Toshiba's SPURSEngine). Depending on the flexibility of future GPUs (by which point the 'G' in the name will need to be dropped. Is the industry showing signs of any proper naming convention?), the performance wanted of the CPU could vary considerably.
 
GPGPU is "graphics card", so you couldn't have GPGPU + GPU. Well, you could in theory, but that'd be very mental! Future GPUs won't be able to run all desired code, and will still need a CPU of sorts, kinda like how SPUs need a PPU or CPU (in the case of Toshiba's SPURSEngine). Depending on the flexibility of future GPUs (by which point the 'G' in the name will need to be dropped. Is the industry showing signs of any proper naming convention?), the performance wanted of the CPU could vary considerably.

ok. No insult but i am learning here. So thank you. All i knew about gpu were besides graphic cards were something about parallel tasks vs cpu (something from an nvidia presentation).

I thought a gpgpu was something different(vs gpu) because there was an article complaining about the dev cost to go from 20 mil to maybe 60mil also not that many dev know how to code for it.

Does the Cpu really make a difference (yes i know of ps3 cpu can offload work from the gpu) but what is the deal with for example a quad core 4.7 ghz vs say a 16 core with 3.0 ghz ?

It is better because multiple cores means more tasks can be assigned ?
 
wouldn't the GPGPU be the graphics card already?

What you're basically saying is that you would like to see a console that scares away each and every developer because it has no development history at all? Even the PS3 had a basic dual thread PPC so you could at least code something on it.


Thanks.
 
Thanks. It is just that i have heard next gen consoles launch with either tech that is a year old or new (modified/ new hardware).

I just want to say i joined this site because this thread seems to be the only one i have noticed on the net that actually gets into what hardware may be around for consoles besides just naming possible ati cards (northern islands ect..).

Does a console that is Gpgpu need a cpu or can it function with only Gpgpu plus graphic card?

Usually the final specification are locked at least 6-8 months before launch. If MS is planning to launch in November 2011 (which seems likely to happen), the final specification will be locked during Spring 2011. By that time, Ati should have their new architecture out. So Xbox Next could have a tuned version of N.I or an improved RV870. If they launch in Nov. 2012, Xbox Next will have for sure a N.I derivate.


Let's put one thing clear. Next-gen console won't be enough powerfull to have realtime radiosity. But all the assets that will be developed for next-gen games, will be of CGI quality: high-resolution texture, models, shader. And one thing that next-gen must focus on is image quality: i rather have a slighly less complex engine, but without aliasing at all, and with awesome post-processing effects. So there are two aress were MS/Sony/etc engineers have to wrap their mind around are:
  • Huge amount of bandwidth
  • High amount of memory

I think both eDRAM and fast streaming are the proper way to resolve them. It will be nice to see how MS and Sony will use different technology and ideas. :smile:
 
Status
Not open for further replies.
Back
Top