Predict: The Next Generation Console Tech

Status
Not open for further replies.
But cell with DP(sorry im still have cell im my mind... i dont think sony has spent billions in factories and development to throw in the trash...
Good business would say if continued investment or use of sales is bas for future business/economies, you ignore all prior investment and change to the better solution. So yes, it is quite possible after spending all that on Cell with a hope of it becoming an important product line Sony now see it as a dead end and will drop it.

As for texturing, that's perhaps what Cell is least well suited for out of all GPU tasks. There's nothing Cell can help with in textures that a modern GPU won't do far, far better.
 
Only some questions to be put under discussion...so we can then say if the cell is well accepted by developers and if had a chance it evolved to a larger scale (more clock, 16 SPUs, 2-4 PPUs more efficient in single thread, with more cache, less latency, OOE +) and would be best option for sony (performance/wattage/size)?

Could it be Sony / IBM and AMD working together to provide an SoC Cell + GPU(or APU?)?

What would be more acceptable for developers cell or powerpc A2*?

* At 45nm only 428mm^2,1.43 billion transistors...maybe cell could be better relations performance/transistor/size/wattage with same transistors count(if not my mystake cell ps3 PPU + 7 SPUs 228/235 millions transistors has size something like 110/120 mm^2 at 45nm).

http://en.wikipedia.org/wiki/PowerPC_A2


Off topic: interesting file/article about Cell cluster /streaming processing
http://people.cs.vt.edu/butta/docs/ipdps09-cellmr.pdf
Well I don't expect POWER EN/A2 to match Cell on compute density but I believe your calculation is misguided as if you take a look at the die shot the fixed function hardware(s) takes its fair share of die space, just by eyes I would say that IBM could fit 24 (so two more A2 modules) within the overall powerA2 die size. At the same time it's unclear if the POWER A2 includes SIMD units, the cores has a FPU that can handle 32 and 64 FP calculations, my bet is that it doesn't include SIMD.
Overall I don't believe comparing Cell and POWER A2 is a really good comparison.

In the same time they are are both throughput oriented architectures intended to deal with quiet some data, the difference lies I guess in the amount and type of calculations. I don't know what kind of IPC Cell SPU achieve on average but for the POWER EN/A2 they are aiming at sustained 0.8 instruction per cycle (out of 2 max). I'm not sure I remember correctly but in some presentation (from gamefest I believe) they were stating that (on xenos) one could expect 0.2 sustained instruction per cycle. I'm sure it's possible to compare both numbers and it could turns an apple to orange comparison but it could be that Power A2 would turn as meaty improvement over the PS360 powerpc cores (xenon and ppu I let spu out the picture), it may not need much (a 4 wide SIMD) to be good enough. But I guess that's the code that's you would usually on CPU. To compete with a hypothetical Cell successor it would need much more muscles ( more or wider SIMD) it's clear compute density would suffer more importantly power consumption would not be in the same ball park.

As for Cell, well the fact that would imply three different types of resources to code for is a bit bothering. By the time next generation systems are release there should be mature languages that allow to address both the CPU and the GPU, I don't expect someone to go through the expense of extending the usage of one of those languages to SPUs.
I was thinking yesterday after reading the comments in this thread and the RWT Intel Gen6 GPU analysis about how (if) SPUs could replace shader cores all together. This link happened in my mind because of how textures units (and others fixed function units) in gen6 are "decoupled" completely for the shader cores.
In the same time I remember some nAo's idea about how to improve the SPUs (within supposedly an limited silicon and power budget). It was about adding some light multi-threading with an instruction cache(s), and makes the SPUs wider. I believe that along with the same mechanism as in Gen 6 to communicated with fixed function hardwares it could work decently.
Either way I thought a bit more and realize that it would be better to design GPGPU from scratch... Especially as hardware in GPUs GPGPUs removes a lot of pressure on software to make things "works".
It also got me to wonder about what devs would do with a Gen6 CPU if they were to use its resources "directly" as they do with SPUs and how they would perceived the relative advantage of both architectures (so VS SPUs).

So Cell is stuck as it is, it sounds tough to evolve it in anyway significant way other growth without profound changes and most likely breaking BC. I still believe it would makes sense for Sony to include at least 6 SPus in their next system, they are cheap, powerful, cool, a known quantity by know and it would make for easy BC.
 
Well I don't expect POWER EN/A2 to match Cell on compute density but I believe your calculation is misguided as if you take a look at the die shot the fixed function hardware(s) takes its fair share of die space, just by eyes I would say that IBM could fit 24 (so two more A2 modules) within the overall powerA2 die size. At the same time it's unclear if the POWER A2 includes SIMD units, the cores has a FPU that can handle 32 and 64 FP calculations, my bet is that it doesn't include SIMD.
Overall I don't believe comparing Cell and POWER A2 is a really good comparison.

In the same time they are are both throughput oriented architectures intended to deal with quiet some data, the difference lies I guess in the amount and type of calculations. I don't know what kind of IPC Cell SPU achieve on average but for the POWER EN/A2 they are aiming at sustained 0.8 instruction per cycle (out of 2 max). I'm not sure I remember correctly but in some presentation (from gamefest I believe) they were stating that (on xenos) one could expect 0.2 sustained instruction per cycle. I'm sure it's possible to compare both numbers and it could turns an apple to orange comparison but it could be that Power A2 would turn as meaty improvement over the PS360 powerpc cores (xenon and ppu I let spu out the picture), it may not need much (a 4 wide SIMD) to be good enough. But I guess that's the code that's you would usually on CPU. To compete with a hypothetical Cell successor it would need much more muscles ( more or wider SIMD) it's clear compute density would suffer more importantly power consumption would not be in the same ball park.

As for Cell, well the fact that would imply three different types of resources to code for is a bit bothering. By the time next generation systems are release there should be mature languages that allow to address both the CPU and the GPU, I don't expect someone to go through the expense of extending the usage of one of those languages to SPUs.
I was thinking yesterday after reading the comments in this thread and the RWT Intel Gen6 GPU analysis about how (if) SPUs could replace shader cores all together. This link happened in my mind because of how textures units (and others fixed function units) in gen6 are "decoupled" completely for the shader cores.
In the same time I remember some nAo's idea about how to improve the SPUs (within supposedly an limited silicon and power budget). It was about adding some light multi-threading with an instruction cache(s), and makes the SPUs wider. I believe that along with the same mechanism as in Gen 6 to communicated with fixed function hardwares it could work decently.
Either way I thought a bit more and realize that it would be better to design GPGPU from scratch... Especially as hardware in GPUs GPGPUs removes a lot of pressure on software to make things "works".
It also got me to wonder about what devs would do with a Gen6 CPU if they were to use its resources "directly" as they do with SPUs and how they would perceived the relative advantage of both architectures (so VS SPUs).

So Cell is stuck as it is, it sounds tough to evolve it in anyway significant way other growth without profound changes and most likely breaking BC. I still believe it would makes sense for Sony to include at least 6 SPus in their next system, they are cheap, powerful, cool, a known quantity by know and it would make for easy BC.

Very interesting thoughts you bring us here and especially the possibilities intell post Larrabee universe*(cpu + gpu sinergy ...now present by Gen 6,Sandy bridge and Ivy bride ahead).

Sorry for the off topic... sometime ago im read FFT benchmarks Cell reachs 95/98% efficiencies, but Xenon only 0.2 per cycle!
So what some (devs) said that the Xenon would not even have the power of a single SPU in singleFP may be true?

About fixed function .. in a cpu ? Because I imagine most present in the gpu.


I still have the impression that the A2 as an heir to the PPU / Xenon can be customized for SIMD units, more cache, latencies etc. and would be less because of its flexibility (though not all devs agree) one of the best options available for closed box console . Perhaps the WiiU are using now.

Another option would be to put a cell with sony PPUs 4 (A2 with SIMD, more cache, etc?) And 6 to 8 SPUs if developers want to use more physical to more interactive, or AI as it is an extra, besides serving for BC .

* In picture Gpu almost only 20% die area!

http://www.anandtech.com/show/4083/...core-i7-2600k-i5-2500k-core-i3-2100-tested/10
 
Last edited by a moderator:
Sorry for the off topic... sometime ago im read FFT benchmarks Cell reachs 95/98% efficiencies, but Xenon only 0.2 per cycle!
So what some (devs) said that the Xenon would not even have the power of a single SPU in singleFP may be true?
Well I guess you can't compare 0.2 IPC vs 95%/98% "efficiency".
I wonder what they mean mean by efficiency, +95% of peak flops figure? Or as the SPU is two issue +1.8IPC? I guess it's the former.
I fully expect Xenon to not be able to get anywhere close to its maximum peak in FLOPS, that's for sure, but in the same time 0.2 IPC include spaghetti code that would run on the PPU on Cell.
Overall as I said I'm not even sure I remember the figures correctly and that my comparison was valid, clearly SPUs are beast of their own kind.
About fixed function .. you could die on me on these exclarecer a cpu,? Because I imagine most present in the gpu.
Sorry I guess there is a typo or something but I can't figure out what you're trying to say :(

I still have the impression that the A2 as an heir to the PPU / Xenon can be customized for SIMD units, more cache, latencies etc. and would be less because of its flexibility (though not all devs agree) one of the best options available for closed box console . Perhaps the WiiU are using now.
Honestly I don't know, goinf by Sebbbi's opinion for example It would favor more cores and threads so TLP to less cores exploiting ILP and thus achieving higher IPC.
In regard to SIMD units they can surely be added to the design but I would not go with that much horse power, I read many time that in game (so AI, physics, etc.) the amount of SIMD code is not that high it could be a good idea to mutualize 1 or 2 good SIMD units on a module basis.
I find a few power efficient OoO cores more Nintendo like.
Another option would be to put a cell with sony PPUs apefeiçoados 4 (A2 with SIMD, more cache, etc?) And 6 to 8 SPUs if developers want to use more physical to more interactive, or AI as it is an extra, besides serving for BC .
Imho they would be better with four high IPC cores+ 6/8 SPUs for extra juice and/or BC.
In number of transistors GEn6 has to be in the same ballpark as a whole Cell or 8 SPUs, peak performance in FLOPs is lower, I would indeed really want to see how it would compare to SPUs if devs were to code for it at a lower level of abstraction.
 
Well I guess you can't compare 0.2 IPC vs 95%/98% "efficiency".
I wonder what they mean mean by efficiency, +95% of peak flops figure? Or as the SPU is two issue +1.8IPC? I guess it's the former.
I fully expect Xenon to not be able to get anywhere close to its maximum peak in FLOPS, that's for sure, but in the same time 0.2 IPC include spaghetti code that would run on the PPU on Cell.
Overall as I said I'm not even sure I remember the figures correctly and that my comparison was valid, clearly SPUs are beast of their own kind.

Sorry I guess there is a typo or something but I can't figure out what you're trying to say :(


Honestly I don't know, goinf by Sebbbi's opinion for example It would favor more cores and threads so TLP to less cores exploiting ILP and thus achieving higher IPC.
In regard to SIMD units they can surely be added to the design but I would not go with that much horse power, I read many time that in game (so AI, physics, etc.) the amount of SIMD code is not that high it could be a good idea to mutualize 1 or 2 good SIMD units on a module basis.
I find a few power efficient OoO cores more Nintendo like.

Imho they would be better with four high IPC cores+ 6/8 SPUs for extra juice and/or BC.

In number of transistors GEn6 has to be in the same ballpark as a whole Cell or 8 SPUs, peak performance in FLOPs is lower, I would indeed really want to see how it would compare to SPUs if devs were to code for it at a lower level of abstraction.

Sorry liolio many ,many typos ...I was late in my work and used a translator and with very bad results(thanx try to understand me)... forgive me.:oops:


I thought the following hypothesis with "spaghetti code"... xenon= 24FP (3 core * 8 flop each/SIMD/VMX128 ) * 0.2 IPC* 3.2 (GHz) =~ 15.36GFlops.

Cell 95/98% =~ 210/213 total or = ~ 25GFlops per SPU... im remenber nAo talk about cell doing circles around the xenon cpu(joker if not my mystake talk same thing sometime ago)..but all of this is of topic and sorry all for Gflop talk.

Im agree with you believe me.,but im was surprise about fixed functions on cpu (its not commom to me), cause we see many talks about fixed functions on gpu universe only.

About A2,IPC etc...maybe Sebbbi as a master could talk better than anyone here, but maybe he likes the idea of ​​a PPU evolved customize with many of the enhancements included in what we see in POWER7(much more IPC,cache L3!).

My idea was 4 high IPCs (with many enhancements founded in power6 or 7?) to make life easier for developers + 6/8 SPUs,leaving in the hands of then the option of using or not the SPUs (and certainly serve to BC)...i think we totaly agree here.

What you said makes ​​me think here ... best for game developers is more FPU(more flexible? better for AI,phisics,etc?) or SIMD (graphics,even sound?) on cpu?
 
Last edited by a moderator:
Good business would say if continued investment or use of sales is bas for future business/economies, you ignore all prior investment and change to the better solution. So yes, it is quite possible after spending all that on Cell with a hope of it becoming an important product line Sony now see it as a dead end and will drop it.

As for texturing, that's perhaps what Cell is least well suited for out of all GPU tasks. There's nothing Cell can help with in textures that a modern GPU won't do far, far better.



It may not have to throw away ... Shifty is possible that there may be a marriage of two paradigms: the use of more PPUs (4?) More robust and SPUs as found in current Cell processors, I think that would be the most rational and logical decision in my opinion ... unless AMD offers something very substantial in terms of APU... or sony is thinking of a synergy with the universe of mobile / cell phones using ARMs(As PPU + SPUs?) and powerVr6/rogue.

And indeed and fully agreed, Cell can perform tasks like texture ,but does not compare to a GPU that is customized almost for free in your execution units/TMUs, but maybe at some time in the development cycle of 5 / 6 years its possible to invent or evolve some kind of process that have not been previously originally envisaged, because who would have imagined in 2005 that the cell processor (of course driven by the extreme weakness of RSXgpu) would be doing what we see in frostbite 2 (shaders, texture etc.) ,Defferred render/shading and MLAA?

I'm not a developer and I certainly have infinitesimal fraction of their knowledge, but I think it's always good to have on hand a reasonable capacity for processing in floating point on CPU by your flexibility may can eventually help the gpu next gen consoles in your long cycle 5/6 years in a fight against early obsolescence faced PCs will offer.
 
It may not have to throw away ...
I agree, and I can see one or more designs where Cell would be effective (although I don't know if it'd be cost effective). However, the reason to choose one design over another should never be "well we've already spent so much on this already that we may as well carry on using it". At any given point in time, the management should look at what they want to do, what options they have, and which will give the best return on investment. If, having spent $5 billion on Cell, they have to spend another $billion to make PS4, or go with another tech for $1 billion, if the other tech will result in more profits, it's better to drop Cell. Even if changing tech midway through design would cost more than carrying on with the current design, if the profits outweigh the costs it's the right choice. Basically the total investment is going to include whatever you've already spent whether you continue that tech or change to another, so you can't mitigate it. You just have to focus on maximising profits. The only case for going with a tech based on previous investment is if the end result is good enough to carry you forwards.

If Cell is good enough for PS4, then Sont can go with it, but whether Cell makes it into PS4 or not should be independent of how much Sony have invested already in the tech, and should be entirely about what is best of the platform (cost to make, ease of development, etc.) in terms of generating profits.
 
Well Sony tools are all geared towards the Cell, no?

Though maybe with the Vita, they could leverage tools which will target ARM and SGX.

Can the tools used for iOS and Android development, especially cross-platform tools, be leveraged for Vita and then maybe a PS4 which uses some super array of ARM and SGX?
 
Well Sony tools are all geared towards the Cell, no?
That's an argument in favour. However, if the issues the devs face results in poor use of the CPU, and the CPU is costly to manufacture, it may well be better value to rewrite the tools for a new architecture.

Can the tools used for iOS and Android development, especially cross-platform tools, be leveraged for Vita and then maybe a PS4 which uses some super array of ARM and SGX?
There are tools and then there are tools. Compilers are generic and can be used fgor the same chips. Things like GUI tools won't work. Well, Sony could design the frontend to use the same data structures as Android say, so an interface could be ported. Or they could go Android. That's another good case against Cell - use a more common, standard hardware architecture and cross-platform OS, and thus abandon those billions of dollars of Cell investment.
 
Sorry liolio many ,many typos ...I was late in my work and used a translator and with very bad results(thanx try to understand me)... forgive me.:oops:
Don't excuse your self, I'm not in a situation to criticize your English. Shit happens :LOL:
I thought the following hypothesis with "spaghetti code"... xenon= 24FP (3 core * 8 flop each/SIMD/VMX128 ) * 0.2 IPC* 3.2 (GHz) =~ 15.36GFlops.
I'm not sure that average there is that much FP code, I would better count MIPS.
Im agree with you believe me.,but im was surprise about fixed functions on cpu (its not commom to me), cause we see many talks about fixed functions on gpu universe only.
To me SPUs are not really CPUs, what I said is that they could (perhaps) be used as shaders cores.Read last RWT review of INtel Gen6 and especially how shader cores communicates with the fixed function hardware. I might be wrong (I'm not an engineer) but I thought that if Cell were to evolve significantly this could be a good direction and thinking more that it would not be worse it, IBM better start design a GPGPU.
bout A2,IPC etc...maybe Sebbbi as a master could talk better than anyone here, but maybe he likes the idea of ​​a PPU evolved customize with many of the enhancements included in what we see in POWER7(much more IPC,cache L3!).
I don't want to speak for him but the idea behind Xenon and POWERA2 is to maximize the number of execution units/resources for cheap and aim at pretty high utilization of those EUs, I believe Sebbbi's point some pages ago was in the long run he favors more threads, more resources, etc. even if it more effort to make the most out of it.
My idea was 4 high IPCs (with many enhancements founded in power6 or 7?) to make life easier for developers + 6/8 SPUs,leaving in the hands of then the option of using or not the SPUs (and certainly serve to BC)...i think we totaly agree here.
That sounds like a nive option for Sony, 6/8 SPUs should not add significantly to the CPU size/cost.
What you said makes ​​me think here ... best for game developers is more FPU(more flexible? better for AI,phisics,etc?) or SIMD (graphics,even sound?) on cpu?
They are the best to speak about what they want ;)
Sweeney did and even though he was calling for something like Larrabee he could get what he wanted on another form (more standard language to code for and common to all platform).
Repi from Dice spoke about it in one DICE presentation.
Crytech give us a clear idea of what they would want too.
I'm not sure for crytech but those two first would favor a single chip with fast communication between resources.
 
There are tools and then there are tools. Compilers are generic and can be used fgor the same chips. Things like GUI tools won't work. Well, Sony could design the frontend to use the same data structures as Android say, so an interface could be ported. Or they could go Android. That's another good case against Cell - use a more common, standard hardware architecture and cross-platform OS, and thus abandon those billions of dollars of Cell investment.

Right but if the alternative is to invest billions more in R&D to make the Cell competitive for the future versus tapping into the R&D done by a very competitive industry (mobile SOCs).

OTOH, if they tap into a standard which eventually becomes commoditized, then how do they differentiate or more importantly, justify charging $300 and up for a dedicated console with the same silicon as smart phones and tablets. Or justify charging $60 for games.

The console business model as we've known it the past two decades is by no means assured in the next 5-10 years. That may have to be taken into consideration as the design process advances.
 
Right but if the alternative is to invest billions more in R&D...etc...
Yeah, I wasn't saying that Cell was a dead end to be dropped. There are possibilities. My only point was that the idea "Sony must use Cell because they've already invested billions in it" is poor management, and prior investment should not be considered as the only reason to go a specific route. No doubt that investment will have moved your tech in a useful direction and maybe made it competitive, but the fact a company spends billions on an idea doesn't mean it has to carry through that path as Heinrich4 was suggesting.
 
Interesting XBox next could come sooner than expect (fall 2012, H1 2013).
It should a SoC but AMD terminology is better either its fusion or APU, I don't expect MS (or any other manufacturer to do a real SoC ie include everything on top of CPU and GPU).
I wonder about EDRAM IBM 32nm process could be ready but it looks a bit on the expansive side. I would bet on an external chip especially as MS might make 1080p mandatory (at first at least) SO taking in account that MS should include a potent CPU and a potent GPU that doesn't let much room for the EDRAM.
Nice to hear (if true obviously) that IBM has the control of the chip design a new RroD incident would be... bothering. I wonder if they will include some network/security accelerators to the chip as in the POWER A2/EN. I suspect MS to go heavy on security measure either locally and while accessing the Live.

In regard to the GPU taking in account the time frame I think we can expect at least GCN derivative.
 
It'll be months from production to shipping.
Indeed I re-read the news, if the moles are right the first chips should be ready in Q1 2012, one year to launching sounds right (assuming re-spin, system testing, and building a decent inventory).
Buy the way I start to wonder about next Nintendo chip, we better hear something about chip being taped out soon as I start to wonder about when they will launch.
 
8 years after 360, it had better have >100MB edram, else I wouldn't care.

PS: It's a console, so I wouldn't care anyway, except for the tech.

PPS: I am kinda sceptical on this.
 
This "Obed" if it really exists can be based on the Trinity APU* if the MS wants to use (return) x86 architecture, but if have something like eDRAM, maybe have PPCs with a large shared L2 or even L3 who knows.

The date of mid 2013 seems plausible, because I don't believe MS want to leave the Nintendo to much time lonely in next gen console market.

*
AMD_APU_Mobile_roadmap.jpg
 
Status
Not open for further replies.
Back
Top