Predict: The Next Generation Console Tech

Status
Not open for further replies.
Wii... have just begun!

Ok, I may not have an accurate estimation of specs for MS/Sony/Nintendo/Phantom platform, but it seems like many members are ignoring Nintendo when it comes to predicting next-gen console hardware. Sony PS4 is definitely seen as next-gen-Cell with what-ever-gpu and memory and MS xbox3 is seen as direct derivative of todays xbox360 tech. Nintendo Wii2 is usualy seens as cheap derivative of Wii but not as fast as PS3 or xbox360.

Now after Wii success big-N has money pouring from doors and windows so could we see Nintendo making a console that would be competing with Cell2-PS4 as performance point of view? Laying down big bucks for IBM/Intel/Nvidia and bringing us the TRUE 4D Mario! :) What do you think Nintendo being the king of the hill again CPU/GPU POV just like in N64 days?

Ok, let's predict some:

Sony: Play, year 2011
- 8GB UMA XDR2: 512GB/s
- Mid-end Nvidia GPU
- Cell2 4GHz, 20SPE w/ 2MB local store each, 500 DP-GFLOP/s
- Blu-Ray 12x
- 250GB SSD

Nintendo: Wii too: 2011
- 4GB DDR7: 500GB/s
- 128MB eDRAM: 4TB/s
- Lower-high-end Nvidia GPU
- IBM RPMC* POWER7 @ 7GHz: DP-1TFLOP/s
- Blu-ray 12x

Ok, I have given it a thought and here is my Xbox-cos(pi*n) prediction (yeah, you heard that name from me first! :) ). Microsoft will be unsatisfied because this was second generation and still no world domination. Goes Wii route (cheaper is better) but still keeping up with H/W to get 1080p games/video and other multimedia-center stuff.

Microsoft: Xbox-cos(pi*n), year 2010
- 2GB DDR4: 75GB/s
- 96MB eDRAM: 1TB/s
- lower-mid-end GPU
- IBM Okta-core POWER6 derivative @ 4GHz: 100DP-GFLOP/s cheap and lots of flops/buck
- Blu-ray 6x
- 250GB SSD

Power word Kil... no, MERGE! ;)


* RPCM= Ridiculously Parallel Multi-Core
 
Last edited by a moderator:
Just a note, Jugix, you won't have editing capability yet, but I'll merge your update of your Xbox prediction when you post it. :)
 
Cell development has existed for a couple of years now. Add another couple of years development before GPU's are really offering effective all-purpose processing, and the reality is Cell will probably be in a much stronger development position then GPU's.
GPU's aren't going to offer a seamless, easy development system. Every new multicore architecture that's doing parallel processing is going to face the same issues. A Cell-based PS4 will offer in effect the same programming difficulties as Wii did this gen...none ;) Unless there's a radical shift, which is unlikely, code can be copied over exactly from PS3 to PS4. This maintains BC, ease of development, while more cores etc. provides an excellent scaling of the already developed techniques (assuming developers have got to grips with scheduling systems and work distribution models). Compare that to designing your engines from scratch for whole new hardware, and the advantage is obvious.

I think this point is really debatable ;)
GPGPU is not really new either even if GPU capabalities are just starting to make shine.
That's true that GPU have still some way to go before offering the same flexibility as the cell but on really heavily parallel workloads they are already matching what cell can offers.

So far cell adoption is nothing close to impressive and by the time next systems are available I wouldn't bet a lot on cell blades being more current than gpu clusters.

Speaking of development environment same thing here.
Nothing proves that what works on a 6 spe cell will work scale nicely on a 20 spe design without major rework. The jump will be pretty coase from 6 cores to twenty.
Middlewares providers will need to do some important rework.
Anyway I could assume that this will go smoothly for the sake of discussion.

On does this looks on the GPU side?
In my opinon better.
First if the presence of a cell2 is debatable in next sony system, the presence of a proper GPU isn't.
All three next gen systems will a GPU so do PCs.
GPU vendors have an edge there they can leverage their software efforts on a bigger volume. The same can be said for middleware vendors they more likely to put more efforts on something standard (as the ps4 will have a GPU).
More they have the chance to put their software on an evolving hardware.
Their softs are already working on "many cores" systems (say the 8800 is made of 16 cores) they will have the opportunity to do tests with GPU including more cores.
They are even gaining experience on the SLI/crossfire front which allow them to push their experiment further.
The tools (havoc/physicX and the languages) will there from scratch on anyway systems that includes a gpu.
It will be available from scratch and working, no advantage to Sony here if they stick with the cell.

On a side note all this efforts will come for FREE to anyone putting a gpu in his systems.
Intel and Nvidia will push havoc and physics and they will want them to be effective.

Middle vendors will also have the opportunity to put theirs hands on the hard as GPU are "standard" and they will have kept track with theirs evolutions, the software will havebeen tested again, no coarsed jump form x cores to 4x cores.

So there is no advantage to stick with the cell from a software perspective, I would dare to say that there will be more tools available (and woring properly) for GPU than for the next cell iteration.

And I haven't bring the cost of R&D in regard to just purchase a mostly standard GPU.

The only CON I can't see is if (a pretty bif one) GPU failed to deliver on promises, but a lot of members here tend to think hat directx11 gpu are likley to be able deal with most of the workloads SPU would have to deal with, AI, animations, collision, etc.
Nevertheless AI seems the more "IFFY".

But If GPU are up to the task I don't think a lot of people would be happy to have to deal with three differents kind of cores.
 
I thought Cell was quite a bit slower than R580 at folding? And a lot more so in comparison to the R6xx generation?
I know for heavily parallel workloads GPU are better.
But GPU could ahve to deal with less heavily parallel worloads and cell would be likely to come on top(my guess not that technical ;) anyway if some members want to shine in they are welcome).
Overall for sake of discussion has stated, they are more or less as efficient.
(say the cell and the gpu works on different kind of workload simulteanously, maybe the cell will be done with AI quickly but gpu migh do the same with particles).

Even if they reach the same perf per mm² for there intended workload, I think Sony would in fact have no advantage sticking wth the cell.

And I can bring another point in the discussion, I'm sure Intel Nvidia and even ATI are likely to put more efforts ($ in R&D) in their GPU than IBM Sony in the cell IMO. (discutable too but the point of forum, no? ).

EDIT
About folding I'm not sure that gpu are better in fact if we consider chip of equal die size (and @ the same porcess obviously). Anyway insight welcome.
 
My half a cent:

First off, I believe the approach Sony took with the PS3 will pay off in the long run. The console had a rough start, but it's doing fairly well at the moment. Once some of the heavy hitters for the system are out, and once it's had one or more price cuts under its belt, forgeddaboutit. It'll do well. With that assumption in place, I don't believe Sony will be so turned off by the PS3's big up-front investment as to turn 180 degrees and attempt to pull a Wii for their next console.

The two most exotic/expensive components in the system were the BD drive and the Cell, correct? By the time PS4 comes around, BD drives will be dirt cheap, and Cells will hopefully also be more entrenched and have benefitted from economies of scale (assuming they actually start being used in TVs and other devices). I think this would afford Sony the abillity to load up the PS4 with nice goodies and still have it retail for less than $450.
A setup like this maybe, just pure uneducated speculation and whishful thinking :smile::

multiple Cells (would leverage the existing experience of PS3 devs)
custom nvidia part
4GB system ram
1-2GB vram
???MB edram
BD drive
standard or laptop sata hd
10Gb lan
external sata port too perhaps?

Cheers.

I pretty much agree.

PS4:
1 CPU chip/die that has the equivalanet of at least 4 CELLs (4 beefier PPEs + 32 advanced SPEs minimum)
4GB of Rambus next-gen XDR system RAM
500 GB/sec bandwidth minimum
Custom Nvidia GPU, something that is not as outdated as RSX was when PS3 launched. uses eDRAM this time
1-2 GB dedicated external graphics memory
much faster BD drive
standard 200 GB HD
USB 3.0 and all the stardard ports one would expect of the time
 
How about particles? colissions? fluids simulations, clothes simulations?
Those are heavily parallel workloads.
And it looks like a lot more workloads could find their place on the gpu the same that found their place on SPU.

But if you think that gpgpu are (will) useless there is better place to discuss the issue ;) as quiet some members will disagree.

In fact the gpu cores/mutliprocessor could act as SPU and offload the cpu from a lot of calculation.
Ok these cores maybe more specialized (they can run good graphics) than SPU but it's nothing choking as we're speaking of a system supporting video games, and that effectively more revelant than folding or Hd decoding...
 
Last edited by a moderator:
I thought Cell was quite a bit slower than R580 at folding? And a lot more so in comparison to the R6xx generation?
Folding@Home is not a good indicator since it's not a straight benchmark, not to mention it's hard to connect it to realtime game performance.

http://folding.stanford.edu/English/FAQ-PS3
What type of calculations the PS3 client is capable of running?

The PS3 right now runs what are called implicit solvation calculations, including some simple ones (sigmodal dependent dielectric) and some more sophisticated ones (AGBNP, a type of Generalized Born method from Prof. Ron Levy's group at Rutgers). In this respect, the PS3 client is much like our GPU client. However, the PS3 client is more flexible, in that it can also run explicit solvent calculations as well, although not at the same speed increase relative to PC's. We are working to increase the speed of explicit solvent on the PS3 and would then run these calculations on the PS3 as well. In a nutshell, the PS3 takes the middle ground between GPU's (extreme speed, but at limited types of WU's) and CPU's (less speed, but more flexibility in types of WU's).
How should the FLOPS per client be interpreted?

We stress that one should not divide "current TFLOPS" by "active clients" to estimate the performance of that hardware running without interruption. Note that if donors suspend the FAH client (e.g. to play a game, watch a movie, etc) they enlarge the time between getting the WU and delivering the result. This in turn reduces the FLOPS value, as more time was needed to deliver the result.

It seems that the PS3 is more than 10X as powerful as an average PC. Why doesn't it get 10X PPD as well?

We balance the points based on both speed and the flexibility of the client. The GPU client is still the fastest, but it is the least flexible and can only run a very, very limited set of WUs. Thus, its points are not linearly proportional to the speed increase. The PS3 takes the middle ground between GPUs (extreme speed, but at limited types of WU's) and CPU's (less speed, but more flexibility in types of WUs). We have picked the PS3 as the natural benchmark machine for PS3 calculations and set its points per day to 900 to reflect this middle ground between speed (faster than CPU, but slower than GPU) and flexibility (more flexible than GPU, less than CPU).
 
Folding@Home is not a good indicator since it's not a straight benchmark, not to mention it's hard to connect it to realtime game performance.

http://folding.stanford.edu/English/FAQ-PS3

Interesting reading One ;)

It'sclear that GPU still don't offer the as many possibilities as Cell (they mayn ever offer it in fact) but they are likely to become good enough for a lot od stuffs and really excellent at some others.

It's about balancing the workloads between the multiprocessors/SIMD Array.
 
The new GPU2 client looks like its going to narrow the gap between Cell and GPU's in terms of what work units can be performed. Its also going to increase in raw speed (signifincatly) on the R6xx series GPU's so that would deifnatly make for an interesting comparison point.

EDIT: In fact the gap has been more than closed, the GPU2 client now supports a broader range of work units than Cell! Although it sounds like Cell may also support some or all of these in a future release.

http://www.extremetech.com/article2/0,1697,2284065,00.asp
 
Last edited by a moderator:
How about particles? colissions? fluids simulations, clothes simulations?
Those are heavily parallel workloads.
And it looks like a lot more workloads could find their place on the gpu the same that found their place on SPU.
Sure they'll run.. But you can definitely churn through alot more per frame using the Cell than most GPUs & in real-world scenarios it makes sense to use Cell for what it's good at & not waste GPU resources leaving your beefy floating point monster CPU sitting idle..

But if you think that gpgpu are (will) useless there is better place to discuss the issue ;) as quiet some members will disagree.
I never ever "ever" mentioned gpgpu at all.. Where in the world did you get the idea that I said anything of the sort..?

In fact the gpu cores/mutliprocessor could act as SPU and offload the cpu from a lot of calculation.
But you lose alot of flexibility running on the GPU, many tasks just wouldn't run efficiently enough to get comparable performance & with things like collision, if you already have a Cell sitting along side your GPU then why in the world wouldn't you use it instead..?

Ok these cores maybe more specialized (they can run good graphics) than SPU but it's nothing choking as we're speaking of a system supporting video games, and that effectively more revelant than folding or Hd decoding...
Sure..
What I don't get is this strange mentality you seem to have that for some reason developers would want to offload general game processing tasks to the GPU when you have a big fat multi-core CPU that can process the same data alot more efficiently..?
In practise a console game is ALOT more likely to be GPU-limited than CPU-limited so it would make sense going forward, for the focus to rest on improving the CPUs capacity to help the GPU in graphics relating processing tasks (calculating radiosity for example) as well as doing everything it already does (or just making the GPU & CPU both faster & more flexible), alot faster & more abundantly than trying to do the opposite..

I don't envisage technology like GPGPU (as wonderful as it maybe) making much of an impact in the console space as console GPUs aren't at all in danger of being starved of work in the vast majority of cases so theres little incentive to "offload more work onto it"..
Whereas keeping all those little SPUs busy with enough useful work per frame is a much harder & more complicated ordeal for example..
 
In practise a console game is ALOT more likely to be GPU-limited than CPU-limited so it would make sense going forward, for the focus to rest on improving the CPUs capacity to help the GPU in graphics relating processing tasks (calculating radiosity for example) as well as doing everything it already does (or just making the GPU & CPU both faster & more flexible), alot faster & more abundantly than trying to do the opposite..

...

Whereas keeping all those little SPUs busy with enough useful work per frame is a much harder & more complicated ordeal for example..


hm... how many SPEs do you think would be enough ? As you say, the CPU will be able to assist the GPU in certain tasks, and I have to imagine at some point, that's what all the "extra" SPEs/cores will be doing because there aren't that many other CPU-only tasks to be parallelized. I'd be interested to know how well developers are utilizing the 6 SPEs at the moment, if there are enough tasks as to consume a significant amount of processing time per core.

What I'm getting at is, what would be a good strike between hardware and software for the future. Factoring in the die size and power consumption, would it be worth it to have 32 or more SPE's? Do developers want that many hardware threads?

Or should they instead increase the cache, clock speed, IPC etc to something stupidly awesome with a more conservative amount of cores (16-24SPE's).
 
But you lose alot of flexibility running on the GPU, many tasks just wouldn't run efficiently enough to get comparable performance & with things like collision, if you already have a Cell sitting along side your GPU then why in the world wouldn't you use it instead..?

What I don't get is this strange mentality you seem to have that for some reason developers would want to offload general game processing tasks to the GPU when you have a big fat multi-core CPU that can process the same data alot more efficiently..?
I think you've got your wires crossed. This is about the future consoles, not the current ones. PS3 is a good architecture. Going forwards, would it be worth having a big CPU/Cell in a console, or will a 'GPU' be better at most of the processing heavy workloads, and a CPU will only need to be something fairly simplistic - in essence the CPU being PPE and the GPU providing lots of vector units as SPEs?

liolio's faith is in the development of GPUs as versatile processing engines, moreso than GPU's are capable now. Looking at the situation now Cell is a no-brainer as GPUs just can't compete with the flexibility and performance, but that isn't going to remain that way. As for GPUs being tied up with graphics work, if we lose the preconception-forming naming conventions and don't call them GPUs but VPUs (vector processing units) then the choice is to spend your transistor budget either on CPU+GPU, or on CPU+VPU. If you could have 600 M transistors, would you get bang for your buck from 300 M on CPU and 300 M on GPU, or 50 M on CPU and 550 M on VPUs, across multiple dies if necessary? This is similar to the idea of doing graphics solely on Cell or Larrabee. If you have 600 M transistors of Cell/Larrabee, will you get overall a better result with those transistors churning through game code and graphics rendering, versus a budget split over discrete chips dedicated to specific roles?

It's an interesting time, with three (at least!) different approaches to the 'lots of processing power' target, an a choice between how much workload to place on CPUs and GPUs, giving the developers also the option of flexibility or performance.
 
Sure they'll run.. But you can definitely churn through alot more per frame using the Cell than most GPUs & in real-world scenarios it makes sense to use Cell for what it's good at & not waste GPU resources leaving your beefy floating point monster CPU sitting idle..
Why would console manufacturer spend more money (and silicon budget) than needed if they intended to use the gpu as a general purpose accelerator.

About the perf not that's sure, look at the last Nvidia presentation, ok the gpu is way bigger than the cell, BUT I'm not sure cell that the cell would lead in perf/mm²
I'm not sure that cell and gpu goals are that different ie deal with parallel workload heavy on computation ;)

I never ever "ever" mentioned gpgpu at all.. Where in the world did you get the idea that I said anything of the sort..?
Point, but you seem to imply that using GPU for things outside of graphic is not a good idea. Doing calculation on the gpu even if game relative is gpgpu.

But you lose alot of flexibility running on the GPU, many tasks just wouldn't run efficiently enough to get comparable performance & with things like collision, if you already have a Cell sitting along side your GPU then why in the world wouldn't you use it instead..?
Look at the link above, GPU are catching up.
And they benefit from more R&D than what IBM or SOny is likely to spend mostly due to volume.
I'm not sure that that many workloads would works that much better on cell.
I guess it would be a mixed bag.
So I can reverse the argument, why have a huge cell-like cpu when you can offload a lot on caculations on the gpu, depending on your goals you can balance the graphic workloads gpgpu ones, if the game don't need push the graphic.

In the ps3 you have a lot of different ressources SPU vertex units pixel units.
If in the ps4(or whatever system) you have only "gpu cores" I see a lot of opportunities to balance your workload and keep the hardware busy which can turn to be an huge advantage.

Sure..
What I don't get is this strange mentality you seem to have that for some reason developers would want to offload general game processing tasks to the GPU when you have a big fat multi-core CPU that can process the same data alot more efficiently..?
In practise a console game is ALOT more likely to be GPU-limited than CPU-limited so it would make sense going forward, for the focus to rest on improving the CPUs capacity to help the GPU in graphics relating processing tasks (calculating radiosity for example) as well as doing everything it already does (or just making the GPU & CPU both faster & more flexible), alot faster & more abundantly than trying to do the opposite..
Once again why have a huge ass CPU? And their no proof the big ass cpu is better for the task where SPU/GPU are really likely to shine.
One can focus on making the PPC cores better. ATI/NviDIA will deal to make the GPU better at their own costs.
And mix different workloads on gpu is perhaps a great way to push their efficiency even further (make ALU more busy).

I don't envisage technology like GPGPU (as wonderful as it maybe) making much of an impact in the console space as console GPUs aren't at all in danger of being starved of work in the vast majority of cases so theres little incentive to "offload more work onto it"..
Whereas keeping all those little SPUs busy with enough useful work per frame is a much harder & more complicated ordeal for example..
Your argument is broken:
Mostly every body agree on a 4PPU xSPU design.
If one choose 4 PPU (even slightly bigger better) they are left with a heathlier silicon budget for the gpu. Then people have choice which is a strengh not a weakness to run whatever they want on the gpu.
If I follow this logic, NVidia and ATI should have never gone with unified design to prevent ones to say "spend to much power on vertex shading, because in the end peole only see pixel".

And all this is on the "technical side" there is a lot of money (R&D) saved in the process by purchasing a bigger GPU.
And my point on software still stand too, Gpu are standard so the will be the tools to use them.

The more I think the less the choice of cell is obvious, but as a powerhouse it could still be a good choice if GPU somewhat fail or lag behind their promises ;)
 
Last edited by a moderator:

Possibly. It is increasingly looking as if 32nm might be viable for 2011 volume production, and it should allow some decent advances within a reasonable cost/power envelope.

However, any predictions that only take what might be technically possible into account, and disregards R&D costs, advantages to extending the existing infrastructure in software tools and libraries, market positioning in terms of cost, infrastructure limits (realistic max target resolution of 1920x1080 for instance), human engineering in size/power draw/noise, plays for unique selling points, et cetera - are simply not making a worthwhile effort.
 
I think you've got your wires crossed. This is about the future consoles, not the current ones. PS3 is a good architecture. Going forwards, would it be worth having a big CPU/Cell in a console, or will a 'GPU' be better at most of the processing heavy workloads, and a CPU will only need to be something fairly simplistic - in essence the CPU being PPE and the GPU providing lots of vector units as SPEs?

liolio's faith is in the development of GPUs as versatile processing engines, moreso than GPU's are capable now. Looking at the situation now Cell is a no-brainer as GPUs just can't compete with the flexibility and performance, but that isn't going to remain that way. As for GPUs being tied up with graphics work, if we lose the preconception-forming naming conventions and don't call them GPUs but VPUs (vector processing units) then the choice is to spend your transistor budget either on CPU+GPU, or on CPU+VPU. If you could have 600 M transistors, would you get bang for your buck from 300 M on CPU and 300 M on GPU, or 50 M on CPU and 550 M on VPUs, across multiple dies if necessary? This is similar to the idea of doing graphics solely on Cell or Larrabee. If you have 600 M transistors of Cell/Larrabee, will you get overall a better result with those transistors churning through game code and graphics rendering, versus a budget split over discrete chips dedicated to specific roles?

It's an interesting time, with three (at least!) different approaches to the 'lots of processing power' target, an a choice between how much workload to place on CPUs and GPUs, giving the developers also the option of flexibility or performance.
Really well put together Shifty :cool:

That's exactely what I wanted to express!
Also one of my main point was that if the software and experience will be there for the cell it will there for GPU/VPU too.
And while it sounds easy for Sony to leverage the R&D already put in cell (both on software and hardware) we (people on this board) have dismissed too easily other alternatives for Sony.
The R&D behind GPU comes for free (ok you buy the IP or pay royalties but you will have to anyway) => you can focus on the CPU or others things.

archangelmorph don't get me wrong stick with the cell could be interesting, it would provide different ressources for different workloads, it makes a lot of sense in fact.
But, it's not enough to dismiss alternatives ;)

By the way, it looks like 32nm could be available by the timeframe next system are likely to launch! Cool news for the enthousiasts healthier silicon budget :)

If GPU/VPU deliver on theirs promises, I could see at least MS (as Sony has choice ;) ) go with an even more unbalanced system in regard to cpu gpu die size.
With the 360 they spent twice as much silicon on the gpu (I take in accound the edram in the daughter die) as on the cpu.

They could end with:
A tiny(really tiny @32 cpu), a quadcore supporting 8 threads.
think a xenon/ppu with it weakness (/bugs/altivex implementation/cahce size/cache latency/supporting OoO/...) adressed.

they would be left with a bunch of silicon for the GPU.
They could go with a huge GPU but they would be more likely to go with two middle size dies.
This could easily land to a tricky(costly) mobo design especially if they want to keep an UMA design.
So they would have to put a lot of efforts in making the communication between the different chips is one of the main strengh of their design.
Inclusion of daughter die ala xenos could be even more trikier.
 
Last edited by a moderator:
Too Tricky...
Fewer chips on the board = better board.

Start with a mature die process and design a complimentary internal bus architecture.
Join separate chips into System on Chip units as production and die processes mature.
 
Going forwards, would it be worth having a big CPU/Cell in a console, or will a 'GPU' be better at most of the processing heavy workloads, and a CPU will only need to be something fairly simplistic - in essence the CPU being PPE and the GPU providing lots of vector units as SPEs?
That's about the technical side of console design, Cell is special for Sony because its IP cost is supposed to be cheaper than an IP entirely developed by a third party. In other words, SCE has to recover the development cost of Cell by reusing its IP for a game console unless it's embedded in every TV. In an earlier phase in a console life cycle transistor cost is important but after shrinks IP cost becomes more apparent in the way of cost cut.
 
Status
Not open for further replies.
Back
Top