Predict: The Next Generation Console Tech

fehu · Nov 22, 2009

I really dont know
Reading in another thread about the next gddr+ standard i've found some article about differential signaling, but the most recent was about an amd slide dated at november 2008

Shifty Geezer · Nov 22, 2009

I just thought, how's about the other way around? Toshiba provide an extended SPURSEngine processor, anyone can provide a standard main processor, and the SPU code developers have will remain portable.

fehu · Nov 22, 2009

but paired with a future gpu strong on vertex operation and post processing, are spu so usefull/necessary?

upnorthsox · Nov 22, 2009

Shifty Geezer said:
I just thought, how's about the other way around? Toshiba provide an extended SPURSEngine processor, anyone can provide a standard main processor, and the SPU code developers have will remain portable.

Or to go one more direction, couldn't the SPU's be added to a customized gpu which Toshiba not only has experience doing but also has the research done from their initial try. If you look at the current and future design of ATI SP's,Itel and NV Cores they have more in common with SPU's than they do with their offerings at the outset of this gen.

Crossbar · Nov 22, 2009

Shifty Geezer said:
I just thought, how's about the other way around? Toshiba provide an extended SPURSEngine processor, anyone can provide a standard main processor, and the SPU code developers have will remain portable.

Yeah, something like that might work I was more thinking of recycling some generic Mips core or similar. I expect the code running on the PPC to be pretty generic and mostly written in high level language so swapping out the PPC core shouldn´t really bother the programmers that much as long as the performance is acceptable.

However, the Spursengine is designed for lower operational frequencies than the Cell so it may not be a perfect fit straight out of the box, but Toshiba seems to have capable engineers, so I guess anything could be possible.

one · Nov 23, 2009

(Larrabee + SPU) * 2

This will make everyone happy in 2012, no?

nAo · Nov 23, 2009

one said:
(Larrabee + SPU) * 2

This will make everyone happy in 2012, no?

That would actually upset developers

eastmen · Nov 23, 2009

Dr. Nick said:
NV, ATI/now AMD and Microsoft already knew what was likely coming at the time, Sony on the other hand likely didn't. Look at the hardware they had before the PS3. It's a good thing that the Cell was powerful enough to make up for the RSX shortcomings but imagine what things would be like if the Cell wasn't needed to help the PS3 to keep up with the Xenos in the 360. If they had a scaled down G80. Something that would have been fully DX10 compliant. Something like that would not only have benefited Sony but PC gaming as well. In other words it would have been much closer to the vision laid out before developers and the public when the system was first announced.

Thats what I'm saying .They saw that GS was a fillrate monster and that basicly let them keep up in some ways with the nv20a in the xbox .

I'm sure they were looking at the huge fillrate advantage in the g70 and thought it would do it again not really knowing that shaders would be a huge deal this generation.

I'm sure if they took a g80 and cut it in half or took two third of the final g80 it would have certianly allowed the ps3 to graphicly out muscle the 360.

There are alot of other things i'm sure that hut them. MS putting in more ram based on feed back from the devs , the bluray drives being so expensive at the start.

Tahir2 · Nov 23, 2009

Sony have a habit for using lots of different chips.
PS2 = MIPS CPU, VU01 and VU02
PS3 = PPE CPU, SPU, RSX

The traditional PC model seems more elegant, i.e. a strong CPU and GPU.

In the long run it works out cheaper too if you look at Xbox 360 and PS3 BOM excluding the additional features and concentrating just on the motherboard, CPU + additional processing units (aka SPU's) , GPU and RAM.

It would be in Sony's interest to design a new console with a more hemogenous architecture and stop wasting money on trying to reinvent the wheel. After all the millions spent on R&D by Sony the Xenos and Xenon holds their own against the PS3 quite admirably.

I would love to see a console design based around an AMD CPU and GPU for 2012. I believe it would be a very powerful and efficient design.

archangelmorph · Nov 23, 2009

Tahir2 said:
Sony have a habit for using lots of different chips.
PS2 = MIPS CPU, VU01 and VU02
PS3 = PPE CPU, SPU, RSX

The traditional PC model seems more elegant, i.e. a strong CPU and GPU.

In the long run it works out cheaper too if you look at Xbox 360 and PS3 BOM excluding the additional features and concentrating just on the motherboard, CPU + additional processing units (aka SPU's) , GPU and RAM.

It would be in Sony's interest to design a new console with a more hemogenous architecture and stop wasting money on trying to reinvent the wheel. After all the millions spent on R&D by Sony the Xenos and Xenon holds their own against the PS3 quite admirably.

I would love to see a console design based around an AMD CPU and GPU for 2012. I believe it would be a very powerful and efficient design.

I'd much prefer a console based around a Cell iteration + competitive AMD/NV GPU
solution..

Acert93 · Nov 23, 2009

one said:
(Larrabee + SPU) * 2

This will make everyone happy in 2012, no?

Why would you need SPEs if you had Larrabee? iirc Larrabee's CPUs are fairly simple with a short pipeline, 4 HW threads, and a honking vector unit. And a big difference is they have access to TMUs. Been a while since I read anything on how it is shaping up but I believe they have their own L2 cache (most devs would consider that a move up from SPEs in terms of accessibility) and a global L3.

Not sure why Intel would justify an entirely different processor (+complexity). I would guess SPEs are denser cores (more per chip all things even) and LS faster. But what you gain with Larrabee in terms of uniformity (CPU and GPU all using the same language), standard cache model, and texture processing makes some big inroads.

Do developers really want a 256K LS limit, heterogenous cores (SPEs, PPEs + GPI), and manually managing their DMA requests when they could just go Larrabee outright?

Lose some peak performance and gain it back, maybe more so, in everything being on one code base and one chip type.

Weaste · Nov 23, 2009

Joshua Luna said:
Do developers really want a 256K LS limit

Why is it a 256K limit?

3dilettante · Nov 23, 2009

Joshua Luna said:
Why would you need SPEs if you had Larrabee? iirc Larrabee's CPUs are fairly simple with a short pipeline, 4 HW threads, and a honking vector unit. And a big difference is they have access to TMUs. Been a while since I read anything on how it is shaping up but I believe they have their own L2 cache (most devs would consider that a move up from SPEs in terms of accessibility) and a global L3.

There is no L3. Each core has its own 256KiB subset of the L2.
These tiles are kept coherent with one another, but each core can only write directly to its own tile.

upnorthsox · Nov 23, 2009

Joshua Luna said:
Why would you need SPEs if you had Larrabee? iirc Larrabee's CPUs are fairly simple with a short pipeline, 4 HW threads, and a honking vector unit. And a big difference is they have access to TMUs. Been a while since I read anything on how it is shaping up but I believe they have their own L2 cache (most devs would consider that a move up from SPEs in terms of accessibility) and a global L3.

How so? It's essentially the same thing

Not sure why Intel would justify an entirely different processor (+complexity). I would guess SPEs are denser cores (more per chip all things even) and LS faster. But what you gain with Larrabee in terms of uniformity (CPU and GPU all using the same language), standard cache model, and texture processing makes some big inroads.

SPE's are there to assist the CPU not a Larrabee GPU. Now if you're talking about adding SPU's to a Larrabee GPU then yea it's pretty hard to justify that. But who's talking about that?

Do developers really want a 256K LS limit, heterogenous cores (SPEs, PPEs + GPI), and manually managing their DMA requests when they could just go Larrabee outright?

Lose some peak performance and gain it back, maybe more so, in everything being on one code base and one chip type.

Again, Larrabee cores are also limited to 256k so if devs don't like SPU LS's they ain't going to like Larrabee either.

Dr. Nick · Nov 23, 2009

archangelmorph said:
I'd much prefer a console based around a Cell iteration + competitive AMD/NV GPU
solution..

So you would like to continue developing with the Cell rather than something the would likely be easier to develop on? Honest question.

Acert93 · Nov 23, 2009

Weaste said:
Why is it a 256K limit?

LS isn't cache, it is memory. It is fast, but you also need to manually manage your memory. Cache is simpler and user friendly. When games have a 2 year development window there are always compromises and not every compromise is, "What is the best performer?" but "how can we get the performance we need in the time window allocated?" While SPEs have encouraged a task model and better data formats SPEs do pose challenges related to LS size and the heterogenous nature of the cores.

As for the 256KB limit, besides not being cache (more work to manage LS), increasing the size means lowering LS performance.

upnorthsox said:
How so? It's essentially the same thing

Not according to lazy developers.

SPE's are there to assist the CPU not a Larrabee GPU. Now if you're talking about adding SPU's to a Larrabee GPU then yea it's pretty hard to justify that. But who's talking about that?

Why would you ever go with a CELL (with SPEs) + Larrabee when you could go 2x Larrabee?

Why use "SPEs to assist the CPU" when each Larrabee has 32 4-thread CPUs?

All you are doing is adding yet another programming model/complexity. If you are going to have:

CPU
-2 x PPEs
-16 x SPEs

GPU
-32 x Larrabee Cores

That is 3 different CPU types and potential code bases. What happens when you nifty GPU code needs to fall back to the SPEs and isn't friendly to the LS?

Why not 2 Larrabee chips for 64 cpu cores (256 HW threads)--all your code could be the same, be it graphics, physics, or general game code. 1 CPU type, 1 memory model, all code works on EVERY CPU.

The point is this: A PS4 with PPEs (PPC), SPEs, and a Larrabee CPU with x86 CPUs is too complex and specialized.

Why would you need SPEs if you have Larrabee CPUs available? Besides raw potential peak speed, name 1 reason you thrust this burden on developers?

Again, Larrabee cores are also limited to 256k so if devs don't like SPU LS's they ain't going to like Larrabee either.

It is cache, not memory.

Dr. Nick · Nov 23, 2009

Joshua Luna said:
Why not 2 Larrabee chips for 64 cpu cores (256 HW threads)--all your code could be the same, be it graphics, physics, or general game code. 1 CPU type, 1 memory model, all code works on EVERY CPU.

That makes a great deal of sense but sounds like a very expensive thing to do.

Shifty Geezer · Nov 23, 2009

Joshua Luna said:
Why would you ever go with a CELL (with SPEs) + Larrabee when you could go 2x Larrabee?

If Larrabee is viewed as a versatile GPU rather thana CPGPU hybrid, a Cell+Larrabee PS4 would allow developers to port existing code and practices, provide BC, and still use Larrabee as GPU. If you lose Cell completely, developers are starting completely from scratch yet again!

Why use "SPEs to assist the CPU" when each Larrabee has 32 4-thread CPUs?

upnorthsox means the SPEs are there for CPU workloads, not graphics rendering. Leave that to whatever GPU is used, nVidia, AMD or Larrabee.

All you are doing is adding yet another programming model/complexity. If you are going to have:
...
That is 3 different CPU types and potential code bases. What happens when you nifty GPU code needs to fall back to the SPEs and isn't friendly to the LS?

The CPU aspect is well known and understood at this point. Writing Larrabee code is a complete unknown. At best you just chuck shader code at it like any other GPU.

Why would you need SPEs if you have Larrabee CPUs available? Besides raw potential peak speed, name 1 reason you thrust this burden on developers?

Because the Larrabee's are busy rendering graphics. That's one system model where the Cell is the CPU and the Larrabee is a GPU, ignoring Larrabee's potential functionality as CPU. If you want to write program code in x86 as well as your graphics, then you'd want a pure Larrabee system. Alternatively you could go with Larrabee as a very multicore CPU and put in an nVidia GPU.

At this moment the simplest future for PS4 developers as got to be a Cell CPU and some GPU. Perhaps change the LS to cache for devs who can't get their head/budget around managing the data that closely. But IMO a Larrabee only system at this point is going to be as bottlenecked by code as PS3 was. Not quite so bad, because Cell has started everything thinking many-core and parallelising tasks, but still.

Acert93 · Nov 23, 2009

I agree and I have my doubts about Larrabee. (Hopefully a couple chaps here will prove me doubts wrong--good luck guys!)

On the other hand, my only real point, is that a CELL CPU (PPEs+SPEs) + Larrabee makes no sense. What are SPEs doing that Larrabee cores cannot? What you lose in LS speed and space efficency you gain in theory by having 1 CPU type as well as totally blurring the distinction between GPU and CPU. All this hogwash of using SPEs to do work for RSX (work, mind you, Xenos does on a smaller die!) you can flush it down the toilet: You can slowly tear down the distinction between CPU and GPU.

I guess where I am going with this is that while it makes sense if you have a normal GPU to have a normal CPU--but if you use Larrabee you really should have an x86 main CPU or "go broke" with all Larrabee or an OOOe core in a sea of Larrabees.

SPEs (CPU) + Larrabee (GPU) = Makes no sense

Too much overlap, to complex.

Shifty Geezer · Nov 23, 2009

Joshua Luna said:
On the other hand, my only real point, is that a CELL CPU (PPEs+SPEs) + Larrabee makes no sense. What are SPEs doing that Larrabee cores cannot?

I think it only makes sense if Larrabee turns out to be the bee's knees as a GPU. If the programmability pays huge dividends, and the tools make it easy to harness, chosing to use Larrabee as a standalone GPU makes sense, and then you leave Cell to do the program code. That's not really what LRB is about though. If you're going to use Larrabee at all in a system, like yourself I think you'd want just LRB. There is still a case for a Cell_LRB system though, in this current era of unproven tech and guesswork!

Predict: The Next Generation Console Tech

fehu

Shifty Geezer

uber-Troll!

fehu

upnorthsox

Crossbar

one

Unruly Member

nAo

Nutella Nutellae

eastmen

Tahir2

archangelmorph

Acert93

Artist formerly known as Acert93

Weaste

3dilettante

upnorthsox

Dr. Nick

Acert93

Artist formerly known as Acert93

Dr. Nick

Shifty Geezer

uber-Troll!

Acert93

Artist formerly known as Acert93

Shifty Geezer

uber-Troll!

Similar threads