Be thankful for Cell ... *mitosis*

Status
Not open for further replies.

psorcerer

Regular
New I guess Sony took a gamble on Cell and had their fingers burnt so won't go there again!

I would say that the modern console games are improved only because of Cell.
1. Cell taught developers how to parallelize their code
2. Cell taught developers how to use data-driven software design.
3. Cell taught developers how to use asynchronous compute.
Without Cell we would still be getting single-thread games that relied heavily on texture mapping.
 
no evidence that sufficient work undertaking now on CPU can be deployed to GPU's compute

There is evidence. But you're right, we are in the middle of 8th generation and majority of people still do not know how to program GPUs, I would say that the main problem is that gamers still buy the games with bad graphics, so why bother?
 
Last edited:
I would say that the modern console games are improved only because of Cell.
1. Cell taught developers how to parallelize their code
2. Cell taught developers how to use data-driven software design.
3. Cell taught developers how to use asynchronous compute.
Without Cell we would still be getting single-thread games that relied heavily on texture mapping.

yeah, I just meant from a financial PoV it cost then last gen
 
yeah, I just meant from a financial PoV it cost then last gen

Current generation looks very good from financial point of view, but will probably not bring any real advances in graphics.
I hope that the next generation will have more technology (and having 20% stronger main CPU with 2x more cache doesn't count as more technology).
 
Xenon is a stripped out Cell, Microsoft were very proud of getting it from Sony for free.

And they even managed to release the Xenon before Sony released their own Cell. A win for Microsoft.

One whole year before even...Fatality!

EDIT:
It means that for the purpose of my statement Cell = Cell + Xenon
You mean Cell = SPEs + (Xenon / 3) ?
 
And they even managed to release the Xenon before Sony released their own Cell. A win for Microsoft.

Yeah, it was a real win. That's why Sony did what it did this generation, and so far it looks like it worked fine.
P.S. Cell was ready much earlier than PS3, PS3 was delayed by RSX, not Cell.

You mean Cell = SPEs + (Xenon / 3) ?

I mean that "parallelising code" works for both Cell or Xenon, but "use async compute" does not (nothing to async on Xenon, sadly).
 
Optimizing SPEs + local cache has similarities with optimizing shaders for local cache I think. Also SPE used a lot to aid RSX.
 
You mean they couldn't learn that from using Xenon? :???:
In the case of game devs:
Or Sega Saturn?
Or multisocket PCs?
The contemporaneous dual-cores?

Xenon is a stripped out Cell, Microsoft were very proud of getting it from Sony for free.
Source?
Last I read up on this, Microsoft paid IBM money to design Xenon. IBM could pat itself on the back for getting Sony and Microsoft to pay for similar cores.

Optimizing SPEs + local cache has similarities with optimizing shaders for local cache I think. Also SPE used a lot to aid RSX.
Cache optimizations mattered for other architectures as well.
The specific way Cell implemented its system could be credited for hindering multicore development for devs on the PS3 early on.
Xenon could be used similarly, but it didn't have a massive chunk of its die actively intimidating people from using it.

Some of the technically weaker early PS3 games skated by on the PPE alone, and the SPE's DMA to local store and lack of even instruction caching is something even GPUs avoid.
 

http://www.amazon.com/The-Race-New-Game-Machine/dp/0806531010

Cache optimizations mattered for other architectures as well.

But they were absolutely essential to get a good performance from PS3.

actively intimidating people from using it

That's not how it works. Hardware must be intimidating otherwise nobody will bother to extract more performance.
This is the case this generation. People can extract at least 10 times from async compute, but they don't, because it looks ok anyway.
PS4 has emulation layers for some DX11 concepts (created to aid porting older games), guess what? Everybody uses that emulation in every new game (although it's slow as hell).
People are lazy, if you won't stand on top of them with a stimulus in hand, they won't do anything. :)
 
I would say that the modern console games are improved only because of Cell.
1. Cell taught developers how to parallelize their code
2. Cell taught developers how to use data-driven software design.
3. Cell taught developers how to use asynchronous compute.
Without Cell we would still be getting single-thread games that relied heavily on texture mapping.

Poppycock.
 
Is there a line where Microsoft congratulated itself on getting Xenon for free?

But they were absolutely essential to get a good performance from PS3.
Because you had to use the SPEs to get good performance, and their memory model required more than just cache optimizations.
Some of those Cell optimizations are things where vast portions of it wouldn't function if you didn't do the special dance. Some people would call that "broken" or at least a "multi-billion dollar mistake".

That's not how it works. Hardware must be intimidating otherwise nobody will bother to extract more performance.
This is why Cell is a widely used high-performance processor with an active development pipeline and ecosystem, and not a discard that was kept alive only because it was a required element of an established player in a multi-billion dollar market.

This is the case this generation. People can extract at least 10 times from async compute, but they don't, because it looks ok anyway.
This is on a GPU design that has made serious efforts to make itself more accessible and compatible with CPU development: aligned page formats, shared memory, caches, and leveraging existing APIs and/or working to provide toolsets based on standards.
The planned endgame with HSA, which breathless PS4 fans were implicitly harping on for when HUMA support was the rumor buzzword of the day, is to align the programmer-facing architecture further.

PS4 has emulation layers for some DX11 concepts (created to aid porting older games), guess what? Everybody uses that emulation in every new game (although it's slow as hell).
Which things are slow? Certain functions?
What is "slow"? 30-60 FPS?

People are lazy, if you won't stand on top of them with a stimulus in hand, they won't do anything. :)
They can be lazy, and they can also have budgets and limited time to reinvent the wheel or repeat mistakes of the past. Cell's particular choice was not a first, but there's a reason why that form of heterogenous implementation seemed new to people when it came out.
 
Is there a line where Microsoft congratulated itself on getting Xenon for free?

Obviously not, they were internally proud of it using words like "we are the software company, yet".

Some of those Cell optimizations are things where vast portions of it wouldn't function if you didn't do the special dance.

AFAIK, all techniques used for SPEs were highly beneficial for multicore CPU and GPU programming as well.

not a discard

Nobody said that it was a good financial move.
But it was a good educational move, and somebody must pay for that education.

This is on a GPU design that has made serious efforts to make itself more accessible and compatible with CPU development: aligned page formats, shared memory, caches, and leveraging existing APIs and/or working to provide toolsets based on standards.

You mix things here. Leveraging existing APIs is clearly a bad move (in educational sense), but reducing CPU/GPU synchronization is a good one.

Which things are slow? Certain functions?

Registers and state changes. Slow is "GPU functionality emulated on CPU", i.e. depends on synchronization.
And the main problem: it allows people to think in DX11-way, using things like "scene", "render state", "pipeline", etc. When it obviously is not how modern hardware works.

budgets and limited time

We are talking about pure technical/engineering problem. Let's not mix it with business. Engineer, who want to "think business", probably needs to be fired or promoted to manager, it's not their job.
 
Obviously not, they were internally proud of it using words like "we are the software company, yet".
I can draw no conclusions from that sentence fragment.

AFAIK, all techniques used for SPEs were highly beneficial for multicore CPU and GPU programming as well.
The DMA list functionality is something that is missed by some when it comes to CPU multicore programming.
The disparate memory spaces is something being moved away from rapidly, and the required care in managing the instruction payload because it could readily impact data capacity and/or data addressing has not carried over.
The lack of coherence meant certain behaviors such as false-sharing and management of the cacheability of data in the system in modern systems was not represented well by Cell.

But it was a good educational move, and somebody must pay for that education.
The additional complexity, poor tools, and lack of a good reference for the exotic design stymied a number of developers and lead to conservative early releases that leveraged a single core on the PS3.
The PC and Xbox360 could provide multicore development opportunities, and did so sooner. Even if the PS3 weren't delayed, and I've seen more blame put on Blu-ray than RSX, the set of of early PS3 games that outright avoided the SPEs meant that it took another round of iteration or developer flameouts before true multi-core adoption would have taken place for a number of developers.

Having to dedicate optimization and programming for a specific model of developer-antagonistic multicore was a source of drag for multiplatform development.
Cell had an impact, but it was this sole driver of the advancement of multicore development knowledge, nor is it fair to say Xenon was Cell. What made Cell what it was versus Xenon is the part that was disposed of by everyone going forward.

edit: Missed a word: it was *not* this sole driver

You mix things here. Leveraging existing APIs is clearly a bad move (in educational sense), but reducing CPU/GPU synchronization is a good one.
What made Cell what it was was a heterogenous multiprocessing system, where the biggest difference was multiple isolated memory spaces, explicit DMA, and incompatible instruction formats.
GCN strives heavily to to remove the first two, and the desired end goal for AMD is to provide sufficient software tools and abstractions to make the third problem no worse than exists for separate APIs, with the hope for a somewhat transparent to the programmer HSA programming model to make it even less obvious.
Reducing synchronization in this case is bringing the GPU closer to what CPUs already do.

And the main problem: it allows people to think in DX11-way, using things like "scene", "render state", "pipeline", etc. When it obviously is not how modern hardware works.
The hardware doesn't fully match what the APIs express, but that's not to say that those concepts are foreign to it.
It allows for the deployment of a platform that is economical and commercially successful, and one that can be done now as opposed to years in the future. It allows iteration and research now, as opposed to years retreading all the old ground to get to the new.
The legacy APIs are not ideal, but they are at the same time not devoid of reason and practical benefit, and this is not an industry of volunteers that do not require food and shelter.
You can aim for perfection that iterates never, or you iterate as much as you can on what you can afford to work on.

We are talking about pure technical/engineering problem. Let's not mix it with business. Engineer, who want to "think business", probably needs to be fired or promoted to manager, it's not their job.
Engineers cannot neglect practical and material constraints. Making choices to meet goals and trading off between what you want versus what it costs, or how it can be made acceptable to the market is part of the job description.
 
Last edited:
An engineering problem without time/budget goals and constraints?

I would say that time constraints are vastly exaggerated here.

The lack of coherence meant certain behaviors such as false-sharing and management of the cacheability of data in the system in modern systems was not represented well by Cell.

I hope you refer to async GPU compute here as "modern system"?

Having to dedicate optimization and programming for a specific model of developer-antagonistic multicore was a source of drag for multiplatform development.

You are still doing too much financials here.
I'm talking purely about engineering/educational aspect. Before Cell multicore CPUs were abundant on PC, yet nobody even remotely managed to use it in games (even two threads).
It was partly because drivers/APIs were strictly single-threaded, but I would argue that Cell created the needed paradigm shift here.
And there are no drivers for any paradigm shift right now (I thought DX12/Vulkan, but diving deeper leaves me with "oh, god, they just could not throw out old habits" feel).

What made Cell what it was was a heterogenous multiprocessing system, where the biggest difference was multiple isolated memory spaces, explicit DMA, and incompatible instruction formats.

The biggest Cell difference was: you need to know how to write multi-threaded code, and how to manually manage its execution.
All other things are irrelevant small nuances.

goal for AMD is to provide sufficient software tools and abstractions

I hope not. Tools need to be written by game developers, for their specific development needs.
Using fat APIs was never a good idea, and vendor API cannot be thin by definition.

but that's not to say that those concepts are foreign to it

I think they are. Current hardware knows how to draw triangles, one by one. That's it.
Scene, iterative changes, and all other DX crap are very foreign to it.

The legacy APIs are not ideal, but they are at the same time not devoid of reason and practical benefit, and this is not an industry of volunteers that do not require food and shelter.

They are impractical crap. It was clear in 2007, it's even more clear now. There is a huge demand for very low-level, thin APIs: DX12/Vulkan/Mantle/Swift. But the main player, MSFT chickened in the last moment and still uses too much CPU for render-related tasks.

You can aim for perfection that iterates never, or you iterate as much as you can on what you can afford to work on.

Iterations are cheap now. Software development is much easier than it was in the PS2 era. There are huge problems with asset development and art, but software is just incredibly easy to write. And the tricks, they did not change much.
 
Status
Not open for further replies.
Back
Top