Most games are 80% general purpose code and 20% FP...

Titanio · Jul 5, 2005

shaderguy said:
That's effectively what Epic is doing with their Unreal Engine. (Of course, they're very diplomatic about ignoring the SPEs, saying "we're leaving the SPEs free for middleware to use".)

They actually said they were going back to start tapping the SPEs. The last word on that, from the pulled Anandtech article, was that they were using them for particle systems, physics and animation (the physics bit would be covered by the middleware, not sure about the others).

As for general purpose tasks, we should possibly distinguish between how tasks would run without change on a SPE (or SPEs) versus how they could be cast to work better. I think it's an open ended question beyond the tasks which map naturally to them.

gurgi · Jul 5, 2005

So if they are working physics on the SPEs, does that mean the AGEIA guys are trying to get thier API running on them? Kinda funny considering all the PPU vs SPE debate.

Titanio · Jul 5, 2005

gurgi said:
So if they are working physics on the SPEs, does that mean the AGEIA guys are trying to get thier API running on them? Kinda funny considering all the PPU vs SPE debate.

They are, yeah.

http://www.ageia.com/pr_05172005b.html

They've contributed themselves to the debate, it must be said. At least in one interview one of the guys mentioned the Physx chip helping to level the playing field for the PC vs the next-gen systems, but not necessarily "to a PS3-level". I don't know exactly what he meant by that, but I'm very curious to see how their middleware ends up on Cell, and how physics generally works out on it.

version · Jul 5, 2005

shaderguy said:
Shifty Geezer said:

Shaderguy :

1 : How could MS look at Cell and base their decisions on it, when Cell's design was unknown during XeCPU's development?

Click to expand...

Because Sony and IBM publicly announced much of the Cell architecture and PS3 performance targets several years ago. There were also the various papers and patent applications publicly available which gave many details. It's true that many of the details were not revealed, but overall performance targets, as well as the overall architecture, were well known. (For example, Sony was saying that 4 Cells == 1 Terraflop, from which one could determine 1 Cell == 256 GFlops. And that a Cell would be 1 CPU + a large number of DSPs. And that there was a high-speed RAMBus-designed interconnect between Cells, and that one Cell would be used as a CPU, while the other would be used as a GPU, and so on.)

Shifty Geezer said:

2 : Can we really say XeCPU has 3x the GP performance? Faf points out the SPE's aren't any slouches in this regard (though we don't know how cache/LS management impacts things), and MS's statement that they have 3x the performance is based on 1 PPE cores vs. 3 (though they aren't the same cores in all respects) and totally discounted the worth of the SPEs in GP. I think what we're hearing regards GP performance is rather nonsensical and unfounded FUD and shouldn't be taken as valid, unless someone can present some hard facts on the matter.

Click to expand...

Since the SPEs don't have general-purpose access to main memory, I think their performance on general purpose code has to be discounted quite a bit. Wouldn't you have to implement some form of software cache? Wouldn't that make a main memory read access look like this:

int Peek(void* address)
{
TLBEntry e = TLB[TLBHash(address)];
if(e.base != address & PAGE_MASK)
{
// schedule DMA transfer here...possibly context switch while waiting.
}
return * (int*) (e.cache + (address & PAGE_OFFSET_MASK));
}

That seems like it would take at least 20 instructions, including several branches, even for the in-cache case. That seems slow enough that people wouldn't really want to use general-purpose algorithms on SPEs. Instead, devs will write custom SPE code, that reads and writes data in a more stream-oriented fashion.

Given that a single PPC core is not too wimpy, I predict many PS3 devs will just ignore the SPEs, and tune thier game to use the single PPC core and the GPU. That's effectively what Epic is doing with their Unreal Engine. (Of course, they're very diplomatic about ignoring the SPEs, saying "we're leaving the SPEs free for middleware to use".)

cache miss is expensive, but possible task switch when it doing

scooby_dooby · Jul 5, 2005

Titanio said:
They actually said they were going back to start tapping the SPEs. The last word on that, from the pulled Anandtech article, was that they were using them for particle systems, physics and animation (the physics bit would be covered by the middleware, not sure about the others).
.

Here is what was specifically said in the Anandtech article:

"From those that have had experience with the PS3 development kits, this access takes far too long to be used in many real world scenarios. It is the small amount of local memory that each SPE has access to that limits the SPEs from being able to work on more than a handful of tasks. While physics acceleration is an important one, there are many more tasks that canâ€™t be accelerated by the SPEs because of the memory limitation.

The other point that has been made is that even if you can offload some of the physics calculations to the SPE array, the Cellâ€™s PPE ends up being a pretty big bottleneck thanks to its overall lackluster performance. Itâ€™s akin to having an extremely fast GPU but without a fast CPU to pair it up with."

Carl B · Jul 5, 2005

Wrong article.

The first public game demo on the PlayStation 3 was Epic Gamesâ€™ Unreal Engine 3 at Sonyâ€™s PS3 press conference. Tim Sweeney, the founder and UE3 father of Epic, performed the demo and helped shed some light on how multi-threading can work on the PlayStation 3.

According to Tim, a lot of things arenâ€™t appropriate for SPE acceleration in UE3, mainly high-level game logic, artificial intelligence and scripting. But he adds that â€œFortunately these comprise a small percentage of total CPU time on a traditional single-threaded architecture, so dedicating the CPU to those tasks is appropriate, while the SPE's and GPU do their thing."

So what does Tim Sweeney see the SPEs being used for in UE3? "With UE3, our focus on SPE acceleration is on physics, animation updates, particle systems, sound; a few other areas are possible but require more experimentation."

And this is the Anandtech article Titanio had in mind I believe: here

scooby_dooby · Jul 5, 2005

" The last word on that, from the pulled Anandtech article"

It's the only article to be pulled, but your right he doesnb't mention Sweeney at all.

Although "From people who have expereince with the dev kits..." obviously includes, but may not be limited to, Epic.

I think ANand's summary is basically a blunt way of what Tim put very politically. Tim's comments from the other article:

"According to Tim, a lot of things arenâ€™t appropriate for SPE acceleration in UE3, mainly high-level game logic, artificial intelligence and scripting. But he adds that â€œFortunately these comprise a small percentage of total CPU time on a traditional single-threaded architecture, so dedicating the CPU to those tasks is appropriate, while the SPE's and GPU do their thing."

So what does Tim Sweeney see the SPEs being used for in UE3? "With UE3, our focus on SPE acceleration is on physics, animation updates, particle systems, sound; a few other areas are possible but require more experimentation.""

Carl B · Jul 5, 2005

I edited my post above for further clarity.

Titanio · Jul 5, 2005

scooby_dooby said:
" The last word on that, from the pulled Anandtech article"

It's the only article to be pulled, but your right he doesnb't mention Sweeney at all.

Although "From people who have expereince with the dev kits..." obviously includes, but may not be limited to, Epic.

I distinctly recall an anandtech article with relayed info from Sweeney regarding SPE usage, including those things mentioned (physics, particle systems, animation etc.). Perhaps it wasn't the pulled article, however, but I'll look it up.

edit - thanks xbd, wrong article

scooby_dooby · Jul 5, 2005

edited mine to be a little more complete

Carl B · Jul 5, 2005

Well anyway, the point is that Sweeney sees the SPE's as helping out with some very compute-intensive tasks. Other developers give the SPE's even more credit... and of course some don't give them anythign at all.

Whatever the case, it's doubtless that they'll probably be pushed fairly far by the time this upcoming generation is over, and even if they end up primarily used only for particles and physics (and sound), they'll have been far from 'useless.'

3roxor · Jul 5, 2005

Even the devs for Ps2 started using those 2 VU's and the 7 SPE's are better in every way. I'm sure there is plenty of use for them.

Shifty Geezer · Jul 6, 2005

A day or two after Anand's article there was a Toms Hardware article about GPU's being put to use outperforming general purpose CPU's at general purpose code. The solution is to rethink your models of data and processing.

I don't discount difficulty in learning to program to the new stream structures (shared by Cell and XeCPU. eg. MS thinned down some GP material from XeCPU) but to say the new architectures won't perform well in a lot of tasks I think is just a case of not being able to see potential. Both MS and STI came at the problem of how to cater for next-gen media entertainment superstuff from the same approach - provide chunky streaming FP processors. Neither thought to go with big and meaty OOO big-cache CPUs. Either we conclude IBM know bugger all about processor design and programming and shafted both MS and Sony with duffers, who weren't smart enough to realise their error, or the hardware is capable of doing what is asked of it, and doing it well, even though it's different to existing approaches.

seismologist · Jul 6, 2005

I dont think there's any question that these CPUs will suffer compared to PC's at running GP code. (Which was why I'm surprised there was so much backlash to that Anand article).

The obvious solution is, dont use it to write general purpose code! Make something different. Write code that requires 2Tflop of FPU performance.

That's potentential is what excites me most about the potential of these new machines... Not how well they will do at running PC code.

scooby_dooby · Jul 6, 2005

"Either we conclude IBM know bugger all about processor design and programming and shafted both MS and Sony with duffers, who weren't smart enough to realise their error, or the hardware is capable of doing what is asked of it, and doing it well, even though it's different to existing approaches."

Isn't it simply cheaper to produce 3 small IO cores, than 1 beefy OOO core?

Fox5 · Jul 6, 2005

scooby_dooby said:
"Either we conclude IBM know bugger all about processor design and programming and shafted both MS and Sony with duffers, who weren't smart enough to realise their error, or the hardware is capable of doing what is asked of it, and doing it well, even though it's different to existing approaches."

Isn't it simply cheaper to produce 3 small IO cores, than 1 beefy OOO core?

Maybe IBM went with the strengths of their design teams, perhaps they're more familiar/comfortable with going for all out performance rather than a beefy OOO core.

ralexand · Jul 6, 2005

seismologist said:
I dont think there's any question that these CPUs will suffer compared to PC's at running GP code. (Which was why I'm surprised there was so much backlash to that Anand article).

The obvious solution is, dont use it to write general purpose code! Make something different. Write code that requires 2Tflop of FPU performance.

That's potentential is what excites me most about the potential of these new machines... Not how well they will do at running PC code.

But game code is game code no matter what platform you are on. It's not like pc games are formatting large text files like word does when running a game. No matter how great your floating point processor is, you'll still need to do highly branched code due to the interactivity of a game.

MechanizedDeath · Jul 6, 2005

scooby_dooby said:
"Either we conclude IBM know bugger all about processor design and programming and shafted both MS and Sony with duffers, who weren't smart enough to realise their error, or the hardware is capable of doing what is asked of it, and doing it well, even though it's different to existing approaches."

Isn't it simply cheaper to produce 3 small IO cores, than 1 beefy OOO core?

But didn't they essentially strip down one of their big, beefy OOO cores and made it in-order in order to get them smaller? I have to go with Shifty on this. The chances of BOTH MS and Sony making the same "bad" decision in CPU design is extremely unlikely. I'd wager that the armchair devs at Anandtech are simply way off the mark. Seriously, why would MS and Sony have dumped billions into chip designs that don't do what they need them to? Either they're both clueless, or the author of that article is misinformed. Seeing Sweeney and others make comments in favor of the processor designs, I'd say it's the latter. PEACE.

Shifty Geezer · Jul 6, 2005

scooby_dooby said:
Isn't it simply cheaper to produce 3 small IO cores, than 1 beefy OOO core?

At manufacturing level it's the only way to get lots of cures. But for system perforamnce and cost, if you add RnD into the mix, and if the end product gets thrashed in a gaming situation by a 2 GHz P4 (Anand saying XeCPU was 2x XB CPU), that's a really dumb decision.

So MS discussed their needs for a CPU...

"we could go simple multicore, or buy an off-the-shelf general purpose GPU."
"Multicore sounds better marketting wise."
"Yes, but 80% of games are GP. Multicore means less GP performance as they need to be small. If we get a GP core in there, we'll knock the pants of Sony's Cell in gaming as Cell isn't GP friendly."
"True, but multicore sounds better marketting wise."
"Okay, I see that. But we can get cheap off-the-shelf components, save on the RnD, provide an exisiting API that devs are infinitely familiar with, provide easy BC for XB games, and still thrash the pants off our rival's naffy little useless stream processor."
"Hmmmm...good points. That said, multicore sounds better marketting wise."
"You're right. Break open the cheque book and bag us a CPU developer!"

Prior to E3 MS was all FP power of their system. They never mentioned GP until that Major Nelson article. It was, IMO, never a point they considered until finding differences between their system and Cell for their PR campaign. And the reason it was never an issue is because stream processors can handle all the work load needed and very well if you learn and write to the hardware.

Inane_Dork · Jul 6, 2005

Shifty Geezer said:
Prior to E3 MS was all FP power of their system. They never mentioned GP until that Major Nelson article. It was, IMO, never a point they considered until finding differences between their system and Cell for their PR campaign.

Uh... I dunno about that. MS would have to be very stupid indeed to have no sense of what the Cell was capable of prior to E3. I always took it as a new focus without crippling that which was already possible (in theory, at least). That is, it'd be like Bioware announcing an FPS game. It's rather presumed that it will still have RPG elements and a strong story. It's the FPS part that's really newsworthy.

That said, it probably was just PR.

Most games are 80% general purpose code and 20% FP...

Titanio

gurgi

Titanio

version

scooby_dooby

Carl B

Friends call me xbd

scooby_dooby

Carl B

Friends call me xbd

Titanio

scooby_dooby

Carl B

Friends call me xbd

3roxor

Shifty Geezer

uber-Troll!

seismologist

scooby_dooby

Fox5

ralexand

MechanizedDeath

Shifty Geezer

uber-Troll!

Inane_Dork

Rebmem Roines

Similar threads