GDC Europe

aaronspink · Aug 31, 2005

Panajev2001a said:
Deano, you say no special stuff for branches, but what do you think about this:

There is a difference between what is architecturally allowed and what is architecturally implemented. All available data says that no branch prediction was designed for the SPE. A snipit about branch prediction in an architectural specification has no impact on the inclusion of branch prediction in an actual design based off that specification.

Aaron Spink
speaking for myself inc.

aaronspink · Aug 31, 2005

version said:
Peter Hofstee:
The design philosophy of Cell was to design for as high an operating frequency as we could without making the processor inefficient ( if 1% more performance costs more than 3% power you would know for sure you've gone too far ) and then achieve maximum operating efficiency by running the processor at its minimum operating voltage.

Some of the graphs we've shown indicate an operating frequency somewhat over 3GHz at the minimum operating voltage. Personally I think that it is better to go to SMP configurations ( like the 2-way blade prototype IBM has shown ) if you have a higher power budget.

Understand that Peter Hofstee is doing marketting for the CELL. Grains of salt and all that. Interesting that no actual running system has been shown running faster than 3.2, yet apparently if we want to believe your hype, 4 Ghz chips are the norm at any given nominal voltage.

3.2 GHz or there abouts will be the operating frequency for CELL. Though you are welcome to dream about whatever you want.

Aaron Spink
speaking for myself inc.

aaronspink · Aug 31, 2005

xbdestroya said:
Well the heat thing isn't something I'm just taking a stab at however, it comes from this Q&A from an interview with IBM's Paul McKenney.

So the implication here is that the XeCPU is an 85-watt TDP part; well above Cell at the same frequency I would imagine, even though we only have schmoo for the SPE's.

http://www-128.ibm.com/developerworks/power/library/pa-nl14-directions.html

I think you are vastly underestimating the CELL power and confusing Pmax with Pavg. CELL should have the same if not higher power than XeCPU. I fully expect both designs to be around 70-80W Pmax and 40-50W Pavg.

Aaron Spink
speaking for myself inc.

Carl B · Aug 31, 2005

Well, we'll just have to take a 'wait and see' approach on that one I guess.

As for GDC Europe - the George Bain 'Architecture of PS3' presentation should have ended by now. Any word from any attendees on what was discussed?

Heinrich4 · Aug 31, 2005

DeanoC said:
This is how I work it out...

PPE can dual issue VMX FMADDs
SPE can single issue a SIMD FMADD
1 * 3.2 * 2 +
7 * 3.2 * 1 = 230.4 billion FLOPs

Now I'm not sure if a GFLOP is 1 billion or 2^30 FLOPs so I'll go with a billion...

Thats the maximum flops you can get (I think), its irrelevant if the PPE FPU's could do some work, as you have filled all the instruction slots with VMX instruction so you can't issue any...

I think I'll update the slide as well.

-> Congratulations Deano C for excellent file ppt and certainly a great presentation.

But this cell is version 2 (2 PPE VMX units) will have to be used in ps3?

Shifty Geezer · Aug 31, 2005

Version 2 Cell is the only version available available AFAIK. It very quickly superceded version 1 by all accounts.

Heinrich4 · Aug 31, 2005

Shifty Geezer said:
Version 2 Cell is the only version available available AFAIK. It very quickly superceded version 1 by all accounts.

I am thankful for the information.

But because the 217.6 GFlops that Sony announced (will be that 250MHz for 295/300MHz will occur the same of the original EE of 99)?

Fafalada · Aug 31, 2005

aaron said:
While within a short burst it is often possible to see a linear progression of FP ops, this will only be the case because you've pre-loaded the operands.

Except that "short bursts" apply to pretty much every graphic loop/streaming process, and they aren't usually all that short.

aaronspink · Aug 31, 2005

Fafalada said:
Except that "short bursts" apply to pretty much every graphic loop/streaming process, and they aren't usually all that short.

Sure, you can have a short burst every 32 cycles, followed by 32 cycles of loads. Which puts you right where I said you'd be.

Its really quite simple, if you can't get the performance in Linpack/DAXPY, you sure as hell will not get it in any real code.

Aaron Spink
speaking for myself inc.

Crazyace · Aug 31, 2005

daxpy is really the worst case - not the best case.. I can code higher Radix FFT's or matrix matrix multiplies to give a much better ratio of fmac's to load/stores.

However I agree that peak 'FP' numbers are more PR than engineering - but everyone does it.

( I cant work out 115.2 for XB360, and I cant work out 218 for PS3 - but some PR monkey has somewhere

)

Panajev2001a · Aug 31, 2005

aaronspink said:
There is a difference between what is architecturally allowed and what is architecturally implemented. All available data says that no branch prediction was designed for the SPE. A snipit about branch prediction in an architectural specification has no impact on the inclusion of branch prediction in an actual design based off that specification.

Aaron Spink
speaking for myself inc.

Dynamic branch hints are possible though and that might apply to that as an earlier patent mentioned before.

version · Aug 31, 2005

Panajev2001a said:
Dynamic branch hints are possible though and that might apply to that as an earlier patent mentioned before.

with selfmodified code ?

MrWibble · Aug 31, 2005

Crazyace said:
daxpy is really the worst case - not the best case.. I can code higher Radix FFT's or matrix matrix multiplies to give a much better ratio of fmac's to load/stores.

However I agree that peak 'FP' numbers are more PR than engineering - but everyone does it.

( I cant work out 115.2 for XB360, and I cant work out 218 for PS3 - but some PR monkey has somewhere )

Gah.. second attempt at posting because IE just plain doesn't work for me here anymore - as soon as I submit IE just eats 99% CPU and never comes back... Yes, I use IE... I know, I know...

Anyway.

My best "PR Hat" guess as to the source of the numbers is:

1 thread dual-issues a VMX fmad and an FPU fmad
1 thread issues another FPU fmad

That would give 12 ops per cycle making the rather large assumption that such scheduling is actually possible without some form of magic fairy dust.

Note my assumption of 2 FPUs... bit bold but it's the only way I can see to make things work. Assuming 2 VMXs is excessive - it blows the numbers and frankly it would probably blow the yields too if they had to find space for another one of those...

12 ops per cycle is what is required for both sets of numbers - multiply by cores and clock for X360 and add a sprinkling of SPE goodness for PS3.

12 * 3.2G = 38.4
38.4 * 3cores = 115.2

12 * 3.2G = 38.4
8 * 3.2G * 7 = 179.2
38.4 + 179.2 = 217.6 ("218-ish")

We should make marketing people write code if they're so clever....

pc999 · Aug 31, 2005

Interesting KZ AI seems to be a very nice idea, but after all it do not result very well, do anyone know why? By the way it seems to be pretty expensive to the CPU, but also flop friendly I am correct?

Acert93 · Sep 1, 2005

pc999 said:
Interesting KZ AI seems to be a very nice idea, but after all it do not result very well, do anyone know why? By the way it seems to be pretty expensive to the CPU, but also flop friendly I am correct?

Sometimes a simpler approach produces better results in regards to the end user experience. A popular example would be Doom3 which had a complex 3D positional sound engine that was scrapped and instead a simple, yet effectice, sound system was used.

AI is difficult because it is not human. We have had dozens of years to think and test things... and can do totally random, unthinkable things that destroy the AI experience!

Think of it like chess.

In chess there are predetermined moves and techniques, patterns, that can be used together to form tactics. Every piece has "rules" and in general effective strategies rely on following these patterns. We are at the point where a computer can beat the best chess players--as long as the follow the rules!

Imagine a chess match where you could move 2 pieces at once!

or you jumped all the way across the board, or made two moves in a row. Chess is a simple game, yet if you allowed humans to break the rules and do things the computer has not been trained to expect it would get creamed and "look dumb".

That is what happens when you stick humans in a game. If it can be broke--they will break it!

We are talking about a 3D sphere with all kinds of variables. You add physics, guns, running, jumping, doors and halls, vehicles, etc... into the mix and you have hundreds of variables... and you have to code your AI to react "intelligently" to all kinds of scenarios.

So even a really impressive AI can look pretty bad. Broken AI sticks out like a soar thumb. It can be really good... until it breaks, and then you are cursing at it for being so horrible!

So in that regards a lot have stuck to simple scripting. It may not be as dynamic as a fluid situational awareness AI, but on the other hand you can make it do, what you want it to do, when and how you want. And yet it is pretty simple.

Bohdy · Sep 1, 2005

Killzone's AI was quite clever. The reason that it appeared stupid, imo, is the delay between updates was too great. It was burning up too much cpu time to update more frequently, I suppose.

ERP · Sep 1, 2005

pc999 said:
Interesting KZ AI seems to be a very nice idea, but after all it do not result very well, do anyone know why? By the way it seems to be pretty expensive to the CPU, but also flop friendly I am correct?

The problem with game AI in general is that if the player can't understand what is happening, it just seems random, stupid or cheap.

The biggest challenge with complex AI is explaining it to the player in the context of playing the game. It's one of the reasons that the best AI does not equate with the best gameplay experience.

Sis · Sep 1, 2005

Bohdy said:
Killzone's AI was quite clever. The reason that it appeared stupid, imo, is the delay between updates was too great. It was burning up too much cpu time to update more frequently, I suppose.

Fine, but the problem is it only takes a single flaw to negate any cleverness. I stopped playing Killzone when I rushed a building and started climbing some steps, only to find about 8 bad guys at the top. I wanted to kill them, but they refused to budge from the top of the stairs. And they wouldn't come after me either, because apparently the AI didn't have that logic.

In the end, it looked utterly stupid, all these guys at the top, me shooting potshots at them, and they just stood there...

.Sis

MechanizedDeath · Sep 1, 2005

ERP said:
The problem with game AI in general is that if the player can't understand what is happening, it just seems random, stupid or cheap.

The biggest challenge with complex AI is explaining it to the player in the context of playing the game. It's one of the reasons that the best AI does not equate with the best gameplay experience.

Hasn't the KZ AI been lauded before? I never played the game, so I don't know the problems with it, but I know they have been praised before for the AI in the game. Looking at the ppt, it seems they are really trying to recreate good tactical combat. I wonder what they're planning for KZ3 on the PS3. I want to see what people say about the AI in KZ2 first though. Hopefully they remedied some of the problems gamers had with the original game. PEACE.

Inane_Dork · Sep 1, 2005

IIRC, Killzone was panned for its AI, not lauded. Was anyone here when those videos were released a few months before launch? The AI was dumb as dirt and it raised quite a ruckus.

BTW, if any of you find this topic interesting, you might find Bungie's recollection of Halo's AI interesting. They faced the problem of making intelligent and understandable behavior and solved it in Halo, IMO.

GDC Europe

aaronspink

aaronspink

aaronspink

Carl B

Friends call me xbd

Heinrich4

Shifty Geezer

uber-Troll!

Heinrich4

Fafalada

aaronspink

Crazyace

Panajev2001a

version

MrWibble

pc999

Acert93

Artist formerly known as Acert93

Bohdy

ERP

Sis

mental_v-sync=off;

MechanizedDeath

Inane_Dork

Rebmem Roines

Similar threads