I am talking about the sort of code you might get in a role playing game, where you might write code which procedurally follows decision making, and the tendency there is for the pure AI parts of the code to be interspersed with code that collects information required for the AI to process and does what needs to be done as a result of the decision or AI status the AI code comes up with. Strip out everything else, and AI decision code will actually be extremely compact.
You mean when there's a scripting language that the designer or a player can use to define behaviors? In that case, the script is the data that gets interpreted by the game engine.
That's the point. A scripting language is more accessible and requires less expertise for a character designer or people making player-generated content. Sometimes peak performance is traded off for greater accessibility.
Perhaps that is something unimportant for most PS3 games, it is important for a number of PC games.
The approach most programmers take to coding AI is to draw out a flow chart with decision boxes and convert that to code. The resulting code works well on a conventional processor like the PPE but not on SPEs - the SPEs can't fit all the code in local store, and if you do split the code between the PPE and an SPE, the SPE sits idle most of the time because the AI processing is imtermittent in the procedural code, whereas the PPE running everything wouldn't be idle.
Most programmers? I'd like some stats on that.
That's just an argument for multithreading and avoiding unncessary synchronization, not Cell specifically.
A single x86 core can play the role of a PPE and use the others as SPEs. Their peak rates would be lower (if the data is SIMD friendly, if the data can fit in LS, if the access patterns work well with DMA, if the threads are relatively straight-line, etc), but then again one can send off more complex operations to a full core that would take additional conversion work when passing it to a SPE.
Writing AI code for SPEs requires a different, approach - a rules based simulation approach rather than a flow chart approach. That doesn't mean to say that you don't need procedural AI code. Certain things are easier to do this way, you have the PPE there, and it can handle this type of code efficiently while the SPE can't.
The reality is that there are many approaches to creating AI. No single solution works everywhere and no single solution is necessarily better.
For approaches that only rely on processing power and can meet all those qualifications you bring up, Cell is likely to be best.
For approaches that for whatever reason fail to fit in the predefined box, a solution other than Cell is likely to be best.
Simple AI processing stacked in layers, becomes very sophisticated AI - that's how our brains work, and the results aren't simple. The main difficulty is that programmers are taught to think procedurally - flow charts and decision boxes. True, some AI problems are difficult to think of except procedurally, but other AI problems - for example crowd behaviour - are much more natural and easier to implement in terms of simple rules governing how each person interacts with others and the environment. Also it is not one or the other, you can use both.
The more rules you slather on, the more likely you get a mess. That problem has not been solved, and it is platform-independent.
What you posit is a very low-level view of the AI implementation. It necessitates rules-based methods because systems that are more complicated become unmanagable.
Methods that try to minimize this issue sometimes have issues running well on Cell. They don't always have the same problems if asked to run on Kentsfield.
Not all is. The position of execution in the code also represents the results of previous decisions in the flow chart and therefore represents some of the AI information. Where the AI code is designed by mapping the AI status into a flow chart, the natural tendency is to put in code that needs to be executed in the path of execution rather than saving the status as data for another process to pick up.
That's once again an argument for multithreading and intelligently written code.
AI code is not put directly into a game engine's main loop. It is already partitioned based on that.
A decent simulation is trivially capable of similar optimizations for AI modules.
Unlike the SPE, there isn't as strong a need for another bout with low-level optimizations that not all the dev team has training in.
Yes, but you can do a lot with simple building blocks.
I really like peanut butter on my sandwiches. It's so awesome when I to use it there.
My car battery died, and I needed to get to work. Since the peanut butter worked on my sandwiches, I spread it on the engine. Then my car caught on fire, and since the peanut butter works so well on my toast, I tried to spread more on.
Perhaps other constraints make a low level rules-based approach awkward, excessive, or too time consuming.
http://www.frams.alife.pl/index.html
Layering of very simple rules can produce results that are much more complex than a programmer has time to hand code. Also there is nothing to stop you using the two together eg. to skew the rules of herd migration according to a programmed set of parameters, or for the programmer to set the positions of pieces of an exploding "replicator" while letting the pieces wiggle around based on simple AI interaction rules.
I want my trooper to secure the ammo supply at the top of my base. Do I really need the system to go back to first principles every 1/60th of a second to make sure it's a right turn instead of a left?
But what is to stop you doubling the number of factors in their calculations with two pass AI processing?
How does that stop doubling the number of calculations?
A+B=C, D+E=F
If I put the first operation in pass one and the second in pass two, it's still two operations.
Multiple passes might reduce the burden of the amount of memory that is needed at any given instant, but it incurs a cost.
The act of separating the passes means there is extra setup work.
It may be obvious, but I am just pointing it out because this answers the core of the argument regarding Cell SPE vs conventional big cache CPU. If you are going to traverse code once, maybe with a bit of looping, then cache + CPU always wins - run it on the PPE. If you can fit the code and data required for processing into the SPE, AND you are going to run the code hundreds or thousands of times and then unload it to release the SPE for other things AND you don't have to wait for anything else to do it, then the SPE always wins.
Traversing code once would be more of a tie or an advantage to the SPE. Cache doesn't do quite as much for first and only access.
If instruction throughput is very important (it probably isn't most of the time), it is likely the x86 would win, since its instruction cache has half the latency and the core can decode more instructions at once.
I think you are focusing too much on the instruction stream as a source of issues.
Quite likely more than 2x for a lot of things. However the non-compute bound AI isn't compute intensive, so getting peak performance isn't a big deal, and for efficiency (ie. you don't want to tie up an SPE to process intermittent tasks), you would do this on the PPE along with other tasks.
I only stated that such situations will exist, not that 2x was the only amount that can be expected.
I do like how you like to dump the work that 7 SPEs don't want to do on the poor PPE that can handle less than half of what a single Kentsfield core can do.
Certainly the computional effort might be better spent on something else. However AI number crunching isn't expensive compared to graphics number crunching - at least not to the extent it is implemented in current games, and it is the least exploited area and the one where the biggest improvements are there to be made.
That's because modern AI is stupid, not that AI can't be compute-intensive.
The chip involved has little to do with this problem.
It's not that Cell isn't good for a good portion of AI problems, but the supposed advantages it has over a multicore x86 from an AI perspective are highly variable.