How could SPE's be used in console RPG's?

SPM said:
Just try programming the crowd behaviour algorithm for several hundred people using procedural AI and you will see just how much easier the neural network AI approach is for certain things. Procedural AI approach is of course simpler for overall game logic.

Well, for flocking behaviour, neural nets don't really enter into the equation at all. You could extend the behaviour of individuals in the crowd using neural nets or any other AI technique if you wanted, but the behaviours of cohesion, alignment, seperation etc. don't have anything really to do with neural nets. NNs and others really relate to the decision making process, whereas I guess flocking behaviours relate more to motion. A seperation behaviour may get you away from a certain entity in the crowd, but some other mechanism needs to be employed first to decide that that entity is dangerous, or whatever. The flocking algos in and of themselves don't really care how you came to that decision, which is what NNs and other mechanisms would be concerned with.

SPM said:
Regarding the unpredictable subtlties that arise, isn't this the very definition of intelligence - for an object to use what experience it has gained from it's environment and interactions to come up with actions which could not have been predicted by it's programmer? Or to put this in Star Trek speak - for an artificial creation to go beyond it's original programming.

Like I said, this might be desireable or it might not. Your AI may find ways to keep to your rules, but short-cut them also, leaving you with undesireable behaviour. I remember my AI lecturer talking about robots someone was working on using NNs, IIRC. They were meant for vacumning or something, but in some of their first attempts, the robots would just end up basically humping the power sockets :p
 
Last edited by a moderator:
Sort of an aside, but I think a Nippon Ichi RPG with fully Cel-shaded graphics like in that Mistwalker 360 game (not Lost Oddyssey/Blue Dragon, the other one) would be pretty awesome.
 
Well, for flocking behaviour, neural nets don't really enter into the equation at all. You could extend the behaviour of individuals in the crowd using neural nets or any other AI technique if you wanted, but the behaviours of cohesion, alignment, seperation etc. don't have anything really to do with neural nets. NNs and others really relate to the decision making process, whereas I guess flocking behaviours relate more to motion. A seperation behaviour may get you away from a certain entity in the crowd, but some other mechanism needs to be employed first to decide that that entity is dangerous, or whatever.

This is of course not strictly speaking a neural network, the term "neural network type of AI" is used for want of a better word. Basically the program works in the same way as a neural network, which is a network of interacting objects each of which behave in accordance with a simple set of rules producing a very complex and difficult to predict behaviour pattern as a result.

Like I said, this might be desireable or it might not. Your AI may find ways to keep to your rules, but short-cut them also, leaving you with undesireable behaviour. I remember my AI lecturer talking about robots someone was working on using NNs, IIRC. They were meant for vacumning or something, but in some of their first attempts, the robots would just end up basically humping the power sockets

I am not disagreeing with you, but this will allow "intelligent" behaviours in RPGs that are unpredictable and unrepeatable, as well as allowing some very complex behaviours to be programmed in a simpler way which would not be possible by hardcoding them in procedural AI. Maybe these are not desirable in robot vacuum cleaners, but they can add another dimension in immersiveness to RPGames.
 
SPM said:
This is of course not strictly speaking a neural network, the term "neural network type of AI" is used for want of a better word. Basically the program works in the same way as a neural network, which is a network of interacting objects each of which behave in accordance with a simple set of rules producing a very complex and difficult to predict behaviour pattern as a result.

A neural net isn't about a network of entities in a world, rather the decision making mechanism behind one or more of them. It may of course scale to more entities better, and deal with n-way interaction better than procedural stuff, potentially, but a NN may be used to drive one individual, or just one part of that individual.

I get what you're saying by bottom-up versus top-down AI, though. I'm not sure how much the former has been used in games yet, though, or regarding its relative success to date.
 
Titanio said:
A neural net isn't about a network of entities in a world, rather the decision making mechanism behind one or more of them. It may of course scale to more entities better, and deal with n-way interaction better than procedural stuff, potentially, but a NN may be used to drive one individual, or just one part of that individual.

Yeah, but there is no word for the "interacting object based AI" that I am trying to describe. The programming technique in terms of processing algorithms that would be used to simulate a neural network on a computer would be similar (ie. treating the logic states as tables of boolean data), which is why I used the term.
 
SPM said:
Regarding the unpredictable subtlties that arise, isn't this the very definition of intelligence - for an object to use what experience it has gained from it's environment and interactions to come up with actions which could not have been predicted by it's programmer? Or to put this in Star Trek speak - for an artificial creation to go beyond it's original programming.

This may make for a good research project but gamers by and large play games to have fun and to win. This usually translates to easy and predictable opponents. Make the computer player too good or too unpredictable and people will complain that the game is too hard or broken.
 
MrWibble said:
Ok, well if we rephrase the question to be less PS3 centric (it almost reads like another "SPUs aren't general purpose/useful" thread at the moment), and simply ask:

"How could RPGs be improved with more processing power?"

I made it PS3 centric because there's been talk about the SPE's in Cell being able to double as GPU shaders. I think that feature is fairly PS3 specific.
I was wondering if it would be possible to use idle SPE's to augment RSX rendering.

MrWibble said:
Yeah, what he said - if you don't *need* the power, why go to the trouble of contriving a use for it just to give yourself more work?

It's like saying "what can I get the SPUs to do in my 2D coloured-block puzzle game?"

Modern turn based RPG's aren't like 2D block puzzles though. You may not need CPU power but you could use the added rendering power.
 
Last edited by a moderator:
SPM said:
Why do you think more complex AI would not be beneficial to RPGs? I think that improved AI will be the most important advance in RPGs in future, rather than graphics or physics improvements.

Current games use primitive code branch based top-down pseudo AI, because of the procedural, single threaded nature of conventional CPUs. The AI is simple and predictable because it consists simply of traversing through a fixed set of states (mirrored in code branches) depending on decisions taken. This is predictable and has a limited number of states, because it has to be programmed in explicitly by the programmer. It is also limited because it is inefficient due to the fact that the logic state is stored in the program counter and requires long branches (forcing the instruction cache to be flushed), and hogs a whole execution thread. It is pseudo-AI because it isn't really AI at all, just something that has the appearance of intelligence. It is comparable to the LISP based talk programs of the 80s, which would take text you type in and try and give a response by identifying grammar and switching verbs and order to pose a question. It seems intelligent for a very short time and then becomes infuriating because of it's limited response capability.

There is another way of doing AI, one which the SPE is ideally suited to: neural-network type AI. Examples of this type of AI have recently been demonstrated in the crowd behaviour demos.

Neural network type AI involves treating the logic status as data in a logic table and processing it as data with boolean operations rather than representing it by the position of the program counter. This has a number of advantages. You are not limited to one logic status per execution thread as you are with procedural AI - you can have thousands of logic states by creating thousands of logic status datasets each associated with a different object. You can process the logic datasets using vector processing to some extent by using branchless boolean algorithms. You can load and stream process the logic states of objects in turn, or distribute them to a number of SPEs to process them in parrallel. Branches are short and can be limited to a large extent by using branchless algorithms that store the results of what would otherwise be branches as boolean flags in the data in the logic tables instead for later action. The SPE is perfect for could have been built with this type of AI in mind. Also for complex AI, neural network type AI is much easier to program, because the programmer just has to program the (relatively simple) rules for each object, and then turn them loose to interact with the environment and with each other in complex ways, rather than having to program explicitly for all outcomes in the entire environment. Unlike procedural AI, this is real intelligence. Take the crowd behaviour demo for example, the programmer has defined how each person in the crowd should behave, and he/she can put each person in a particular place, but unlike procedural AI, there is no way he/she can predict where they will end up - the darn things have a mind of their own, and there is an almost infinite number of ways things can end up. What neural network type AI really is, is a realtime simulation of a complex network of interacting objects based on rules specified for the objects by the programmer. You may of course have more than one AI object to a physical object. For example the human mind may have an emotion AI object, a conscious AI object, a sub-conscious AI object, a desire AI object, and a conscience AI object, all interacting with each other and with the environment. Another possibility is interaction of AI objects over the Internet in a truely massive environment.

In Cell, you would probably run procedural AI on the PPE to do overall game AI, but the SPEs would handle the AI of individual objects in the environment.

As far as a RPGs are concerned, instead of the boring and predictable navigating through a finite set of possibilities, the neural network type of AI could allow far more immersive and complex RPG games than we have now. It would drive the "cheat" bloggers mad though.

There were some old discussions on this in the IBM Cell forums
http://www-128.ibm.com/developerworks/forums/dw_thread.jsp?forum=739&thread=93069&cat=46
http://www-128.ibm.com/developerworks/forums/dw_thread.jsp?forum=739&thread=103263&cat=46

The problem with neural nets is that during "learning" they basically construct some function F(S) -> O where S is some set of input Statea and O is some output state.

The problem is that in real AI problems S and O are actually hard to describe, and in some cases it's difficult to measure the quality of O for learning. Changing S or O requires relearning from nothiing.

They are ideally suited to well defined problems like steering problems (in the crowd demo) fighting games, and probably turn based games, where closed form solutions are not obvious.

Having said that more oftem than not it isn't about being smart in game AI in the good turn based RPG's it's more about presenting a puzzle that can be solved, simple state machines are hard to beat in that context.
 
A good game will have various weightings at least, so if the players is melee strong, will select a Party Weakness skill over a one-on-one attack, or if they're resistant to Fire, attack with lightnings spells. Though IIRC FFX creatures just attacked with a standard pattern, zero intelligence! Which was probably good for the gameplay, because you could adapt your choices. If they adapted to you, you'd get whipped. Like most games, the computer has to be dumb, because the moment it has any sense the fact it outnumbers you 50:1 and a quick 'surround and attack from all sides'...
 
I bet that at first they'll just go all out on rendering. Maybe compute radiosity on the CELL, then paste that on to the scene from the RSX.
 
SPM said:
Current games use primitive code branch based top-down pseudo AI, because of the procedural, single threaded nature of conventional CPUs.

That's not entirely the reason this is used most often. The real reason we have crummy AI is because it's incredibly hard to get right and fast.

The speed problem is in part due to how computer processors work, but not even a highly threaded architecture can really match the exponential increase in complexity AI involves.

Procedural branch-based AI is inflexible, but it is very easy (compared to other methods) to understand, create, and debug.
More "organic" ways of implementing AI are usually way more complex and much more prone to failure.


The AI is simple and predictable because it consists simply of traversing through a fixed set of states (mirrored in code branches) depending on decisions taken. This is predictable and has a limited number of states, because it has to be programmed in explicitly by the programmer.

If the machine runs off of transistors within hard-wired functional units, this branching behavior is unavoidable. The states in software exist because that's what electronics work with. As long as algorithms are the basis of what we use to express AI, branches will never go away.

The holy grail of course is an AI that can make things up as it goes along. Unfortunately for the developer, if it does go wrong, good luck in fixing it. AI guys have been working on this since the 70's with no real robust solution.

It is also limited because it is inefficient due to the fact that the logic state is stored in the program counter and requires long branches (forcing the instruction cache to be flushed), and hogs a whole execution thread.

Logic state exists in the data structures within the simulation, and depending on the implementation it can change even when the program counter exists within a narrow set of values (like a loop that changes the entire simulation). The program counter has little to do with the logic state of the AI, it just tells the processor where it is in the instruction stream.

The biggest limitation to AI isn't branch inefficiency, it's the size of the problem and the lack of robust algorithms for all but the simplest cases.

There is another way of doing AI, one which the SPE is ideally suited to: neural-network type AI. Examples of this type of AI have recently been demonstrated in the crowd behaviour demos.

The SPE is a single-threaded processor that runs on compiled procedural code. What it has as an advantage is an improved way of accessing memory and communicating with other SPEs through CELL's internal interconnect.

It still has a program counter, and it still does things through branches and locally stored instructions.
If it is ideally suited for neural networks, then so is an old Pentium.

Neural network type AI involves treating the logic status as data in a logic table and processing it as data with boolean operations rather than representing it by the position of the program counter.

Don't fixate on the program counter, it has almost nothing to do with the logic state of the AI in question.

This has a number of advantages. You are not limited to one logic status per execution thread as you are with procedural AI - you can have thousands of logic states by creating thousands of logic status datasets each associated with a different object.

There are thousands of logic states in single-threaded code. There's a difference between the conceptual state of a system and the state of the hardware that is running the simulation.

You can process the logic datasets using vector processing to some extent by using branchless boolean algorithms. You can load and stream process the logic states of objects in turn, or distribute them to a number of SPEs to process them in parrallel. Branches are short and can be limited to a large extent by using branchless algorithms that store the results of what would otherwise be branches as boolean flags in the data in the logic tables instead for later action.

That really hasn't changed anything. Conditionals are still being evaluated in the software, you've only pushed complexity elsewhere in the system.

Any other processor would run this AI just the same as an SPE. The algorithm is the same, so the behavior is the same on universally capable machines.
What you get is silicon that is simulating a neural network, regardless if the machine simulating it is an SPE or a SPARC.

Using multiple SPEs is nothing that will revolutionize AI, we've had multiprocessors for nearly as long as we've had processors. It hasn't really made the computationally wasteful neural simulation any more viable.

The SPE is perfect for could have been built with this type of AI in mind. Also for complex AI, neural network type AI is much easier to program, because the programmer just has to program the (relatively simple) rules for each object, and then turn them loose to interact with the environment and with each other in complex ways, rather than having to program explicitly for all outcomes in the entire environment.

A neural network is a collection of nodes that are connected to each other by varying numbers of links, each link being given a weight. Depending on the inputs being fed into a node, it will output a postive or negative result (in the simplest form) to other nodes it is connected to.
Over time, a neural network that is given training inputs begins to assign new weights to each link, with desirable results reinforcing the links that make the right decisions.

There are almost no rules in a neural network, save in the form of the boolean function each node uses to evaluate its input and the weight at each link.

What you are describing is emergent behavior, which isn't a neural network. It is a way of getting behavior from a collection of independent agents based on how they interact.

Nodes in a neural network are not independent, and they do not act. They all work towards producing an output. Some other routine must then be made to interpret its output and use it to change the overall simulation state.

It is quite possible that the agents in a flock can use neural networks in their decision making, but this is conceptually different.


Unlike procedural AI, this is real intelligence. Take the crowd behaviour demo for example, the programmer has defined how each person in the crowd should behave, and he/she can put each person in a particular place, but unlike procedural AI, there is no way he/she can predict where they will end up - the darn things have a mind of their own, and there is an almost infinite number of ways things can end up.

Unless a hardware random number generator and some kind of Monte Carlo simulation is put in each agent, then the same starting conditions will lead to the same ending conditions every single time.

It's still an algorithm, and those are inherently deterministic.

What neural network type AI really is, is a realtime simulation of a complex network of interacting objects based on rules specified for the objects by the programmer.

That is emergent behavior, but not necessarily from a neural network.

In Cell, you would probably run procedural AI on the PPE to do overall game AI, but the SPEs would handle the AI of individual objects in the environment.

Probably, but caution should be taken in how the term AI is being used. Is the SPE being tasked with making critical decisions or just updating another element of the simulation?

As far as a RPGs are concerned, instead of the boring and predictable navigating through a finite set of possibilities, the neural network type of AI could allow far more immersive and complex RPG games than we have now. It would drive the "cheat" bloggers mad though.

Boring and predictable, or sane and understandable?
There is no guarantee that a neural net will output an expected or sensical result. Not unless a lot of effort is put into its construction and a lot of restraint is used.
Defining sensical in algorithmic terms may be impossible.

Then there's the problem that people don't want perfect AI, they want AI that meets their expectations. A perfect AI is one that will never lose.

A few problems for AI:

1) There is no hard and fast definition for AI, a lot of AI experts don't even think "true" AI is even possible
2) The difference between number crunching and decision making
a) Silicon is great at the first part, and it needs to be. A lot of number crunching goes into creating the illusion that there is decision making going on.
b) Tactical, short-term, close-proximity decisions can fall into number crunching.
c) Strategic and wide-ranging decisions (some of that "true" AI falls in here) are much harder and vastly more complex. For relatively "simple" games like chess, strategic thinking means breaking down a game board into a vast collection of decision states. The addition of a few more rules or a few more spaces leads to a gargantuan increase in complexity. This is number crunching, and it really suffers here.
Heaven help the AI if there's a time constraint.
d) The unexpected, both for the developer, the player, and the software. The more complex a game or environment, the more that little things upset the decision-making.
3) Development. Fancy AI's are hard, and if they fail they will fail in mysterious ways.
4) Complexity. The big problem is that while Cell can be tens of times faster than other processors, even the slightest increase in problem size or complexity will need a chip 100 times stronger than Cell.
5) Return on investment. So you spend two years on awesome learning AI for your RPG, but, some guy spends three months scripting a similar game and for the most part nobody notices the difference. Worse, on rare occasions your complex AI goes Cujo on its own party, and you don't know why. Maybe you could fix it if you had another ten years...


Don't expect any majorly huge advances in AI any time soon. Cell is still silicon, and it still has to fake the same things other chips fake. It's just slightly better at it.
 
3dilettante said:
Don't expect any majorly huge advances in AI any time soon. Cell is still silicon, and it still has to fake the same things other chips fake. It's just slightly better at it.


I wonder if good midleware will appear so we get good AI.

There are those http://spirops.com/ from their pdfs it seems that it could be great solution (at least to help production), althought I do wonder how the hell will they keep the number of cases down.

Anyway I think (or at least hope) that this gen dev overall will be more carefull with AI, if things like those of UE3 is close from what they say people will want more and better.
 
3dilettante said:
Procedural branch-based AI is inflexible, but it is very easy (compared to other methods) to understand, create, and debug. More "organic" ways of implementing AI are usually way more complex and much more prone to failure.

There are thousands of logic states in single-threaded code. There's a difference between the conceptual state of a system and the state of the hardware that is running the simulation.

The problem with procedural AI is that the programmer has to explicitly define every logic state. It is easy for doing simple things like game logic for what responses are required to make doors open as you navigate through a fixed maze, but other AI problems like the herd/crowd behaviour simulation are much more simple to program by defining the logic state in an implicit way. For example you define a simple set of rules which define how each person/chicken interacts with regard to it's neighbours rather than attempting to hard program a formula to realistically cover every possible logic state combination for the 1000s of people/chicken objects in the environment - something that is near impossible given the number of possible combinations and interactions between 1000s of objects in the environment is immense. Just try programming the herd/crowd behaviour simulation as single threaded procedural AI code, and you will find it requires an immense amount of programming effort

This implicit method of definition of logic rules, or bottom up AI, is the same as the approach used by neural nets - although in neural nets the rules are even simpler.

If the machine runs off of transistors within hard-wired functional units, this branching behavior is unavoidable. The states in software exist because that's what electronics work with. As long as algorithms are the basis of what we use to express AI, branches will never go away.

Logic state exists in the data structures within the simulation, and depending on the implementation it can change even when the program counter exists within a narrow set of values (like a loop that changes the entire simulation). The program counter has little to do with the logic state of the AI, it just tells the processor where it is in the instruction stream.

The SPE is a single-threaded processor that runs on compiled procedural code. What it has as an advantage is an improved way of accessing memory and communicating with other SPEs through CELL's internal interconnect.

It still has a program counter, and it still does things through branches and locally stored instructions.
If it is ideally suited for neural networks, then so is an old Pentium.

Don't fixate on the program counter, it has almost nothing to do with the logic state of the AI in question.

That really hasn't changed anything. Conditionals are still being evaluated in the software, you've only pushed complexity elsewhere in the system.

You are thinking with a procedural AI mindset. You have to forget the procedural AI mindset and think openly before you can appreciate bottom up AI. Let me give a simple practical example of the difference between the procedural AI approach where the logic state is stored in the program counter and the bottom up object AI where the logic state is stored as data, and the consequences for performance. Consider a decision which is part of an AI routine:

Procedural AI with branching:
If A > B then goto xxxx
If A < B then goto yyyy
C = 0
store C



xxxx C = true
store C


yyyy C = false
store C

How do we know what the result of our test is? Answer - where the program branches, and the content of the stack tells you - ie. the logic state is stored in the program counter. More complex logic states in which more than one test value is stored on the program counter are possible: for example if you call a function or subroutine at the branch, you store the current logic status (which represents the results of all the compounded branch decisions up to now that brought to the current execution point) as the return address on the stack, and the latest test result as the address in the program counter of the new branch. If you want to store the results in a logic table, you add code to store it in the appropriate branch.

Branchless AI treating results as data:

C = A - B
sign extend C
store C

How do we know what the results are? The value of C tells us. The program counter doesn't.


What is the difference between the two?
1) There are no branches in the branchless AI method. Branching is expensive, particularly long branches which are common in AI code because the cache gets flushed, and particularly subroutines and function calls (which require an additional overhead) which are often required for navigating complex heirachical AI decision trees. These inefficiencies apply to all processors, but are even more true for the SPE.
2) While the procedural AI method hogs an entire execution thread meaning that the code has to be repeatedly executed in a different object context for every object that has to processed. On the other hand the branchless AI code can be executed on the SPE on 4 objects simultaneously using SIMD because it treats the results as data, even if the results for each of the 4 objects are different. The procedural AI however can only handle one object at a time per thread because it only has one program counter, and results for different objects may branch different ways. The branchless AI code can go even further - if the test being performed is a set of boolean bitwise operations on flags, and a 64 bit word is used to store the flag values of 64 objects, then 64 x 4 = 256 objects can be processed in parrellel.

The question is how far can/should branchless and parrallel AI code be applied in practical situations? The answer is whenever possible, and it is possible and beneficial in a large number of circumstances.

Mathematically speaking, anything that can be achieved by testing and branching can be converted into a set of inline boolean and arithmetic operations applied to data stored in tables. Running AI computations in parrallel is possible with branchless operations in inline code and short loops. The inline code may be slightly longer, but it will run a lot faster. In practice some limited branching between batches of inline code will probably be required (more because the procedural mindset of microprocessor designers tends to ensure that more attention is paid to test-and-branch instructions than test-and-store instructions than anything else).

In procedural code, branching at some point is required in order to execute code that would display or output something which acts on the results. However this is not the case in event driven code which makes up much of the computing workload in games programs. In the case of overall games logic (for example, where you select the level you want to attempt, or determine the position in a fixed maze where you are), which is basically procedural and single threaded, it makes sense to use procedural AI and run it on the PPE which is good at doing long branches. In the case of object AI, (as in the crowd objects in the crowd behaviour simulation) the drawing of the object is event driven. There is no need for the AI code to act on the results by running code at the end of a branch. All it needs to do is update the data associated with the object so the geometry program will later update geometry and send it for rendering. Practically will be some branching required in the algorithms required update the object data, but this can be done in such a way as to minimise branches, eliminate long branches, and maximise the potential for vector processing. For example, the AI and updating of the AI logic tables might be done on 4 objects at a time using SIMD with mainly branchless code by one SPE, and the updating of the geomtery parameters associated with each object may be done later on an object by object basis in parallel on a number of SPEs by the geometry program. Branching will be required as with all code, but it will be short efficient branching of the type the SPE is good at, not the cache-flushing, parrallel-processing-killing long branching and thread hogging of procedural AI.

Any other processor would run this AI just the same as an SPE. The algorithm is the same, so the behavior is the same on universally capable machines.
What you get is silicon that is simulating a neural network, regardless if the machine simulating it is an SPE or a SPARC.

Using multiple SPEs is nothing that will revolutionize AI, we've had multiprocessors for nearly as long as we've had processors. It hasn't really made the computationally wasteful neural simulation any more viable.

The benefits of bottom up AI which treats logic states as data in order to minimise branching and allow parrallel processing of AI operations is beneficial to all CPUs, however it is essential in Cell, because Cell's SPEs are poor at long branches typical of procedural AI, and because it maximises the potential for parallel processing.

The biggest limitation to AI isn't branch inefficiency, it's the size of the problem and the lack of robust algorithms for all but the simplest cases.

Branch inefficiency and the fact that the single threaded nature of procedural AI used in games so far isn't a problem for the limited level of AI we have had so far. However it does limit what is possible in terms of making the AI more complex and believable, ie. raising the level of intelligence.

The holy grail of course is an AI that can make things up as it goes along. Unfortunately for the developer, if it does go wrong, good luck in fixing it. AI guys have been working on this since the 70's with no real robust solution.

A neural network is a collection of nodes that are connected to each other by varying numbers of links, each link being given a weight. Depending on the inputs being fed into a node, it will output a postive or negative result (in the simplest form) to other nodes it is connected to.
Over time, a neural network that is given training inputs begins to assign new weights to each link, with desirable results reinforcing the links that make the right decisions.

There are almost no rules in a neural network, save in the form of the boolean function each node uses to evaluate its input and the weight at each link.

What you are describing is emergent behavior, which isn't a neural network. It is a way of getting behavior from a collection of independent agents based on how they interact.

Nodes in a neural network are not independent, and they do not act. They all work towards producing an output. Some other routine must then be made to interpret its output and use it to change the overall simulation state.

It is quite possible that the agents in a flock can use neural networks in their decision making, but this is conceptually different.

I referred to neural networks in the sense of programming methods that simulate large numbers of objects that interact according to a set of simple rules and are processed in parrallel as data objects - the same concept that I am referring to here, and a completely different approach to the current procedural AI approach which is based on the branching history of a single thread as used in games now.

The problem with neural nets in my mind is that researchers are too fixated with mimicing the human brain. The predefined rules in neural networks is so simple that it takes a condiderable amount of carefully directed training to achieve anything at all. Human brains have had millions of years of evolution to develop and test the basic rules and still have to be trained in a real life environment for about 3 years before thay can do anything useful at all. Who would want to do that for a smart vacuum cleaner?

I am not suggesting that the rules used by neural network researchers are used in games, merely that the basic technique of distributed, non-procedural, rule defined AI processing can be applied. There is no reason why more complex rules can't be used in neural net type simulations - it is just a question of granularity. A set of such rules can be applied to a whole object such as a person in a crowd simulation, or for as my example a single human brain can be simulated by a number of objects each governed by a different set of rules. The latter is a very coarse equivalent of a neural net, the former is implemented in the same non-procedural manner as a neural net.


Unless a hardware random number generator and some kind of Monte Carlo simulation is put in each agent, then the same starting conditions will lead to the same ending conditions every single time.

It's still an algorithm, and those are inherently deterministic.

That is emergent behavior, but not necessarily from a neural network.

Probably, but caution should be taken in how the term AI is being used. Is the SPE being tasked with making critical decisions or just updating another element of the simulation?

Boring and predictable, or sane and understandable?
There is no guarantee that a neural net will output an expected or sensical result. Not unless a lot of effort is put into its construction and a lot of restraint is used.
Defining sensical in algorithmic terms may be impossible.

I am not suggesting for a moment that a new artificial life form is about crawl out of your games console. It is simply a question of giving characters in a game the impression of being intelligent. A character which behaves in a more complex but rational (ie. non-random) manner has the appearance of greater intelligence. Compare behaviour of the herd/crowd simulation with a procedurally programmed algorithm that attempts to do the same thing. The objects behave in a way that responds to the environment that seems like they have a mine of their own rather than like someone has programmed them to respond in a particular way.

As for being deterministic, the same can be said for the weather system of the world or the behaviour of human beings. They are determined by physical laws and so they should be utterly predictable. They aren't because they are immensely complex and so can't be predicted. The same is true for the herd/crowd simulation - remember that each object can occupy millions of possible positions and each can interact with every other object. Also if the object interacts with the user, a millisecond difference in user response may mean that objects end up in completely different locations. Try picking up the herding simulation where the herd interacts in the same way with the user, start at the same starting position, and see if you can end up twice with all the herd in the same positions - it is pretty well impossible.

Then there's the problem that people don't want perfect AI, they want AI that meets their expectations. A perfect AI is one that will never lose.

I am just talking about my preferences here. I find predictable RPGs extremely boring. Sure, it should not be completely random so that you can learn how to tackle better, but the outcome should not be exactly the same every time, and the differences should be complex but predictable according to something that can be learned. This is where AI comes in - complex results but which act according to a logical set of rules which can be deduced (as opposed to a simple set of rules and a random number generator which is either random or easy to predict). As for making the game harder, it doesn't have to be, you can make it harder by increasing AI, but compensate by setting the "hardness parameters" in the game to make your opponents easier to beat.

There is no hard and fast definition for AI, a lot of AI experts don't even think "true" AI is even possible

True, and what has been called AI so far isn't really intelligence. As for whether true "AI" is possible, unless you consider human beings as not intelligent, it has already been done. Just because humans in their current state of intelligence haven't or can't do it, doesn't mean it can't be done.
 
SPM said:
The problem with procedural AI is that the programmer has to explicitly define every logic state.

Not necessarily, it depends heavily on how the simulation is set up. A minor agent within the larger game world can work from a much smaller set of data.

There's a difference between procedural AI and procedural code. One is a higher-level type of decision making for determining the future state of a simulation.

The other is a programming paradigm where the program is composed of various procedures or methods that can be called at any point in a program. It is often preferable over sequential code because it allows for much better maintainability and structure.

Just try programming the herd/crowd behaviour simulation as single threaded procedural AI code, and you will find it requires an immense amount of programming effort

Not much harder in the long run, since the problem being modeled is a very simplistic stimulus-response behavior. Any procedural rules for the behavior of each individual within the herd would degenerate into simple rules.

Whether it's single-threaded is beside the point. A single-threaded vs a multi-threaded makes no difference to the simulation, and procedural code is not inherently single-threaded. Procedural or not, the AI will take a bit more work if it is multithreaded.

Let me give a simple practical example of the difference between the procedural AI approach where the logic state is stored in the program counter and the bottom up object AI where the logic state is stored as data, and the consequences for performance. Consider a decision which is part of an AI routine:

Procedural AI with branching:
If A > B then goto xxxx
If A < B then goto yyyy
C = 0
store C



xxxx C = true
store C


yyyy C = false
store C

How do we know what the result of our test is? Answer - where the program branches, and the content of the stack tells you - ie. the logic state is stored in the program counter.

This example is simplistic and out of context, and depending on what state you are talking about, it may or may not be relevant to the program counter.

From the point of view of the hardware context during the execution of the program, yes, there's some information in the counter. If the AI were free to interrupt the core while it was running itself, then the counter would matter. Since it can't, or at the very least shouldn't if meaningful output is desired, it doesn't matter.
In theory, an improperly threaded AI could do something like that, and nothing but garbage would result.

From the logical state of the AI simulation, it looks like this:

Simulation time slice 1 C = whatever it was
*simulation update: everything happens here*
Simulation time slice 2 C = whatever it turns out to be

More complex logic states in which more than one test value is stored on the program counter are possible: for example if you call a function or subroutine at the branch, you store the current logic status (which represents the results of all the compounded branch decisions up to now that brought to the current execution point) as the return address on the stack, and the latest test result as the address in the program counter of the new branch. If you want to store the results in a logic table, you add code to store it in the appropriate branch.
If you want to do anything with C, there's going to be a branch or a select statement set up anyway.
In this easy 2-way situation, a select statement would work well.

Branchless AI treating results as data:

C = A - B
sign extend C
store C

This isn't an AI concept. It's a code optimization within a procedure for a very specific (and simplistic) relationship between only two data elements.
If the relationship between the data values is that simplistic, then the use of that really bad branch example would be a sign of a coding oversight, not a limitation of procedure. Then again, a good optimizing compiler might be able to do that transformation anyway.

It has nothing to do with the conceptual layout of the AI.

How do we know what the results are? The value of C tells us. The program counter doesn't.

What about later when something is actually done with the value C? Odds are, program flow will not be identical, in which case the counter will reflect the difference.

What is the difference between the two?
1) There are no branches in the branchless AI method.
That's because you selected the sliver of code that had no branches in the program. No decision had been made, the program was going to do its own thing regardless of what C was.

It's just a meaninglessly twitching set of bits. In such case, yes an extension is better than a branch when the result isn't used on anything.
In real life, that code would immediately put a branch after that sign-extension because it would then have to do something with the value of C.

2) While the procedural AI method hogs an entire execution thread meaning that the code has to be repeatedly executed in a different object context for every object that has to processed.

This has more to do with the data layout of the simulation. A procedural program can do the same thing with a properly made procedure. Actually, your branchless AI code snippet would probably be found in the middle of a procedure.

Of course, this is only from the standpoint of a procedural program, which as has already been covered entirely unrelated to a procedural AI.

On the other hand the branchless AI code can be executed on the SPE on 4 objects simultaneously using SIMD because it treats the results as data, even if the results for each of the 4 objects are different.

This is a low-level hardware feature the SPE offers that the assembly code can utilize. A procedural program could do this as well, and it probably would if the simulated behavior is simple enough.

The procedural AI however can only handle one object at a time per thread because it only has one program counter, and results for different objects may branch different ways.

Why?

Watch this:

single version:

load A
load B
UpdateOneObject(A, B)

quad version:
load packed A
load packed B
UpdateFourObjectsUsingSIMD (A, B)

The data layout within the simulation is the limiting factor, not the programming paradigm or AI philosophy.

The branchless AI code can go even further - if the test being performed is a set of boolean bitwise operations on flags, and a 64 bit word is used to store the flag values of 64 objects, then 64 x 4 = 256 objects can be processed in parrellel.

I don't follow your math. The 64 boolean flags are being applied to the simple data states in the packed 4 objects. At the end of the computation, only the 4 objects would be altered.

I also don't get how it is that an SPE can handle the simultaneous computation of (64 + 2) initital data states four times in parallel. It can't, and its masking abilities are not that insanely flexible.

What set those boolean flags?
Is grouping 64 boolean flags even meaningful?
In what situation would they make any sense?
Why would a procedural program be unable to handle it?

What if the simulated environment has relationships that dynamically underfill or overflow the static data structure?

Mathematically speaking, anything that can be achieved by testing and branching can be converted into a set of inline boolean and arithmetic operations applied to data stored in tables.

The complexity and code expansion involved is enough reason to be very careful on where it is used.

Running AI computations in parrallel is possible with branchless operations in inline code and short loops. The inline code may be slightly longer, but it will run a lot faster.

On something beyond a simple herd simulation, the inline is going to get a lot longer. A lot of the code being run in parallel will wind up being discarded if you rely only on bit shifting and masking.

Cell doesn't offer a hyper exponential increase in throughput.

Branches in different threads don't affect one another, so branches don't do much to hinder multithreading. Memory aliasing, synchronization, and contention do, and these are limits of silicon and physics.

In practice some limited branching between batches of inline code will probably be required (more because the procedural mindset of microprocessor designers tends to ensure that more attention is paid to test-and-branch instructions than test-and-store instructions than anything else).

There will be branching because that will be the only way to choose between various sets of inline code.
If code is inlined, each stream must reflect a single logical outcome, or if it uses some kind of predication or bit select, it must burn limited throughput.
This is not a trivial increase in code size and resource pressure.

Branch inefficiency and the fact that the single threaded nature of procedural AI used in games so far isn't a problem for the limited level of AI we have had so far. However it does limit what is possible in terms of making the AI more complex and believable, ie. raising the level of intelligence.

The examples you have given for non-procedural AI are incredibly simplistic.
They are meant for behaviors that, while important, are not meant to be intelligent at all.

For number crunching, SIMD works. But the nucleus of branching can't be eliminated. You can't make a branchless strategic AI for more than a tiny problem because the amount of parallel execution needed to reflect all the outcomes of a series of branches will explode.

As for being deterministic, the same can be said for the weather system of the world or the behaviour of human beings. They are determined by physical laws and so they should be utterly predictable.

There's an entire realm of physics where all things are based on probabilities and are required to be uncertain.

They aren't because they are immensely complex and so can't be predicted. The same is true for the herd/crowd simulation - remember that each object can occupy millions of possible positions and each can interact with every other object.
Also if the object interacts with the user, a millisecond difference in user response may mean that objects end up in completely different locations. Try picking up the herding simulation where the herd interacts in the same way with the user, start at the same starting position, and see if you can end up twice with all the herd in the same positions - it is pretty well impossible.

Easy, do nothing at all. Zero is zero, and a deterministic system will reflect that.

The user's input is a great way to sneak in stochastic behavior. Fluctuations in reaction time or even the responses of the input device can introduce unexpected results. If you were to take a snapshot after every input buffer polling, however, every consequent action would be deterministic.
 
3dilettante said:
Not necessarily, it depends heavily on how the simulation is set up. A minor agent within the larger game world can work from a much smaller set of data.

There's a difference between procedural AI and procedural code. One is a higher-level type of decision making for determining the future state of a simulation.

The other is a programming paradigm where the program is composed of various procedures or methods that can be called at any point in a program. It is often preferable over sequential code because it allows for much better maintainability and structure.

I think we are talking on different wavelengths here. When I talk about procedural AI, what I am referring to is AI code in which the path of code execution depends on results of AI processing and reflects the logic state of the AI process. When I talk of neural network type AI or object AI, I am referring to AI code in which the AI logic state is processed purely as data using boolean operations and does not affect which way the code branches. These are just terms I have used for want of something better - I don't attach any special meaning other than what I mean by these terms.

Not much harder in the long run, since the problem being modeled is a very simplistic stimulus-response behavior. Any procedural rules for the behavior of each individual within the herd would degenerate into simple rules.

Whether it's single-threaded is beside the point. A single-threaded vs a multi-threaded makes no difference to the simulation, and procedural code is not inherently single-threaded. Procedural or not, the AI will take a bit more work if it is multithreaded.

So you are simulating 1000 individuals in a herd. How are you going to handle this? With a single threaded piece of procedural code? A better and simpler way is to use a short piece of event driven code (as opposed to procedural code) which handles each individuals response, and stores the results of the AI code as data. This simple piece of code would be triggered say every few seconds for every one of the 1000 individuals and a large numbers of the event driven execution threads could be run in parrallel. In this case the event driven code is much simpler, and if programmed correctly can be run in parrallel using SIMD or bit packing. Procedural code would be more complex because you would have to find a way of updating all the 1000 objects on a timely basis. This type of problem ie. object AI is better suited to event driven code.


This example is simplistic and out of context, and depending on what state you are talking about, it may or may not be relevant to the program counter.

From the point of view of the hardware context during the execution of the program, yes, there's some information in the counter. If the AI were free to interrupt the core while it was running itself, then the counter would matter. Since it can't, or at the very least shouldn't if meaningful output is desired, it doesn't matter.
In theory, an improperly threaded AI could do something like that, and nothing but garbage would result.

From the logical state of the AI simulation, it looks like this:

Simulation time slice 1 C = whatever it was
*simulation update: everything happens here*
Simulation time slice 2 C = whatever it turns out to be

I am not sure what you are getting at here. The AI code for each individual in the herd would be detached event driven code which would be triggered by a timer say, would look at neighbouring individuals and the environment and would update it's parameters accordingly, and then terminate. Why should this result in garbage? Why make the program complicated by trying to centralise the process and incorporate it into a single piece of procedural master code?

As for the example being oversimplistic, that maybe the case, but as I said, mathematically there is nothing you can do with testing and branching code that you can't do by boolean arithmetic operations acting on data. The complexity of logic operations that has been done this way (by processing logic states as data) is way in excess of what has been done by representing logic states as branches in code - e.g. the simulation of entire CPU chips, or simulations of large neural networks.

What I am suggesting is very simple. Process and store the results of your AI code as data, and do not represent their state as branches (use test and store instructions like COMPARE on the SPEs rather than test and branch). You can have branches and loops in your code, so long as they are not reflect your AI results. If you do this, you will be able to run the AI processing of a number of objects in parrallel using SIMD or if bits are packed into larger words, by using bitwise boolean operations. With procedural AI (my definition) you can't process more than one object at a time per execution thread simply because different objects may branch different ways, and you only have one program counter to a CPU. On the other hand you have lots of registers on a CPU, especially on something like an SPE.

The way I am suggesting lends itself naturally to event driven code (hence my suggestion that it is used for object AI processing and that traditional procedural AI code which runs better on the PPE might still be apropriate for overall games AI/logic), and is efficient enough to allow immensely complex AI processing. The advantages besides vector/parrallel processing capability are that the long branches typical of procedural AI decision branches which cause the cache to miss are avoided. The short looping/branching in AI data-processing code will not result in a cache miss and on the SPE code can be executed from the SPE's local store so there is no penalty. In event driven code, all the AI event has to do is update the AI data and terminate. The processing of the results - ie. the individual in the herd moving in a particular direction as a result of the AI - does not need to be dove by the AI code, so no branching is required for this in the AI code. Instead the logical place for this would be the event driven animation code which would act on the results of the AI stored as data and move the individual accordingly. This allows a neat programmatic as well as executionally efficent separation of the AI code from the animation code (the AI code can be SIMD whereas the animation code can process objects one by one in a stream).

I have read a lot of nonsense (like the nonsense about SPEs not being good at physics) about how poor the SPEs would be at AI because of their inability to do long branches associated with procedural AI efficiently. The reality is that the SPE will actually be a lot more powerful and better suited to AI and physics than conventional processors - it just requires the breaking of the straightjacket of the procedural programming mindset to realise it.

The bottom line is the fact that independent developers like Aegis and AISeek are putting their own money into Cell-like and AI processing units as well as PPUs because they work.

http://www.aiseek.com
http://www.aiseek.com/AIseek - Intelligence for New Worlds.pdf
http://www.gamasutra.com/php-bin/news_index.php?story=8627
http://www.engenuitytech.com/pdf/press/2006/20060322-PR-AI-40-EN.pdf

One thing I have to say about the AIseek product is that it is a set of algorithms for doing things like searching for the quickest route. However not every human will be smart enough to know how to find the absolute shortest Dijkstra's shortest route algorithm, so for realistic simulation of humans, I hope other algorithms as well that will take the same approach as different humans might in order to tackle the problem.
 
SPM said:
I think we are talking on different wavelengths here. When I talk about procedural AI, what I am referring to is AI code in which the path of code execution depends on results of AI processing and reflects the logic state of the AI process. When I talk of neural network type AI or object AI, I am referring to AI code in which the AI logic state is processed purely as data using boolean operations and does not affect which way the code branches.
Apparently we are on different wavelengths. I have never encountered definitions for procedural code or neural networks like what you are using, and they mix up the higher-level conceptual objects with lower-level implementation details.

The logic state of the AI process is only evaluated between update cycles. This removes the nature of the update code completely from consideration.

So you are simulating 1000 individuals in a herd. How are you going to handle this? With a single threaded piece of procedural code?

Do what every single-threaded AI simulation has ever done. Iterate through the list of objects that store the state variables of the AI agents.

Better yet, multithread the procedural code.

A better and simpler way is to use a short piece of event driven code (as opposed to procedural code) which handles each individuals response, and stores the results of the AI code as data.

Agent state in every form of AI is stored as data. Code is then run using that data. There is no need for a direct mapping of the AI simulation to the underlying code.

Procedural code would be more complex because you would have to find a way of updating all the 1000 objects on a timely basis. This type of problem ie. object AI is better suited to event driven code.

Depending on what definition of procedural you are using, there may be no problem at all. I'm still not clear on why you think a procedural AI must be single-threaded or why branches have any affect on threadability at all.

I am not sure what you are getting at here. The AI code for each individual in the herd would be detached event driven code which would be triggered by a timer say, would look at neighbouring individuals and the environment and would update it's parameters accordingly, and then terminate.

Game simulations run in discrete time steps. The AI is updated after each step (or some multiple number of steps). The simulation would run through the object lists and run the needed methods. This can be done in parallel, and there is no need of an unreliable timer to muck things up.

Why should this result in garbage? Why make the program complicated by trying to centralise the process and incorporate it into a single piece of procedural master code?

Thread-safe code has procedures that can be run independently. Unless a procedure is interrupted in the middle of running through all its branches by something that accesses common variables, the end result will be fine. There is no reason why there needs to be a single piece of procedural code.

As for the example being oversimplistic, that maybe the case, but as I said, mathematically there is nothing you can do with testing and branching code that you can't do by boolean arithmetic operations acting on data.
They can be made equivalent in result, but relying solely on boolean operations acting on data means the data model must be more complex. The addition of a few variables leads to a sizeable increase in the amount of (probably wasted) parallel computation.

Blindly computing in parallel and then masking or selecting burns serious resources, since without some kind of smart branching every combination of states must be computed, and then the proper result masked out.

What I am suggesting is very simple. Process and store the results of your AI code as data, and do not represent their state as branches (use test and store instructions like COMPARE on the SPEs rather than test and branch).

All AI simulations store agent state as data between update cycles. Code itself does not directly map onto a single object, not without self-modification (a big no-no).

With procedural AI (my definition) you can't process more than one object at a time per execution thread simply because different objects may branch different ways, and you only have one program counter to a CPU.
I really don't think your definition exists. AI is not done the way you say it is, and branching behavior has no effect on threading or parallel processing.

The way I am suggesting lends itself naturally to event driven code (hence my suggestion that it is used for object AI processing and that traditional procedural AI code which runs better on the PPE might still be apropriate for overall games AI/logic), and is efficient enough to allow immensely complex AI processing.
No more complex than has been offered by any multiprocessing system before.

The advantages besides vector/parrallel processing capability are that the long branches typical of procedural AI decision branches which cause the cache to miss are avoided.
You'll have to write out exctly what you think a long branch is.
Branch instructions have a fixed penalty. Large numbers of poorly predicted branches are a performance minus. They have no length.

In event driven code, all the AI event has to do is update the AI data and terminate.
It doesn't even need to be event driven. That's how it's usually done. However, how the updating is done is highly dependent on how complex the relationships and utility functions are. Not all AI decisions are solely based on A>B do C.

The processing of the results - ie. the individual in the herd moving in a particular direction as a result of the AI - does not need to be dove by the AI code, so no branching is required for this in the AI code.
Unless the logical code and masking are being run on a particularly complex data model.
Branches have a fixed penalty.
Data expansion and computational complexity initially is smaller, but it grows fast.
Adding something like a variable with four possible states will quadruple the amount of data that will need to be masked.

Instead the logical place for this would be the event driven animation code which would act on the results of the AI stored as data and move the individual accordingly. This allows a neat programmatic as well as executionally efficent separation of the AI code from the animation code (the AI code can be SIMD whereas the animation code can process objects one by one in a stream).

This is nothing new, that's how it's done. What you neglect to cover is exactly how well the data can be updated when the desired update is based on complex rule and variable relationships. Using selects and masking means a lot of work is done in parallel and then discarded. The wasted work takes a non-trivial amount of time, and it multiplies with every new state combination.

I have read a lot of nonsense (like the nonsense about SPEs not being good at physics) about how poor the SPEs would be at AI because of their inability to do long branches associated with procedural AI efficiently. The reality is that the SPE will actually be a lot more powerful and better suited to AI and physics than conventional processors - it just requires the breaking of the straightjacket of the procedural programming mindset to realise it.

That's mixing up the low-level and high-level types of "procedural". They are completely separate things. The SPE will be running programs written like what is seen in C and C++. That's going to be imperative and procedural code.
I am not sure where your earlier defintion of "procedural" fits at this level.
 
SPM said:
IThe bottom line is the fact that independent developers like Aegis and AISeek are putting their own money into Cell-like and AI processing units as well as PPUs because they work.

http://www.aiseek.com
http://www.aiseek.com/AIseek - Intelligence for New Worlds.pdf
http://www.gamasutra.com/php-bin/news_index.php?story=8627
http://www.engenuitytech.com/pdf/press/2006/20060322-PR-AI-40-EN.pdf

One thing I have to say about the AIseek product is that it is a set of algorithms for doing things like searching for the quickest route. However not every human will be smart enough to know how to find the absolute shortest Dijkstra's shortest route algorithm, so for realistic simulation of humans, I hope other algorithms as well that will take the same approach as different humans might in order to tackle the problem.

Are you sure about this, because in the site they talk about


this seems not like Cell/PPU which have a very diferent configurations, unless you speak in Cell like in a very broad sense and (1 head core more a few "inferior" ones), althought it should also use some intensive maths, meybe in this it looks like.

Can you share a bit more of info on this.
 
pc999 said:
this seems not like Cell/PPU which have a very diferent configurations, unless you speak in Cell like in a very broad sense and (1 head core more a few "inferior" ones), althought it should also use some intensive maths, meybe in this it looks like.

Can you share a bit more of info on this.

I'm not sure the branching discussed in AIseek's FAQ is branch instructions in code, but branches in the graphs used to represent parts of the game simulation.

The SPE should be well suited to crunching through a graph like that.

AIseek makes the telling distinction between low-level and high-level AI. If I were snooty, I'd go so far as to say most of the low-level AI stuff isn't really AI but thinly disguised optimization, but there is more to it than that.

AIseek says they have hardware that offloads the low-level AI.
Finding the shortest path is a well-researched problem, and a game-grid is a very well-defined problem set to work from.

Other low-level tasks like sensory simulation can also be accellerated. Usually, something like a monte-carlo simulation uses psuedorandom numbers to produce a probability of correct and incorrect input, insulating the AI agent from the game world's actual data.
This is more realistic, since in real life we cannot actually know the world to the absolute degree of certainty that game code has.

AIseek could have added some random-number generators (VIA did this with their processor as well) to help with this task. The SPEs are rather good at random number generation as well, though not quite in the same league as specialized hardware.

Because graph traversal is so well-researched, a lot of AI mapped out decision making into decision trees and other representations. For well-defined and somewhat simple problems this works great.

More complicated problems and dynamic environments result in graphs that explode in size or algorithms that fail to function properly, especially when making decisions based on optimal future events.

If decision trees are used then, developers work hard to pare down the number of nodes traversed by culling obviously bad decisions from consideration early (with branch instructions, for example).
 
Back
Top