State machines.

Most methods that are used to run multiple tasks in parallel therefore use locks and synchronization mechanisms. They layer the workload in time, executing sequences of tasks in lockstep.

But, why would you do so? Just because a processor is essentially a state machine, doesn't mean you have to use a similar model to describe your tasks. Because, essentially, that Turing machine can solve any task in a predictable time frame, as long as it isn't a hard problem.

So, instead of trying to come up with better locking mechanisms, a better task division or whatever as long as it is still using the state machine paradigm, we need to come up with something that is free running.

Or, in other words: we need a paradigm that doesn't switch between predefined states that represent fixed lengths of time, but instead average new states from multiple asynchronous occcurences.

That way, you need no synchronization, and you don't require atomic operations either.

Simply gather what data is available, make a best guess for the current time, and update.

An additional bonus of such a model is, that you can steer it solely by priority, and aren't limited by having to process all objects each timeframe.


I've often thought about a parallel processing simulation that relies on current time and interpolation/extrapolation, only problem is that it still relies on synchronization it just attempts to work to reduce it; each object's state is essentially double-buffered, the front buffer is the object's state at a specified time and is freely readable, the back buffer is the object's state during processing and the object locks the front buffer and swaps when it's ready for an update.

And I imagine that in most game type simulations the object threads would likely be sleeping the majority of the time and not constantly locking and swapping buffers, which means that most of the active threads(ie. objects) will have free access to the data when ever they need it.
 
Scopes, domains and limits.

Let's take Russ' example of bank accounts:

There is a memory location that specifies your current bank account. You go to the shop, and order something by credit card. The CC company checks your current account, and accepts the transaction. But, that current account visible to the CC company is probably not your actual account amount: it is more likely a separate account, that is only updated with your actual account once a month.

In that case, there is no single number that tells you the exact state of your account at any moment. However, there are multiple amounts that are averaged over time. If you stop spending and receiving for a few months, all those amounts are added and substracted, to give you a single number that specifies how much money you have.

But, how about your house and other investments? Those don't even render an exact amount at any specified time, only a rough approximation depending on the state of the market.

So, in reality, it is quite hard to come up with a definite number that specifies your monetary worth.


For games, you might wonder if you want the AI to be omnipotent: know it all, and be able to do it all flawlessly, like knowing you are there, a mile away, behind a cardboard wall, and being able to hit you in the eye.

It's all the same thing: in real life, nobody has all the data at any one time. So, why do we expect computers to be different?



Timestamps, tasklists and rescheduling.

Instead of synchronizing all actions to specified moments, and having to complete all dependencies up front, you could also accept that not all the data you would want is available at all times. Like in real life.

If you are unable to access the exact, current state of something, there are many good alternatives:

- If you never encountered the possible interaction in the past, you can simply ignore it.

- If you did encounter it once in the past, but your data is old, you can use the old data.

- If you encountered it multiple times in the past, but your data is old, you can average or extrapolate it.

To be able to know what to do, you want all your data to carry a timestamp. And you only guarantee that your results are accurate for the timeframe covered.


If you require better interaction and have a way to calculate most of those results yourself, you might want to use a tasklist.

Having to wait for all tasks to finish is a waste of processing time. Which is what synchronizing stages is about. But then again, you could also add all tasks, form waiting for memory accesses to return data, to verifying if all your financial transactions are completed before calculating the total, to a tasklist.

The trick is in allowing the items on your (virtual) tasklist to execute asynchronously, within bounds.



Good programming practices.

If writing a function, you can assume all parameters and other data are available and correct. Which is a recepy for disaster if you're working on a large application, even if you wrote all of it yourself. It is much safer to check all your parameters and other data for correctness before using it. Or, alternatively, checking the timestamp and dependencies of it. And using alternatives if it cannot be verified.



State machines.

That leaves you with a tasklist, that can be as granual as you want it to be, and which defines the boundaries and limits of the accuracy of your calculations. Like, you cannot update your total monetary worth immediately when some shares soar, if the latest data you have from those is from yesterday.

Such a tasklist is like a state machine, and like Demo said, it doesn't really matter how many tapes, states or heads it has. The only thing that matters is if your calculations will still be correct if you change the execution order.

Which you can do if you don't want all your calculations to be correct for the exact same time period. Because you really cannot do that anyway.

A better alternative is simply to reschedule each task by adding it to the end of the list when the data it depends on is updated. Or simply wait for it's next run to get the update, if it is scheduled on a timer. If those updates aren't critical, you might want to trigger them. While you could prioritize important updates to run as often as possible.



I was working on a demo to show all this in realtime, I might finish it if you want me to.
 
This post is wrong on so many levels. Please consult sipser, henneseey, and patterson.


As we all know, given (close to) infinite time, a Turing machine, the simplest logical device ever devised (a one-state machine), can solve any solvable problem. We classify a problem as "hard", when there doesn't exist a solution to it that runs within a specified time.

Processors are about maximizing throughput, and so we require everything to finish as fast as possible. GPUs specifically like their problems small and fast. And while CPUs don't mind larger problems, that run longer, both require execution to be atomic, or at least predictable in time.

Amdahl is famous for putting this into writing. And, as it turns out, the main bottleneck isn't so much the amount of calculations required (as you can always use a faster approximation if needed), but the amount of interaction.

Essentially, that is because the processors are still state machines, executing a multitude of actions that might depend on each others results to continue.


Most methods that are used to run multiple tasks in parallel therefore use locks and synchronization mechanisms. They layer the workload in time, executing sequences of tasks in lockstep.

But, why would you do so? Just because a processor is essentially a state machine, doesn't mean you have to use a similar model to describe your tasks. Because, essentially, that Turing machine can solve any task in a predictable time frame, as long as it isn't a hard problem.

So, instead of trying to come up with better locking mechanisms, a better task division or whatever as long as it is still using the state machine paradigm, we need to come up with something that is free running.

Or, in other words: we need a paradigm that doesn't switch between predefined states that represent fixed lengths of time, but instead average new states from multiple asynchronous occcurences.

That way, you need no synchronization, and you don't require atomic operations either.

Simply gather what data is available, make a best guess for the current time, and update.

An additional bonus of such a model is, that you can steer it solely by priority, and aren't limited by having to process all objects each timeframe.
 
Back
Top