To Panajev:
Just read your first post on this thread, great read. But it got me thinking - by vectorizing, you might lose some of the data locality inherent in the game scene. To take Nao's example:
// Nao's vectorized code///////////////////////////////////////////////////////////////
common_particle_update( IN vec3 particlePosition, IN vec3 particleSpeed, UNIFORM float deltaTime, UNIFORM polysoup roomDatabase....)
{
// compute next frame particle position
vec3 nextPosition = particlePosition + particleSpeed*deltaTime
// check if this particle is going to hit something..
intersection check = trace(particlePosition, nextPosition, roomDatabase)
// create a new explosion particle if there is a hit, otherwise update particle position
if (check.outcoming == true) then create (check.intersection_data, particle_explosion)
else particlePosition = nextPosition
}
///////////////////////////////////////////////////////////////////////////////////////////
Might be slower than code that "recognizes" data locality (important changes bold)
[EDIT: Sorry for no tabs...how do you get it to tab properly?]
///////////////////////////////////////////////////////////////////////////////////////////
for(each particle cloud)
{
cloud_particle_update( IN vec3 cloudParticlePosition, IN vec3 cloudParticleSpeed, UNIFORM float deltaTime, UNIFORM polysoup roomDatabase....)
{
// compute next frame particle position
vec3 nextCloudPosition = cloudParticlePosition + cloudParticleSpeed*deltaTime
// get local geometry around particle cloud
polysoup localDatabase = fetchLocalGeometry(cloudParticlePosition, nextCloudPosition, roomDatabase)
// check if this particle is going to hit something..
intersection check = trace(cloudParticlePosition, nextCloudPosition,
localDatabase)
// create a new explosion particle if there is a hit, otherwise update particle position
if (check.outcoming == true) then create (check.intersection_data, particle_explosion)
else cloudParticlePosition = nextCloudPosition
}
}
///////////////////////////////////////////////////////////////////////////////////////////
This is sort of a half-way compromise between the two code samples Nao gave. This will increase per-particle special processing, but my point is, there must be a sweet spot between the two examples Nao gave. By making the code explicitly aware that a each localized particle cloud is dependent on only the local geometry (and not the geometry on the other side of the room), I think this code can be made to run faster.
I guess a well-written trace(...) function would also solve this problem (at least in the example I give), but I'm sure there are other situations where leaving it up to the function to find data locality is not good enough.
BTW, nice to see you around Panajev...