Titanio said:Can this be one final blend at the end? Draw all the transparencies and then blend with RSX's buffer?
Yes, 1 final pass per frame. Shouldn't be a problem
Titanio said:It may not cover all cases, but sorting batches could work OK. If you have a bucket of all particles in a particular portion of the frame, you could sort them independently of others? You might have problems with shared particles, though...but I'm not sure if you'd notice in dense particle systems.
You're right. It's very rare that you'd need to sort every particle agains all others. You can pre-sort them in groups that do not overlap, and then sort each group on the SPU.
Titanio said:Millions seems like a lot to be drawing on one frame, even with overdraw.
Actually 1 million is what people are talking as next gen target.
There was this white paper on how to do 1 million sorted particles on the GPU. It's quite interesting. It required like 200+ passes. And runs at like 2fps on current gen PC, but it has potential with some optimization + next gen GFX power.
Titanio said:The amount of time required is a good question, I'm not sure at all. The alpha blending itself doesn't seem particularly complicated though (I'd say perhaps the sorting and splitting into tiles might take as long if not more). I'm not really qualified to say, though, I'd leave it to Faf or Npl or other devs to comment on that.
Alpha blending is not the problem. It's the actual rasterization. It has a very unbalanced nature - 1 second you have subpixel particles, next you have 10,000 pixel particle covering the whole screen. It's a problem for the GPU too, except it has a lot of pixel arrays and super deep pipelines.
That doesn't mean processing particles on the SPU is a bad idea. Actually I would generate the particles on SPU(s) and send them directly to RSX for rasterization. This would be the perfect match IMHO. I just wish RSX had EDRAM like the GS. There is no substitute for raw bandwidth and fillrate when it comes to particles.