hi,The most expensive part is not the simulation, but the volumetric lighting. I implemented the slice-based volume rendering at first, then tried to move on to raytraced volume rendering. Unfortunately the result was frustrating: hardware raytraced volume rendering trended to show more artifacts than slice-based methods, and almost no good way to calculate volumetric lighting. So guess that nVidia smoke demo is based on raytraced volume rendering, that means - tracing a pixel through a volume texture and accumulating sampled values along the ray.
i am also doing volume raycasting/-tracing, but i can not reconstruct your problems. could you explain what artifacts you encountered opposed to the slice based rendering?