Well, yes, I also think they ray-trace a 'voxelized' triangle mesh and use the voxels for the empty space skipping during ray-tracing and storing triangles. There is various ways to do this efficiently, but basically per voxel you have a flag that tells it is empty or not empty. In case of a none empty voxel a limited number of triangles (like 1-4) would be associated with that voxel, similar to what marching cubes produces: So instead of BVH, 3D textures are used for accelerating the ray tracing.Watching it again today i think it's more just classical raytrcing of a lower triangle LOD from the scene. Your idea would show more irregular tesselation considering the random orientation of the shells, and my ideas above would also show similar artifacts from the world space grid alignment.
The mentioning of their Total Illumination maybe is just a bit misleading towards voxels. I think they use 'voxels' only for acceleration structure here, if at all.
At distance there seems indeed a fallback to a voxel approximation, but there is only one spot in the video where i can see this. Otherwise it's too perfect.
The missing noise also made me tend to think about alternative image based tech like VSM, but now i guess the reason is simply this:
Instead of disturbing the ray directions with random angles for AA (as usual, causing noise), they use a globally equal angular offset for the whole frame (or just per normal direction, whatever).
As a result, rays having the same normal (and that's common for smooth human made materials that show sharp reflections, also for water) will keep parallel and will traverse the same nodes in acceleration structure much more likely.
Also instead noise you'd get this kind of banding / ghosting visible in the video.
So the missing noise is no hint to any spectacular new tech either. It really seems triangle raytracing mostly?
Maybe they limit themselves to only sharp reflections for now because coherent rays and no need for denoising, but the quality shown is better than needed, and more glossy reflections could be made by bilateral blur / falling back to voxels earlier?
I'd be mostly interested in options to trade accuracy vs. performance. Video is 30fps and 2K - upscaling reflections could work to get a 4K game on next gen i guess.
Anyway impressive they managed to make it that fast using general purpose compute.