Game development presentations - a useful reference

I don't know this person, but it's an interesting watch


Edit: Kind of interesting. Basically saying forward rendering is better because without SSR you can save a lot of bandwidth and scale to lower end gpus. The trade off is pixel overdraw because of shading being down as quads. Kind of makes me wonder about a lot of gpus because I feel like compute is scaling faster than memory bandwidth and memory latency.

triangle visibility buffering is not considered here.
 
Last edited:
I don't know this person, but it's an interesting watch


Edit: Kind of interesting. Basically saying forward rendering is better because without SSR you can save a lot of bandwidth and scale to lower end gpus. The trade off is pixel overdraw because of shading being down as quads. Kind of makes me wonder about a lot of gpus because I feel like compute is scaling faster than memory bandwidth and memory latency.

triangle visibility buffering is not considered here.

Yeah this is exclusively for super, super, super low end hw targets. As soon as you get anything approaching modern triangle counts you start getting triangle overdraw using forward very quickly.
 
siggraph 2023 stuff is being posted, including Sebastian Aaltonen's talk, which was supposed to be very good.


The Call of Duty terrain one is fascinating, because it helps show why Call of Duty looks so ancient now, at least in terms of multiplayer. Hadn't quite realized how wide their platform target is, "32mb is a lot of memory on some platforms" is just, damn.

On the other hand, if visuals really do sell (I'm not convinced how much this is actually the case) it just means someone that targeted even say, Switch 2 mobile @30fps, as a minimum platform would easily be able to dominate Cod visually.
 


 
AMD posted an informative blog about GPU occupancy.

So occupancy going up can mean performance goes down, because of that complex interplay between getting the most out of the GPU’s execution resources while balancing its ability to service memory requests and absorb them well inside its cache hierarchy. It’s such a difficult thing to influence and balance as the GPU programmer, especially on PC where the problem space spans many GPUs from many vendors, and where the choices the shader compiler stack makes to compile your shader can change between driver updates.

 

The above extension makes stronger behaviour guarantees behind reconvergence/divergence during subgroup operations. This functionality prevents compilers out of doing specific optimizations for SIMT architectures since they lack lockstep execution guarantees for subgroup operations in the presence of divergence ...

I also stumbled upon an interesting issue from an IHV that claimed that they can't sanely implement wave operations for helper lanes since Direct3D spec authors implicitly assume maximal reconvergence behaviour by default for their API which by contrast is explicitly disallowed in that specific case for Vulkan as per footnote in the aforementioned blog post ...
Alan Baker said:
  • In cases where all remaining invocations in a quad are helpers, implementations may terminate the entire quad - maximal reconvergence cannot be used to require these remain live.
 
https://lfranke.github.io/trips/

Sharper than Spherical Gaussians.

Interesting method. They splat points bilinear to 2x2 pixels and two mips of a framebuffer pyramid. Larger points go to lower res mip levels, solving the holes problem.

I use the exact same thing to generate my GI environment maps. (Remembering when i've asked how to get in contact with Dreams devs, this was the proposal i wanted to make. Use mip maps instead rendering entire cubes from large points. :D )

However, to make this a general rendering method, they also have to solve transparency / depth sorting. (Which i do not, since visibility is already known in my case)
And they use something like a list of 16 fragments per pixel in the paper, similar to OIT approaches.

That's slow of coarse.
I've had tried a faster option: Each point only goes into one pixel, using depth comparison as usual to find the closest point.
After that, reconstruct antialiased image by weighting winning points from a 3x3 kernel.
I've tried this only on CPU, and it's promising. But there is some popping ofc. TAA might help.

However, after i saw the Spherical Gaussians paper i thought that's really high quality, and it can do proper transparency. Maybe a more expensive method is worth it. Maybe it's time to tackle this transperency problem...
 
Back
Top