Game development presentations - a useful reference

Scott_Arm · Feb 2, 2025

A blog post detailing some of the changes in the directx 12 agility sdk. Bunch of these improvements look interesting.

DirectX 12 Agility SDK 1.716.0-preview Explained

Karamazov · Feb 3, 2025

Scott_Arm · Feb 6, 2025

SDK for Neural Texture Compression released. Contains a bunch of links for more info.

GitHub - NVIDIA-RTX/RTXNTC: NVIDIA Neural Texture Compression SDK

NVIDIA Neural Texture Compression SDK. Contribute to NVIDIA-RTX/RTXNTC development by creating an account on GitHub.

github.com

Scott_Arm · Feb 6, 2025

Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs

New article on ray tracing hair and fur with hardware ray tracing support for the linear swept sphere (LSS) primitive.

Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs | NVIDIA Technical Blog

Hardware support for ray tracing triangle meshes was introduced as part of NVIDIA RTX in 2018. But ray tracing for hair and fur has remained a compute-intensive problem that has been difficult to…

developer.nvidia.com

New article on NVIDIA RTX Mega Geometry

NVIDIA RTX Mega Geometry Now Available with New Vulkan Samples | NVIDIA Technical Blog

Geometric detail in computer graphics has increased exponentially in the past 30 years. To render high quality assets with higher instance counts and greater triangle density…

developer.nvidia.com

New blog post on the RTX Neural Kit with links on sample projects for all of the different technologies.

Get Started with Neural Rendering Using NVIDIA RTX Kit | NVIDIA Technical Blog

Neural rendering is the next era of computer graphics. By integrating neural networks into the rendering process, we can take dramatic leaps forward in performance, image quality…

developer.nvidia.com

Scott_Arm · Feb 6, 2025

Solving the Dense Geometry Problem

A new article on AMD's Dense Geometry Format (DGF).

Solving the Dense Geometry Problem

Discover how AMD's Dense Geometry Compression Format (DGF) revolutionizes graphics by compressing complex models for efficient real-time rendering, bridging the gap between rasterization and ray tracing.

gpuopen.com

raytracingfan · Feb 7, 2025

It will be interesting to see how DGF fits with Mega Geometry, if both become standardized by the graphics APIs. They are both used with triangle clusters and could complement each other in theory, although I wonder if the tessellation examples Nvidia is showing up would work with DGF. Also, if Linear Swept Spheres are standardized by the graphics APIs, standardized geometry compression formats like DGF should support them.

Scott_Arm said:
New article on ray tracing hair and fur with hardware ray tracing support for the linear swept sphere (LSS) primitive.

Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs | NVIDIA Technical Blog

Hardware support for ray tracing triangle meshes was introduced as part of NVIDIA RTX in 2018. But ray tracing for hair and fur has remained a compute-intensive problem that has been difficult to…

developer.nvidia.com

The LSS primitive is a thick, round 3D line with varying radii. Multiple linear swept spheres can be chained together to build 3D curves, sharing vertices where they overlap, similar to how triangles are used to build 3D surfaces. LSS is shaped like a cylindrical or conical tube with spheres optionally capping either end.

The earlier material published on LSS made it seem as if the radius would be constant, so only pill-shaped capsules could be represented. Varying radii makes LSS significantly more flexible. If future versions of LSS support providing an angle of revolution then semi-capsules, semi-cones, and other partially-revolved capsules and cones could be represented too. That might not provide much value if the LSS is only a few pixels wide as the article says, but neither would varying radii, and Nvidia chose to support that.

Maybe other geometric primitives like bilinear patches will also get HWRT support one day.

MfA · Feb 7, 2025

So if I read Mega Geometry right, Cluster LODs are not selected in the ray traversal, but a BLAS with the appropriate LODs is build for each frame? If so I preferred Intel's attempt.

MfA · Feb 7, 2025

raytracingfan said:
It will be interesting to see how DGF fits with Mega Geometry

It can be a disk format for clusters. The point of DGF is that it's trivially converted to whatever the silly raytracing APIs desire.

Honestly I'm a bit disappointed in the second author. After wrangling raytracing in doing it right at Intel, he's now bending over backwards to facilitate raytracing in doing it wrong at AMD.

Dictator · Feb 7, 2025

Scott_Arm said:
Solving the Dense Geometry Problem

A new article on AMD's Dense Geometry Format (DGF).

Solving the Dense Geometry Problem

Discover how AMD's Dense Geometry Compression Format (DGF) revolutionizes graphics by compressing complex models for efficient real-time rendering, bridging the gap between rasterization and ray tracing.

gpuopen.com

Not sure if I like the direction of DGF for what it means for broader adoption vs. what NV has seemingly schemed.
One requires a new hardware unit, and one does not. Realistically only AMD will have production on that hardware unit in the mid term so no RDNA1, no RDNA 2, no RDNA 3 (maybe not even RDNA4?), no Turing, no Ampere, no Ada Lovelace, no Blackwell, no Arc, no Battlemage. Where as we have seen Mega Geo at its base already working on RTX 2000 and is not locked into adding a decompression unit to the GPU for this singular purpose.

Competing standards here with one requiring new decompression unit... yeah... not sure that is the way forward.

edit: how does DGF factor into dynamic tessellation as well?

trinibwoy · Feb 7, 2025

Can’t support legacy hardware forever. If AMD’s stuff works well then having the hardware now will speed adoption in the future similar to RT. Software alternatives are fine in the short term.

Dictator · Feb 7, 2025

trinibwoy said:
Can’t support legacy hardware forever. If AMD’s stuff works well then having the hardware now will speed adoption in the future similar to RT. Software alternatives are fine in the short term.

I think here I am just not sure making a format as the basis is the right idea. Like @raytracingfan, they kind of fit into one another to a degree and are not mutually exclusive, but one idea at solving the problem is eminently less portable.

Not 1:1 relevant, but thinking about Direct Storage GPU decompression (which I think has been a failure at this point for adoption). It mandates format according to my conversation with Nixxes, and that has made it usage far more troublesome.

DXR left a lot of room for innovation below its broad spec with version 1.0 - that is how we had 3 different vendors come up with 3 different solutions with that broader spec. But with a compression format for geometry, being codified in hardware, we are instead seeing one way to tackle the problem being *the only way* for the immediate XXX years after the next PlayStation comes out/next Direct X comes out. Do Intel and Nvidia then by necessity implement this hardware or format? Or do they pay the price of recompression? Ehhhh I am not liking this.

As a reminder - the consoles are setting the pacing for API advancement seemingly and AMD is *the consoles*.

Edit: Perhaps this requires a different thread before we start clogging up this one

Ethatron · Feb 7, 2025

raytracingfan said:
It will be interesting to see how DGF fits with Mega Geometry, if both become standardized by the graphics APIs. They are both used with triangle clusters and could complement each other in theory, although I wonder if the tessellation examples Nvidia is showing up would work with DGF.

It's a wash, they are basically the same. Only difference is that DGF is serializabe, CLAS isn't.
Tessellation is done by sub-division and re-clustering of the base mesh.

raytracingfan said:
Also, if Linear Swept Spheres are standardized by the graphics APIs, standardized geometry compression formats like DGF should support them.

There' no overlap between the two concepts. You won't construct BLASes of objects containing triangles and LSSes at the same time. You construct two seperate BLASes of different type.

Dictator said:
Not sure if I like the direction of DGF for what it means for broader adoption vs. what NV has seemingly schemed.
One requires a new hardware unit, and one does not.

Neither is either or. Nvidia emulates on older hw (or uses a programmable traversal & intersection processor; probably not the case otherwise they'd have jumped on intel's traversal shader proposal) same as suggests AMD could manage architecture backwards compatibility. In both cases the actual CLAS-type or DGF is custom and new, distinct from current ASes.

Dictator said:
Where as we have seen Mega Geo at its base already working on RTX 2000 and is not locked into adding a decompression unit to the GPU for this singular purpose.

DGF is also not locked into needing a hardware block supporting it.

Dictator said:
Competing standards here with one requiring new decompression unit... yeah... not sure that is the way forward.

I also think S3TC and BC were bad suggestions. Software lossy compression with programmable filtering woud have been much better. /s
DFG proposes block compression for geometry, with selectable bitrate, while at the same time synergizing with clustering. If you want this, you have to bite the bullet and converge/cooperate on some format.

Dictator said:
edit: how does DGF factor into dynamic tessellation as well?

Exactly the same way.

DegustatoR · Feb 7, 2025

I'm not that sure that MG and DGF are tackling the same thing.
Presumably Nvidia h/w already does some sort of geo/BVH compression, DGF seem to just move that to storage instead of it being wholly on GPU runtime thing.
Both can co-exist and neither should prevent each other from being used IMO.

raytracingfan · Feb 7, 2025

MfA said:
So if I read Mega Geometry right, Cluster LODs are not selected in the ray traversal, but a BLAS with the appropriate LODs is build for each frame? If so I preferred Intel's attempt.

AMD has its own interesting idea for RT LOD, which allows per-ray LOD selection during traversal without needing to store multiple LOD instances in memory.

OlegSH · Feb 7, 2025

DegustatoR said:
Presumably Nvidia h/w already does some sort of geo/BVH compression, DGF seem to just move that to storage instead of it being wholly on GPU runtime thing.

Besides, AMD's BVH has always been 2 to 3 times the size of NVIDIA's BVH since Turing. Now, with Blackwell's claimed 75% BVH size on top, this would translate into 2.7 - 4x compression ratios for BVH when compared to the current AMD products, which should exceed or match the DGF compression ratios for BVH that AMD depicts in this image.

The issue with DFG is that it requires compressing geometry, whereas MG does not require any additional compression from the engine to function.
As a result, MG has broader hardware and sw support, does not require any content preprocessing from engine, and provides the same if not better compression ratios for BVH, so what's the point of DFG then for anyone other than AMD?

Ethatron · Feb 7, 2025

OlegSH said:
The issue with DFG is that it requires compressing geometry, whereas MG does not require any additional compression from the engine to function.
As a result, MG has broader hardware and sw support, does not require any content preprocessing from engine, and provides the same if not better compression ratios for BVH, so what's the point of DFG then for anyone other than AMD?

Looks like a good substitute for Nanite's runtime structure with native hardware support.
Mesh-shader support is hinted at.

DegustatoR · Feb 8, 2025

OlegSH said:
so what's the point of DFG then for anyone other than AMD?

Some savings on storage and bandwidth. Dunno if its significant for the geometry - with full detail Nanite like meshes it could be?

OlegSH · Feb 8, 2025

Ethatron said:
Looks like a good substitute for Nanite's runtime structure with native hardware support.

And why exactly would you want to replace something that works on practically all hardware with something that works on none?

The only possible reason could be achieving higher compression ratios for the DFG or perhaps noticeably better performance.

However, the claimed 4 bytes per triangle for DFG are quite comparable to the nanite's geometry compression ratio, which should be of a similar level. I have also never seen a game where decompressing nanite's storage format has been an issue or a performance bottleneck.

Regarding potential frame time spikes, which AMD mentions in its blog when transcoding from the storage format, the decompression speed of the early version of nanite's storage format was around 50 GB/s on the PS5 with unoptimized software. This is already much faster than what the PS5 SSD can provide. We have yet to see a game where those rates would be fully saturated. In existing titles, there is no scenario where those sw decompression rates would cause frame time spikes.

trinibwoy · Feb 8, 2025

Ethatron said:
I also think S3TC and BC were bad suggestions.

I see your point but it’s not a perfect analogy. Compressed texture mips are self-contained and don’t have to worry about continuous LOD or cluster seams etc that complicate triangle meshes. DGF also doesn’t seem to handle other geometry structures like LSS.

If DGF solves all those problems, encourages a highly efficient cache architecture, enables faster runtime tracing and raster and allows for continued innovation then wide adoption will be great for everyone. Tall order though.

Ethatron · Feb 8, 2025

OlegSH said:
And why exactly would you want to replace something that works on practically all hardware with something that works on none?

The only possible reason could be achieving higher compression ratios for the DFG or perhaps noticeably better performance.

Sounds like good reasons.

OlegSH said:
However, the claimed 4 bytes per triangle for DFG are quite comparable to the nanite's geometry compression ratio, which should be of a similar level. I have also never seen a game where decompressing nanite's storage format has been an issue or a performance bottleneck.

BVH construction is a big performance problem. CLAS construction forces one to transcode and then still build all of those Nanite clusters. With DGF neither is necessary, you put what you have on disk directly into the hierarchy.

trinibwoy said:
I see your point but it’s not a perfect analogy. Compressed texture mips are self-contained and don’t have to worry about continuous LOD or cluster seams etc that complicate triangle meshes. DGF also doesn’t seem to handle other geometry structures like LSS.

You don't mix and match triangles and spheres in the same BLAS for the same purpose. Having the equivalent of DGF for AABBs or OBBs or Spheres or LSS or Lozenges is something that can be complemented.

trinibwoy said:
If DGF solves all those problems, encourages a highly efficient cache architecture, enables faster runtime tracing and raster and allows for continued innovation then wide adoption will be great for everyone. Tall order though.

We can encourage this direction if we see the glass half full instead of half empty and be constructive (good faith) along with critizism. Exploration is valuable, paths to solutions are not always linear, nor should they be unilateral.

An implementation of virtualized geometry is excessively complicated and intrusive. If the available support in the API/hw is such that it eventually just requires calling a handful of (portable) functions, then it's of benefit for every developer, big and small.

Game development presentations - a useful reference

Scott_Arm

Karamazov

Scott_Arm

GitHub - NVIDIA-RTX/RTXNTC: NVIDIA Neural Texture Compression SDK

Scott_Arm

Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs | NVIDIA Technical Blog

NVIDIA RTX Mega Geometry Now Available with New Vulkan Samples | NVIDIA Technical Blog

Get Started with Neural Rendering Using NVIDIA RTX Kit | NVIDIA Technical Blog

Scott_Arm

Solving the Dense Geometry Problem

raytracingfan

Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs | NVIDIA Technical Blog

MfA

MfA

Dictator

Solving the Dense Geometry Problem

trinibwoy

Meh

Dictator

Ethatron

DegustatoR

raytracingfan

OlegSH

Ethatron

DegustatoR

OlegSH

trinibwoy

Meh

Ethatron

Similar threads