AMD: RDNA 3 Speculation, Rumours and Discussion

Status
Not open for further replies.
Spider man MM uses it for body deformation

If thats all to it, then yes, software can be used and i understand people being underwhelmed by the tech. Recontructing a image from 1080p to 4k with the final image looking as good if not better then native 4k is a different beast, if that can be done totally in software aswell, then yes, i can wonder why there would be hardware dedicated for it.
 
Recontructing a image from 1080p to 4k with the final image looking as good if not better then native 4k is a different beast, if that can be done totally in software aswell, then yes, i can wonder why there would be hardware dedicated for it.
AFAIK, DLSS 2 uses temporal upscaling, which means e.g. game renders frames with periodic sequences of subpixel offsets like (0.25, 0.25), (0.25, 0.75), (0.75, 0.25), (0.75, 0.75). If you combine 4 such frames, you get a correct 4K frame (but textures appear blurrier if you do not manually decrease filter sizes accordingly). This is also how TAA works. But with motion it breaks or causes unwanted motion blur, which is why TAA requires reprojection of previous frame(s) with motion vectors.
Now i guess the ML part here is either to prevent a need for motion vectors (which are difficult to create in cases), and / or to improve AA and detail with pattern detection methods.
Such methods of combined TAA and upscaling exist. I think UE has it, and people say it's almost as good as DLSS but also more costly (hinting at least the win of tensor cores - though, that postprocess is still just fractions of frametime.)
So maybe DLSS is not that impressive / magic from the upscaling perspective, because it has those subframes which it 'only' has to convert to a temporally stable image.

I was very impressed from the Google upscaling shown recently, which did increase resolution a whole 4 times. This feels indeed magic to me, and i guess it works by learning the whole scene, so pattern detection can reconstruct it's detail features.
Likely such method can't be used for a whole game with many different scenes, even lesser for any game without specific training. It's also too expensive.


Skinning is an interesting application. Maybe it's the most embarrassing failure of games - scanning in Keanu and presenting movie like experience in many cases, but then showing completely wrong anatomy if bones get out of restpose too much.
To fix this, we would need a whole body simulation - bones, muscles, volume preservation, sliding skin making folds... it's huge manual effort in coding multiple simulation techniques and combining them, parameter tweaking hell, and finally too expensive for current realtime tech.
So not possible, and there is not much progress in cheaper approximations. I worked in this myself a lot, and only after decades of failures i found something which seems to work, but need to make artist friendly tools first to set it up to be sure, and then still need to add muscles and bone collisions...
It's a huge problem, and much harder / more important than e.g. hair sim and rendering, which is often discussed.

Now i can imagine ML can do this well, similar to how it can speed up fluid simulations. The training is probably much less work than the development of a system to replicate this. So yeah, that's one more good argument to sell me tensor cores.
 
Last edited:
Now i guess the ML part here is either to prevent a need for motion vectors (which are difficult to create in cases), and / or to improve AA and detail with pattern detection methods.
DLSS requires motion vectors as input. If NN was used instead of motion vectors, it could have been used in literally every game. I suspect neither optical flow or NN approximations would have enough precision to serve the task.
NN is used to combine two aligned frames since neighborhood clipping is the place where information loss happens with TAA.

Such methods of combined TAA and upscaling exist.
It has been here for a long time, it's called TAAU, the issue is that it suffers from neighborhood clipping even more in comparison with TAA in native resolution, hence the quality and detail losses are way worse.

I think UE has it, and people say it's almost as good as DLSS but also more costly (hinting at least the win of tensor cores - though, that postprocess is still just fractions of frametime.)
Current implementation in UE 4.26 is nowhere close to DLSS.
 
Well, right now, only RTX gpus enables one to use DLSS2.0 reconstruction tech, and according to NV its due to hardware implementations. NV could be lying or simply making up things, but its not the topic for it ;) Im sure that AMD will have much improved ray tracing and reconstruction tech/ML hardware in RDNA3 gpus which shouldnt be far off from now. Seeing its very hard to get any GPUs now, it doesnt really matter their behind, anyway.
 
Nvidia broadcast app uses DL for removing backgroung image. Photoshop also uses machine learning. If we had had normal GDC this year I bet there would have been a lot of talks about using ML for gaming. I saw also some game that was developed around GPT-3.

Neural Filters is a major breakthrough in AI-powered creativity and the beginning of a complete reimagination of filters and image manipulation inside Photoshop. This first version ships with a large set of new filters. Many of these filters are still in the beta quality state. We’ve decided to ship them to you now so you can try them out and give feedback and help shape the future of AI in Photoshop. Neural Filters is part of a new machine learning platform, which will evolve and get better over time – expanding on what’s possible exponentially.
https://blog.adobe.com/en/publish/2...st-advanced-ai-application-for-creatives.html
 
If we had had normal GDC this year I bet there would have been a lot of talks about using ML for gaming. I saw also some game that was developed around GPT-3.
There were a few talks on GTC.

Fully Fused Neural Network for Radiance Caching in Real Time Rendering
This one is my fav. Quality of the NN cache seems to be better than caching radiance in probes or voxels - no light leaks, captures diffuse and high frequency secondary lighting, doesn't require any prebaking and is also capable of accelerating RTXDI.
Though, this cache requires tensor cores for real-time training and inference.
 

Next year's going to be a really exciting year for PC gaming. We've got RDNA3 and Ada Lovelace with the rumours of them being >2x faster the Ampere (difficult to believe) Zen4 with rumours of 25% uplift over Zen3, Alderlake with its new BIG.little design (and also rumours of massive performance uplifts), DDR5 which should at least double typical RAM capacities with much faster speed, PCIe5 (unfortunately only from Intel) and of course, DirectStorage should start seeing traction by then too.

If I can get my hands on one I think ill just grab a 3060ti to tide me over until late 2022 and then stump up for a monster upgrade.
 
100TF's, that cant be correct?

Next year's going to be a really exciting year for PC gaming. We've got RDNA3 and Ada Lovelace with the rumours of them being >2x faster the Ampere (difficult to believe) Zen4 with rumours of 25% uplift over Zen3, Alderlake with its new BIG.little design (and also rumours of massive performance uplifts), DDR5 which should at least double typical RAM capacities with much faster speed, PCIe5 (unfortunately only from Intel) and of course, DirectStorage should start seeing traction by then too.

If I can get my hands on one I think ill just grab a 3060ti to tide me over until late 2022 and then stump up for a monster upgrade.


Oh absolutely. RDNA3 looks to be much more promising then RDNA2 ever was/is, same for Zen3 and 4 over Zen2. Im hanging tight with the 2080Ti for a good while left, had it since 2018, going to be a 4/5 year GPU/pc untill im getting complete new gaming system.

With MS focussing on PC gaming more then ever before, and sony aswell, were in for good times. Maybe the delay in able to get hardware right now only has people being forced to wait to get even better hardware down the line.
 
NN is used to combine two aligned frames since neighborhood clipping is the place where information loss happens with TAA.
So NN improves over the naive bounding box in color space? Makes sense.
I would love to see texture compression with neural nets, research results look promising.
Yeah, this would be the killer application. But did not follow. Some quick googling gave me an improvement over jpeg of about 30%, and without block artifacts.
Is there research showing much better ratios?
 
Oh absolutely. RDNA3 looks to be much more promising then RDNA2 ever was/is, same for Zen3 and 4 over Zen2. Im hanging tight with the 2080Ti for a good while left, had it since 2018, going to be a 4/5 year GPU/pc untill im getting complete new gaming system.

With MS focussing on PC gaming more then ever before, and sony aswell, were in for good times. Maybe the delay in able to get hardware right now only has people being forced to wait to get even better hardware down the line.

Yes you're in a a great position with a 2080Ti. I certainly wouldn't bother upgrading in your position until at least the next gen GPUs launch. It'll easily handle anything thrown at it for the next couple of years without having to make any serious compromises.
 
DLSS requires motion vectors as input. If NN was used instead of motion vectors, it could have been used in literally every game. I suspect neither optical flow or NN approximations would have enough precision to serve the task.
NN is used to combine two aligned frames since neighborhood clipping is the place where information loss happens with TAA.


It has been here for a long time, it's called TAAU, the issue is that it suffers from neighborhood clipping even more in comparison with TAA in native resolution, hence the quality and detail losses are way worse.


Current implementation in UE 4.26 is nowhere close to DLSS.

I dunno, the new one has advantages over DLSS and disadvantages, I'm not sure there's an absolutely clear winner. You can see a video below. DLSS does better with subpixel detail at the moment, but there's a tendency for variance in scale with the neural net overguessing, making the phone wires too big and clearly aliasing a lot in motion because of it. Meanwhile the TXAA doesn't do well with subpixel detail, making it jittery in position, but once you can see the phone wire clearly it looks a lot more smooth and stable than DLSS. Clearly TXAA needs work on subpixel detail, it's throwing out subpixel detail after it disappears from camera jitter. But once it doesn't it looks a lot more stable, so I'm hoping future non "experimental" status versions are just an overall win.

 
Last edited:
I've been thinking that with the introduction of Infinity Cache, the debate between deferred vs forward renderers rages again once more ...

With enough fast on-chip memory, we can afford to store our G-buffer in this memory location for a deferred renderer. On GCN/RDNA, register pressure and hardware occupancy can be a frequent problem when encountering ubershaders in a forward rendering pipeline. Deferred rendering with Infinity Cache is a great combination for RDNA2 since we can use lower register pressure specialized shaders to get potentially the most optimal performance in this configuration ...

On other architectures such as current Nvidia GPUs where compute capabilities are vastly outgrowing the memory performance, it's a less than ideal scenario to burn most of the available bandwidth on the G-buffer. If register pressure isn't as big of an issue while there's lot's of ALUs to spare then forward renderers might be a better fit in their case since running complex ubsershaders doesn't have the same performance constraints for them ...

If on-chip memory like Infinity Cache were to keep growing in bandwidth/capacity in the future, we could potentially support MSAA G-buffer or deep (for transparency) G-buffer layouts (AKA "fat" G-buffers) so that deferred renderers may be able to support features like MSAA/transparency at some point!
 
but there's a tendency for variance in scale with the neural net overguessing, making the phone wires too big and clearly aliasing a lot in motion because of it
That's not an overguessing, but rather how the low res input image should look like when temporal part fails to accumulate any pixels. If you pay attention to the background behind the wires, you will spot quickly the dramatic difference in foliage detalization between the DLSS and clipping in TAAU.
The TAAU seems to be falling apart more gracefully on wires though. Instead of fully denying to accumulate pixels like the NN in DLSS does in the case of wires, TAAU still accumulates pixels (not that it had a choice since its heuristics work uniformly across the image) with stochastic jittering and randomly kills signal with color clipping, this certainly helps decreasing the visibility of the regular pixel grid pattern on wires by making the wires more diffuse and less visible (essentialy by randomly splatting pixels of the wire via the Holton sequence jittering distribution and by washing out them with background pixels via the neighborhood clipping). This effect doesn't seem to be intentional because as you can see it works well just on wires and only in motion while the rest of the image becomes super blurry and suffers greatly from detail losses in motion and in static.
 
This latter point holds true for increasing enterprise needs on ML, but not for games. NV does an experiment to test if it works to introduce new HW features together with introducing their application as well, into a market which does not request it.

We may never know how much DLSS and tensors played a role in Nvidia’s raytracing ambitions but upscaling has dramatically increased the viability of RT on today’s hardware at modern resolutions. Maybe without DLSS Nvidia would’ve tried to push checkerboard on PC or some other upscaling solution.

The market didn’t ask for ML based upscaling specifically but Nvidia didn’t invent upscaling and there’s certainly widespread interest and investment in upscaling tech that has nothing to do with tensors. So I would say the market definitely wants better, faster upscaling and DLSS meets that need.

DLSS may just be a consequence of gaming hardware getting tensors for “free” so might as well use it for something. Or maybe it was part of Nvidia’s grand design from the beginning and AMD got caught out on both the RT and upscaling front. Nobody is asking for ML based upscaling but AMD has to respond with some workable alternative whether it uses ML or not.
 
Status
Not open for further replies.
Back
Top