AMD: RDNA 3 Speculation, Rumours and Discussion

Status
Not open for further replies.
Intel and nVidia can use the RT cores concurrently to their compute cores. So they are able to use these specific cores to travel the BVH and to do triangle intersection while at the same time compute and shade.
Just like AMD can? What AMD can't and Intel and NVIDIA can do is texturing and RT at the same time.
 
Just like AMD can? What AMD can't and Intel and NVIDIA can do is texturing and RT at the same time.
Can RDNA actually co-issue tex/intersection and math, or does issuing the instruction stall the math SIMDs for a cycle?

But only looking at intersection is an incredibly limited view. RDNA does traversal via normal math instructions while nvidia and intel offload that to the RT cores. And more problematic than simply contention of resources is that traversal is a very branchy workload that is poorly suited to SIMD hardware.
 
And if they can further add something like SER in Nvidia GPUs, then they might be even competitive in Raytracing performance. Lets hope for the best.

If i recall one of the DF Directs took on the matter and concluded that AMD at some point has to implement RT and ML in hardware in the same vein as NV and Intel do, to be able to compete in the markets now largests graphics features, machine learning/AI and ray tracing. Im confident that AMD will, the question is when. Lagging behind this much isnt going to be an option as the competition will take up market share. And IF we might see another generation of consoles then manufacturers going with AMD might become a consideration, maybe having to resort to Intel or even NV. Now i know consoles arent that of a high margin product, its not something AMD want to miss out on either.
 
If i recall one of the DF Directs took on the matter and concluded that AMD at some point has to implement RT and ML in hardware in the same vein as NV and Intel do, to be able to compete in the markets now largests graphics features, machine learning/AI and ray tracing. Im confident that AMD will, the question is when. Lagging behind this much isnt going to be an option as the competition will take up market share. And IF we might see another generation of consoles then manufacturers going with AMD might become a consideration, maybe having to resort to Intel or even NV. Now i know consoles arent that of a high margin product, its not something AMD want to miss out on either.
"Machine learning/AI" is really getting far more credit than it deserves in gaming. It's literally used for scaling which can be done without it too, that's far cry from "largest graphic features".
AMD has implemented ML hardware where they see it matter, aka Instinct/CDNA, and they're implementing accelerators in another segment where they see it relevant, aka CPUs.

As for "same vein" implementation, again, dedicated or part of something else isn't what dictates the performance.
 
As for "same vein" implementation, again, dedicated or part of something else isn't what dictates the performance.

By dedicated I assume he means not running traversal on SIMDs. It’s not related to the shared texture unit for intersection.

With Nvidia and Intel both pushing hardware sorting AMD will be in a tough spot if they stubbornly stick to SIMD traversal and leave it to devs to figure out. I wouldn’t be surprised if Intel has stronger devrel than AMD with game developers.
 
The sole fact that Ada is anywhere from 4 to 8 times faster in RT than RDNA2 proves that it is indeed possible to achieve.

Stop this nonsense. It's nowhere near 4-8 times faster. Unless you believe crappy benchmarks like 3d Mark that already manipulated their results (limiting Radeon cards in DX12 Timespy when it appeared to be were faster than Geforce ) and has less and less translation to the games.


You can say it's 2,5-3 times faster (according to Computerbase 260%). But definitely not 4-8 times faster. It's simply a lie.
And most games in Computerbase test are nVidia sponsored games so it's the best case scenario.

With RDNA3 RT probably will still be slower but the gap (50-60% against 3090Ti) will be smaller.
 
Intel and nVidia can use the RT cores concurrently to their compute cores. So they are able to use these specific cores to travel the BVH and to do triangle intersection while at the same time compute and shade.

But if they will do it they have to pay for delivering results to graphic pipeline. It's simply as that. There is no free lunch. Having separated hardware for something will always add latency to your graphic streamline (and the pains to synchronize hardwares).
 
Why do you say that? That would still be a 2x improvement over a 6900xt in RT performance. Expecting more than that is extremely unrealistic.
If the rumoured specs are true and given RDNA2's low RT perf I don't see why it's unrealistic for N31 and especially N32 (halving RT frametime, not necessarily doubling FPS with RT especially in lighter RT games). Apparently it's 96 CUs/48WGP vs 80/40, 1.2x increase for N31 vs N21, then 60CU/30WGP vs 40/20, 1.5x for N32 vs N22. Clock speeds apparently it's in the 3-3.5GHz range, N21 is about 2.3GHz and N22 2.6GHz average, assume 3GHz for N31 and 3.3GHz for N32 = 1.3x and 1.2x respectively. That's 1.56x for N31 vs N21 and 1.9x N32 vs N22 per RT unit excluding all other perf improvements/arch gains like the RT units themselves, extra compute potentially helping (although this matters less if they offload more work to dedicated units), extra memory bandwidth etc. We know performance doesn't scale perfectly but even if their RT units are "only" 1.5x faster per clock vs RDNA2 then 2x RT performance should be possible for N31 which I don't think is unrealistic given their lower starting point.

If they have many big arch improvements like HW BVH acceleration, coherency sorting (highly doubt this but we can dream), much smarter caching etc then it'd be more like 3x but that's a wishlist rather than expectation. With Nvidia improving significantly again, Intel's first gen RT HW being similar to Ampere, AMD have to make big gains to not fall further behind and if they can't get close to Ampere's RT perf/unit then that's bad for us
 
But if they will do it they have to pay for delivering results to graphic pipeline. It's simply as that. There is no free lunch. Having separated hardware for something will always add latency to your graphic streamline (and the pains to synchronize hardwares).
That is only true for RDNA2. BVH traversal happens on the compute cores and the result has always be checked for triangle intersection. This happens in the RT cores implemented by nVidia and Intel. So the RT cores giving the result back to the compute cores which is the same with AMD.
 
You can say it's 2,5-3 times faster (according to Computerbase 260%). But definitely not 4-8 times faster. It's simply a lie.
And most games in Computerbase test are nVidia sponsored games so it's the best case scenario.
With RDNA3 RT probably will still be slower but the gap (50-60% against 3090Ti) will be smaller.
Invariably will depend on reviewer and games tested. Here they tested ray tracing over 40+ games using three APIs (Vulcan, DX 12, and DX 11) in both 4K resolution and 1440p at the highest setting.
I imagine some of these games have to be AMD sponsored with their watered-down ray tracing effects.
In 4K gaming with ray tracing enabled, the RTX 4090’s average frame rates enjoy a 104% difference over the RX 6900 XT and 70% better than the RTX 3090. It boasts close to twice the performance at the same power as Ampere and excellent scalability and overclocking ability as power consumption increases.
 
"Machine learning/AI" is really getting far more credit than it deserves in gaming. It's literally used for scaling which can be done without it too, that's far cry from "largest graphic features".
AMD has implemented ML hardware where they see it matter, aka Instinct/CDNA, and they're implementing accelerators in another segment where they see it relevant, aka CPUs.

As for "same vein" implementation, again, dedicated or part of something else isn't what dictates the performance.

Without hw ML, reconstruction technologies are going to perform slower or with worse quality. The tensor hardware and Intels XeSS cores do help out there. AMD will have to follow suit to not fall further behind in that area. Theres not the IF but the when.

So yes, in the same vein as Intel and NV, AMD will at some point implement RT and ML akin to the mentioned IHV's.

The DF crew has just enough technical chops to make reasonable observations, but not enough to conclude much of anything.

Enough value to mention their findings. They have direct connections and talks to Intel for example, game devs and IHV's in some cases.
 
It would presumably be impossible to implement frame generation without AI. I realise the current implementation has some weaknesses but they will eventually be ironed out and additional frames will be inserted to make the performance multiplier so great that any GPU not using it will be effectively obsolete - especially when future consoles start using it.

So yes, I absolutely expect AMD to get on the AI bandwagon at some point.
 
I really think you dont know how to count.

So explain Vega on tsmc 7nm vs N21 , 60% bigger die for ~ 150% more performance.
OR N22 being the same size on the same 7nm process with 1/2 the memory bandwidth while having 50% more performance.

by your logic AMD must be GODS amongst men. how could they make such a performance jump, after all only process matters, right ?

Now they get approx. double the logic transistors (7nm to 5nm) with approx. the same die size dedicated to ALU/ROP/MTU/PCI/Display etc when comparing N21 to N31. They be walking it in yeah?
Vega was a buggy iteration of a GPU architecture dating back to 2012. Hardly a relevant comparison to a well functioning one dating back to only 2020.

So which past features were equivalent to RT?

With SER, Nvidia is already quoting a 44% improvement in Cyberpunk's new ray tracing mode and a 29% improvement in Portal RTX. So the premise about >2X being impossible may be wrong even on Nvidia hardware.

However Nvidia already accelerate more of the RT pipeline, and are already 2X faster or more in RT alone. All AMD need to do to is close the gap a little, for example by adding hardware acceleration of ray traversal, which Intel managed on their first generation of RT hardware! Again, if they are 2X faster in rasterization, *any* reduction to the cost of ray tracing will increase performance by >2X. So essentially your claim must be that it's just impossible for AMD to improve their ray tracing capabilities and Intel has them beat.
It’s difficult to say what feature would be equivalent. Maybe programmable shaders?

Lets wait and see what SER actually delivers. But keep in mind this is Nvidia’s 3rd iteration of RT core. I don’t expect AMDs first iteration to compete if they even decide to add some in the first place. It’s entirely possible they just keep all processing in the shader core.
 
Jeez.
The only way for you to get an inference engine anywhere client AMD is the laptop SoCs, and not even all of them, only the premium and halo ones.
If you want it inside your GPU, you need a semicustom design.

That, patent?, on a chiplet inference engine possibly hooking up to a GPU (I think I'm remembering this right) sounds possible. Intel and Nvidia already have their inference parts, AMD has the IP now and the chiplet infrastructure to just tack it on.

What they need is a reason to do so, the recent proof that deep learning isn't some magic black box and can just be de-entangled from a mess of matrix ops to a direct translation of a more understandable, traditional algo like decision trees shows there really isn't any magic to DLSS/XESS. Any improvement to upscaling or RT denoising or etc. may not need an inference engine, assuming the previous work can be carried forward towards some sort of generic translation to code that runs just as fast/faster using other instruction types.

Best of both worlds, get the AI to guess the best, most optimized solution there is for you, then just run it anywhere.
 
It’s difficult to say what feature would be equivalent. Maybe programmable shaders?
So in that case what about the 8500 vs 9700 Pro?

Lets wait and see what SER actually delivers. But keep in mind this is Nvidia’s 3rd iteration of RT core. I don’t expect AMDs first iteration to compete if they even decide to add some in the first place. It’s entirely possible they just keep all processing in the shader core.
We're not talking about AMD competing in RT though. We're talking about them reducing the performance hit of ray tracing by (potentially) accelerating more of the RT pipeline. I mean they literally told us they have enhanced the ray tracing capabilities of each CU. Presumably they managed to achieve some benefit for this and it wasn't a complete waste of time?
 
Status
Not open for further replies.
Back
Top