AMD RDNA3 Specifications Discussion Thread

DegustatoR · Dec 15, 2022

trinibwoy said:
The uncapped results aren’t much different. Wonder why AND thought it was ok to claim a 50% perf/watt improvement.

Outside of the obvious (i.e. lying) the only plausible explanation is that the figure is for RDNA3 - and N31 is not the only member of RDNA3 family.

Subtlesnake · Dec 15, 2022

Do we have performance per watt figures at 300W for the 7900 XTX?

MuteyM · Dec 15, 2022

Man from Atlantis said:
https://twitter.com/i/web/status/1603261861905936384

Can anyone test this on Linux, I read it's expected to be around 5% perf deficit or so

In practice I doubt a lack of shader prefetch will make much difference. First time a shader runs it may run a tiny bit slower due to cache not being primed. But it would definitely be interesting to force-disable prefetch on RDNA2 and benchmark a few games just to see…

Silent_Buddha · Dec 15, 2022

Subtlesnake said:
Do we have performance per watt figures at 300W for the 7900 XTX?

It'd be interesting if they specifically chose 300W for the figure if [1] that's below the knee of the power curve for 7900 XTX and [2] above the knee of the power curve for 6900 XT. That'd be a classic case of manipulating the numbers while simultaneously using real data which isn't terribly relevant to the shipped product as it's spec'd as a 355W part.

Although I still don't think that'd necessarily change things much. I'm just guessing something went horribly horribly wrong and the shipping product ended up being something other than what they used for the testing.

Regards,
SB

MuteyM · Dec 15, 2022

Anyone know what this tasty little tidbit of RDNA3 driver code means?

#if VKI_BUILD_GFX11
  bool enableRayTracingHwTraversalStack; ///< Enable using hardware accelerated traversal stack
#endif

MuteyM · Dec 15, 2022

MuteyM said:
Anyone know what this tasty little tidbit of RDNA3 driver code means?
#if VKI_BUILD_GFX11 bool enableRayTracingHwTraversalStack; ///< Enable using hardware accelerated traversal stack #endif

I found these too:

RtIp2_0 = 0x3,     ///< Added more Hardware RayTracing features, such as BoxSort, PointerFlag, etc

supportRayTraversalStack           :  1; ///< HW assisted ray tracing traversal stack support

supportPointerFlags                :  1; ///< Ray tracing HW supports flags embedded in the node

With all these fancy new RT features, why isn’t RDNA3 any faster per clock at RT than RDNA2? Are AMD’s drivers not making use of them yet, or is something else broken?

Digidi · Dec 15, 2022

The MultiDrawDirect Accelerator is also not wokring good. Its better than the 6900xt but not the 2,3 which what was staed in the slides.

DegustatoR · Dec 15, 2022

MuteyM said:
With all these fancy new RT features, why isn’t RDNA3 any faster per clock at RT than RDNA2?

It is faster per clock than RDNA2:

AMD Radeon RX 7900 XTX & RX 7900 XT im Test: Technikvergleich RDNA 3 vs. RDNA 2, OC und UV

Radeon RX 7900 XTX & XT im Test: Technikvergleich RDNA 3 vs. RDNA 2, OC und UV

www.computerbase.de

JoeJ · Dec 15, 2022

MuteyM said:
With all these fancy new RT features, why isn’t RDNA3 any faster per clock at RT than RDNA2?

7900 XTX close to two times faster than 6900 XT. Factoring in slightly higher clock and shader counts, id's still get at least 1.5 over the thumb.

Granath · Dec 15, 2022

JoeJ said:
7900 XTX close to two times faster than 6900 XT. Factoring in slightly higher clock and shader counts, id's still get at least 1.5 over the thumb.

Quite different in Cyberpunk 2077, which runs 27 percent faster than RDNA 2 with the same computing power on RDNA 3, and there is even a 30 percent increase in percentile FPS. Doom Eternal is up 32 and 40 percent, Spider-Man Remastered is up 25 and 33 percent, and Metro Exodus is still up a good 22 and 21 percent.

Subtlesnake · Dec 15, 2022

Silent_Buddha said:
It'd be interesting if they specifically chose 300W for the figure if [1] that's below the knee of the power curve for 7900 XTX and [2] above the knee of the power curve for 6900 XT. That'd be a classic case of manipulating the numbers while simultaneously using real data which isn't terribly relevant to the shipped product as it's spec'd as a 355W part.

Although I still don't think that'd necessarily change things much. I'm just guessing something went horribly horribly wrong and the shipping product ended up being something other than what they used for the testing.

Regards,
SB

I think it's above the knee of the power curve for the 6900 XT, since performance per watt degrades on the 6950 XT. Though perhaps it degrades more on the 7900 XTX and that's why they didn't compare to the 6950 XT at 335 Watts.

In any case, without testing at 300W I don't think we can confirm that AMD "lied".

pharma · Dec 16, 2022

Et si... les RX 7900 avaient un bug hardware ?

C'est en tout cas l'hypothèse planant autour de la révision A0 du bousin.

www.comptoir-hardware.com

Eternalight7 · Dec 16, 2022

One thing I thought was weird was Tomshardware saying, "The GPU shader counts are where things start to get a bit different from other architectures. AMD says there are still 64 Streaming Processors (SP) per CU, but there are now four SIMD32 vector units per CU as well — two of which can only process FP32 or Matrix operations and not INT32."
if you have 4 SIMD, wouldn't you want them to be flexible and do all types of operations? The excuse being power savings.

Also, I believe it was LTT or GN? that was talking about how the thermistor on the fan is too close to the heatsink. They measured it and it was at least 10C off from ambient. Heatsoak causing poor readings isn't doing it any favors.

MuteyM · Dec 16, 2022

DegustatoR said:
It is faster per clock than RDNA2:

Thank you, I stand corrected.

tsa1 · Dec 16, 2022

Subtlesnake said:
I think it's above the knee of the power curve for the 6900 XT, since performance per watt degrades on the 6950 XT.

All of this is extremely workload dependant. In most AAA games my 6900xt (watercooled, limit set at 520W/475A) consumes around 350W (if fully loaded, which is hard to do at 1080p) while in Timespy it is usually well above 390W with peaks of 470W and higher. Clock/fps most of the times scales linearly (provided 100% true GPU load is achievable), but power scaling is mostly unpredictable.

Also, Timespy has weird interdependance between performance and amount of monitors active (more than one tanks the graphics score by 1K points or so, very significant loss) while it never happens in TSE or most games (in some of them (SOTTR) there is a small loss (about 5-10 fps) at 1440p in the benchmark). There's also the case of furmark which easily hits power limit at 720p noAA and causes the GPU core to consume more than 400A.

In any case, testing power efficiency at capped FPS is very Huang'esque move - by manipulating the limit you can arbitrarily make the difference between more powerful and less powerful GPUs as wide as you want (which was used to prop up Ampere and Ada in their presentations). In this fashion i can also show that my old Vega56 was 2-3x more efficient than Hawaii, let's say, in Witcher 3, by setting limit at which it was still in low-load scenario while Hawaii would be 100% loaded at max core voltage.

Granath · Dec 16, 2022

https://twitter.com/i/web/status/1603578978245263361

Bondrewd · Dec 16, 2022

That's HIP and somewhat heavy workload so yea lol.

Digidi · Dec 16, 2022

It looks like the Scheduling is totaly broke? Like in the benchmark i did, it is intresting that compute Shader can not handle the simple "software Rasterizer Task" . Im' wondering why this simple and clear task is not 2x as fast because 7900xtx have double shader performance....
Also the MultiDrawIndirect Accelerator is not working properleay. I cant see 2.3 performance gain.

Kaotik · Dec 16, 2022

Digidi said:
Also the MultiDrawIndirect Accelerator is not working properleay. I cant see 2.3 performance gain.

Slide numbers are always under specific conditions and should be read "up to xyz", not "xyz". Not seeing 2.3 times the performance in one test doesn't mean anything for any other scenario using said feature.

Digidi · Dec 16, 2022

Kaotik said:
Slide numbers are always under specific conditions and should be read "up to xyz", not "xyz". Not seeing 2.3 times the performance in one test doesn't mean anything for any other scenario using said feature.

I agreee but this benchmark was written that it should pull out the most of the MultiDrawIndirect feature. This feature is DX12 and Vulcan Api related. So if AMD has an accelerator it should automaticly kick in if sombody uses MulitDrawIndirect over the API. That is confusing.

AMD RDNA3 Specifications Discussion Thread

DegustatoR

Subtlesnake

MuteyM

Silent_Buddha

MuteyM

MuteyM

Digidi

Attachments

DegustatoR

AMD Radeon RX 7900 XTX & RX 7900 XT im Test: Technikvergleich RDNA 3 vs. RDNA 2, OC und UV

JoeJ

Granath

Subtlesnake

pharma

Et si... les RX 7900 avaient un bug hardware ?

Eternalight7

MuteyM

tsa1

Granath

Bondrewd

Digidi

Kaotik

Drunk Member

Digidi

Similar threads