AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Bulk of the video is that The 6800xts are as fast as the 30x0 series cards but RT performance isn't as good and no DLSS

I think AMD really needed to have a DLSS option at launch , even if it was just a demo of it on one game.

He seems to be more upbeat about the 6800 non xt .
 
I like the fact that he was "torn" about how to review cards since one has an upscale solution and the other doesn't. It was interesting having his opinion about that.
 
I'm still not convinced these are "real" clocks per se - why AMD's own presentation mentions much lower clocks for both 6800 and 6800xt while in reality they are 200-300 mhz higher even with stock limits / heatsink. In poorly optimized titles (like FF XV demo with its "inside-out" tessellated cows), my Vega boosts upwards of 1700 while in more demanding games it goes down to 1650-1680, depending on the resolution etc. What I mean, if load is not high enough, the card boosts very high but it's not "real" frequency, as in it literally can't work at such high clocks in any real task.

Of course, this descrepancy between advertised and actual clocks can be just a mind trick to make people believe that card is overperforming their expectations (like nV's boost since Maxwell 2.0), but given it's the AMD we are talking about, it might just be a glitch or some sort of "peak" boost clock in a given second. Hopefully power tables will be available for use at some point, it seems the GPUs are power-limited most of the time, not just max-frequency limited.

The clock is as real as it gets. It's not any different to CPU turbo where on lighter loads (less parallel) the clock can hit higher frequency due to power and temperature headroom. My 6800XT can boost to 2720MHz in certain simpler tasks like some scenes from older 3D Marks or Unigine Heaven, but will limit itself to 2350MHz in really heavy scenes in other engines which use a lot of complex shaders and transparencies. In my case, I'm hitting power limit as by undervolting I'm moving that bottom clock upwards to almost 2400MHz with drop from 1.15v to 1.075v. On the other hand, this drop also limits my max. clock in light scenes to about 2600MHz.

In real games like Asserto Corsa, Doom Ethernal or FarCry 4 I see my card averaging closer to upper bound of 2450-2650MHz at my overclock than to worst case scenario 2350MHz.

Once BIOSes with higher power limits and voltages are out, I'm sure my card will reach higher average clocks at the cost of power, but I will WC it when my EK block arrives, ready for 400W heat source ;)
 
I found the animated graphic about Infinity Cache +VRAM being equivalent of 1664GB/s bandwidth to be interesting. I assume that was from AMD themselves so I wonder how it's calculated. Taking the 2TB* the 4K hit rate of 58% plus the 512GB/s of the VRAM comes close.

So the 1664GB/s are a worst case scenario, since resolutions lower than 4K should have a higher hit rate.
 
Has anyone tried to make benchmarks to test what would happen with inifinity cache if there was a workload that tried to utilize full 16GB of ram? This could be done to simulate future games with potentially larger asset sizes and more memory accesses per frame? Maybe something like blender could be usable for this purpose(same scene with lower/higher quality assets)
 
Has anyone tried to make benchmarks to test what would happen with inifinity cache if there was a workload that tried to utilize full 16GB of ram? This could be done to simulate future games with potentially larger asset sizes and more memory accesses per frame? Maybe something like blender could be usable for this purpose(same scene with lower/higher quality assets)

I'd imagine the Infinity Cache is used for pixel/compute shaders that happen mostly on a per-pixel basis, so they depend directly on the amount of pixels to render.
 
The clock is as real as it gets. It's not any different to CPU turbo where on lighter loads (less parallel) the clock can hit higher frequency due to power and temperature headroom.
I truly hope the clocks are real and it's just memory bottleneck that limits the performance, but here's my thoughts:

It's hard to see the clock uplift without actually measuring performance and comparing one result to another. For example, I can force my vega to run at 1700 ish clock with liquid edition SPPT (mostly for those 1.25Vmax), but the performance (as measured in Firestrike GS) would be actually lower (and my card is definitely not thermal throttling, it's very far away from hotspot tjmax). What happens, I guess, is that the GPU spends more time in p6 state rather than p7 state (which we can't see without special measuring tools like a very good oscilloscope as the frequency is updated 10000 times in a second) due to more micro-instabilities, which ultimately lowers the actual clocks without the software monitoring noticing it.
 
Some folks seem to be reporting the “max” clock from the overclocking software and not the actual measured clock. While that’s helpful for comparison to other people’s max clock it may not tell you what clock the cards are actually running at.
Yeah, I wish reviewers actually measured real clocks on all GPUs, otherwise we get situations where a thermally throttling reference Vega 56 (that actually runs at 1.25ghz) is compared to a 2 Ghz Pascal GPU which routinely goes 100-250 mhz above it's actual max boost clocks . It's actually a shame that virtually no one is really interested now both in the arctitectural performance and hardware engineering stuff
 
Yeah, I wish reviewers actually measured real clocks on all GPUs, otherwise we get situations where a thermally throttling reference Vega 56 (that actually runs at 1.25ghz) is compared to a 2 Ghz Pascal GPU which routinely goes 100-250 mhz above it's actual max boost clocks . It's actually a shame that virtually no one is really interested now both in the arctitectural performance and hardware engineering stuff

There’s probably no money in the deep dive tech stuff like we got in the early 2000’s. Ad revenue determines what content we see and there’s nobody doing articles as a hobby anymore like young Anand back in the day.
 
That was explored a bit earlier in the thread here:
https://forum.beyond3d.com/posts/2168954/

Thanks, I'd forgot about that. It does seem to add up perfectly but then it also conflicts with the other AMD slides showing the IC providing ~2TB/s of bandwidth from a 1024bit interface at 1.94ghz.

But then we know the IC can be overclocked automatically as needed so perhaps 1.94ghz is the max OC while the base speed is half the GPU game clock, I.e. 1.125Ghz.
 
Thanks, I'd forgot about that. It does seem to add up perfectly but then it also conflicts with the other AMD slides showing the IC providing ~2TB/s of bandwidth from a 1024bit interface at 1.94ghz.

But then we know the IC can be overclocked automatically as needed so perhaps 1.94ghz is the max OC while the base speed is half the GPU game clock, I.e. 1.125Ghz.
1.94 GHz is the max. boost value. Standard IC clock is at 1.4 GHz.

So the 1664GB/s are a worst case scenario, since resolutions lower than 4K should have a higher hit rate.
Worst case scenario is 512 GByte/s (plus a minuscule amount from IC) when data sets massively exceed the 128 MByte (like Dagger Hashimoto). For gaming it depends on how close you get with the processing elements to your power limit and what priority the IC is given in this case.
Assuming the core is running at the ASICs power limit and IC is not throttled, it should run at least at 1.43 TByte/s; ×0,58 = 831,5 GB/s; +512 GByte/s = ~1.34 TByte/s effective transfer rate in 4k gaming while at power limit.
 
I truly hope the clocks are real and it's just memory bottleneck that limits the performance, but here's my thoughts:

It's hard to see the clock uplift without actually measuring performance and comparing one result to another. For example, I can force my vega to run at 1700 ish clock with liquid edition SPPT (mostly for those 1.25Vmax), but the performance (as measured in Firestrike GS) would be actually lower (and my card is definitely not thermal throttling, it's very far away from hotspot tjmax). What happens, I guess, is that the GPU spends more time in p6 state rather than p7 state (which we can't see without special measuring tools like a very good oscilloscope as the frequency is updated 10000 times in a second) due to more micro-instabilities, which ultimately lowers the actual clocks without the software monitoring noticing it.
I think the mistake you're making is to assume that AMD was stupid enough to continue to use the Vega architecture for power/clocking.

AMD's done the work to catch up with, seemingly, Maxwell in terms of per ALU or per fixed function unit, power usage. We can argue the details, but AMD has finally arrived at a competitive position, something RDNA clearly wasn't, with 5700XT using as much power as 2080Ti for substantially less performance.
 
Back
Top