AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
2080Ti is 23% faster than rated while 3080 is 5% faster (both compared with boost clock). So a 14% cut-down GA102 is throttling dramatically more than a 6% cut-down TU102.

Boost clock on the box is pretty meaningless. That just means Nvidia left more boost headroom on Turing, it doesn't have anything to do with throttling.

Is that sustained or over quickly? If it's not sustained, then left for longer, GA102 would get slower and slower.
Not sure why that matters. Power limits are instantaneous.

So the evidence right now appears to indicate that compute is throttling GA102 and that Furmark doesn't.

Not quite. We don't know what clocks AIDA64 ran at.
 
Arcturus has 8192 FP32 ALU lanes, probably with 1:2 FP64 throughput. And if RDNA2's clocks are any indication, it too should clock at around 2GHz for >32 TFLOPs.
There's also the introduction of matrix multiplication instructions that may indicate significantly more hardware alongside the SIMD. The expectation for the target market is much higher sustained ALU throughput, and there were some driver changes indicating a discussion about disabling things like boost because of the prospect of thermal throttling being likely.
 
Clock is locked at 1980 (far above base clock) here on 3080 stress test:
Fun fact: With 8xMSAA, core clock stays at 19xx MHz on the card I currently have here. Without it, clocks go down to 15xx-ish MHz while fps go up 200 % (90-ish to 280-ish).
edit: clocks go up with MSAA-levels; MSAA is not part of any preset for Furmark

So the evidence right now appears to indicate that compute is throttling GA102 and that Furmark doesn't.
I'm seeing an average of 1750 MHz on a partner 3080 (OC model) during the few seconds of the SPFLOPS-Test in AIDA64. Card is power-limited at that time. Data based on GPU-z on 0.1s intervall.
That's still above the advertised boost clock, just not max. boost.
 
Last edited:
The current hardware stores the vertex attributes that are used for interpolation in LDS during a pixel shader invocation. (For up to 16 triangles per PS wave and you interpolate them with V_INTERP_P1_F32 and V_INTERP_P2_F32 instructions in the shader which read directly from LDS)
Looking at some of my ancient posts it seems I knew this, but forgot. Sigh.

Fun fact: With 8xMSAA, core clock stays at 19xx MHz on the card I currently have here. Without it, clocks go down to 15xx-ish MHz while fps go up 200 % (90-ish to 280-ish).
edit: clocks go up with MSAA-levels; MSAA is not part of any preset for Furmark
So, to do a stress test with Furmark, MSAA must be switched off?

Does the clock immediately hit base clock ("power virus clock"), or does the clock start high then gradually decline over time?

Does the clock ever sit at base clock on Ampere during a Furmark stress test?

Which GPUs sit at base clock during a Furmark stress test?

I'm seeing an average of 1750 MHz on a partner 3080 (OC model) during the few seconds of the SPFLOPS-Test in AIDA64. Card is power-limited at that time. Data based on GPU-z on 0.1s intervall.
That's still above the advertised boost clock, just not max. boost.
So the problem here is that GPUs, like CPUs, limit thermal load by downclocking over time with sustained work. A 5s (guess) FP32 test isn't "sustained". This is complicated by the overall capability of the cooling system (and fan speed will vary).
 
So, to do a stress test with Furmark, MSAA must be switched off?
Does the clock immediately hit base clock ("power virus clock"), or does the clock start high then gradually decline over time?
Does the clock ever sit at base clock on Ampere during a Furmark stress test?
Which GPUs sit at base clock during a Furmark stress test?

1) The presets all don't use MSAA and from what I saw while playing around for a few minutes: Yes.
2) The clock pretty much instantly hits it's low but mainly sustained level.
3) Dips down to near-base clocks at irregular intervals, yes.
4) I don't know from the top of my head. The older the cards, the more likely I guess, since power control has greatly improved over time.
 
In the videos I linked, Furmark framerates varied massively. Any ideas why?

I wonder if the peaks in framerate coincide with the maximum power ("dips down to near-base clocks").

I continue to contend that the reason there's rumoured to be about a 900MHz range in clocks for Navi 21 is that power consumption will be massive when 160 CUs are running intensively for sustained periods.
 
225W would be a weird board limit. That would be like 100% performance per watt improvement. Actually more than that if it’s 72CUs

edit: wow totally misread that. I thought he said 6700xt was 225W. I’m tired.

anyway it seems like it’ll be close to 300W because the boost clocks are putting it at like a 140% performance increase with less than a 100% increase in CUs,and the architecture is only supposed to be 50% performance per watt increase.
 
Last edited:
Let's do some naive napkin math.

Let's assume the 6900xt is a 72CU variant. The 5700 is a 36CU card. It's 180W TBP. Say the PCB is 40W.

180W - 40W = 140W for the 5700 chip. That should be at game clocks, because everything I've seen shows typical board power being pretty close to sustained power draw during gaming.

Let's say rogame is right and the 6900 XT game clock is 2000-2100. 2000 MHz / 1655 MHz = 1.21 and 2100 / 1655 MHz = 1.27. These number looks pretty good, because AMD is advertising a 50% performance per watt increase, so it would seem like they've sunk some of that into performance and some into lowering powering consumption.

Let's just do a naive doubling of the 5700 assuming the all of the performance per watt improvements went into lowering power.

140 / 1.5 = 93W
93W x 2 = 186W

We haven't considered the 1.21 - 1.27 clock increase yet. This is where I'm not sure what the power increase would be. The 5700xt has a 6% clock advantage (game clock) over the 5700 at the cost of an extra 25% total power. I know the relationship is roughly something like voltage squared x frequency + some static power loss for a processor, but in this case we have no idea what the voltage is, or what the voltage vs frequency curve looks like.

The PCB here is probably going to be a bit bigger, so let's say 50W.
186W + 50W = 236W

That means we're greater than 236W. How much, I'm not really sure. I was expecting close to 300W, but maybe not. That's 65W to play with. Could clocks and PCB eat up that much ... probably, but can't really tell.

5700 gets ~35 avg in Gears 5 4k ultra. A naive doubling gives you 70 fps.
5700 gets ~30 avg in Borderlands 3 4k badass. A naive doubling gives you 60 fps.

The performance numbers AMD showed were:
Gears 5 4k ultra 73 fps
Borderlands 3 4k badass 60 fps.

Those numbers match up pretty well. Considering we haven't considered the 21 - 27% game clock increase yet, these numbers could make a lot of sense. Scaling will not be perfect when doubling CUs, and there could be other things limiting scaling like VRAM bandwidth.
 
The reference cooler we saw can't really handle more than 300W can it?

Vega 7 heatsink was already capable of handling 300W. Arguably even the old blowers were kinda able to handle 300W (depending on how you want to define "handle"). Unless there is some severe flaw with the design internally (since we haven't seen that) it should be able to do 300W.

With the market accepting 3 fan 2+ slot heatsinks as the standard now for high end graphics cards the reality is the old defacto limit of 250W is way too low, even 300w is low, as those designs should scale up to 350w+.

It's also in my opinion setting up a possible future of MCM base designs since if we ever move to that we might be seeing 400W+ as the high end as anything lower will liekly be leaving significant performance on the table that people are willing to pay for.
 
The higher clocks will then result in going above the original performance/power point and result in the higher power usage.
What is the original performance/power point?

I just don't see why the clocks that are coming in the cards aren't the originally intended.
We do have a PS5 clocked at 2.23GHz on what seems to be a ~150-200W power budget for the APU.
 
With half of the CUs..
Then take away the power allocated to the CPU by default (45-50W?) and double the resulting number.

Question is "why would 2.1GHz core clock not be AMD's original intent for Navi 21"? Or rather, why are we already following the suggestion that AMD had to react by "over"clocking their new graphics cards?
 
Even if this rumor becomes true and top end one 6900 XT is 320W (350 or more for AIBs), it could still be another navi 21 which AMD showed on Ryzen keynote and this be in RTX 3090 territory at 4k or faster at 1080, 1440p, the one AMD would call halo one. According to redgaming AMD will launch 3 navi 21 models next week: 6900 XT, 6800 XT and 6800, so 6800XT could be 64/72cu and more efficient while close to RTX 3080 perf and match what AMD showed too..will see how it turns out. Other rumor said top end would launch with AMD only design with no AIBs at launch, so...
 
With half of the CUs..

Yah, as above my guess would be around 235W for a 72CU big navi @ 1655 MHz game clock. Put that up to 2000-2100 MHz and you've gotta be pushing closer to 300W. It's a 345-445 MHz increase. It's going to be substantial in terms of power because the voltage will have to go up.
 
Status
Not open for further replies.
Back
Top