The AMD 9070 / 9070XT Reviews and Discussion Thread

Wesker · Mar 7, 2025

homerdog said:
C) Start selling the cards directly and cut out the board partners.

This, the Apple model.

nutball · Mar 7, 2025

Wesker said:
This, the Apple model.

Or 3dfx. That worked well for them.

RobertR1 · Mar 7, 2025

RobertR1 · Mar 7, 2025

What AMD needs to focus on is communicating and advertising the rate of FSR4 adoption so people have confidence to purchase.

Florin · Mar 7, 2025

RobertR1 said:
What AMD needs to focus on is communicating and advertising the rate of FSR4 adoption so people have confidence to purchase.

Yep. Really FSR 3.1 though, so owners of RX 6k and 7k (and compatible GPUs from competitors) continue to get the benefit of some sort of upscaling, while RX 90xx owners can get the quality improvement.

The much bigger installed base would make adoption an easier sell.

Malo · Mar 7, 2025

This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?

techuse · Mar 7, 2025

trinibwoy said:
And GB203 gets more performance out of less area, power and fillrate. Are we just picking random metrics now?

Random? Arithmetic, texturing and bandwidth have been the pillars of traditional rendering performance for decades.

Kaotik · Mar 7, 2025

Malo said:
This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?

To my understanding it's not really required, but running it on older hardware would be twice as heavy. It uses INT8, which RX 7000 technically supports but only runs as fast as FP16, while RX 9000 does double
edit:
scratch that, I think it does quadruple compared to 7000

trinibwoy · Mar 7, 2025

techuse said:
Random? Arithmetic, texturing and bandwidth have been the pillars of traditional rendering performance for decades.

Not when discussing architectural efficiency. The pillars there are PPA and PPW. Peak flops haven’t been correlated with performance for some time and certainly not when comparing different architectures. They’re mostly marketing numbers at this point. The bandwidth comparison is interesting but without knowing how each chip would perform with similar off-chip bandwidth any conclusions there are speculative at best.

You can’t argue with power or area though.

Bondrewd · Mar 7, 2025

Malo said:
but what hardware in the 9000 series is required for FSR4?

FP8 matmul with WMMA.

Kaotik said:
It uses INT8

FP8.

trinibwoy said:
The pillars there are PPA and PPW

perf per offchip byte also matters, memory ain't growing on trees anymore.

eastmen · Mar 7, 2025

nutball said:
Or 3dfx. That worked well for them.

Why not both? Microsoft has its surface line and hp, dell and others still produce pcs and laptops.

AMD could make cards exactly to base specs and let the aib's make custom cards that are price higher. Isn't that sorta the point of the nvidia founders editions?

Kaotik · Mar 7, 2025

Bondrewd said:
FP8 matmul with WMMA.

FP8.

perf per offchip byte also matters, memory ain't growing on trees anymore.

Irrelevant really in this case if it's FP8 or INT8, but at least AMDs FSR 4 slide talked specifically about TOPS, not FLOPS

Bondrewd · Mar 7, 2025

Kaotik said:
Irrelevant really in this case if it's FP8 or INT8

Very relevant.
RDNA3 has no FP8 support.

Kaotik said:
but at least AMDs FSR 4 slide talked specifically about TOPS, not FLOPS

You can quite literally disassemble the FSR4 driver .dll to see all the FP8 layers embedded.

pcchen · Mar 7, 2025

eastmen said:
Why not both? Microsoft has its surface line and hp, dell and others still produce pcs and laptops.

AMD could make cards exactly to base specs and let the aib's make custom cards that are price higher. Isn't that sorta the point of the nvidia founders editions?

I think the problem is NVIDIA's FE is sort of in limited quantities. They are generally only available when a GPU was released and it can be difficult to find a few months later.
If NVIDIA or AMD started to make first party cards in large quantities, this will create problems in relationships with AIB partners. Surface and Surface Pro are more available but they are still just a drop in the bucket.
I also don't think it's good for consumers if there's only first party cards.

eastmen · Mar 7, 2025

Malo said:
This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?

From digital foundry's video

about 1 minute in. Maybe as they tune the algorithm they can reduce the requirements. However it seems like that previous rdna 3 chips aren't capable of it.

I couldn't find much about int 8 on the previous chips but according to toms hardware https://www.tomshardware.com/news/amd-rdna-3-gpu-architecture-deep-dive-the-ryzen-moment-for-gpus

The 7900xtx is able to do 122 int 16 tops. If the performance doubles like it does from int 32( 61.4) to in 16 in the list then we are looking at 244 int-8 tops for the 7900xtx and 206 for the 7900xt. Going further down the stack would only give less and less performance.

So I think even if they tune the algorithm it may be impossible on rdna 3.

eastmen · Mar 7, 2025

pcchen said:
I think the problem is NVIDIA's FE is sort of in limited quantities. They are generally only available when a GPU was released and it can be difficult to find a few months later.
If NVIDIA or AMD started to make first party cards in large quantities, this will create problems in relationships with AIB partners. Surface and Surface Pro are more available but they are still just a drop in the bucket.
I also don't think it's good for consumers if there's only first party cards.

I don't think only amd cards would work either. But if they made a generic card that strictly adheres to the base spec of the cards there shouldn't be much of an issue. Third parties can add better cooling and clock speeds while pricing them higher. I'd certainly pay a premium for a more compact card that offers better cooling than the stock one or has a built in water block I think a lot of people would.

fellix · Mar 7, 2025

eastmen said:
The 7900xtx is able to do 122 int 16 tops. If the performance doubles like it does from int 32( 61.4) to in 16 in the list then we are looking at 244 int-8 tops for the 7900xtx and 206 for the 7900xt. Going further down the stack would only give less and less performance.

Those rates are for vector op's not WMMA. here's what RDNA3 vector ALU is capable for WMMA:

trinibwoy · Mar 7, 2025

Bondrewd said:
perf per offchip byte also matters, memory ain't growing on trees anymore.

Yeah we don’t have the SKUs to compare though. We don’t know how bandwidth limited N48 is and we don’t know how much excess bandwidth GB203 has. Best we can say is fast GDDR7 helps. How much is anyone’s guess.

itsmydamnation · Mar 8, 2025

trinibwoy said:
Yeah we don’t have the SKUs to compare though. We don’t know how bandwidth limited N48 is and we don’t know how much excess bandwidth GB203 has. Best we can say is fast GDDR7 helps. How much is anyone’s guess.

Given that 20gbps can normally hit about ~22gbps OC , we should be able to get a pretty good idea just going 18 , 20, 21, 22, with same clock.

But either way given a 7900XTX running at 2700mhz mem in CP2077 is seeing about a %8-10 performance ( 50 +-1 to 55 +-1) increase over 2500mhz ( i have lots of mods , optimised RT and quality settings for my preferences so Mileage may vary (upscale , frame gen disabled to test) ) and it has the same CU L0/L1/L2 , LLC to memory bandwidth ratio as the 9070XT. With the improvements to ALU /CU utilisation/ out of order memory etc it is very likely a significate amount of performance is left on the table at 20gbps for the 7090XT.

If i was a betting man i would say its more then likely it is more bandwidth constrained then 7900XTX is, hell chips and cheese hit 3.8ghz ALU only workload, you think if they could feed the beast they wouldn't run a decoupled shader clock like the 7900XTX, blow the power budget etc etc?

You seem to be trying quite hard to down play / exclude this parameter. If a 5070ti didnt need 40% more bandwdith why would Nvidia equip it with GDDR7 when there is a significate premium on it over GDDR, you think NV's a charity ? Why then would they run the 5080 with almost identical Shader to bandwidth ratio as the 5070ti ?

trinibwoy · Mar 8, 2025

itsmydamnation said:
Given that 20gbps can normally hit about ~22gbps OC , we should be able to get a pretty good idea just going 18 , 20, 21, 22, with same clock.

But either way given a 7900XTX running at 2700mhz mem in CP2077 is seeing about a %8-10 performance ( 50 +-1 to 55 +-1) increase over 2500mhz ( i have lots of mods , optimised RT and quality settings for my preferences so Mileage may vary (upscale , frame gen disabled to test) ) and it has the same CU L0/L1/L2 , LLC to memory bandwidth ratio as the 9070XT. With the improvements to ALU /CU utilisation/ out of order memory etc it is very likely a significate amount of performance is left on the table at 20gbps for the 7090XT.

If i was a betting man i would say its more then likely it is more bandwidth constrained then 7900XTX is, hell chips and cheese hit 3.8ghz ALU only workload, you think if they could feed the beast they wouldn't run a decoupled shader clock like the 7900XTX, blow the power budget etc etc?

You seem to be trying quite hard to down play / exclude this parameter. If a 5070ti didnt need 40% more bandwdith why would Nvidia equip it with GDDR7 when there is a significate premium on it over GDDR, you think NV's a charity ? Why then would they run the 5080 with almost identical Shader to bandwidth ratio as the 5070ti ?

How are you factoring in improvements to N48’s cache implementation. Are you sure you’re not trying hard to overestimate N48’s bandwidth limits? I really don’t see the point in making stuff up out of thin air and trying to rationalize it based on pure guesswork.

To expand on your point. If N48 is severely bandwidth limited why would AMD ship at clocks and power that can’t reach its potential with GDDR6? It goes both ways.

The AMD 9070 / 9070XT Reviews and Discussion Thread

Wesker

nutball

RobertR1

Pro

RobertR1

Pro

Florin

Merrily dodgy

Malo

Yak Mechanicum

techuse

Kaotik

Drunk Member

trinibwoy

Meh

Bondrewd

eastmen

Kaotik

Drunk Member

Bondrewd

pcchen

Moderator

eastmen

eastmen

fellix

trinibwoy

Meh

itsmydamnation

trinibwoy

Meh

Similar threads