The AMD 9070 / 9070XT Reviews and Discussion Thread

What AMD needs to focus on is communicating and advertising the rate of FSR4 adoption so people have confidence to purchase.
Yep. Really FSR 3.1 though, so owners of RX 6k and 7k (and compatible GPUs from competitors) continue to get the benefit of some sort of upscaling, while RX 90xx owners can get the quality improvement.

The much bigger installed base would make adoption an easier sell.
 
This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?
 
This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?
To my understanding it's not really required, but running it on older hardware would be twice as heavy. It uses INT8, which RX 7000 technically supports but only runs as fast as FP16, while RX 9000 does double
edit:
scratch that, I think it does quadruple compared to 7000
 
Last edited:
Random? Arithmetic, texturing and bandwidth have been the pillars of traditional rendering performance for decades.

Not when discussing architectural efficiency. The pillars there are PPA and PPW. Peak flops haven’t been correlated with performance for some time and certainly not when comparing different architectures. They’re mostly marketing numbers at this point. The bandwidth comparison is interesting but without knowing how each chip would perform with similar off-chip bandwidth any conclusions there are speculative at best.

You can’t argue with power or area though.
 
Or 3dfx. That worked well for them.
Why not both? Microsoft has its surface line and hp, dell and others still produce pcs and laptops.

AMD could make cards exactly to base specs and let the aib's make custom cards that are price higher. Isn't that sorta the point of the nvidia founders editions?
 
Why not both? Microsoft has its surface line and hp, dell and others still produce pcs and laptops.

AMD could make cards exactly to base specs and let the aib's make custom cards that are price higher. Isn't that sorta the point of the nvidia founders editions?

I think the problem is NVIDIA's FE is sort of in limited quantities. They are generally only available when a GPU was released and it can be difficult to find a few months later.
If NVIDIA or AMD started to make first party cards in large quantities, this will create problems in relationships with AIB partners. Surface and Surface Pro are more available but they are still just a drop in the bucket.
I also don't think it's good for consumers if there's only first party cards.
 
This has probably already been discussed, but what hardware in the 9000 series is required for FSR4?
1741374117624.png

From digital foundry's video
about 1 minute in. Maybe as they tune the algorithm they can reduce the requirements. However it seems like that previous rdna 3 chips aren't capable of it.

I couldn't find much about int 8 on the previous chips but according to toms hardware https://www.tomshardware.com/news/amd-rdna-3-gpu-architecture-deep-dive-the-ryzen-moment-for-gpus

The 7900xtx is able to do 122 int 16 tops. If the performance doubles like it does from int 32( 61.4) to in 16 in the list then we are looking at 244 int-8 tops for the 7900xtx and 206 for the 7900xt. Going further down the stack would only give less and less performance.

So I think even if they tune the algorithm it may be impossible on rdna 3.
 
I think the problem is NVIDIA's FE is sort of in limited quantities. They are generally only available when a GPU was released and it can be difficult to find a few months later.
If NVIDIA or AMD started to make first party cards in large quantities, this will create problems in relationships with AIB partners. Surface and Surface Pro are more available but they are still just a drop in the bucket.
I also don't think it's good for consumers if there's only first party cards.
I don't think only amd cards would work either. But if they made a generic card that strictly adheres to the base spec of the cards there shouldn't be much of an issue. Third parties can add better cooling and clock speeds while pricing them higher. I'd certainly pay a premium for a more compact card that offers better cooling than the stock one or has a built in water block I think a lot of people would.
 
perf per offchip byte also matters, memory ain't growing on trees anymore.

Yeah we don’t have the SKUs to compare though. We don’t know how bandwidth limited N48 is and we don’t know how much excess bandwidth GB203 has. Best we can say is fast GDDR7 helps. How much is anyone’s guess.
 
Yeah we don’t have the SKUs to compare though. We don’t know how bandwidth limited N48 is and we don’t know how much excess bandwidth GB203 has. Best we can say is fast GDDR7 helps. How much is anyone’s guess.
Given that 20gbps can normally hit about ~22gbps OC , we should be able to get a pretty good idea just going 18 , 20, 21, 22, with same clock.

But either way given a 7900XTX running at 2700mhz mem in CP2077 is seeing about a %8-10 performance ( 50 +-1 to 55 +-1) increase over 2500mhz ( i have lots of mods , optimised RT and quality settings for my preferences so Mileage may vary (upscale , frame gen disabled to test) ) and it has the same CU L0/L1/L2 , LLC to memory bandwidth ratio as the 9070XT. With the improvements to ALU /CU utilisation/ out of order memory etc it is very likely a significate amount of performance is left on the table at 20gbps for the 7090XT.

If i was a betting man i would say its more then likely it is more bandwidth constrained then 7900XTX is, hell chips and cheese hit 3.8ghz ALU only workload, you think if they could feed the beast they wouldn't run a decoupled shader clock like the 7900XTX, blow the power budget etc etc?

You seem to be trying quite hard to down play / exclude this parameter. If a 5070ti didnt need 40% more bandwdith why would Nvidia equip it with GDDR7 when there is a significate premium on it over GDDR, you think NV's a charity ? Why then would they run the 5080 with almost identical Shader to bandwidth ratio as the 5070ti ?
 
Given that 20gbps can normally hit about ~22gbps OC , we should be able to get a pretty good idea just going 18 , 20, 21, 22, with same clock.

But either way given a 7900XTX running at 2700mhz mem in CP2077 is seeing about a %8-10 performance ( 50 +-1 to 55 +-1) increase over 2500mhz ( i have lots of mods , optimised RT and quality settings for my preferences so Mileage may vary (upscale , frame gen disabled to test) ) and it has the same CU L0/L1/L2 , LLC to memory bandwidth ratio as the 9070XT. With the improvements to ALU /CU utilisation/ out of order memory etc it is very likely a significate amount of performance is left on the table at 20gbps for the 7090XT.

If i was a betting man i would say its more then likely it is more bandwidth constrained then 7900XTX is, hell chips and cheese hit 3.8ghz ALU only workload, you think if they could feed the beast they wouldn't run a decoupled shader clock like the 7900XTX, blow the power budget etc etc?

You seem to be trying quite hard to down play / exclude this parameter. If a 5070ti didnt need 40% more bandwdith why would Nvidia equip it with GDDR7 when there is a significate premium on it over GDDR, you think NV's a charity ? Why then would they run the 5080 with almost identical Shader to bandwidth ratio as the 5070ti ?

How are you factoring in improvements to N48’s cache implementation. Are you sure you’re not trying hard to overestimate N48’s bandwidth limits? I really don’t see the point in making stuff up out of thin air and trying to rationalize it based on pure guesswork.

To expand on your point. If N48 is severely bandwidth limited why would AMD ship at clocks and power that can’t reach its potential with GDDR6? It goes both ways.
 
Last edited:
Back
Top