AMD RX 7900XTX and RX 7900XT Reviews

Also, when it can't dual issue from the same wavefront, why can't they make it so that it could fill the other ALU from another wavefront that is ready to go?

Seems like a no-brainer at a high level, so there must be some difficult technical hurdle to overcome to make it work.
Beyond the doubled scheduler throughput, issuing from two wavefronts at once means maintaining twice as many wavefronts in flight in the CU in order to keep up throughput. That also means doubling register size, you'll want to increase cache in line, and eventually you've just doubled the CU count.

Simply doubling up the ALUs certainly isn't going to bring anywhere near double performance, but also comes in with a much smaller increase in die size and power draw. And those are the variables you're trying to optimise your performance against, metrics like realised IPC/peak IPC are at best a means rather than an end.
 
Also, when it can't dual issue from the same wavefront, why can't they make it so that it could fill the other ALU from another wavefront that is ready to go?

Seems like a no-brainer at a high level, so there must be some difficult technical hurdle to overcome to make it work.
You need to expand the VRF and operand network to sustain up to 2x arbitrary operands from up to 2x VALU instructions co-issuing at full rate.

Look at the register usage limitations imposed on VOPD. That is a clear sign that they did not so, at least in RDNA 3. It would also require them to have hardware checks to detect & schedule around these limitations across wavefronts (beyond just VRF bank conflicts)
 
Last edited:
Beyond the doubled scheduler throughput, issuing from two wavefronts at once means maintaining twice as many wavefronts in flight in the CU in order to keep up throughput. That also means doubling register size, you'll want to increase cache in line, and eventually you've just doubled the CU count.

Simply doubling up the ALUs certainly isn't going to bring anywhere near double performance, but also comes in with a much smaller increase in die size and power draw. And those are the variables you're trying to optimise your performance against, metrics like realised IPC/peak IPC are at best a means rather than an end.

This is why I like the variable wavefront idea. If register/cache occupancy is low and the wavefront can be split, then do so, in half or quarters or whatever. Then adapt to higher occupancy as you go. With all the passes and work variability these days it seems like an optimization waiting to happen, rather than defaulting to the worst case scenario every time.

There's hw and problems to solve, but it seems like optimizing wavefront on the size to hit high utilization might be worth it.
 
If a retape is needed yes, but if metal spin is enough it would be A1 (which still takes 3-6 months apparently)
Yeah, that is where I'm confused.
A respin of N31 into A1 shouldn't take more than 2-3 months.
It would also mean nothing is fundamentally broken with the design.
 
The one thing that is just giving me cause for concern right now is – you guessed it! – pricing. While currently out of stock here in the UK, the card launched at about £1300 at OCUK, making it £100 more expensive than some RTX 4080s currently on the market.

This was a worry for me when the 7900 XTX launched. It's all well and good for the reference card to come in at £999 – but custom cards still need to be cheaper than the RTX 4080 for the 7900 XTX to offer any meaningful value. As good as Sapphire’s design is, if you can get something with similar overall performance, but with DLSS support and superior ray tracing, for £100 less…. well, the 7900 XTX becomes very hard to justify.
 
Last edited by a moderator:
The 7900XTX Sapphire Nitro+ upps the power limit to 440w, yet it results in 2% to 4% performance increases across 5 games (Cyberpunk, Resident Evil, Spider-Man, Far Cry 6 and Horizon) at 4K.

Undervolting and overclocking the card to 3GHz, pushed the power consumption to 490w to 510w and yielded 10% performance uplift over base model. Without undervolting I guess 530w is not that far off.

 
Last edited:
Something is wrong with N31 frontend. Here also the Draw Call Overhead Test in 3dmark is much with than expected.

So mesh shader are maybe broken, drawcalls are maybe broken, MultiDrawIndirect is maybe broken?

 
Yea, I recommend putting a budget on the side to repaste and repad any used card you get.

Things generally are more expensive over here though thermal paste and thermal pads are not one of them.
Not all gpus are used to mine either, sellers do state if cards have been used for mining.
 
Something is wrong with N31 frontend. Here also the Draw Call Overhead Test in 3dmark is much with than expected.

So mesh shader are maybe broken, drawcalls are maybe broken, MultiDrawIndirect is maybe broken?

Do you think that AMD could fix it with a driver update?
 
tbqh if you're buying RDNA2 or Ampere, it really doesn't matter if they've been mined on. The cards are new enough that the wear and tear on the card is unlikely to mean anything at all. Linus demonstrated an even older card (I wanna say an RX5x0 series?) that had been mined on for many years and still performed fine even with that age and usage level, still worked with no issues.

GPUs don't exactly die easily.
 
Back
Top