AMD RX 7900XTX and RX 7900XT Reviews


As usual, no shit. :)

If we would be talking about some products which are completely comparable otherwise and are manufactured by companies each of which holds roughly 50% of market share then yes, this would make sense.
In the situation where AMD GPUs are right now being just a little bit more attractive in price won't help to attract any Nv buyers to their products.
Even the +20% markup on 4080 should look fine from Nv's perspective - and it's not really a flat +20% either, this is mostly a US case I think with the gap being smaller in other regions.
So as unfortunate as it sounds it looks like Nvidia knew exactly where N31 cards will land and their pricing for both 4080 and 4090 reflect that knowledge very well.

That's a different argument from "Does this card provide a decent enough price/performance/cost ratio to the competition to be a rational choice for some" though. I'm saying that someone who does enough research and knows what they value in games could justify it - perhaps - against a $200 more costly 4080.

You're arguing from the perspective of "Does this card provide a value proposition attractive enough where it appreciably moves the marketshare needle for AMD", and on that front, I completely agree - it doesn't. The very fact you have to parse out these wins and there's so many if/buts, even with the $200 premium of the 4080 currently, precludes that from happening.

It's arguable even with a vastly superior product you're going to get any significant marketshare penetration from a ~$1k product in the first place, but when you're starting from the position AMD is, if you really want to claw back the gamer mindshare it has to be a slam dunk.

Recent rumors of 4070Ti "relaunch" happening at the exact same price as was planned for "4080 12GB" also make sense in this view - w/o RT it should land a bit below than 7900XT - which has the exact same price.

Eh, I don't know about that. The specs indicate a far more significant regression vs the 4080 16GB, and we're also talking 12GB vs 20. I think the performance gap in rasterized titles will be more significant than the XTX vs 4080.

If they release it at $799, then that's maybe a different matter.
 
While true I believe that this is mostly about GCN's execution pipeline being very compute friendly
Yes
in contrast to its graphics pipeline having all sorts of issues
Turns out it was "designed for Mantle, D3D12, Vulkan" without all that DX7/OpenGL cruft dragging the hardware and drivers down...
and less about suitability of the architecture to modern (gaming) compute tasks - which tend to be more "single threaded" these days (which is what RDNA has "fixed" in comparison to GCN by going from 64 wide 4 tick execution to a 32 wide 1 tick).
Arguably the introduction of wave32 is the key here, rather than the cadence of the ALUs. On PC, developers tuned for NVidia's preferred workgroup size of 32.

During the history of GCN AMD kept changing the scheduling algorithm for hardware thread progress (and batching of memory operations - wall clock time versus cache coherent access patterns), proof that it was a bane, so RDNA's approach is just another iteration of that continual search... Cache sizes inside shader engines keep growing, as more evidence.

Presumably this means that compute generally runs better in wave32 mode than in wave64.
Yes, probably mostly because the code was written for NVidia. But finding direct evidence is extremely hard... If you target a workgroup size of 32 then it results in many subtleties in the code when optimised so a GPU that wants 64 work items (GCN) is out of luck. Of course it's silly to assume that 32 is the best size for a given task.

I also vaguely remember seeing something about compute shaders being almost exclusively wave32 in RDNA compiler output?
With the compute APIs allowing any workgroup size, but with multiples of 32 being preferred, it makes sense. Remember, too, that RDNA's wave64 mode is a software kludge because there is no intrinsic hardware thread size of 64. There's only execution and condition code support for 64 work items. The compiler strings wave64 out on top of the intrinsic 32 work item hardware thread size and does some funny bitwise moves and math to fool the hardware, which is still working on 32 work items at any one time. You have to listen closely to AMD engineers descriptions of the hardware execution model to discern this...

These probably hit CPU limitations without RT and the GPU downclocks in the absence of workload.
Far Cry 6 not CPU limited, nor Spiderman Remastered...

Less power and/or temperature limited maybe?
Well, we have to be careful about "die quality" - chip was a fail either because it has a block that failed or MCD that failed? but otherwise might be a high clocker? Chip lottery...
 
As usual, no shit. :)
Not about me but more about the market. You're looking at a market where ~80% of consumers have been buying Nvidia GPUs for at least the last ~10 years.
Being "just a bit cheaper" while also not being on par on features and performance in some key areas won't help persuading these consumers to switch their choice of manufacturer.
It will allow AMD to retain their ~20% of the market at best as those who bought AMD over these years will still opt to buy AMD - as the situation hasn't really changed, we're more or less in the same comparative scenario we were since the launch of Maxwell.

I'm saying that someone who does enough research and knows what they value in games could justify it - perhaps - against a $200 more costly 4080.
If someone play Warzone only? Sure.
Such person wouldn't do much research though I feel.
These cards are for AMD die hards who have been buying AMD GPUs all the way back to 9700 probably. They will "do the research" and end up being fine with essentially the same issues and advantages which RDNA2 had.

You're arguing from the perspective of "Does this card provide a value proposition attractive enough where it appreciably moves the marketshare needle for AMD", and on that front, I completely agree - it doesn't. The very fact you have to parse out these wins and there's so many if/buts, even with the $200 premium of the 4080 currently, precludes that from happening.
Yep.

It's arguable even with a vastly superior product you're going to get any significant marketshare penetration from a ~$1k product in the first place, but when you're starting from the position AMD is, if you really want to claw back the gamer mindshare it has to be a slam dunk.
Well maybe not completely a slam dunk but at least a product which would not have obvious flaws like slow RT. This alone if it would be solved would make 7900XTX a clear winner at such price even despite it lacking some of competitor's features. Then again I don't think that AMD would've priced it this way if it were to be actually comparable everywhere with 4080.

Eh, I don't know about that. The specs indicate a far more significant regression vs the 4080 16GB, and we're also talking 12GB vs 20. I think the performance gap in rasterized titles will be more significant than the XTX vs 4080.
It will probably be more significant but not egregious. The XT seem to be roughly on par with 3090Ti w/o RT and whatever benchmarks were shown for 4080 12GB put it at about the same level minus some 5-10% or so. With a win in RT this will likely be enough for Nvidia to price the card directly against 7900XT still. Memory size difference won't stop them from this especially as 12GB is enough even for 4K+RT right now - which is arguably not a resolution people will be buying a "4070Ti" for.

If they release it at $799, then that's maybe a different matter.
That would essentially make the 7900XT obsolete half a month after launch. I doubt that they'll just cut their margins for no apparent reason like this, at least not at launch - as a repositioning in a year from now maybe.
 
Well, we have to be careful about "die quality" - chip was a fail either because it has a block that failed or MCD that failed? but otherwise might be a high clocker? Chip lottery...

Die variability seems really high here. At least one reviewer has been able to clock to original spec clocks of 3+ghz while drawing just 405w. Others report power leaks, bad clock limits. Makes it a bit hard to pin down where exactly things have gone wrong and where they might be fixed for future cards.

Arguably the introduction of wave32 is the key here, rather than the cadence of the ALUs. On PC, developers tuned for NVidia's preferred workgroup size of 32.

During the history of GCN AMD kept changing the scheduling algorithm for hardware thread progress (and batching of memory operations - wall clock time versus cache coherent access patterns), proof that it was a bane, so RDNA's approach is just another iteration of that continual search... Cache sizes inside shader engines keep growing, as more evidence.


Yes, probably mostly because the code was written for NVidia. But finding direct evidence is extremely hard... If you target a workgroup size of 32 then it results in many subtleties in the code when optimised so a GPU that wants 64 work items (GCN) is out of luck. Of course it's silly to assume that 32 is the best size for a given task.


With the compute APIs allowing any workgroup size, but with multiples of 32 being preferred, it makes sense. Remember, too, that RDNA's wave64 mode is a software kludge because there is no intrinsic hardware thread size of 64. There's only execution and condition code support for 64 work items. The compiler strings wave64 out on top of the intrinsic 32 work item hardware thread size and does some funny bitwise moves and math to fool the hardware, which is still working on 32 work items at any one time. You have to listen closely to AMD engineers descriptions of the hardware execution model to discern this...

I'd love to see someone adopt resizable hw wave/work item size. Getting one or two hits in an entire wave once you pass to ALU seems like a large waste of potential unless you're absolutely minimizing instructions and bandwidth on hit, which you're probably not doing on dynamic objects unless you somehow have realtime cache/lighting scheme there too.

There's some old ideas on it. With the advent of RT, however you're doing it, divergence is back to the fore after years and years of just pushing for higher res so divergence worries balance out any increase in branching/etc.
 
I'd argue Vega was still good for compute
Released the same year as V100 so lol.
Lmao.
This is worse IMO.
Nah, it's a mess, but it's a tiny (in die area) mess.
At least one reviewer has been able to clock to original spec clocks of 3+ghz while drawing just 405w
Those aren't gaming clocks at all.
I'd love to see someone adopt resizable hw wave/work item size
Errrrr, Intel?
 
Those aren't gaming clocks at all.

Errrrr, Intel?
Sad to hear

? The docs I can find indicate that size is 16 work items while 32 work items can be combined somehow, but I can't find more specifics? I was thinking something like this but I don't know if that's how Intel works.

Regardless you could go beyond the linked paper potentially. They suggest an odd grouping strategy, but I'm imaging sharing local resources like register/shared memory across wave sizes when in large wave size, while dynamically splitting up work items and shared resources when in small wave size.
 
Vega matched or was faster than its competitors with no performance pitfalls and didn't lack any features of relevance. It was just less power efficient. This is worse IMO.
The 6800 XT launched for only $50 less than the 3080 and was slower in rasterisation at 4K and substantially slower in ray tracing. This card is no worse at 4K and the ray tracing differential seems to have shrunk slightly. It also benefits from having FSR 2.1 available while at the launch of the 6000 series AMD had nothing.
 
g26_o.png


Performance hit from enabling h/w Lumen in Fortnite seem to be comparable between RDNA3 and Ada - or even a bit higher in case of Ada. Wonder why.
 
The 6800 XT launched for only $50 less than the 3080 and was slower in rasterisation at 4K and substantially slower in ray tracing. This card is no worse at 4K and the ray tracing differential seems to have shrunk slightly. It also benefits from having FSR 2.1 available while at the launch of the 6000 series AMD had nothing.
Vega was the competitor for GTX 1080 and 1070. RDNA 2 is a different comparison but that was also a better situation for AMD than this. 6900XT matched a 3090 outside of RT and was closer with RT than a 7900XTX is to a 4090. RT titles were also less prevalent and had less enticing implementations.
 
We should also not forget that AD102 (608mm²) ist 116 mm²bigger thant N31 (522mm²) . Also not mentioning that MCM desing looses 10% die size on both sides for the interconnect. So AD102 is 30% bigger than N31, if you take this in relation N31 is dooing really well for the first chiplet design.

Also mentionted at computerbase is that N31 scales realy well with the clock, will 4080 you see a not so good scaling performance.

 
That would essentially make the 7900XT obsolete half a month after launch. I doubt that they'll just cut their margins for no apparent reason like this, at least not at launch - as a repositioning in a year from now maybe.

No, I meant Nvidia introducing the 4070ti at $799. Unlike AMD, they haven't announced a price yet, we just have rumors at this point.

We should also not forget that AD102 (608mm²) ist 116 mm²bigger thant N31 (522mm²) . Also not mentioning that MCM desing looses 10% die size on both sides for the interconnect. So AD102 is 30% bigger than N31, if you take this in relation N31 is dooing really well for the first chiplet design.

Sure, but unless that die size reduction actually filters down into the MSRP, hard to really care much. It's also consuming significantly more power.
 
We should also not forget that AD102 ist 116 mm²bigger thant N31 (522mm²) . Also not mentioning that MCM desing looses 10% die size on both sides for the interconnect. So AD102 is 30% bigger than N31, if you take this in relation N31 is dooing really well for the first chiplet design.
But the RTX 4080 comes with the AD103 die, which is much smaller than 7900XTX (380mm² vs 522mm²), and it's also cut down (76 SM out of 80 SM), yet it's providing similar raster levels of performance, and higher ray tracing performance, at reduced power consumption too.

Moving on to another point .. what's happeneing in this 3D Mark Mesh Shader test?

1670894020162.png

 
Maybe we should wait with Mesh Shader till sombody can benchmark this here. If somboy get the hands on this card please do this benchmark:

Start Benchmark with this settings that we can compare all together:
main_vk1.exe -fov 60 -subdiv 6 -mode 2 -meshlet 0
https://tellusim.com/mesh-shader-emulation/

 
Back
Top