AMD Execution Thread [2023]

Status
Not open for further replies.
N32 is pretty competitive with GA104 which is its closest competition specs wise. No reason to expect things will be different on 3nm. AMD’s challenge as usual will be branding and market penetration.
N32 kind of sits between AD103 and AD104. Since the 4080 is a cut down AD103 it makes decent point of comparison. Of course the 4080 gives the 7800XT an epic spanking. There are other factors at play (GDDR6X vs GDDR6, N32 being on 5nm and 6nm) but I'm just saying NVIDIA is further ahead than we tend to think. I think if we could see the BOM for the 7800XT and 4080 they would be shockingly close.
 
N32 kind of sits between AD103 and AD104. Since the 4080 is a cut down AD103 it makes decent point of comparison. Of course the 4080 gives the 7800XT an epic spanking. There are other factors at play (GDDR6X vs GDDR6, N32 being on 5nm and 6nm) but I'm just saying NVIDIA is further ahead than we tend to think. I think if we could see the BOM for the 7800XT and 4080 they would be shockingly close.
AD103 has 63% more transistors than N32. AD104 is a much closer comparison and there the 7800 XT competes with the cut down version.
 
AD103 has 63% more transistors than N32. AD104 is a much closer comparison and there the 7800 XT competes with the cut down version.
These comparisons are hard to make now with AMD and NVIDIA pursuing different strategies in chip production. Most fair to me is how much the cards cost to produce. Estimates I've seen put 7800XT and RTX4080 pretty close to each other.
 
Btw does RDNA3 on desktop have higher idle power consumption than rdna 2?

Curious finding why on mobile, amd phoenix got higher idle power consumption than amd Rembrandt.
Navi32 and Navi31 necessarily have some higher idle power draw due to the inter-chiplet communication links needing to be powered up.
Sending a bit of data off-die will always cost more power than sending a bit somewhere else on-die.

However, comparing Navi 33 to 23 (monolithic to monolithic, so most representative of what you'd see in an APU anyway), they look the same, except for the odd outlier of video playback:

Not quite sure why an RX 7600 sucks down more power than a 4090 in that scenario, whereas a 6600XT hums along at 10w.
It's really difficult to compare apples to apples in scenarios like this, two really important things can distort the readings:

1) Fan curves - if one model doesn't have a fan-stop option at all, or the fan curves are so aggressive that they spin up in a certain test where another card's fans don't, power skyrockets, but that's not the GPU silicon's fault, strictly speaking. The card TPU tested definitely stops at idle, but there's not enough data to say which cards fans were spinning during the video playback test.

2) Power delivery - unsophisticated VRMs sometimes don't have 'phase shedding' where they don't power up all the VRM phases at low loads, which greatly increases their efficiency. Or kind of like the fan curve thing, it may just be that the set points of each card are different due to different VRM controllers and programming, so if card A is running 8 phases flat out at the low-but-not-zero load of playing back a video, but card B is only running 1 or two, that'll make a significant difference in the power consumption numbers as well, and that's also not the GPU silicon's fault either.

1694625440008.png
 
Last edited:
These comparisons are hard to make now with AMD and NVIDIA pursuing different strategies in chip production. Most fair to me is how much the cards cost to produce. Estimates I've seen put 7800XT and RTX4080 pretty close to each other.
Using the below calculator, with a defect density of 0.07 and an edge loss of 4, on a 300mm wafer, and ballpark costs of $17000/$10000 per wafer for 5nm/6nm, as suggested by Ian Cutress, I make it ~$156 for the AD103 silicon vs ~$110 for the 7800 XT, not taking into account packaging costs. GDDR6X memory is probably more expensive than GDDR6 also.


While the AD104 silicon is ~$101, counting all partial dies as usable (4070 being a cut down version). So maybe AMD is a bit further behind, but I still think AD103 is the wrong comparison point.
 
It's still gonna suck due to low memory bandwidth and they'll probably want >$350us for it.

AMD has been giving me the sads lately. :(
 
Using the below calculator, with a defect density of 0.07 and an edge loss of 4, on a 300mm wafer, and ballpark costs of $17000/$10000 per wafer for 5nm/6nm, as suggested by Ian Cutress, I make it ~$156 for the AD103 silicon vs ~$110 for the 7800 XT, not taking into account packaging costs. GDDR6X memory is probably more expensive than GDDR6 also.


While the AD104 silicon is ~$101, counting all partial dies as usable (4070 being a cut down version). So maybe AMD is a bit further behind, but I still think AD103 is the wrong comparison point.
That's good to know. So it does sit between AD103 and AD104, closer to AD104. IDK why NVIDIA even bothers with GDDR6X but I assume they got a good price on it. Seems not worth if it's significantly more expensive.
 
It's still gonna suck due to low memory bandwidth and they'll probably want >$350us for it.
Oh yeah, to that end, one more interesting thing i noticed about the 7700XT from Eurogamer's review.

I'd posted earlier that according to TechPowerUp's review, the 7700XT was incredibly overclockable, and in fact, once highly overclocked, actually managed to be faster than the 7800XT at least in some tests, which is rare to see these days.

I'd been too eager to focus on that, and when comparing cards, people usually look at the CUs or TFLOPs numbers to compare cards in the stack, but that really doesn't tell the whole story on this one.

7700XT -> 7800XT:
54 -> 60 CUs (11.1% increase, but realistically even less than that as the 7700XT is clocked higher out of the box and less power limited, so the actual TFLOPs increase is closer to 5% best case for the 7800XT)
48 -> 64MB of Infinity Cache (33.3% increase, both in size and overall bandwidth since the slices are striped together)
432 -> 624 GB/s of memory bandwidth (44.4% increase)


However, Starfield, the 7800XT has a ~30% lead, making the 7700XT a pretty awful decision at its MSRP.
Sometimes you really do need that 'real' memory bandwidth, and there's only so much the on-chip caches can do.
 
These comparisons are hard to make now with AMD and NVIDIA pursuing different strategies in chip production. Most fair to me is how much the cards cost to produce. Estimates I've seen put 7800XT and RTX4080 pretty close to each other.

It’s almost impossible to accurately guess what anything actually costs to make. Even if we had the cost numbers it wouldn’t tell us anything about the architecture or expected performance. Best we can go on is high level specs, unit counts, bandwidth etc for comparables.
 
However, Starfield, the 7800XT has a ~30% lead, making the 7700XT a pretty awful decision at its MSRP.
Sometimes you really do need that 'real' memory bandwidth, and there's only so much the on-chip caches can do.
7800 XT has both more cache and membandwidth enough to explain the difference though, not quite sure how you can draw the conclusion it's the membandwidth instead of caches or both combined making the difference
 
It’s almost impossible to accurately guess what anything actually costs to make. Even if we had the cost numbers it wouldn’t tell us anything about the architecture or expected performance. Best we can go on is high level specs, unit counts, bandwidth etc for comparables.
It shouldn't be that complicated to make a sort of rough guess based on some basic generalities and specs.

A 4080 is a 379mm² die on TSMC 5nm family. But it also cut down by 10% or so.

7800XT is 349mm² of total silicon, but only 200mm² of that is on TSMC 5nm(and requires it to be fully enabled), while the rest is on 7nm family.

I'd say still it's pretty safe to say the 7800XT is ultimately cheaper to make. TSMC 5nm has amazing yields and 200mm² is small enough to where they're gonna have great yields itself per wafer. All while the remaining 7nm family MCD's are cheap and obviously incredibly plentiful.

That said, it's pretty obvious the performance of the 7800XT should be way closer to the 4080 than it is. It's almost outright shameful it's only competing with the cut down AD104 4070.

RDNA3 is clearly terrible in terms of performance per mm², but pricing is always a way out of this. And I have no doubt that if RDNA3 had performed like it was meant to, the 7800XT would not cost just $500.
 
7800 XT has both more cache and membandwidth enough to explain the difference though, not quite sure how you can draw the conclusion it's the membandwidth instead of caches or both combined making the difference

That is a good point, I suspect that the memory bandwidth is the more important factor here, but don't have the empirical data to say for sure or not, would require some heavy memory underclock testing on a 7800XT to be sure. :)
 
It shouldn't be that complicated to make a sort of rough guess based on some basic generalities and specs.

A 4080 is a 379mm² die on TSMC 5nm family. But it also cut down by 10% or so.

7800XT is 349mm² of total silicon, but only 200mm² of that is on TSMC 5nm(and requires it to be fully enabled), while the rest is on 7nm family.

I'd say still it's pretty safe to say the 7800XT is ultimately cheaper to make. TSMC 5nm has amazing yields and 200mm² is small enough to where they're gonna have great yields itself per wafer. All while the remaining 7nm family MCD's are cheap and obviously incredibly plentiful.

That said, it's pretty obvious the performance of the 7800XT should be way closer to the 4080 than it is. It's almost outright shameful it's only competing with the cut down AD104 4070.

RDNA3 is clearly terrible in terms of performance per mm², but pricing is always a way out of this. And I have no doubt that if RDNA3 had performed like it was meant to, the 7800XT would not cost just $500.

Die size isn’t that helpful especially when dealing with chiplets. The 7800XT is much closer to the 4070 Ti on paper.

Another interesting comparison is 6900 XT vs 7800 XT. Almost identical performance even though the balance of CUs, flops, bandwidth and cache is quite different.
 
Packaging costs would likely have to be astronomical (and/or severe yield, defect recovery issues) if Navi 32 were cost comparable to AD103, even if cut down as the RTX 4080. I just don't feel that's likely.

Not that from the business side the reality is likely much better however. Navi 31 is likely at best cost comparable against AD103, and Navi 32 against AD104. The problem though is that in terms of market positioning they are competing against cut down versions of each respective chip while also undercutting them.

My feeling with this the entire time is I think many people might be (or did) place too much emphasis on the advantage of chiplets (especially in terms of cost) due to the success of it in the CPU space with AMD (chiplet) against Intel's (monolithic) but there seems to be a lot of both technical and business factors that are completely different, and not just limited too possible execution issues for this generation.
 
Die size isn’t that helpful especially when dealing with chiplets. The 7800XT is much closer to the 4070 Ti on paper.

Another interesting comparison is 6900 XT vs 7800 XT. Almost identical performance even though the balance of CUs, flops, bandwidth and cache is quite different.
Die size is still very critical, even when talking about chiplets. I think I covered all the general stipulations well enough in my comment. And yes, N32 is probably closer to AD104 in costs than AD103. That was kinda my whole point.

My feeling with this the entire time is I think many people might be (or did) place too much emphasis on the advantage of chiplets (especially in terms of cost) due to the success of it in the CPU space with AMD (chiplet) against Intel's (monolithic) but there seems to be a lot of both technical and business factors that are completely different, and not just limited too possible execution issues for this generation.
I dont think that's entirely fair. RDNA3 is not a good barometer for the benefits of chiplets since it seems the core architecture itself just is just kinda terrible. But lots of people wont really consider that, and will just point to the more overt and physical difference and say, "Oh RDNA3 is bad cuz of chiplets", even though there's really no reason to think that.

I think had RDNA3 performed as AMD expected, it would change the whole perspective completely. AMD could have A) either raised prices and made a much stronger case for chiplets for their own financial benefits, or B) offered similar prices as now, but with way better ultimate value in terms of performance per dollar over Nvidia.

Nvidia is probably in no real hurry here cuz they think they can get away with what they want for pricing anyways. Their strategy for improving margins isn't reducing costs, it's just charging more. Had RDNA3 turned out better, they probably couldn't get away with what they're doing now. Chiplets in GPU's will be disruptive, it's just gonna require Nvidia or AMD to do a good job on the fundamentals.
 
AD103 has 63% more transistors than N32. AD104 is a much closer comparison and there the 7800 XT competes with the cut down version.
The 4070Ti (full AD104) is the more suitable comparison if we want a comparison based on absolute theoreticals. Here the 4070Ti is significantly ahead of the 7800XT. Compute resources (CUs, TMUs, ROPs, RT Cores) are almost identical, the die size is comparable too, 294 mm² (4070Ti) vs 346 mm² (7800XT). TDP is also comparable 280w (4070Ti) vs 265w (7800XT).

The cutdown AD104 (aka 4070) is cut down too much. It loses 1/3 of the cores, loses 200MHz of clocks, and consumes way less power. It's already at a huge rasterization and memory bandwidth disadvantage vs the 7800XT. It's only comparable to it based on price, which is subject to market forces not technical merits.
 
Using the below calculator, with a defect density of 0.07 and an edge loss of 4, on a 300mm wafer, and ballpark costs of $17000/$10000 per wafer for 5nm/6nm, as suggested by Ian Cutress, I make it ~$156 for the AD103 silicon vs ~$110 for the 7800 XT, not taking into account packaging costs. GDDR6X memory is probably more expensive than GDDR6 also.


While the AD104 silicon is ~$101, counting all partial dies as usable (4070 being a cut down version). So maybe AMD is a bit further behind, but I still think AD103 is the wrong comparison point.
TSMC N6 is supposed to be a "cost effective" replacement of N7, which should have been well under $10k in 2022, I think we were throwing around the ~$7-$8k number last year.
Do you have a quote from Ian about TSMC N6 cost? Even after all of TSMC's price hikes, I can't believe they moved a cost effective node up close to 2018 introductory prices...
Also, Nvidia's custom N4 is likely on the high side of the 5nm wafer cost range(I think ~$16-$18k was mentioned last time), even with their volume discount.

I previously estimated N31 at ~$120-$130 for pure die costs. ~300mm2 GCD- ~$90-$95 each (<$100), MCD- ~$4 each (<$5).
Using similar numbers, I would guess N32 is around $80-$85. ~200mm2 GCD- ~$65 each, MCD ~$4 each.
Edit- AMD might be selling a N32 package for ~$110-$120.

Apparently, AMD's whole focus for RDNA3 was cost as they made quite a few comments about PPA during their presentation last year.
 
Last edited:
Also, Nvidia's custom N4 is likely on the high side of the 5nm wafer cost range(I think ~$16-$18k was mentioned last time), even with their volume discount.
N4 is probably the exact same price as the N5-family process it's based on. Every big customer gets to customize the process for their chips to a point and so far there's no indication of N4 being any different, except NVIDIA wanting to call it different.
 
Status
Not open for further replies.
Back
Top