AMD Execution Thread [2024]

Status
Not open for further replies.
Yet it wasn't expected to scale beyond 5nm either after 3nm didn't improve.

3nm was known when TSMC published the technical details, and TSMC somewhat pushed finfets as much as they could, with the SRAM not scaling being a drawback. But they went for maturity over rushing new transistor tech before they thought it was fully ready for mass production (we can see how that worked out for Samsung). But it was never expected not to scale any further, just that going forward it would scale much less vs logic.

Either ways though, whether it scales or not, the higher cost of the new nodes will itself make using more SRAM challenging for all but the highest margin chips. So having tech like AMD's will be an advantage.
 

9800X3D-REVIEW-1.jpg
 
That's what the X3D'S were always good at -- not specifically the top-line performance, rather the stoic consistency in delivering frames even at the lowest edges of the statistical graphs.
 
not specifically the top-line performance, rather the stoic consistency in delivering frames even at the lowest edges
These are the same. You'll get similar improvements in both lows and highs. The reason why it's not happening in this leaked review is because the highs (averages) there are GPU limited.
 
My statement around consistency in framerate was cast towards those who who buy these procs as the penultimate gaming rig, when combined with all the other penultimate settings of their chosen game. Sure, I understand and acknowledge there is a segment of the gaming population who are competitive twitch gamers and run at the lowest details because absolute-maximum framerates are king.

However, for those of us who aren't competitive gamers and are instead here for the max-rez, max-details playing experience, even my 4090 at 2865MHz will still become bottleneck of my 3440x1440 120Hz display -- of course depending on the game. At which point, the far better 1% and 0.1% percentile frame times really matter more than the absolute peak of CPU output, IMO.
 
There is a caveat though, and especially for 0.1% lows, is that they are much more affected by run and test variance. And test scenario may or may not be more or less beneficial for certain architectures.

This something that might need to be kept in mind as how representative these small test slices are to longer varied game play.

Also the number itself isn't fully illustrative since it doesn't really show if it's say one prolonged stutter or a lot of microsutters.

I think there is a danger with a growing assumption that 1% and 0.1% lows are really representative as singular numbers with how they are used.
 
No singular metric should be presumed more important than any other singular metric. 0.1% frame times tell a different story than 99% frame times, the same as naively assuming average framerates tell the whole story too.

History has demonstrated significant increases in cache size and decreases in memory access latency pay dividends in more regular frame pacing, which tends to show up as both better lows and better highs.

If I'm "peak shaving" by using quality and resolution settings to cause my GPU to become the bottleneck, then my interest is no longer the absolute maximum speed, but a more consistent delivery.
 
Last edited:
Keep in mind that this review is with a 4070Ti at 1440p, which would be more GPU than CPU limited, although fairly representative of an upper mid-range/enthusiast gaming system these days.

If true, the numbers should be even better for say a 4090 at 1080p, which is what most reviews will test I imagine. We'll find out tomorrow.
 
Seems like 9800X3D is better than I expected seeing real reviews. The slightly higher clocks is ok, though not that big a deal, but the extra cache does seem to be alleviating some bottlenecks that Zen 5 has in gaming, at least better than it did for Zen 4. I know some theorized it might, but there was no way to know without some testing.

I still dont think the product as a whole is amazing value and it's not some huge leap from 7800X3D, but it's undeniably the fastest gaming CPU in existence by a moderate margin.
 
With this new stacking method, and now that the cache layer is the same size as the CPU layer, I wonder if we'll see a "new" version of the X3D lineup with dual vcache-stacked-CCDs? The only real challenge for prior dual CCD X3D parts was the heterogeniety of the cache, leaving some arbitrary combination of AMD software, Windows scheduler, and game logic to figure out which cores were the "right" ones. Giving both CCDs the vcache treatment solves so many dumb scheduling problems, and would bring a level of production-workload performance to the desktop which would outclass even many of Intel's XEON class chips.
 
I wonder if we'll see a "new" version of the X3D lineup with dual vcache-stacked-CCDs?
It's one possibility and it would probably be the most interesting development for gaming in the Zen 5 lineup. AMD's reasoning for keeping the cache to just one CCD were clocks which shouldn't be the issue now with logic sitting above the cache.

However - and this is already the case with 9800X3D - I hope that they'll avoid pushing the clocks too much as this hurts overall perf/watt while providing very limited gains in gaming and not that impressive ones outside of it. 7950X3D consuming 65W in gaming is a good thing, and now this is already higher with 9800X3D thanks to is clocks boost. 9950X3D hitting more than 125W would be quite a bit disappointing in comparison to what 7950X3D is capable of.
 
Wendell from Level1Techs seemed to think the 65W ECO mode for the 9800X3D had no discernable impact on gaming performance, but did affect "production" workloads. I'll link him saying ECO mode "mostly did not impact gaming performance" at the 236s mark here:

The unfortunate part is, despite him saying it, he doesn't provide any benchmarks. I'd like to see someone out there actually run the numbers and show us how different the 65W operating mode is from the full 120W mode.
 
Neither is Zen 5 so this was to be expected.
Going by Zen 5's gaming performance, we might have expected only a lower single digit increase over 7800X3D, but it's more like 10%, which is definitely better than many would have expected and more than some small clockspeed improvements could explain.
 
Going by Zen 5's gaming performance, we might have expected only a lower single digit increase over 7800X3D, but it's more like 10%, which is definitely better than many would have expected and more than some small clockspeed improvements could explain.
Clocks are increased from 4.7 on 7800X3D to 5.2GHz which is exactly a 10% change. In contrast to Zen 4 non-3D parts the 3D ones had to use lower clocks so it's not surprising that the gains are higher here as the clocks increase is in fact also higher.
 
Clocks are increased from 4.7 on 7800X3D to 5.2GHz which is exactly a 10% change. In contrast to Zen 4 non-3D parts the 3D ones had to use lower clocks so it's not surprising that the gains are higher here as the clocks increase is in fact also higher.
Huh? 7800X3D's Boost clock is 5 GHz, not 4.7 GHz.
 
With this new stacking method, and now that the cache layer is the same size as the CPU layer, I wonder if we'll see a "new" version of the X3D lineup with dual vcache-stacked-CCDs? The only real challenge for prior dual CCD X3D parts was the heterogeniety of the cache, leaving some arbitrary combination of AMD software, Windows scheduler, and game logic to figure out which cores were the "right" ones. Giving both CCDs the vcache treatment solves so many dumb scheduling problems, and would bring a level of production-workload performance to the desktop which would outclass even many of Intel's XEON class chips.
It opens up a lot of possibilities. The next iteration might be something similar to the CDNA3 Instinct setup.
The IOD gets turned into an active interposer, CCDs with Vcache stacked ontop. Maybe add a small GPU chiplet with Vcache. Then you just need to move the LPDDR on package.
 
Specced clocks and actual sustained clocks are different things.
Handbrake might not be the best test for that.
For example our 7800X3D sample did about 4.95 GHz in Cinebench and 5.05 GHz (yes, over the supposed max) in Cyberpunk (and 9800X3D did 5.225 in Cyberpunk for us, also tad over supposed max)
 
Neither is Zen 5 so this was to be expected.
I honestly doubt that we'll see a lot of performance improvement in AM5, certainly not to the level of AM4.
This being said AMD certainly has a couple of aces up their sleeves which they may want to use in AM5 - but if Intel will continue as they do currently this may only happen in AM6.
Well Zen 6 seems to still be on AM5, and with a combination of IPC and frequency improvements due to N3P (rumoured for mainstream parts), should be a pretty reasonable improvement. Apparently Intel will have no competitive part for gaming until Nova Lake in 2026, but even then I don't expect AMD to stagnate. Bear in mind whatever Zen 6 is going to be is likely already set in stone, as these kinds of decisions are made years in advance, so it's not like they can suddenly change things just based on how Arrow Lake turned out.
With this new stacking method, and now that the cache layer is the same size as the CPU layer, I wonder if we'll see a "new" version of the X3D lineup with dual vcache-stacked-CCDs? The only real challenge for prior dual CCD X3D parts was the heterogeniety of the cache, leaving some arbitrary combination of AMD software, Windows scheduler, and game logic to figure out which cores were the "right" ones. Giving both CCDs the vcache treatment solves so many dumb scheduling problems, and would bring a level of production-workload performance to the desktop which would outclass even many of Intel's XEON class chips.
Rumour is the dual CCD parts will have Vcache on both CCDs this time around. And there's also a rumour of Zen 5 threadripper getting Vcache.
Going by Zen 5's gaming performance, we might have expected only a lower single digit increase over 7800X3D, but it's more like 10%, which is definitely better than many would have expected and more than some small clockspeed improvements could explain.
There were some games which saw 15-20% improvements as well, such as Hogwarts Legacy, Starfield, Homeworld 3, etc which are definitely more than expected. It does seem like the IOD might be limiting Zen 5 a bit and AMD really should have updated it. There were also improvements in some workloads such as Cinebench where there was no improvement for Zen 4 X3D over Zen 4. This bodes well for Turin-X and we could see it release as early as Q1'26.
It opens up a lot of possibilities. The next iteration might be something similar to the CDNA3 Instinct setup.
The IOD gets turned into an active interposer, CCDs with Vcache stacked ontop. Maybe add a small GPU chiplet with Vcache. Then you just need to move the LPDDR on package.
This might not be feasible for all client parts, esp mid-range and lower. But I hope they do make some changes for the higher end parts. If they have a Strix Halo successor with Zen 6, that would certainly be interesting. LPDDR on package didn't really work out for Lunar Lake and Intel is scrapping it with the next gen Panther Lake already.
Handbrake might not be the best test for that.
For example our 7800X3D sample did about 4.95 GHz in Cinebench and 5.05 GHz (yes, over the supposed max) in Cyberpunk (and 9800X3D did 5.225 in Cyberpunk for us, also tad over supposed max)
Yea those are similar numbers to a few reviews I've seen as well, the 9800X3D gets about an ~200 mhz boost at best over the 7800X3D for gaming. So that's about 4% gain, the rest is due to the higher IPC of Zen 5.
 
Status
Not open for further replies.
Back
Top