AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
If Navi 2x GPUs have a monster last level cache, then consoles games will not be built to take advantage of it, because consoles don't have such a monster cache.

I don't think the cache would require games to be programmed a different way. If that was the case Navi 2x would not perform well on older games at all, which dosen't seem to be the case from the leaks/benchmarks we have seen so far. Just the fact that the console games would be optimized for the RDNA2 architecture should help performance in future games. The purpose of the cache is perhaps more to increase performance/W? And which also allows them to reduce external memory bandwidth and costs. The die cost for the cache may have been too large on the 7nm process for the consoles, which are already pushing the envelope in terms of new tech and costs. RDNA2 and Zen 2 by themselves are large enough of a leap in terms of performance and performance/W compared to Jaguar and GCN.
 
If Navi 2x GPUs have a monster last level cache, then consoles games will not be built to take advantage of it, because consoles don't have such a monster cache.
Why wouldn't it be reliant on RDNA2's memory subsystem or HBCC like unit to realize that it has available cache and can keep the data closer?
I realize that modern APIs have given more low level access to devs but after all the features that AMD tries to add that are never fully supported, I would expect them to not make that same mistake.
 
Higher IPC and clocks don't come for free obviously. If all these rumours are true, then they've exceeded their 50% perf/W target by a mile and the numbers teased at the Zen even are almost certainly from a 72CU or even 64CU SKU if we're being really optimistic. I'm still a bit skeptical, though I'd love to be pleasantly surprised. 3 more days to go anyway, we'll find out soon enough.
 
I'm trying to make sense of this ... I think he's saying that he didn't want to post their numbers, but he confirmed their 3080 numbers were similar to his, so instead of posting the 6800XT number they sent him he "estimated" the number from his 3080 based on the ratio of the 3080 to 6800XT numbers they sent him ... it's really dumb, and I have no idea why he'd do something so convoluted, but it's the only interpretation I can come up with.

View attachment 4820

He also added another 10% to the 6800xt scores to account for a new driver. it is a bit odd that all the leaks are for 3dmark. Maybe AMD did like Nvidia and whitelisted only certain apps for AIB drivers. There’s also talk of AIBs having artificially crippled drivers. Wednesday can’t come soon enough.
 
He also added another 10% to the 6800xt scores to account for a new driver. it is a bit odd that all the leaks are for 3dmark. Maybe AMD did like Nvidia and whitelisted only certain apps for AIB drivers. There’s also talk of AIBs having artificially crippled drivers. Wednesday can’t come soon enough.

I have a stack of games waiting for a new video card, so this whole thing is making me very impatient. Lol
 
He also added another 10% to the 6800xt scores to account for a new driver. it is a bit odd that all the leaks are for 3dmark. Maybe AMD did like Nvidia and whitelisted only certain apps for AIB drivers. There’s also talk of AIBs having artificially crippled drivers. Wednesday can’t come soon enough.

This is not so clear, the way he words that seems to mean AMD supplied a new driver which brought a 10% increase in scores, and the scores he was showing were achieved with that driver.
 
So in RTGs latest video, he claims he was shown slides that compare gaming performance between 6800XT and 3080. 6800XT wins 5, loses 3 and ties in 2 out of 10 at 4K resolution. And wins 8/10 in 1440p

I wonder if this is a controlled leak by AMD
 
It's interesting that the biggest change in both Zen3 and RDNA2 seems to be massively reworked cache.

Not totally surprised. Apple has been leading in mobile CPU performance for years now by concentrating on their cache hierarchy. Having idling compute resources waiting around means adding more silicon other than getting instructions to the idling hardware doesn't make a lot of sense, and modern archs generally rip through things while sram has utterly and completely failed to scale as well as logic.

Makes me wonder whether AMD, Nvidia, (and... possibly Intel?) are lining up for Samsung's 3nm. They'll not only be years ahead of others if they stick to their schedule, but their stacked s-ram tech could cut die space by a hell of a lot, as sram is a ballooning on most dies versus logic. If they can get their yield numbers towards anything reasonable they'll be in a great position. After all TSMC's upcoming chiplet interconnects can technically connect dies from other foundries as well as their own.
 
This is not so clear, the way he words that seems to mean AMD supplied a new driver which brought a 10% increase in scores, and the scores he was showing were achieved with that driver.

Yeah I’m using a translator which says:

“In addition, it must also be added that AMD is said to have distributed another performance driver yesterday, which then brought the last 10% increase, which is already included in the above result.”

I took the above result to mean his calculated numbers as that graph was right above the quote. But it could also mean it was included in the original percentages he got from the AIB.
 
I don't really get Igor, why so much obfuscation. If he just named the figures in the ballpark of tens of thousands with granularity of 500, no one would guess his leak's source (unless AMD provides bioses with very specific power targets and v/f curves that can help them to root out the leaker AIBs). Also, it seems most of the people doing that have no clue about overall / graphic score and the fact that overall score basically says nothing if you don't know the specific CPU it was tested on. Anyways, only three days to go and then this madness will end (probably)
 
Not totally surprised.

I for one am very pleasantly surprised. Most of the GPU cache research I’ve seen seems to favor HPC. I always figured if big caches are useful for HPC and Nvidia still hasn’t bothered to invest there on its high margin HPC chips then we probably won’t see it on gaming cards either.

My guess is if there is some kind of huge cache then AMD is pinning entire buffers in that cache to avoid thrashing. Probably via heuristics of which render targets & UAVs are accessed most frequently.

I haven’t given up on the option that they simply went with a 512-bit bus though.
 
My guess is if there is some kind of huge cache then AMD is pinning entire buffers in that cache to avoid thrashing. Probably via heuristics of which render targets & UAVs are accessed most frequently.
I've seen patent documents talk about the problem of dealing with the overheads of small cachelines (e.g. 64 bytes) while using monster caches. The solution is based upon regions (from/to addresses). Which would map nicely to things like render targets and UAVs.

I haven’t given up on the option that they simply went with a 512-bit bus though.
256-bit MC + PHY is about 64mm².
 
Status
Not open for further replies.
Back
Top