I think most of this sounds like nonsense, but duplicating all of that might not be that bad given the yields recovery benefit.And why would you duplicate the display, video, PCIE stuff, etc. It's a non-trivial amount of silicon and certainly not cheap.
I could see something like a M1 Pro/Max physical layout strategy making sense (google it and come to your own conclusions regarding what that means) but at that point it’s literally just an engineering implementation detail and not something anyone outside the company has any reason to care about (outside of geeking out about these kinds of things of course )
Also no consumer/graphics GPUs have had a A100/H100-like Split L2, so if that was the case, it would be noteworthy. I am not sure why they’d want one either though, given how superior the RTX 4090 L2 cache is to the H100’s L2 in the ways that matter for consumer GPUs (i.e. H100 likely benefits from higher maximum cache bandwidth due to massive HBM DRAM bandwidth and AI matrix multiplication workloads access patterns likely fit the split slightly better, but besides that the AD102 L2 is lower latency and higher capacity while still not taking a huge percentage of the die).
My guess is it’s a very traditional bruteforce monolithic 800mm2+ 512-bit GDDR7 chip on N4P with a single L2, and *maybe* GB203 is “just” a cut GB202 but in a way that just saves engineering/verification effort and might even still require a separate tape-out depending on their methodology. I’d be pleasantly surprised if it was more interesting than that but I don’t see what benefit they get from it at their volumes tbh.
Last edited: