And perhaps 180mm2 on a new process is greedy on my part. If and when Sony/MS do slim versions of their consoles w/ finFET, they would probably end up around that size. If Nintendo want smaller/cheaper, they need to go w/ less. Between 130-150mm2 maybe.
I'm still unconvinced that MSFT or SOny will make a finfet version of the the XB! or the PS4 for the simple reason that if AMD were planning to have Puma+ core available on 14/16 nm process they would just be so happy to inform share owners and investors. The writing (for me at least) is on the wall and oh well it is not it's blank.
Yeah, it's more business than technology at this point. But I would be interested in seeing
the PowerVR GT7900 in a "low power console" like they mention in the blog.
it is technology and business, even Nvidia can't touch PowerVR performance in the low end low power segment. The configuration you are pointing to is a bit too high as I don't think Nintendo will try to compete with Sony or MSFT and so it doesn't sound harsh actually I would agree with suh move it is not in their best interest. Now where are console in the greater realm of graphic? AMD GPUs in console and discrete GPUs are the GPUs that are pushing the most FLOPS they also happen to have the worst performances measured in FPS per FLOPS.
I'm saying one thing ultimately whether or not Nintendo wants to compete with, out do, or undercut (in price) its competitor it has nothing to do with Finfet or HBM, it has a lot more to do with the IPs you choose. I am convinced that you can outdo the both the ps4 and the XB1 using 28nm process, no fancy memory technology and while spending less on silicon (IPs it is difficult to know).
True, UE4 runs on mobile, but I seem to recall reading about there being different rendering methods depending on GPU grunt. Don't mobile games use a forward rendering technique? Would this matter to Nintendo?
Well hardware deferred renderers (/GPUs) make their things through hardware and drivers, if I'm not misslead it is transparent to software.
Big.little is something they could be looking at if they just want some more SIMD capabilities w/ the CPU. I think they'll try to get the CPU comparable to Xbone/PS4, as CPU tasks don't scale as easily as GPU ones do.
Indeed it seems it does not by any available public accounts available to us, though it could very well be that AMP is here to stay and not only for power saving purpose. when I read the ongoing researches I linked, my understanding is that the Cell Broadband engine issue was not heterogeneity but the memory model. One thing is clear, 8 middle of the road CPU cores connected through a sucky interconnect is not efficient

As for the SIMD capability, I don't know though dev would know through profiling what they need +/- . I would assert that low IPC /low power cores are good for significantly parallel tasks with a decent amount of dependencies, tasks where both the "mojos" of high IPC Cores (and usually they big caches) and brute force approach (/GPUs) fail. That would require real profiling and its something AMD can't deliver but it would be interesting to see how 2 clusters of x4 A72, each with 2MB of L2 compares to one cluster of x4 A72 with 2MB of L2 backed with 2 clusters of A35 each linked to 1MB of L2. The later might come slightly tinier as looking at the old Exynos 5433 (x4-A15 and x4-A7) and matching Anandtech data, you could almost get 4 clusters of A35 with 512KB of L2 within the same space as another A72 cluster.
Now thinking of a system that does not operate within the really razor tdp constraint mobile chip have to operate within, but with other cost related constraints one could make further trade off: pass on the benefit of the ARM v8 ISA and on some others (numerous in fact) architectural improvements but compensate through higher clock speed and the matching power cost. Looking at where AMD Jaguar cores stand compare to mobile CPU IPs, both in perfs per watts and mm2, saving some sq.mm on the CPU at the expense of power for the sake of your gpu could make sense. A72 and Jaguar should be around the same size at the same process, the former being better in pretty much every way. Now if you cut corners A17-A7 are going to save you lot of silicon , that might be enough to justify a bump of a couple hundred MHz (and the Watts) to if not make up for the loss but sweeten things up.
Also, going by several of their comments, Nintendo have clearly heard the developer complaints about the Wuu CPU. Even Miyamoto called it out as a bottleneck in Star Fox. I think back to the paltry N64 texture cache and how they subsequently decided to include a whopping 1 Megabyte of the stuff in Flipper. Does anyone know if that is a true cache, btw? Or is it more a scratchpad?
What I understood from lots of conversation on that very topic is that the Wii U cpu cores are far from bad if you look at performances per cycles, per mm2 or watts. the things is they 3 low power cores at 1.2GHz is only get you that far even if those cores are good at being low power cores.
Back to die size, looking at Carrizo, that's 250mm2 for 8 GCN cores. That also doesn't include any embedded memory. It has a nice TDP, but is it binned or carry a laptop premium cost? If not, Carrizo, even at 28nm, would be a nice choice for them if they could get around memory limitations. I mean a similar SoC w/ ARM CPU cores though.
imho sales say the whole story the overall chip merits, cpu and gpu, it is harsh, I wish AMD were doing better but wishful thinking only get you that far.
I don't think Nintendo would ever allow memory to be a bottleneck. Even Wii U has 30 GB/s eDRAM, which is more than the 25.6 GB/s standard which AMD uses for similarly specced consumer GPUs. If we scratch any embedded memory or HBM, the most they could probably get for a decent cost is ~50 GB/s DDR4 (128-bit bus). There are
these 12 Gb lpDDR4 chips from Samsung that are faster. They would need 8 chips, however, in order to get the 72 GB/s which AMD seem to find appropriate for their ~1 TFLOP cards.
Heavenly bottleneck that is what they should aimed for

Now my hopes are low if not nills. My bet is that if AMD is inside the hardware won't cut it on any performance bracket, either not performant or to costly, etc.