Even if XB1 can achieve more by shifting work to data access rather than calculating stuff, and utilise a higher overall BW, devs aren't going to bother to make XB1 specific versions of titles. And ultimately the premise is IMO flawed because everything's moving towards computing solutions rather than fetching them. Tiled resources means less requirement for storage and BW, and produces better results. Compute based shaders. Compute based rendering. In terms of pixels pushed, XB1 is extremely unlikely to reach parity save in games where business concerns cap to the lowest common denominator, just as its always been on consoles. The only gen that was really interesting in this regard was last gen where the two machines were diverse in their implementation of similar power, and ultimately XB360 won out due to better GPU. GPU == graphics, basically. But this gen the power envelope is very one-sided, more so than PS2 versus XB where PS2 at least had massive advantages in a few areas.
Agreed.
Though the actual assumption I'm relying on is that the CUs are not saturated enough, such that they are waiting on information for processing because the data isn't there for them to process, ideally this would change as async shader based games continues to evolve (at least i'm hoping).
I'm not sure how accurate that statement is, it's likely a half truth, or just wishful thinking. But I'm biased on the idea that if CU saturation is still low, the bottleneck being available bandwidth (moving completed work out of the CUs to memory, or the opposite, moving fresh work they need to the CUs), then there would be room to grow in saturation as much as there is bandwidth, at least from the actual CU side of things. The idea is, what good is 100 CUs, if you only have 32GB/s of bandwidth. 12 CUs with 192 GB/s should be able to outperform that.
If that's true, both consoles have room to grow, but I'm pretty sure they'd approach the same solution differently, and PS4s solution should be somewhat parallel to a PC method, and Xbox is on its own tangent.
Clearly both MS and Sony have very talented engineers, it's clear that they've paired hardware with bandwidth. Suggesting otherwise would be a mistake. But I'm curious to see if a maxed bandwidth full 192 GB/s on eSRAM + a full 60 GB/s (it's ugly I know, because 60 needs to feed the esram, etc etc. I know) would result in enough efficiency to produce more work than a system with less bandwidth.
Lastly, to address whether a company would invest the effort to discover this paradigm shift of working with embedded RAM for SoC APUs, my answer is: well if going forward embedded RAM is the future, then it's not a terrible idea to get started on that type of R&D.