Do you have a link to that particular benchmark? I'd be interested to see what they were doing that was using so little bandwidth.
In actual games the X1X can be putting out anywhere from 40% to 100% more pixels [edit: in a given period of time], so I'm inclined to think that is most cases, in the real world, bandwidth is the real limiter.
It may not be specific to the PS4, but some time ago there was discussion about getting improved performance for GPU particles by sizing the tiles to match the footprint of the ROP caches. The general workflow assumes ROP caches are continuously servicing misses to memory, but staying within their caches in workloads that permit it leverages their broader internal data paths while significantly reducing their DRAM bandwidth consumption.
Double ROPs in that subset of the work would be able to scale performance without needing as much memory bandwidth.
Guys, your recent discussion on memory contention lead me to do some googling to try and better understand what it was. While doing so I came across the following http://pages.cs.wisc.edu/~basu/isca_iommu_tutorial/IOMMU_TUTORIAL_ASPLOS_2016.pdf
I think the bits that pertain to the next-gent consoles begin on page 122 and end at 172. It's mainly grafts and large text so it shouldn't make for a heavy read. Could some of the more technical members have a quick look through and see if this could have been designed as part of the two systems set to launch later this year? Thanks in advance.
AMD's had an IOMMU of some form going back at least as far as Trinity. There's an IOMMU in the current consoles, and Kaveri fell just short of a full HSA device. HUMA was the marketing point that PS4 fans latched onto, for example.
It's been present for years in standard hardware, so the next gen should be expected to continue to have it.