The A8X uses 128bit LPDDR3 (25GB/s) for an 8-cluster GPU capable of ~230GFLOPs at the estimated 450MHz.
If you assume that apple/Intrinsity usually produce very balanced designs, you have 25GB/s for a 3-core CPU and a 230GFLOPs GPU.
An 8CU (512sp) GPU at ~400MHz would do ~410GFLOPs.
...at 450MHz it's 230.4 GFLOPs FP32 or 460.8 GFLOPs FP16, so pick your poison It being a DR it can save more bandwidth under conditionals. That shouldn't mean that I believe that MTK has integrated anything with 8 clusters/CUs/whatever...I believe it when I see it. Anything with 4 clusters would be already "high end" for their "standards".
If the Helios X20 uses a 128bit bus with LPDDR4, we'd have twice the memory bandwidth for an iGPU with almost twice the FP32 throughput. It doesn't sound unreasonable.
There's never anything like "enough" bandwidth especially in a power constrained ULP SoC environment where a high number of units are struggling for the very same memory bandwidth. It doesn't matter what it'll contain in the end, the bandwidth won't go wasted.
Last edited: