It is with some satisfaction I can get firmly on topic again: A9x die shot
12 cluster GPU, no L3
12 cluster GPU, no L3
You mean 6 extra clusters. The 3 cluster pairs and doubled memory interface seems to make up for most of the area. Also pity on being beat on the L3 news.There's no additional CPU resources, and the two extra GPU clusters can't account for the difference by themselves.
During normal run-time? Very doubtful.Does anybody know if the L3 in the A9 is power-gated?, I imagine it must be.
Sorry you lost your scoop, on the other hand there is an intersting story hiding there. Why did Apple scrap the L3 on A9x? The ratio of GPU resources to bandwidth is similar between A9 and A9x. If the bandwidth to L3 would have stayed the same, the benefit vs. going straight to main mem would have been reduced, favouringYou mean 6 extra clusters. The 3 cluster pairs and doubled memory interface seems to make up for most of the area. Also pity on being beat on the L3 news.
I think power is the sole argument here. Same rhetoric as to why vendors don't bother with PoP memory in tablet and bigger form factors.In terms of power it draws a bit, but the L3 is not particularly fast, and it reduces off die traffic which saves a bit of power.
Except if it is cheaper to include a bit larger battery than increase the die size. Maybe the die is so large that they needed to cut something, and this was easy to cut (easy to compensate for, costs a lot of die).The issue I see with power being the reason for dropping the L3 from the A9x is that the phone SoC A9 has it, even though it has a tighter power envelope. Since the power draw of the L3 has to be relatively modest in the A9, it's proportional contribution to the whole in the A9x should be quite a bit lower still with its twice as wide gpu. Also, if the memory bus is the same speed but twice as wide, energy savings due to reduced bus traffic should be worth more on the A9x.
Hmm.
Yes, but again this is a line of reasoning that actually is stronger for cutting the L3 on the A9, rather than the A9x. The L3 is a larger percentage of the A9 die area, ergo removing it would represent a larger increase in useable dies per wafer, and that would also be particularly welcome on their higher volume part!Except if it is cheaper to include a bit larger battery than increase the die size. Maybe the die is so large that they needed to cut something, and this was easy to cut (easy to compensate for, costs a lot of die).
I think you misunderstood my statement, the A9X dropped it because it has a higher power envelope. I.e. the L3 is there just for power savings and would only be worth it on the smaller devices.The issue I see with power being the reason for dropping the L3 from the A9x is that the phone SoC A9 has it, even though it has a tighter power envelope.
It is with some satisfaction I can get firmly on topic again: A9x die shot
12 cluster GPU, no L3
Oh. Thanks for the clarification. That makes logical sense, although I can't really see that the hit rate of L3-and-not-3MB-of-L1-or-L2 would reduce bus traffic enough to more offset the power draw of the cache system and in itself justify having the L3. A few percent of the memory bus traffic which is also counter balanced by cache power draw just seems like very marginal effects in the overall picture.I think you misunderstood my statement, the A9X dropped it because it has a higher power envelope. I.e. the L3 is there just for power savings and would only be worth it on the smaller devices.
You're not thinking about non-CPU blocks. GPU, display pipeline and a lot of other things in there would probably benefit greatly from reduced memory controller and DRAM activity. Main memory is just ridiculously more expensive in terms of power, I can see them min-maxing as much as possible out of the SoC architecture via such a cache. Of course this is just my theory...That makes logical sense, although I can't really see that the hit rate of L3-and-not-3MB-of-L1-or-L2 would reduce bus traffic enough
In other words, later today?And closing on that note, we’ll be back a bit later this month with our full review of the iPad Pro and a deeper look at A9X’s performance, so be sure to stay tuned for that.
In my head I'm already in December...In other words, later today?
We don’t know the clockspeed of the GPU – this being somewhat problematic to determine within the iOS sandbox – but based on our earlier performance results it’s likely that A9X’s GPU is only clocked slightly higher than A9’s. I say slightly higher because no GPU gets 100% performance scaling with additional cores, and with our GFXBench Manhattan scores being almost perfectly double that of A9’s, it stands to reason that Apple had to add a bit more to the GPU clockspeed to get there.