Apple A9 SoC

Here's a shot of both die side by side (A8 vs. A9). Given a package estimated to be 10.5% larger, I've assumed the A9 is 10.5% larger as a scaling factor. In the image, the A8 is 505 by 624 pixels, or 315120 total pixels. The A9 is 595 by 602, or 358190 total pixels. A8 total pixels increased by 10.5% is 348207 total pixels, so we only need to scale 2.9% down for all comparisons.

ezK2Ada.png

(credit to iMacmatician for partitions on A9)

With that said:
A8 SRAM: 121 x 136 = 16456 pixels
A9 SRAM: 74 x 108 (x 0.971) = 7760 pixels (x total 2) = 15520 pixels

A8 L2: 45 x 73 = 3285 (x total 2) = 6570 pixels
A9 L2: 88 x 165 (x0.971) = 14098 pixels

So, it looks as though we're looking at 4MB SRAM again on A9 (with minimal scaling from process change) and 2MB L2 cache per CPU core on A9, up from 1MB on A8 and matching 2MB per core on A8X. Of course, there are several potential sources of error including die cutouts not equal relative to edge of the chip, as well as the 10.5% growth not being representative. There's also selection error in defining the edges of the L2 (less so on the L3 SRAM). I think these quantities are sufficiently close or far apart enough to justify the above conclusions though.
 
The benefit of Samsung's 14nm process upon which Apple depended the most in comparison to last year's 20nm TSMC process for boosting the A9 seems to be its ability to hit higher frequencies for a given voltage. Still, I'm guessing the GT7600 GPU will be clocked somewhere around 533 MHz and yield a Manhattan 3.0 offscreen result somewhere around 38 fps.
 
Somewhere close to the A8X GPU performance is my guess too; for the frequency it should then be somewhere in the 470-533MHz region.
 
If the upgrade to architecture from Series6XT to Series7XT improves performance by around 40% multiplied by 50% more clusters and some enhancements to the memory access for the GPU to realize most of those gains, reaching and even surpassing Apple's 90% improvement target isn't a stretch.

So, I could even conceive of Apple making their goal with a GPU clock as low as 450 MHz. 470 MHz can make a lot of sense, too, especially after I see the exact clock on the CPU side.

What pushed me to lean on the higher end of the clock speed range for my guess on the GPU is just the feeling that Apple might bump the clocks across all processors on the SoC if they're going so far, in relative terms, on the CPU. However, that very reason is probably a better argument for why they might hold back on raising the GPU clock, in order to balance out some of the extra heat from the CPU.

I'll be looking closely at the fill rate numbers when the benchmarks are run to get a grip on all this!
 
Memory bandwidth seems to increase by around 30%, probably by increased memory clock.
 
If those scores are legit, it is a real pity the full results aren't available. Still, it seems significant gains have been made in both integer and FP, to roughly the same degree. Memory has also increased with aggregate score higher than iPhone6 and even iPad Air2. LPDDR4?

This complete across the boardness is a bit suspect, but the variation displayed between individual test scores makes this a really elaborate ruse if it is one.
 
Memory bandwidth seems to increase by around 30%, probably by increased memory clock.
The memory aggregate score is better than my iPad Air2, and the STREAM Copy bandwidth is just as high, Scale and Add are higher. I'm not sure a clock hike on DDR3 would suffice.
 
Last edited:
I don't know if Geekbench scores are directly comparable between OSes, but if they are then Intel should be worried. They are both fanless designs, and thus sub 5w, but logic dictates that the phone has the lower tdp and thermal dissipation area, which makes the figures even more shocking. Oh yes, the Core M-5Y31 has a RRP of $281, if you're buying by the tray!

iPhone 6s - A9 SoC
Single Core: 2293 / Multi Core 4293

12" MacBook (2015) - Core M-5Y31
Single Core: 2358 / Multi Core 4604
 
I don't know if Geekbench scores are directly comparable between OSes, but if they are then Intel should be worried. They are both fanless designs, and thus sub 5w, but logic dictates that the phone has the lower tdp and thermal dissipation area, which makes the figures even more shocking. Oh yes, the Core M-5Y31 has a RRP of $281, if you're buying by the tray!

iPhone 6s - A9 SoC
Single Core: 2293 / Multi Core 4293

12" MacBook (2015) - Core M-5Y31
Single Core: 2358 / Multi Core 4604
While this is a very interesting discussion, it is also toxic on internet. At the very least, I'd suggest that we stay away from it until we have the full benchmark data. Also, as you point out, this is a cell phone SoC. The A9x makes for a more valid comparison.

In this case, comparing across OSes is probably not as big a problem as comparing across architectures.
 
Another question mark raised by these screenshots is that if L2 has grown to 3MB (?) could that mean that the die shot is actually showing 8MB of L3?
 
Last edited:
The memory aggregate score is better than my iPad Air2, and the STREAM Copy bandwidth is just as high, Scale and Add are higher. I'm not sure a clock hike on DDR3 would suffice.

A8 seems to be using 200MHz LPDDR3 (according to Anandtech it has 12.8GB/s bandwidth). In Geekbench's stream copy test, an iPhone 6 typically gives ~10GB/s. The new A9 result shows ~13GB/s. If the memory is changed to LPDDR4, its bandwidth should be doubled. 30% would be too low even considering some inefficiency caused by the longer prefetch length. In this case, if A8 uses 266MHz LPDDR3, 30% ~ 40% increase seems to be quite reasonable.

On the other hand, iPad Air 2 has double memory bandwidth but its Geekbench stream copy test only gives ~15GB/s. So it's entirely possible that A9 uses 200MHz LPDDR4 (with double theoretical memory bandwidth) but like iPad Air 2 only being able to give ~13GB/s memory bandwidth. This is also a possibility especially considering the more powerful GPU A9 have.
 
While this is a very interesting discussion, it is also toxic on internet. At the very least, I'd suggest that we stay away from it until we have the full benchmark data. Also, as you point out, this is a cell phone SoC. The A9x makes for a more valid comparison.

In this case, comparing across OSes is probably not as big a problem as comparing across architectures.
The A9x is going to make a mess of any Intel chip around 5-10w tdp especially on graphics. I'm pretty shocked by the results posted for 6s today, one of her runs got up to 4800 for multi core!? On "2 cores" is this SMT we are looking at? :\
 
A8 seems to be using 200MHz LPDDR3 (according to Anandtech it has 12.8GB/s bandwidth). In Geekbench's stream copy test, an iPhone 6 typically gives ~10GB/s. The new A9 result shows ~13GB/s. If the memory is changed to LPDDR4, its bandwidth should be doubled. 30% would be too low even considering some inefficiency caused by the longer prefetch length. In this case, if A8 uses 266MHz LPDDR3, 30% ~ 40% increase seems to be quite reasonable.

On the other hand, iPad Air 2 has double memory bandwidth but its Geekbench stream copy test only gives ~15GB/s. So it's entirely possible that A9 uses 200MHz LPDDR4 (with double theoretical memory bandwidth) but like iPad Air 2 only being able to give ~13GB/s memory bandwidth. This is also a possibility especially considering the more powerful GPU A9 have.
Best to keep in mind that (LP)DDR4 only doubles your memory bandwidth if you right off the bat go for the highest clockspeed the standard is meant to support. No one is shipping LPDDR4-3200 AFAIK. It will be just like LPDDR3; clockspeeds will start out lower and ramp up over the years.

Even on the PC desktop, the most common speed grades are 2133 and 2400, well short of doubling DDR3 1600.
 
Best to keep in mind that (LP)DDR4 only doubles your memory bandwidth if you right off the bat go for the highest clockspeed the standard is meant to support. No one is shipping LPDDR4-3200 AFAIK. It will be just like LPDDR3; clockspeeds will start out lower and ramp up over the years.

Even on the PC desktop, the most common speed grades are 2133 and 2400, well short of doubling DDR3 1600.
Both Qualcomm and Samsung are using LPDDR4-3200 and have been for half a year now, it's safe to assume Apple with at least deliver the same. LPDDR4-4200 (or whatever the 2133MHz speed is called) is the next upgrade coming up. We're already at double the memory bandwidth off the bat.

In any case people need to remember that memory to a SoC doesn't mean the CPU cores themselves have access to it or can saturate the full bandwidth.
 
Oh and by the way, here's a story on a woman in San Fransisco who receives her iPhone6S early, proceeding to photo the hell out of it and also, benchmark it:

http://9to5mac.com/2015/09/21/iphone-6s-arrives-early/#more-399360

So almost exactly 50% higher Geekbench scores than A8

If we assume DDR4-3200 is used, the 6s has twice the bandwidth of the A8 in iPhone 6. That alone would yield (2^2)^(1/10)=14.9% higher Geekbench scores (memory synthetics are weighted with 20% geometrically).

That means the remaining benchmarks improve by 1.5/1.149=30.5%. The would require a A8 clocked at 1.4GHz * (1.305^10)^(1/8) = 1.95GHz.

Since it is only clocked at 1.8GHz, we can infer an IPC improvement of 1.95GHz/1.8GHz = 8.5% (assuming limited sensitivity to raw bandwidth)

That's pretty close to what was predicted here.

Cheers
 
Back
Top