Apple A8 and A8X

GfxBench comparison with 5s here
"Up to +50% faster" was technically correct. I have the impression that Apple listened to me and agreed that current GPU performance is mostly good enough for all but the most demanding case. :D But they're leaving that flank open to attack from others.
 
GfxBench comparison with 5s here

12% increase in the OS ALU score.

If that test is correct and anyway accurate, surely this indicates the ALU count hasn't increased, and we are dealing with 4 cluster unit, or much slower clocked 6 cluster unit ?

33% increase in OS fill rate
 
12% increase in the OS ALU score.

If that test is correct and anyway accurate, surely this indicates the ALU count hasn't increased, and we are dealing with 4 cluster unit, or much slower clocked 6 cluster unit ?

33% increase in OS fill rate
Manhattan Offscreen increased 38% and T-Rex Offscreen increased 63% which is very high compared to the ALU speed increase. If the only thing that changed was clock speed and not ALU count shouldn't the improvements in scores for the various tests be more in sync? If the ALU test is measuring FP32 while Manhattan and T-Rex use a mix of FP16 and FP32 with T-Rex being more FP16 heavy then one possibility is they moved from G6430-based to GX6450-based with a small clock speed bump.
 
Manhattan Offscreen increased 38% and T-Rex Offscreen increased 63% which is very high compared to the ALU speed increase. If the only thing that changed was clock speed and not ALU count shouldn't the improvements in scores for the various tests be more in sync? If the ALU test is measuring FP32 while Manhattan and T-Rex use a mix of FP16 and FP32 with T-Rex being more FP16 heavy then one possibility is they moved from G6430-based to GX6450-based with a small clock speed bump.

according to here:-
http://blog.imgtec.com/powervr/the-...ogue-gpus-specifications-features-api-support

GX6450 has 33% more FP16 than the G6430, which would be somewhat consistent with your narrative.
 
The range of fill rates under GFXBench 3.0 for four cluster Rogues in implementations ranging from iPad Air's G6430 to some Merrifield/Moorefield devices with higher clocked G64x0s are around 2984 to 3772, so the iPhone 6 with its score just above 3700 is definitely still an 8 TMU/4 cluster design. The uneven performance improvements in other sub-tests don't suggest much in the way of direct scaling, so they're probably owing more to architectural improvements than frequency scaling.

So, the obvious core that fits the description among PowerVR's announced range would be the GX6450, at a clock speed I'd guess around 467 MHz.
 
Last edited by a moderator:
So if the CPU only received minor changes and is still dual core, the GPU is still 4 cluster, that makes the question of where all the extra transistors went even more pressing.

And with the performance oriented changes being so small, and the focus on sustained performance on the iPhone in the A8 taking away the differentiating feature that the iPads' had with the A7, I wonder if there will actually be an A8X this year with the GX6650 and bigger memory bus for the iPads? If there is really a larger, higher resolution iPad Pro, the Pro and Air could use the A8X and the Mini use the A8 possibly with a minor clock speed bump.
 
So if the CPU only received minor changes and is still dual core, the GPU is still 4 cluster, that makes the question of where all the extra transistors went even more pressing.

I won't be surprised at all if the eventual die shots show a huge SRAM pool, much larger than the 4MB one in A7.
 
according to here:-
http://blog.imgtec.com/powervr/the-...ogue-gpus-specifications-features-api-support

GX6450 has 33% more FP16 than the G6430, which would be somewhat consistent with your narrative.

I'd be VERY surprised if it isn't a GX6450 after all with that kind of fillrates:

http://gfxbench.com/device.jsp?benchmark=gfx30&os=iOS&api=gl&D=Apple iPhone 6

Overall the score is fine for a smartphone; however for an upcoming tablet I'd still insist that it would be extremely boring.
 
The results that first appeared in GFXBench's database this morning for iPhone 6 were actually for the Plus, apparently. So, something like 18.8 fps in Manhattan offscreen at first upload versus the 6's 17.8 and a bigger gap in fill (3700+ vs 3400+).

Yeah, I'm expecting the SRAM pool(s) to be expanded by quite a bit. Should make for highly sustainable performance in demanding apps if devs code for it.
 
So if the CPU only received minor changes and is still dual core, the GPU is still 4 cluster, that makes the question of where all the extra transistors went even more pressing.

And with the performance oriented changes being so small, and the focus on sustained performance on the iPhone in the A8 taking away the differentiating feature that the iPads' had with the A7, I wonder if there will actually be an A8X this year with the GX6650 and bigger memory bus for the iPads? If there is really a larger, higher resolution iPad Pro, the Pro and Air could use the A8X and the Mini use the A8 possibly with a minor clock speed bump.
Rumors so far seem to point to significantly thinner iPads this year so I have doubts that the iPad Air will use an A8X. New features for the iPads can include more RAM and split-screen multitasking.

For the rumored iPad "Pro" ("Air+"?), I can definitely see an A8X with a GX6650 for a higher-resolution display. Sure, the iPhone 6 and 6+ have the same chip (same with the iPhone 5S and iPad Air, etc.), but it's possible that Apple has decided that the iPad Pro is the cutoff between the A8 and the A8X.

As for what the extra transistors are for, I'm in the same boat as Exophase.
 
Rumors so far seem to point to significantly thinner iPads this year so I have doubts that the iPad Air will use an A8X. New features for the iPads can include more RAM and split-screen multitasking.

For the rumored iPad "Pro" ("Air+"?), I can definitely see an A8X with a GX6650 for a higher-resolution display. Sure, the iPhone 6 and 6+ have the same chip (same with the iPhone 5S and iPad Air, etc.), but it's possible that Apple has decided that the iPad Pro is the cutoff between the A8 and the A8X.

As for what the extra transistors are for, I'm in the same boat as Exophase.
My thinking was if there is an A8X it'd need to be shared between the iPad Air and Pro because the Pro itself probably wouldn't have the volumes to justify bringing up a separate chip, especially a bigger, more complicated one.

I originally proposed a 16 MB L3 cache when the A8 transistor count was first announced which seemed high at the time, but if the GPU is only 4 cluster, it may actually be possible although 8-12 MB is probably more realistic. I wonder how useful it would be if the L3 cache can be controlled by developers and used like the EDRAM/ESRAM in consoles? With Metal, Apple has a means to provide direct access if the hardware supports it.
 
http://www.chipworks.com/en/technic...s/blog/inside-the-iphone-6-and-iphone-6-plus/

Chipworks is now analyzing the chips. No big revelations so far since testing is ongoing. They're pretty sure it's 20 nm TSMC and the die size Apple reported is accurate.

They have a (slightly) clearer shot of the die now. In terms of die proportions versus the A7, it looks like CPU is smaller, SRAM is the same, and GPU is larger. (Edit: scratch that, GPU is also about the same)

I haven't looked at the GFXBench numbers, but have we ruled out Apple using 6-cluster with lower clocks for power efficiency? Or is the Series 6XT (plus Apple's tweaks if any) simply that massive.
 
Last edited by a moderator:
Depends how Apple lays it out, although they've been going with dense designs lately.

Fill rate doesn't suggest a 12 TMU/6 cluster part with reduced clocks.
 
The GPU block in the A8 looks like 4.5x4 mm or 18 mm2, the SRAM is 2.2x2.0 mm or 4.4 mm2 and the CPU (I'm including what looks like the 2 separate L2 caches on the left side) is 3.6x3.2 mm or 11.5 mm2.

I don't think Chipworks ever gave the dimensions of the A7, but based on the 102 mm2 area and the proportions of the chip I'm using 10.5x9.7 mm for my calculations. That would make the GPU 4.3x6 mm or 25.8 mm2, SRAM 2.8x2.6 mm or 7.3 mm2, and the CPU 5.3x3.7 mm or 19.6 mm2.

So relative to the A7 the A8 GPU is 70% the size, the SRAM is 60%, and the CPU is 59% the size. Someone else can work out the transistor count changes, but it looks like the SRAM and CPU were pretty much straight shrinks, and the GPU grew a bit probably the change from G6430 to GX6450. So the question is still open where the new transistors went. Previously the CPU, GPU, and SRAM were 52% of the die in the A7, but now they are 38% of the A8.

EDIT: Is the secure element on the CPU? Could Apple be setting aside a larger portion portion of the die for the secure element/enclave for fingerprint data, credit cards, and possibly anticipating other types of secure data like health?

Since there isn't a dramatic increase in L3 cache I guess I'm back to suggesting the possibility of data compression in the memory and possibly flash with a hardware accelerator in the A8.
 
Last edited by a moderator:
My thinking was if there is an A8X it'd need to be shared between the iPad Air and Pro because the Pro itself probably wouldn't have the volumes to justify bringing up a separate chip, especially a bigger, more complicated one.
Yeah that's a good point, I actually hadn't considered that.

EDIT: Is the secure element on the CPU? Could Apple be setting aside a larger portion portion of the die for the secure element/enclave for fingerprint data, credit cards, and possibly anticipating other types of secure data like health?
That wouldn't surprise me, but how many transistors do those types of portions use?
 
It appears that the iPhone 6 Plus is rendering at only 1136x640 (ie. same resolution as iPhone 6's native resolution, rather than the 6 Plus' higher native resolution of 1920 x 1080) in GFXBench On-screen tests:

http://gfxbench.com/compare.jsp?ben...Plus&os1=iOS&api1=gl&D2=Apple+iPhone+6&cols=2

Yuck if that's true and a trend that carries over to all mobile games also. Wherever I was able to reduce from native resolution the aliasing side-effects where more than just nasty. Why the heck did they then even bother to clock the GPU by roughly 10% higher in the Plus anyway? It's not that they'll run out of texel fillrate with 8 TMUs any time soon either *argh* :rolleyes:
 
Back
Top