Apple A8 and A8X

Entropy · Sep 10, 2014

anexanhume said:
Nice work. Now we just need a die shot to confirm cluster count.

Agreed, thanks ltcommander. I wonder if cache sizes can be found in docs. Cache timings probably has to be measured. Again, if the geekbench leak is valid, the IPC is remarkable, and the gritty details of how it is achieved quite interesting.

tangey · Sep 10, 2014

Didn't IMG describe XT as having "up to" 50% performance improvement over 6 ? I wonder is it still a 4 cluster XT unit with a clock increase.

Or look at another way, if it went 6 cluster XT, then wouldn't apple be talking about a lot more than 50% improvement! unless the clock has reduced ?

wco81 · Sep 10, 2014

There's some report that the 6 Plus has 2 GB of RAM while the 6 still has the expected 1 GB.

Grall · Sep 10, 2014

3x iPhone retina rez would eat up more RAM for screen buffers, and with 1GB getting a little tight under the collar already it would make sense.

Lazy8s · Sep 10, 2014

Right. The architectural improvements of 6XT from 6 at the same clock rate and number of clusters was said to yield up to 50% better performance as measured by the established mobile benchmark software in use today.

And, going from a four to six cluster design just by itself should yield up to a 50% performance improvement.

A GX6650 at a similar clock to the A7's G6430 would represent both of the aforementioned enhancements, so totaling only a 50% performance improvement, if that is indeed the core being used, would be very underwhelming.

pcchen · Sep 10, 2014

Grall said:
3x iPhone retina rez would eat up more RAM for screen buffers, and with 1GB getting a little tight under the collar already it would make sense.

I have the feeling that the downsampling to 1080p is just because Apple can't find a decent (and decent priced) 1242x2208 5.5" panel in time. I'd expect the next generation iPhone to ditch the downsampling completely.

wco81 · Sep 10, 2014

But 1080p has to be more common and cheaper?

I thought going forward from iOS 7, they would not be as resolution dependent. So maybe as apps. support the 6 Plus directly (with differentiated layouts than smaller iPhones and iPads), this kind of downsampling won't be necessary with apps. updated for the Plus?

pcchen · Sep 10, 2014

wco81 said:
But 1080p has to be more common and cheaper?

I thought going forward from iOS 7, they would not be as resolution dependent. So maybe as apps. support the 6 Plus directly (with differentiated layouts than smaller iPhones and iPads), this kind of downsampling won't be necessary with apps. updated for the Plus?

One problem is that the button sizes are already fixed in many apps. Basically you don't want a button to be too small to tap. Without using a 3x resolution, a 401 DPI screen would render a "normal size" button too small.

In theory, you can put everything into logical resolution (e.g. I want my button to be 7mm x 7mm, instead of 46x46 "points"), but without using full vector rendering, there are always going to be problems (though to be fair such problems are much less visible on high DPI screens).

Interestingly, even a 1242x2208 5.5" iPhone does not actually have 3x resolution (it only have 461 DPI compared to 489 DPI which is 3x of the original 163 DPI), so it's not entirely clear where the numbers 1242x2208 come from.

[EDIT] One possible reason behind the slightly less DPI for 5.5" iPhone is that Apple probably think the buttons should look a little larger on a larger phone.

Entropy · Sep 10, 2014

Lazy8s said:
Right. The architectural improvements of 6XT from 6 at the same clock rate and number of clusters was said to yield up to 50% better performance as measured by the established mobile benchmark software in use today.

And, going from a four to six cluster design just by itself should yield up to a 50% performance improvement.

A GX6650 at a similar clock to the A7's G6430 would represent both of the aforementioned enhancements, so totaling only a 50% performance improvement, if that is indeed the core being used, would be very underwhelming.

Why would it be underwhelming? I seriously don't understand the reasoning. If there is a GX6650 there, then either Apple is low-balling the performance improvement in terms of FLOPS, possibly because of bandwidth constraints, or because they want to make a device that runs cooler (doesn't need to drop clocks) and for longer.

Either scenario, or a combination, is good engineering, at least in my view.

Grall · Sep 10, 2014

pcchen said:
I'd expect the next generation iPhone to ditch the downsampling completely.

Interesting! Yes, on second thought, that would seem logical. Or well, them removing the scaling does, not having a 5.5" screen with a 2.2k vertical resolution.

That's way, way overkill for the human eye really.

Ailuros · Sep 10, 2014

Entropy said:
It's not obvious what GPU configuration they've chosen. I'd lean towards the GX6650 partly because of its 16-bit capabilities which seems ideal for these devices, and because it could bring a bit more grunt when given freer reins in an iPad, assuming the next iPad uses the same GPU.

It is not a given that clocks are identical between the 6 and 6 Plus.

I'd estimate that a 6650@400MHz could deliver somewhere in the 18-21 fps ballpark in Manhattan offscreen. More than enough for a smartphone, but for a tablet I'd call it completely boring. Sustainable or not sustainable doesn't make a difference to me.

For the FLOP story I have a layman's estimate which I'm not exactly sure if it actually reflects reality:

FP16 ALUs seem to help G6430 to add roughly =/>25% in performance compared to other FP32 cases from the competition; here I'd figure that they cannot combine FP32 and FP16 at full tilt but always a parts of either side. The 6430@450MHz has 115 GFLOPs FP32 and/or 173 GFLOPs with a 13.0 fps score in Manhattan.

With a hypothetical 6650 clocked at 400MHz would have 154 GFLOPs FP32 and/or 307 GFLOPs. If you relate it to only FP32 GFLOPs you get 17 fps, but I figure that the additional FP16 ALUs would also contribute a bit more to the final score.

aaronspink · Sep 10, 2014

mavere said:
The performance improvements seem anemic next to the increased transistor count, especially as the die-shrink's frequency boost alone could account for most of the CPU's gains. Can't wait for a die shot.

Well, it really depends on what transistor counts they are using. Are they using schematic transistor counts or drawn transistor counts?

Entropy · Sep 10, 2014

Ailuros said:
I'd estimate that a 6650@400MHz could deliver somewhere in the 18-21 fps ballpark in Manhattan offscreen. More than enough for a smartphone, but for a tablet I'd call it completely boring. Sustainable or not sustainable doesn't make a difference to me.

As a user, sustained performance is the only metric that matters for gaming. Performance that only lasts for 50s won't get me far in a game. There are other use cases where race to sleep makes sense, but gaming graphics isn't one of them.

For the FLOP story I have a layman's estimate which I'm not exactly sure if it actually reflects reality:

FP16 ALUs seem to help G6430 to add roughly =/>25% in performance compared to other FP32 cases from the competition; here I'd figure that they cannot combine FP32 and FP16 at full tilt but always a parts of either side. The 6430@450MHz has 115 GFLOPs FP32 and/or 173 GFLOPs with a 13.0 fps score in Manhattan.

With a hypothetical 6650 clocked at 400MHz would have 154 GFLOPs FP32 and/or 307 GFLOPs. If you relate it to only FP32 GFLOPs you get 17 fps, but I figure that the additional FP16 ALUs would also contribute a bit more to the final score.

The topic of which numerical format sees most use has come up a couple of times, but hasn't been properly discussed. For me, using fp16 as default fp format for games seems like a given. Half the internal and external bandwidth requirements, more FLOPs (with better die area/power efficiency), retina screens make single pixel errors irrelevant, in order to be detectable precision errors must be - large, correlated with neighboring pixels, and sustained over several frames. Seems pretty damn unlikely, and even if your game could in particular circumstances produce such artifacts, it would still have to be significant enough to offset the other benefits. I just can't see other than fp16 being the typical format.

Entropy · Sep 10, 2014

aaronspink said:
Well, it really depends on what transistor counts they are using. Are they using schematic transistor counts or drawn transistor counts?

Regardless of metric, Apple is probably internally consistent in their comparisons. I suspect increased L3. Improves performance, and the decrease in off-chip communication is a win in and of itself. Ltcommander speculated 16 MB of L3, an if that speculation is correct, that's just over half the increased transistor budget right there.

lanek · Sep 10, 2014

zed said:
Yes I know, iUgly what were they thinking?
Looks like they did follow my suggestion (that got laughed at here a couple of years ago) and released 2 different size phone models, I knew they wouldnt but they should of gotten rid of the home button

Clearly, if they had release it 2 years ago, it should have maybe pass, but now that everyone have done 2-3 smartwatch ( coming from rectangulat to round etc ) .. Well its terrible how common or ugly it look . ( if gold plated look my grandma watch ).

fehu · Sep 10, 2014

Someone can compare this to extimated performance/watt for K1 64bit and the next A57?

Ailuros · Sep 10, 2014

Entropy,

I've heard that consumption for FP16 ALU is less than half compared to FP32 ALUs, but I can't vouch for it either.

Entropy · Sep 10, 2014

Ailuros said:
Entropy,

I've heard that consumption for FP16 ALU is less than half compared to FP32 ALUs, but I can't vouch for it either.

Considering what needs to be done to perform an FP multiply for instance, that makes perfect sense. And I don't think the FP16 capabilities of the GX6650 is a fluke either. The changes were presumably made because licensees find them functional and desirable. I would assume that FP16 is primarily used for graphics, and FP32 for "other codes" whatever the heck it may be that uses the FP32 capabilities of the GPU on these systems.

App developers want their products to look good and perform well, and they will use the available resources accordingly. Benchmarks however aren't under the same pressure, so this is an example where conceivably benchmarks don't necessarily give a good prediction of real world app performance. The same goes for other comparative material. One occasion where the issue of FP formats was raised here was when it was noted that Anandtech was only quoting FP32 FLOPs in their comparison tables, and when the question was raised as to why this was so, it was simply because that was what they used in their desktop GPU tables. No attempt to corroborate with actual use had been made.

tangey · Sep 10, 2014

Given Apple historic use of PowerVr for video encode/decode as well as graphics, I'm guessing it's a reasonable bet that it is IMG IP behind the H.265.

I can't recall exactly where it was mentioned, was it in relation to facetime ?, in which case they'd need both h.265 decode AND encode ?

UPDATE....unless the img website is out of date, it appears they don't have a H.265 encoding capable IP, just decoding.

They announced details of the H.265 decoder over a year ago here:-
http://www.imgtec.com/news/detail.asp?ID=780

Which contained the single line:

For information about PowerVR Series5 video encoder availability, please contact your local sales office."

And there has been no mention of a H.265 encoder since ?

Ailuros · Sep 10, 2014

tangey,

There's also VolTE in A8; smells like former Hellosoft - now IMG IP. http://www.imgtec.com/ensigma/hellosoft.asp

If I would had trusted happy go merry newsparrots across the internet, Apple should had dumped IMG by now for the second generation already. Ironically the IP from IMG Apple employes doesn't decrease but rather the exact opposite.

Apple A8 and A8X

Entropy

tangey

wco81

Grall

Invisible Member

Lazy8s

pcchen

Moderator

wco81

pcchen

Moderator

Entropy

Grall

Invisible Member

Ailuros

Epsilon plus three

aaronspink

Entropy

Entropy

lanek

fehu

Ailuros

Epsilon plus three

Entropy

tangey

Ailuros

Epsilon plus three

Similar threads