Next-Gen iPhone & iPhone Nano Speculation

Here's a link to next generation decode and encode IP from IMG: http://imgtec.com/News/Release/index.asp?NewsID=597

It doesn't mention die area but if you log in for the VXD392 it decodes H.264 HP@4.1/30Hz with a frequency under 100MHz. I personally wouldn't want to torture neither a GPU at higher frequencies nor a CPU at even higher for such functions. I'm not even sure if there's any SoC at the moment available that doesn't have dedicated encoding/decoding hw blocks.

Depends on what you mean by dedicated. Some -- QCOM and I believe TI -- use DSP blocks instead of purely fixed-function. Obviously these are fairly customized DSP's for video encode/decode.
 
Depends on what you mean by dedicated. Some -- QCOM and I believe TI -- use DSP blocks instead of purely fixed-function. Obviously these are fairly customized DSP's for video encode/decode.

It was poorly worded; if I would have said dedicated hw blocks it might have been clearer.
 
VXD isn't that small if you need to support a lot of video standards despite being able to share quite a bit between them. That's the disadvantage versus a DSP-centric (but I wouldn't necessarily say DSP-based?) approach like Qualcomm's. So very roughly VXD scales based on the number of standards supported (selected by the customer based on market requirements) and VXE scales by the number of pipelines (selected by the customer to achieve their target performance at a given clock speed - iirc the clock rates for given standards are stated for 2 pipeline configurations). VXD cannot scale the number of pipelines and VXE always support the same standards. The required clocks are very low for VXD but these are typical clocks and very complex streams require more so combined with 1080p 3D requirements it's actually fairly reasonable.

I could go on rambling about a whole bunch of other video IPs (I just realised I didn't really answer anything!) but I need to go to sleep so I'll stop now... :)
 
Mostly agreed but there are two important points that are worth pointing out:
1) Quad-core is cheap in terms of transistors. On 40nm a Cortex-A9 is about 2mm². The Cortex-A15 is about twice as big which means it's very roughly the same die size on 28nm as an A9 on 40nm. As we move to 20nm it won't be much more than 1mm² per additional core (even if we pessimistically assume 50% higher effective wafer price it's still quite a bit cheaper than an A9 on 40nm).
2) While Handheld CPUs don't take much area, they have extremely high power consumption compared to most other subsystems. And while doubling the number of cores will double your theoretical TDP, it will never increase your power consumption for the same performance target (thanks to power gating) and will actually significantly reduce power consumption for workloads with more than two threads (undervolting; 4 cores at half clock is more efficient than 2 cores at full clock). You can probably keep the same TDP by limiting clock rates with 4 cores at ~70%(?) of your maximum dual-core clock rate.

This isn't like the PC market where CPU cores have historically been very big. So when you combine these two facts you get a clear conclusion: even if the performance benefit isn't very large, it's potentially still worth going quad-core on technical merits alone. And when marketing enters the picture, it should be an obvious decision, especially in the 20nm generation. However that does NOT mean it's the single most important factor - obviously a very strong dual-core solution like the MSM8960 is preferable to an average quad-core one like Tegra 3.
Quad-core I deem very likely as well. Beyond that - not so likely.

So if you compare the A9600 and a similar 20nm SoC, I'd expect the number of both CPU and GPU cores to double. However GPU performance might still increase more than CPU performance because CPU clock rates will be much more limited when all cores are enabled. Obviously everyone else's ratio is going to eventually increase up to that point in the ultra-high-end so for everyone else there's a lot of catching up to do. I agree that the amount of die size dedicated to the CPU will go down on 14nm but how much depends on what ARM does with their first (likely incremental) and second (SMT at the very least) generation 64-bit cores...
There is no doubt that smart-phones/tablets will migrate to the 64-bit ISA.

I don't think I disagree with much of anything in your post. My response was to Ailuros' claim that GPU performance would outpace CPU performance and that this somehow constituted him going out on a limb, when I felt that if so, it was a remarkably short, stubby and solid branch. :) Not much of a risk taking.

I do feel we're in the "gold rush" days of mobile computing. Devices with new and comparatively sophisticated capabilities are ballooning in market presence and value, there is a number of underlying hardware players jockeying for position, and there is still room for huge shifts in market share and even completely new segments to develop. We should enjoy it while it lasts, because it won't last forever. My approximation was that until 20nm we'd see some fairly aggressive moves in order to differentiate you products, but that after this we'd be at a point where additional computing resources would do very little to impress the average buyer, or even a buyer a couple of sigmas out from the average. (Joining Ailuros on his limb here) Not to mention that such increases will be progressively harder to come by just as in higher power segments of computing.
 
Was it ever confirmed that Apple will switch to TSMC? Judging from the rumors I could believe that Apple sent some sort of testrun chip to TSMC for 28nm, but judging from the general yield/capacity problems for TSMC's 28nm it sounds like bad timing right now. I'm not even sure that Apple wants to switch entirely to another foundry and isn't just considering some sort of dual sourcing scheme in case they'd want even higher SoC volumes than so far.
 
Was it ever confirmed that Apple will switch to TSMC? Judging from the rumors I could believe that Apple sent some sort of testrun chip to TSMC for 28nm, but judging from the general yield/capacity problems for TSMC's 28nm it sounds like bad timing right now. I'm not even sure that Apple wants to switch entirely to another foundry and isn't just considering some sort of dual sourcing scheme in case they'd want even higher SoC volumes than so far.

No, it was never confirmed. It's been rumored that they were switching and it's also been rumored that they are sticking with Samsung (as in actively sourced rather than just guessed at they weren't switching). As long as Samsung keeps their process nodes up with TSMC and/or GF, I don't see a huge reason for them to switch if they like their current partnership. There's a lot of demand for TSMC's 28nm processes, but Samsung's demand is thinner and steadier as a result, it would seem.
 
At least their next SoC should be IMO manufactured at Samsung under its 32nm. There's plenty of time for 2013.
 
At least their next SoC should be IMO manufactured at Samsung under its 32nm. There's plenty of time for 2013.

I agree. Samsung's 32nm process is much more proven than TSMC's 28nm process. The last thing Apple needs is poor yield or congested fabs driving up price and/or availability.

Now Intel just needs to open their fab to everyone to make it interesting :devilish:
 
Apple's strategy with the GPU is to give it the priority of putting in the most that's reasonable at the time/node, though still factoring in the desire to not disrupt the software ecosystem with new processors every single year.

Even without OpenCL, Apple use the GPU to assist in various not-strictly-graphics tasks of the UI, browser, and imaging, and they plan for a lot more on the imaging side in the future. Plenty of iOS games feature visual enhancements for the A5 SoC, and the majority of all games play noticeably smoother on iPad 2/iPhone 4S (quite a few various projects are actually making use of UE3 recently.)

Apple know that the standards of software, including games, are escalating quickly. They were the driving force behind OpenCL, so they'll be letting developers in on the additional general purpose resources sometime down the line, as well.
 
Including the Epic Citadel demo, I think we're up to 10 UE3 games on the App Store.

Epic Citadel
Infinity Blade
Dungeon Defenders
Gyro13
Warm Gun
Epoch
Dark Meadow
Infinity Blade II
Desert Zombies: Last Stand
Batman Arkham City Lockdown

Afterlife: Ground Zero has also be announced as well as 4 UE3 games from Gameloft, one of which March of Heroes was cancelled.
 
If we take a strong look at Apple's previous 5 SoCs, we see a pattern of designing a chip for one process, then bumping up clocks for a smaller process. It's a nice, consistent way to "guarantee" performance increases every generation of phone Apple makes. Apple should at least have a very good idea what kind of CPU 1-2 years out.

As nifty as quad-core sounds, I don't think it's really all that great, especially on a mobile device. Heck, even on a desktop, you have to think really hard to find consumer-level apps that take advantage of a quad-core chip. You have to take a look and see what if developers would really take advantage of having 4 cores instead of 2 faster ones (most would not). And a faster dual-core CPU means 0 rewrites to take advantage of performance improvements. Plus, are we really expecting Apple to pull a quad-core A15 out of their hat in late-2012 to 2013? Otherwise, a dual-core A15 would mean giving a quad-core CPU to developers for one generation, only to take it away in the next.

Lastly, I strongly suspect there's only one major SoC development team at Apple which is expected to produce a brand new SoC with a new architecture every two years. In the interim years, Apple can rely on die shrinks and clock speed increases to get them more performance. When you take that into account, Apple knows that building that they are building the A5 not only for the iPad 2 and iPhone 4S, but a potential high-resolution iPad 3 as well.
 
Going from multi-core to "more multi-core" doesn't fall outside the refresh ballpark since you're not really changing architectures. In such a case it's more a design decision if with a smaller process you want to only go for higher frequencies or only for more cores or a fine balance between the two.
 
If we take a strong look at Apple's previous 5 SoCs, we see a pattern of designing a chip for one process, then bumping up clocks for a smaller process. It's a nice, consistent way to "guarantee" performance increases every generation of phone Apple makes. Apple should at least have a very good idea what kind of CPU 1-2 years out.

As nifty as quad-core sounds, I don't think it's really all that great, especially on a mobile device. Heck, even on a desktop, you have to think really hard to find consumer-level apps that take advantage of a quad-core chip. You have to take a look and see what if developers would really take advantage of having 4 cores instead of 2 faster ones (most would not). And a faster dual-core CPU means 0 rewrites to take advantage of performance improvements. Plus, are we really expecting Apple to pull a quad-core A15 out of their hat in late-2012 to 2013? Otherwise, a dual-core A15 would mean giving a quad-core CPU to developers for one generation, only to take it away in the next.

The problem with your analysis is that it makes sense.
A bit more seriously, and this one I mean, is that the scheme is too predictable.
Regrettable as it may be, part of the design of any consumer device is marketability. And if Apple believes that superficial tech websites are in any way a significant part of what shapes market perception of their devices, then that might be reflected in SoC designs. GIven that multi-threaded cache resident benchmarks are the order of the day at such sites, and the lack of significant (for Apple) die cost of adding cores, particularly at 32/28nm, going quad A9 could simply be an issue of playing it safe, marketing wise. Quad has to be better than dual, right? Going to quad A15 would be the next probable step. It would cost a bit, but not too much, and Apple is in a good position to put the squeeze on their competition in that regard.

The big question is how aggressive Apple wants to be. The tablet market is theirs to loose, and they have been aware for some time that Android will be joined by a Wintel attack, come Windows 8. Increasing screen resolution and quality is but one way to stay a step ahead of the competition, they are bound to use any other hardware means at their disposal as well, and SoC cost/complexity is one area where they are capable of being very competitive. Caches, data paths, GPU width, core count et cetera are all examples of where they could strengthen their offering and distance themselves from the also-rans. If the cost in terms of battery life is small enough, I wouldn't be surprised to see Apple going above and beyond the scheme you outlined, for market positioning reasons rather than purely technical ones, basically, ensuring that technological predictability isn't taken advantage of. I'd say they already did this with their A4->A5 move on the same lithographic process.

PS. Unfortunately the Cortex A9 doesn't really support going beyond four cores. Otherwise it would have been wonderfully ironic to see Apple put 16 A9s on their SoC, typically "turned off" obviously, but there when called upon to play the numbers game. :)
 
The projection that Apple would use 543MP2 + dual A9s came well before the rumors and driver discoveries started to surface simply because that configuration best matched the die budget trends the semiconductor companies appeared to be targeting.

I don't think Apple went beyond expectations, assuming one already accepted the fact that their concurrent development of end product and SoC and also their strategic partnership with IMG gave them an advantage in time-to-market for the performance crown.

I expect mobile CPUs to settle on quad core configs soon enough, not because of the meager gain in performance but because of the improved opportunity to manage power consumption. Balancing workloads across independently controlled cores, getting to use a lower voltage from a lesser need for clock speed, etc.
 
One quick note: Apple controls their OS. Therefore it should be relatively simple for them to implement ARM big.LITTLE in a true heterogeneous multiprocessing (A7 & A15 cores doing useful work simultaneously). That makes the 'problem' of switching from 4xA9 to 2xA15 irrelevant since it would really be 4xA9->2xA15+XxA7 where X >= 2.

Mind you, I don't think Apple will go for 4xA9. But I don't think it would be a bad thing either if it doesn't slow down their move to A15+A7 (which is a big if - that's clearly a much more important upgrade). As I've said time and again, there is *no* power cost to going quad-core. It either results in the exact same level of performance at the same power (power gating for lightly multithreaded workloads) or higher performance at the same power (undervolting all cores for highly multithreaded workloads). It's purely an area cost - and that's something Apple can clearly afford.

I agree that their next-gen SoC will probably be very similar but merely increasing the number of cores is much easier than changing the CPU or GPU was in the old days. So given their area budget I don't think a straight shrink is very likely either.
 
Back
Top