Tegra 3 officially announced; in tablets by August, smartphones by Christmas

metafor · Aug 3, 2011

Exophase said:
There's no quad SP MAC, just dual FPMUL and FPADD pipelines which can be chained. Double precision is chained as well, of course.

It's a MAC in that it's a single mult-acc operation. It's not fused but fused MAC isn't the only definition of MAC

For integer there are 8 8x16bit MACs, and I don't think that they're shared with the floating point multipliers, or at least I can't think of a very good way to do this since you'd need to use 6 of those just to get one requisite 24x32 out of it.

It can be done. Parts of the wallace tree and the booth encoders can be re-used. The M3 multiplier as well if it's radix 8.

Double precision VMUL actually has a throughput of two cycles on Cortex-A9 so I'd imagine the multiply part is split as two trips through two single-precision multipliers which have been extended from 23x23 to 27x27. It seemed to me that the modular approach would discourage this as well, but at the same time NEON always comes with VFP, while VFP itself is not required in any form for A9. You'd think if it were 100% separate they'd offer them separately, with maybe an A8-like VFP-lite option as well. Instead my guess is that if you get NEON you get VFP practically for free, while the opposite isn't nearly as true due to all of the integer stuff NEON needs.

Architecturally, NEON cannot come without VFP. That decision wasn't made, IMO, because of how the A9 is implemented. I believe A15 has an entirely separate VFP and NEON pipeline.

As for how "free" you can get VFP after having a NEON implementation, that depends on the NEON implementation. Without having to support the rounding and denormal handling of VFP as well as DP, a NEON implementation can be made to be very small and efficient. I'd wager that's why A15 separates its VFP from its NEON pipes.

It's true that much of the NEON and VFP multiply pipeline can be shared. But from a power/perf standpoint, having them separate -- and having separate fused and chained/mul/add pipelines -- is the best implementation. Of course, there's an area trade-off for that.

metafor · Aug 3, 2011

Exophase said:
Thanks, that's good to know. Makes me wonder why ARM only sells the NEON unit with VFP. Can they really be 100% separate, with the shared register file?

I guess this would make it stand to reason that the NEON unit is over twice the size of VFP, since it includes VFP and the vector part is surely larger.

Architecturally, an ARM implementation can't have NEON without VFP.

Exophase · Aug 3, 2011

metafor said:
Architecturally, an ARM implementation can't have NEON without VFP.

ARM didn't have to make that decision either ;p Is there a technical/design based reason for this? ... nevermind, guess you hinted at that with the earlier reply.

metafor · Aug 3, 2011

Exophase said:
ARM didn't have to make that decision either ;p Is there a technical/design based reason for this?

I can't see one. But it makes sense for legacy reasons. VFP existed before NEON and even had SIMD-like features (vector operations) that were deprecated when NEON came about.

I agree that requiring a VFP implementation does hinder how small and efficient the implementation can be, but I'm guessing they don't want to have people go back and rewrite software (what little there is) that uses VFP instructions.

Exophase · Aug 3, 2011

That all makes sense, but they still allow a Cortex-A9 core to be licensed that includes neither VFP or NEON. If they wanted VFP everywhere and if nothing is shared with the NEON unit it might have made sense to make VFP a non-optional part of the core instead of being included in NEON.

metafor · Aug 3, 2011

Exophase said:
That all makes sense, but they still allow a Cortex-A9 core to be licensed that includes neither VFP or NEON. If they wanted VFP everywhere and if nothing is shared with the NEON unit it might have made sense to make VFP a non-optional part of the core instead of being included in NEON.

I suppose it's possible to have a NEON-only configuration. I suppose they assumed anything that used NEON could use at least some of VFP (load multiples, for instance). And there are certain applications that do require the rounding/denormal handling capabilities of VFP. Although I guess none of that matters in a cell phone.

Laurent06 · Aug 3, 2011

metafor said:
I suppose it's possible to have a NEON-only configuration.

Technically it's possible, but I don't think the option is offered to customers.

metafor · Aug 4, 2011

Laurent06 said:
Technically it's possible, but I don't think the option is offered to customers.

No, but IMO, there should be. A NEON-only implementation can allow smaller and more efficient designs. And more importantly, force compilers/programmers to stop using DP/VFP instructions when they don't need the precision/rounding/denormal.

Rys · Aug 4, 2011

I think it's more that no customer wanted to do it, rather than ARM not offering it.

Ailuros · Sep 7, 2011

http://news.cnet.com/8301-1035_3-20102167-94/nvidia-ceo-sees-tenfold-growth-in-mobile-chip-biz/

Mariner · Sep 7, 2011

Erm...

Huang estimates that Tegra chips are found on half of all high-end Android smartphones and 70 percent of Android tablets

Really?

tangey · Sep 7, 2011

Mariner said:
Erm...

Quote:
"Huang estimates that Tegra chips are found on half of all high-end Android smartphones and 70 percent of Android tablets "

Really?

...and for your next excerise, please define "high-end"

Exophase · Sep 7, 2011

Mariner said:
Erm...

Really?

Maybe if you only count phones and tablets using nVidia and Qualcomm SoCs, because no one else is making them apparently

tangey said:
...and for your next excerise, please define "high-end"

Unfortunately for nVidia there's no definition that'll make their preposterous claim reasonable, everyone knows Apple's A5 SoC is superior to Tegra 2 and has been sold in way more tablets. Maybe he means percent by product, not by sales...

Mariner · Sep 7, 2011

He must be classifying "high end" as dual-core phones only. NV did steal a jump on the market with the early(ish) Tegra 2 phones from LG and Motorola so I suppose his claims may be correct in this context.

The claim initially seemed a bit incongruous to me as I'm a cheapskate when it comes to phones and considered the many single-core 1 GHz+ large-screened phones as "high end"! These must have sold many, many more than the small range of Tegra 2 phones.

I wonder how long it will be before we see Tegra 3 in tablets? I don't think any of the tablets announced at IFA included a Tegra 3, did they?

Mariner · Sep 7, 2011

Exophase said:
Unfortunately for nVidia there's no definition that'll make their preposterous claim reasonable, everyone knows Apple's A5 SoC is superior to Tegra 2 and has been sold in way more tablets. Maybe he means percent by product, not by sales...

To be fair, he did say Android tablets. You can find the Tegra 2 in all types of Android tablet, from the budget ones to the more expensive so I can believe that they have 70% of that market. This still assumes that the various brands of really cheap tablets containing ARMv6 chips, resistive screens and running Eclair or Froyo haven't sold very many!

Exophase · Sep 7, 2011

Yeah, I missed the Android part, that does seem more plausible. I'm going to counter that with "who cares." But cheap Chinese tablets should definitely not qualify as high end, when they're lower end then the first real mainstream ARM tablet.

The more ridiculous bit of the article is that apparently only nVidia and Qualcomm are serious about selling SoCs, as if TI and Samsung don't have products out..

Mariner said:
I wonder how long it will be before we see Tegra 3 in tablets? I don't think any of the tablets announced at IFA included a Tegra 3, did they?

Tegra 3 tablets will be out in August

Arun · Sep 7, 2011

Mariner said:
He must be classifying "high end" as dual-core phones only. NV did steal a jump on the market with the early(ish) Tegra 2 phones from LG and Motorola so I suppose his claims may be correct in this context.

I think that's likely correct. Obviously their share will go down in the second half of the year as OMAP4 and MSM8x60 ramp, but their volumes will still be increasing (e.g. Samsung Galaxy R).

Exophase said:
Tegra 3 tablets will be out in August

Hah! I think it's fair to say that when management is exceedingly confident they won't need any complex respin because they 'prototype a lot' then Murphy's Law will often prove them wrong. BTW, Mike Rayfield asked me if I was coming at Computex (which I wasn't/didn't), implying they'd have Tegra 3 tablets on show there. Obviously the delay happened early on and they just didn't talk about it.

Exophase said:
Maybe if you only count phones and tablets using nVidia and Qualcomm SoCs, because no one else is making them apparently

Jen-Hsun has a long history of statements like that about the handheld industry. If you actually talk to NVIDIA handheld execs, they also take ST-Ericsson and TI seriously, with the usual insistence that "we still move a lot faster than them" ([strike]despite all evidence to the contrary, e.g. 2 years between T20 and T30 tape-out[/strike] T20 to T30 is actually 18 months). They don't take Samsung/Broadcom/Intel seriously and I can't blame them too much.

I can't really blame Jen-Hsun for singling out Qualcomm though (even if he shouldn't ignore all others). I have yet to talk or see about anyone in the industry who doesn't do so. ST-Ericsson? Check. Broadcom? Check. Icera? Check. And the list goes on. And that reputation is well deserved - not just because of their size, but because of their speed. It's very rare to find a company that big that can adapt their roadmap so fast based on changing market conditions. Every single 45nm chip they've released is completely different from what they were talking about three years ago, and very much in a good way strategically. I've never heard of something like that happening at TI or Intel.

metafor · Sep 7, 2011

Exophase said:
Unfortunately for nVidia there's no definition that'll make their preposterous claim reasonable, everyone knows Apple's A5 SoC is superior to Tegra 2 and has been sold in way more tablets. Maybe he means percent by product, not by sales...

In smartphones, having "high-end" be defined as dual-core devices may indeed give Tegra 2 a 50% design share; albeit probably not volume share. There are simply quite a few devices that uses Tegra 2, albeit none that sold remotely as well as the Galaxy S2, I'd say.

Ailuros · Sep 7, 2011

metafor said:
In smartphones, having "high-end" be defined as dual-core devices may indeed give Tegra 2 a 50% design share; albeit probably not volume share. There are simply quite a few devices that uses Tegra 2, albeit none that sold remotely as well as the Galaxy S2, I'd say.

A 50% of dual core CPU smart-phone android design share?

http://www.nvidia.com/object/tegra-superphones.html

Let's see in random order:

Samsung Galaxy S2
LG Optimus 3D
T-mobile my touch slide 4G
HTC Evo 4G+ Rider
Motorola Droid 3
Motorola Droid Bionic

If any of those shouldn't have dual core CPUs I apologize out front. In terms of sales amounts the Galaxy S2 should be definitely on top of that list.

It's definitely quite an awkward perspective to present things, but on the other hand if he would had quoted volume shares it wouldn't have won any impressions in the end

Exophase · Sep 7, 2011

Some more phones with dual-core Scorpions:

HTC Sensation
HTC Evo 3D
Pantech Vega Racer
Xiaomi Phone
myTouch 4G

Seems to me like nVidia doesn't have close to 50% of design wins either. Maybe if you count canceled ones.

Tegra 3 officially announced; in tablets by August, smartphones by Christmas

metafor

metafor

Exophase

metafor

Exophase

metafor

Laurent06

metafor

Rys

Graphics @ AMD

Ailuros

Epsilon plus three

Mariner

tangey

Exophase

Mariner

Mariner

Exophase

Arun

Unknown.

metafor

Ailuros

Epsilon plus three

Exophase

Similar threads