NVIDIA Tegra Architecture

But their unified shader architecture i would presume will be very efficient, as they have loads of experience, isn't an IMR architecture better for more powerfull gpus' over TBDR?
There's nothing that ties more computational power to the rendering approach. TBDRs are as happy with lots of raw power as IMRs; the two things are completely orthogonal.

Ailuros: Why is 80mm^2 approaching the upper limit for NV because of cost? It's a totally arbitrary soft cap to pull out of thin air, since you have no idea about their chip costs.
 
Apparently it's the limit due to package sizes in a mobile form factor.

The fellow from ARM also said this during his AFDS keynote.
 
And Samsung too then.
I certainly agree with your overall point (custom PoP packages are possible for very big companies) but surprisingly enough Samsung uses a standard 14x14 PoP package on the Exynos 4210 (unlike the A5 which is using a 16.7 x 14.3 package interestingly enough). It's also noteworthy that both companies are using custom power management chips (from Dialog Semi for Apple and from Maxim for Samsung).
 
Ailuros: Why is 80mm^2 approaching the upper limit for NV because of cost? It's a totally arbitrary soft cap to pull out of thin air, since you have no idea about their chip costs.

No I truly don't, but then again how many tablet/smartphone SoCs are as large as A5 ie >120mm2@45nm after all? Is it that irrational to assume that someone like Apple will mind way less for higher die area considering the high volumes they're dealing with?
 
Other vendors are making 100mm^2+ smartphone SoCs. Overall, I'm much more inclined to believe NV coming in under it this time around is more because of well-judged positioning than a manufacturing limit.
 
Other vendors are making 100mm^2+ smartphone SoCs. Overall, I'm much more inclined to believe NV coming in under it this time around is more because of well-judged positioning than a manufacturing limit.

The less cost is a consideration, the higher any cost related hypothetical manufacturing upper threshold at least IMHO.
 
Cost is always a consideration, I just don't think it's the primary consideration for NV with Tegra.
 
Kepler in upcoming Tegra SOC?


http://www.anandtech.com/show/5703/jenhsuns-email-to-nvidia-employees-on-a-successful-kepler-launch

One Kepler SMX could be in Series 600 territory.

Well that is pleasing to see, its unlikely he would have slipped up like that in a pre prepared and probably checked email before sending and leaked to media..

Certainly Kepler is a good base to start from, it has little dedicated die space for computational workloads..unlike GTX480, and has the best perf/watt out of todays top line chips.
If Adreno 205 is supposed to be a very cut down simplified Xenos..then it could be possible in future...not this year though.:p
 
Kepler in upcoming Tegra SOC?


http://www.anandtech.com/show/5703/jenhsuns-email-to-nvidia-employees-on-a-successful-kepler-launch

One Kepler SMX could be in Series 600 territory.

Questionable. An SMX has 192 ALUs, 32 L/S units, and 16 TMUs. It's 1/8th of GTX680, and at a rough cut one of them may use ~25-30W at ~1GHz (assuming memory bandwidth scales down somewhat, probably not to 1/8th). Cut back the clocks drastically and fall back on integrated memory and I'm sure that drops a lot, but I doubt all the way down to the 2W or so that'd be on the real high end of a phone SoC. Because eventually you get diminishing returns on pushing down clocks where leakage dominates. And you'd need more fine-grained gating than this. Not to mention, it'd take something like 30mm^2 of die space on TSMC 28nm (assuming the LP process is the same density; I don't really know but I thought they moved to bigger transistors in some areas for less leakage). That's on the high end for typical Tegra GPU die utilization. Very reasonable if you're Apple, though.

Maybe other Kepler variants will have smaller SMX units. I have no idea how much work it is to rebalance them.
 
Questionable. An SMX has 192 ALUs, 32 L/S units, and 16 TMUs. It's 1/8th of GTX680, and at a rough cut one of them may use ~25-30W at ~1GHz (assuming memory bandwidth scales down somewhat, probably not to 1/8th). Cut back the clocks drastically and fall back on integrated memory and I'm sure that drops a lot, but I doubt all the way down to the 2W or so that'd be on the real high end of a phone SoC. Because eventually you get diminishing returns on pushing down clocks where leakage dominates. And you'd need more fine-grained gating than this. Not to mention, it'd take something like 30mm^2 of die space on TSMC 28nm (assuming the LP process is the same density; I don't really know but I thought they moved to bigger transistors in some areas for less leakage). That's on the high end for typical Tegra GPU die utilization. Very reasonable if you're Apple, though.

Maybe other Kepler variants will have smaller SMX units. I have no idea how much work it is to rebalance them.

..Yea you took the words right out of my mouth..:p
 
Questionable. An SMX has 192 ALUs, 32 L/S units, and 16 TMUs. It's 1/8th of GTX680, and at a rough cut one of them may use ~25-30W at ~1GHz (assuming memory bandwidth scales down somewhat, probably not to 1/8th). Cut back the clocks drastically and fall back on integrated memory and I'm sure that drops a lot, but I doubt all the way down to the 2W or so that'd be on the real high end of a phone SoC. Because eventually you get diminishing returns on pushing down clocks where leakage dominates. And you'd need more fine-grained gating than this. Not to mention, it'd take something like 30mm^2 of die space on TSMC 28nm (assuming the LP process is the same density; I don't really know but I thought they moved to bigger transistors in some areas for less leakage). That's on the high end for typical Tegra GPU die utilization. Very reasonable if you're Apple, though.

Maybe other Kepler variants will have smaller SMX units. I have no idea how much work it is to rebalance them.

As I said further down the future it isn't all that absurd and it doesn't necessarily have to be a GK104 SMX as is either. I'd be very surprised if it isn't possible to scale down to say 3*32/8 TMUs per cluster or even further. There's nothing in Jensen's statement either that indicates 28nm at any price.
 
As I said further down the future it isn't all that absurd and it doesn't necessarily have to be a GK104 SMX as is either. I'd be very surprised if it isn't possible to scale down to say 3*32/8 TMUs per cluster or even further. There's nothing in Jensen's statement either that indicates 28nm at any price.

Look, I'm going specifically on AnarchX's comment of "one Kepler SMX", where I'm sure he's talking about what is in GK104. If you want to change to a different layout for an SMX then of course you can make it work, but I'm clearly not talking about that. I do still contend that we don't really know how much of the partitioning is built into the design so I don't think it's fair to automatically assume it scales down to what you suggest. That is, without modification.

The other part of the comment was "upcoming Tegra SoC" which I took to mean the NEXT upcoming one (especially since it's being compared to Series 6), ie something that would be 28nm. Of course it should be eventually possible to fit even a GK104 SMX into a phone SoC. I'm not disputing JHH's comment here.
 
Look, I'm going specifically on AnarchX's comment of "one Kepler SMX", where I'm sure he's talking about what is in GK104. If you want to change to a different layout for an SMX then of course you can make it work, but I'm clearly not talking about that. I do still contend that we don't really know how much of the partitioning is built into the design so I don't think it's fair to automatically assume it scales down to what you suggest. That is, without modification.

I wouldn't even dare to suggest that any IHV like NVIDIA would use part of a desktop design with only small changes for a small form factor SoC. The ULP GeForce GPU blocks in so far Tegra SoC (well it's more or less the same enchilada spread over 3 SoC generations with minor changes, more units and higher clocks) doesn't even resemble directly to any of their desktop designs. It sounds rather like some sort of "mixed bag" with aspects from a number of DX9 architectures NV had.

AnarchX was a wee bit too enthusiastic with his comment here and at 3DC and I think my original answer shows that I don't agree with it at all.
 
One has to keep in mind that with the coming of Windows 8, the target platform for Tegra won't be restricted to the sub-2w range anymore. Clamshell fall in the 5-10W range.
 
Back
Top