NVIDIA Tegra Architecture

Discussion in 'Mobile Graphics Architectures and IP' started by french toast, Jan 17, 2012.

Tags:
  1. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Exactly; however it could still be that Erista doesn't have as many raw GFLOPs after all and bitchslaps all the SoCs it'll succeed by varying degrees but exactly as the slide states.
     
  2. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    783
    Location:
    EU-China
    ^^ This

    If (and it's a big IF) I was Nvidia, I will go 2 SMM and decrease a bit GPU frequency (around 750MHz) to offer good perf increase (50%) and keep TDP lower than TK1.
    2 SMM at 750MHz will still have no competition in GPU benchmarks through all 2015 year and beyond...
     
  3. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Well GX6650 has 12 TMUs dosen't it..so 16 TMU's for a 2015 part dosen't sound like overkill. Besides..if they're targeting high end, high res tablets, it wouldn't hurt.

    And what is the area cost of TMU's these days? GM204 has 128 of them so cant be too much I would think. Compared to the TMU area on 28nm TK1, percentage of die size would probably remain the same on 20SoC anyway.
    I disagree. If they are aiming for higher per/W..a 2 cluster makes more sense as two lower clocked clusters could consume less power than one higher clocked one.
    True..and the graph shows increase of "only" ~1.6x anyway. I think the increased L2 cache on GK208 played a part and which Maxwell has built upon. Erista would probably go to 2MB L2 as well.
    Why? It taped out months after the first GM2x0 part..why would they go with first gen?
    Yep..LPDDR4 memory and 2 MB L2 seem likely.
    Exactly..seems more plausible than ~1 ghz clocks on 20SoC anyway.
     
  4. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Good point; but unless Apple has truly an A8X for the rumored >12" tablet planned I don't see it in consumer products all that soon.

    Look at the Apple A8 die shot; there's one where they've marked the shared quad TMUs between two clusters at a time. It's my understanding that those are 16*16k TMUs with the difference that Apple's drivers support only max 4k*4k textures. In any case given that the A8 GPU is being estimated to be 19.1mm2@20SoC by Anandtech it shouldn't be too hard to extrapolate from the die shot how much approximately 2 quad TMUs cost roughly in hw. They're definitely not small.

    Agreed and good point.
    Do they really need >DX11.0 already for the ULP SoC space? I could of course the possible funky "oh look it's DX12" marketing trip, but that doesn't come for free in hw either. I was estimating through dumb layman's math the GM204 to be around 370mm2; what possibly threw me off was the added >DX11.0 feature budget.

    Whereby 750MHz sounds already aggressive for consumer products, but yes two clusters at a way more reasonable peak frequency than for GK20A makes sense.
     
  5. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Kepler Tegra already supports DX12 whether we care about it or not, of course with a lowish feature level if that's what you refer to (and that needs DX12 included in Windows 10 ARM)

    To compare with desktop :
    Perhaps low end desktop GPU always have had too many features, but they're copy-pasted from the higher end ones and the general public appreciated cards compatible with everything even if slow (now Intel GPUs can run everything)

    Khronos announced "OpenGL Next", which will replace both OpenGL and OpenGL ES. The existence of a separate API was an anomaly, even if practical. OpenGL Next is also said to be a clean break (get rid of all the cruft from the 90s) and will be a DX12 competitor so to speak.
     
  6. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    Yep; GM20x goes beyond DX11.0 feature level.

    The question is whether GL Next will require as many additional functionalities as DX12 in the end.
     
  7. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    I would hope so.
     
  8. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    OK and when is GL_Next estimated to get finalized? A simple estimate would do.
     
  9. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Well we will find out soon enough..there is an iPad event scheduled for the 16th of October apparently. Quad core CPU + GX6650 anyone? :lol:

    You also have to remember that Erista will be around for some time. Depending on the 16/14nm ramp goes, it may well have to hold down the fort till Q1 or Q2'16 so higher performance wouldn't be a bad thing. And it gives them flexibility to play around with power/clocks.
    I dont have photoshop so did a very rough analysis. At max, each quad TMU seems to be about 1/10th of the size of the GPU, i.e. ~1.9 mm2. So at max, two quad TMU's would be ~3.8 mm2. I agree its not chump change, but in a ~100 mm2 SoC, its not too much either wouldn't you say?
    Well, as I have alluded to earlier, marketing is a big factor. Even if we take ~10% as the extra die area required for DX12 support, and taking 30mm2 as the area of the GPU block (both of which are high estimations IMHO), the extra die area required is just ~3 mm2. Secondly, the improved efficiency of 2nd gen Maxwell could alone have been enough of a selling point. Another point to consider that if it does indeed have DX12 support, it could be a good launch vehicle for Windows 10 based tablets (or phones possibly?). And as I've already stated above, Erista could be around for a while and towards the middle or end of its life cycle other IHV's could release a SoC with a DX12 compliant GPU.

    All this is my own personal speculation mind you..gotta love being an armchair CEO right? :lol:
     
  10. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I'd rather bet for dual core at a rounded up frequency.

    Unless I'm missing something it doesn't sound like much more than a year from SoC to SoC either.

    I'm aweful with die shots, but luckily the boys at Anandtech marked the TMU blocks so I estimated roughly around 2mm2 for each quad TMU. I don't know how big they're planning for Erista. Historically after Tegra2 they almost religiously stayed at around 80mm2 and K1 broke every of their SoC area record so far. Back to "base" or it doesn't matter simply?

    Besides 3.5-4mm2 aren't exactly small for the ULP SoC world; given the A8 die area and possible transistor density those 2 more quad TMUs could be well over 80Mio transistors.

    10% here and 10% there and before you know it you're way over budget in the end. As for DX12 in smartphones it'll take eons before you see anything even OGL_ES3.x related, let alone anything close to DX11. Tablets are in a better power realm but still also far from ending up at anything else but functionalities being mostly decorational.

    I've nothing against progress, au contraire but under such extreme power constraints I'd rather have transistors invested in higher efficiency. Besides Microsoft's penetration in the ULP market is still extremely small.

    Do you see many around here that aren't speculating with these things? :razz:
     
  11. Turbotab

    Newcomer

    Joined:
    Feb 19, 2013
    Messages:
    214
    Likes Received:
    3
    Possibly, but with 1GB of RAM:lol:
     
  12. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    We're way OT but nope. If such a thing is even planned for now, native resolution isn't likely to end up below 3072*2304.
     
  13. tangey

    Veteran

    Joined:
    Jul 28, 2006
    Messages:
    1,537
    Likes Received:
    282
    Location:
    0x5FF6BC
    It's possible that the A8 will remain, and with the larger form factor, the CPU will be clocked up a little, the GPU clocked up more.

    Is there consensus as to a guesstimate of the A8 GPU clock in the iphone ? if around 450, and the A8 has been designed for 600Mhz in a larger form factor, that provides a theoretical 33% improvement, assuming there aren't major bandwidth limitations.
     
  14. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    My hope would be next year.
     
  15. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I don't think there's anyone that doesn't share that hope :smile: If it should be late 15' though or later it won't help Tegra marketing much if the Erista GPU should exceed feature level DX11.0.
     
  16. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    That's very true.
    Favorable case would be middleware (e.g. Unreal Engine, Unity, etc.) easily allows you to have a DX12 path, OpenGL 4.5 path, OpenGL Next etc. even if the assets and engine otherwise target old OpenGL ES.
    If that allows reduced CPU usage that's mildly good and useful.
    Ideally the app store sends you a package that only includes the code path you need, can Android, Windows RT/Phone do that? would be useful for ARMv7 vs ARMv8 vs x86 too.
     
  17. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
    I'd actually love to know how many Watts K1 needs for very demanding engines like the latest Unreal one or something as complicated as DX11 tessellation.

    Measuring power or battery lifetime in comparably boring stuff as in T-Rex Kishonti with mostly alpha tests@overload is fine and dandy, but when you market N product with X capabilities I'd also like to know how things look like at full tilt.
     
  18. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,511
    Likes Received:
    224
    Location:
    Chania
  19. Erinyes

    Regular

    Joined:
    Mar 25, 2010
    Messages:
    808
    Likes Received:
    276
    Probably..I was just joking. Though there are some cases where a quad core would be useful, especially in the rumoured "Pro" version of the iPad.
    Well if we take Q1'15 and Q2'16 as the probable release dates, it would be 5 quarters, which is a bit more that the gap between the previous tegra chips afaik.

    Though another thing is I see Erista sticking around for quite a while even after its 16nm succesor is released. 16nm capacity will be expensive and availability would be limited initially. There is no significant increase in density from 20nm to 16nm so the configurations would largely be similar. There would be some gains in power/and or performance, but at considerably higher production costs. So Erista may suffice for a lot of customers as it would be priced significantly lower and would not trail behind in performance all that much. So it could have a reasonably long life cycle. So if its a part which is intended to be sold through 2016, or even later..16 TMU's and DX12 compatibility dont seem to be overkill. (Its a similar situation to how 28nm will remain the dominant process node in 2015 for anything mid-range or lower)
    Well I got 1.9mm2 so we aren't too far off. Also note that Apple has traditionally been more concerned with power and has traded area for lower power consumption. Nvidia should have higher density, and the actual area for them should be even less.. And see my post above for reasons why the extra TMU's may be justified.
    Agreed..but in this case it would be ~5-6 mm2 in total so its not all that high in the context of a ~100 mm2 SoC. And again I'll point you to my post above as to the reasons why I think it wont be completely useless. Given the timeframe, the features could very well be utilised towards the later part of its lifecycle.
     
  20. ams

    ams
    Regular

    Joined:
    Jul 14, 2012
    Messages:
    914
    Likes Received:
    0
    If first silicon for Tegra M1 Erista was available in July 2014, then that would be ~ 1 year after first silicon for Tegra K1 Logan was available. So I would realistically expect to see Tegra M1 Erista in devices by Q2 2015. And if that first Tegra M1 variant has Cortex A57 + A53 cores, then I would expect to see a second Tegra M1 variant with Denver cores in devices by Q4 2015. And then the cycle would repeat in 2016 with Tegra M2 [Maxwell GPU], and in 2017 with Tegra P1 [Pascal GPU].
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...