NVIDIA Tegra Architecture

Discussion in 'Mobile Graphics Architectures and IP' started by french toast, Jan 17, 2012.

Tags:
  1. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles
    Has anybody figured out why Volantis shows as 2,5 Ghz and not 2,3 on GB? It looks like TK1 can match Enhanced cyclone in single thread, but it's not even close in per core IPC. Still, impressive for 28nm.
     
  2. Lazy8s

    Veteran

    Joined:
    Oct 3, 2002
    Messages:
    3,100
    Likes Received:
    19
    Then, of course, there's the question of how the differing approaches to CPU architecture affect the performance in a range of real world workloads.

    It'll be interesting to see if nVidia's approach with Denver leads to a new way forward for other design teams, too.
     
  3. ams

    ams
    Regular

    Joined:
    Jul 14, 2012
    Messages:
    914
    Likes Received:
    0
    Performance per watt is what actually matters in mobile, not performance per MHz. And higher MHz doesn't mean lower power efficiency (see Maxwell vs. Kepler and Denver vs. Cortex cores as an example).

    As I have mentioned elsewhere, Denver eschews the power hungry OoO logic for a totally different approach where code is optimized via software and then executed in-order. The idea is to improve efficiency by optimizing code once and use many times rather than using more power hungry logic to optimize each and every time.
     
  4. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles
    Judging from battery size and battery life, it looks like Cyclone is way ahead in perf/watt also. The ipad has a much smaller battery, yet Apple advertises 10hrs to the Nexus' 9.


    I am very skeptical about this Nexus, every Tegra has benched well, then gone on to be a Dog of a chip. Including 32bit K1.
     
  5. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles
    I'd like to correct my previous post. The Nexus has a smaller battery. 6700mah vs 7340.




    It's hard to say which is better per watt. The ipad has a larger screen and longer battery life, but the Nexus has a smaller battery.
     
  6. ams

    ams
    Regular

    Joined:
    Jul 14, 2012
    Messages:
    914
    Likes Received:
    0
    The CPU perf. per watt of both devices is completely unknown at this time. Ideally one would need to use a CPU-intensive application and isolate CPU power consumption by measuring at the voltage rails, or measure the difference between idle and sustained power consumption with a CPU-intensive application. To accurately measure power consumption of either CPU or GPU, the platform power consumption needs to be isolated and accounted for. And in this particular comparison, there is also a difference in fabrication process node too.
     
  7. JohnH

    Regular

    Joined:
    Mar 18, 2002
    Messages:
    595
    Likes Received:
    18
    Location:
    UK
    You can't use different architectural generations as examples of higher clock != lower power efficiency, it's a completely meaningless comparison, it just shows that newer architectures may be designed to be more power efficient.
     
  8. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles

    True, but they both offer roughly the same battery size and we can infer something about their power efficiency, even if it's only in the context of that OS or form factor.

    The Tegra is at a node disadvantage, but should we judge its performance as somehow more impressive based on what it might do?
     
  9. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,722
    Likes Received:
    141
    Indeed, isn't A8X 20nm?


    I don't see that it matters much anyway, Nvidia and Apple aren't really competing against each other.
     
    #3069 ninelven, Oct 23, 2014
    Last edited by a moderator: Oct 23, 2014
  10. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,552
    Likes Received:
    4,713
    Location:
    Well within 3d
    It is, but 20nm is also not a strong improvement in power efficiency. It will take a transition to FinFETs to get where many traditionally expect a node transition to get to.
    There is still some improvement, and the increased density can allow for more transistors to spend in the pursuit of power savings, so where the tipping point may be could depend on more detailed performance profiling and power testing.
    Should they all transition to a similar FinFET node, the comparison's error bars should be much smaller.
     
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    20SoC according to insider indications isn't exactly what you'd expect it to be. Yes it comes with improvements of course but to put it into a more realistic Tegra Logan vs. Erista picture, if the latter GPU is for example by 80% (freely invented figure) more efficient, do you think it would be fair if I'd say that the majority of that persentage comes from architectural refinements (Kepler-->>Maxwell) and only a modest persentage from the 28HPm to 20SoC transition?

    It's always nice to have a smaller and more advanced manufacturing process but if there aren't any leaps in architectural refinements it'll stay in the hw refresh realm.
     
  12. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles
    From what I understand the benefits of maxwell are mostly due to power gating and clockspeed managment. Under heavy load the differences between Kepler and Maxwell efficiency is negligible.


    I also understand that mobile GPUs such as power vr 6xt render frames much more efficiently than desktop (read:nvidia) GPUs. Who is to say whether nvidia'a current lead in desktop GPUs would even translate?
     
  13. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    A GTX980 is almost by 60% faster than a GTX770 at comparable real time power consumption, while the GM204 is on the same process as GK104 and with about 33% more die area. Same goes for a GM107 vs. a GK107. You don't get that kind of differences just with power gating and frequency tricks. If you don't see any of these increases you're most likely comparing apples to oranges ie making the same mistake as many while comparing GM204 to GK110. The first is a performance chip, the latter a high end chip.

    This doesn't come for free either; GM204 has roughly 48% more transistors than GK104 amongst them a portion for the higher compliance for the first.

    Which has what to do with the topic at hand exactly? Yes the PowerVR is a TBDR and I've been following them since I was a teenager. As with all approaches there are both advantages as disadvantages.

    That still doesn't change one bit that the ULP Maxwell GPU in upcoming Erista will raise the efficiency bar significantly as will most future architectures in that market.
     
  14. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,380
    From the CUDA thread, it's clear that they added some kind of register reuse cache that reduced register fetches from the register banks. Banks are pretty large, so that should result in quite an optimization. And if the register reuse cache is much closer to the ALUs, they will lose less power moving the operands around as well. And then there's reduced HW scheduling and the reduced crossbar not allowing operands to execute everywhere.

    The whole SM architecture has changed significantly, and they intuitively seem to benefit perf/W.

    You don't get this kind of improvement with just clock gating (as if that's a new thing) and clock speed management (whatever that means.)
     
  15. RecessionCone

    Regular Subscriber

    Joined:
    Feb 27, 2010
    Messages:
    504
    Likes Received:
    186
    I'm skeptical about Denver perf/watt too. But why do you think 32bit K1 is a "Dog of a chip"?
     
  16. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles

    I don't mean it's not very fast on paper, but there are tons of bugs and every person I've seen use the 32b k1 returned the thing in hours. Lots of incompatibility, random crashes, etc. It may have had nothing to do with K1 and was just all the other hardware and software...

    But I doubt it. Look at the sales figures, customer response has been negligible.
     
  17. Florin

    Florin Merrily dodgy
    Veteran Subscriber

    Joined:
    Aug 27, 2003
    Messages:
    1,682
    Likes Received:
    297
    Location:
    The colonies
    Tons of bugs, such as..?

    Where are the sales figures for K1 devices?
     
  18. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles

    It's not just "clock gating" and how do you explain the fact that Kepler and Maxwell have the EXACT same power consumption on compute loads?


    Your explanation is that "the architecture is different" ... Ok!
     
  19. extrajudicial

    Banned

    Joined:
    Oct 23, 2014
    Messages:
    20
    Likes Received:
    0
    Location:
    Los Angeles

    I'm on mobile and am having a hard time linking all the reviews. The consensus is that yes it's very fast but very buggy. Lots of crashing and updates.

    I'd love to see the sales figures, I'll look for them in the next hr.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...