NVIDIA Tegra Architecture

Discussion in 'Mobile Graphics Architectures and IP' started by french toast, Jan 17, 2012.

Tags:
  1. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
    Im not saying gl es 2.0 is holding things back...im only saying there is next gen parts arriving and I would have prefered if all new socs shipped with them....halti is one of the reasons as expect that it has features that would improve graphics over today's games.

    Exophase. ..im now going to have a good look into halti specifics to see just what the features are...im not usually into software so much but it would be a good idea to learn more aboit it if I am going to be discussing it.

    However yours and johns points seem to be focusing more on what details I know about the api rather than explaining just why you think keeping current gen gpus is such a good idea?

    Moving to next gen will likely bring other parameters forward with mobile gpus other than api....
    Such as performance per watt. .plus just on a personel point if im going to be buying a new smsrtphone end of summer...then I will be expecting to keep ot for 2 years this time.. and in that time I want the very latest technology and performance. (at least because we are hitting the technology foothold and these are things arnt cheap!)

    Ailuros and co.....when you compare it to pc stuff your right those advanced features dont work on big complex pf games without a moderately powerful gpu...but for some reason mobile gaming has managed to produce graphics at resolutions never though possible a couple of years ago....mobile gpus arnt even close to being maxed out at 720p or even 1080p...so just right there we already have a scenario where more IQ can be introduced and still get steady frame rates. ..next gen more powerfull and efficient gpus will only increase that.

    If studios do implement what I have suggested. ..and lets face it they already have to an extent. ...than we will be able to buy a game off market place....and it will have IQ graphics settings that can be upped or lowered according to the devices performace at a greater level than now. ..that might not be halti as yet...but there is room for that to happen.
     
  2. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    Good idea.

    Higher functionalities increase transistor counts, which increases die area and indirectly decreases perf/W. That's obviously a vast oversimplification that could have pitfalls but here an example for the case example:

    Assume NV wanted to dedicate say 20mm2 out of the 80mm2 total on T4 for the GPU block. Now they could have chosen between 72 ALU lanes at DX9L3/OGL_ES2.0 or a smaller number of ALU lanes with ~DX10/OGL_ES3.0. Wouldn't you suggest that the 2nd scenario would end up quite a bit slower than the first? Presupposition of course same die area and same power target for both scenarios.

    Excuse me I don't have an entourage.

    The majority of games aren't being renderred at native resolutions on recent iPads as just one example. But once you're there as resolutions are still quite high even with the former conditional, one reason more to be times more careful with mobile strategies when it comes to performance in today's applications vs. very high functionalities dilemmas like in the given case. If resolutions would have stagnated at say <720p for all devices forever things would be times easier.

    As a close 2nd today's SFF GPUs aren't exactly slouches compared to desktop GPUs when 1600*1200 was still an exotic ultra high end resolution for PC gaming.

    As I said games aren't in their majority renderred at 2048 on recent iPads for a reason, or do I see yet high sample anisotropic filtering being widespread on mobile devices. On top of that mobile games are specifically adapted to the underlying hw of the SFF devices and despite the usage of a fair amount of shader calls here and there, the result is still lightyears away from a today's even mainstream standalone GPU could process even in high resolutions with 4xMSAA/16xAF.

    Would you try to torture a high end tablet today with something like Epic's Samaritan demo for example?

    As another example have a look at 3dmark's newest benchmark suite; isn't there a huge difference in requirements between the DX11/DX10 demos for PC/notebook GPUs and the DX9L3 Ice Storm test crafted for SFF mobile devices?

    Sw and hw engineers should know what they're doing, why and when. Those that participate here in the debate included.
     
  3. JohnH

    Regular

    Joined:
    Mar 18, 2002
    Messages:
    595
    Likes Received:
    18
    Location:
    UK
    So if you're not saying ES2.0 is holding things back, why do you think ES3.0 is going to result in an improvement?

    I think it's a common misconception that adding features automatically improves quality, given a basic level of functionality more often than not it is actually greater performance that leads to increased IQ, particularly where oerall performance is still relatively low (mobile is still an order of magnitude or more slower than the desktop). The other problem is that adding features costs power, when you're in a heavily thermally limited environment (such as mobile) this means that those features can and will detract from overal performance.

    This isn't to say features shouldn't be added going forward, however there is a balance that needs to be achieved when doing this.
     
  4. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
    Ok I get that john, however your generalising....can you give an example? im talking about the fact even current generation have room to spare...including cpu power under utilised...so whilst thats a correct general statement no doubt. ..can you be more specific?

    Are saying you think a game couldnt be built on smartphones that utilise snapdragon s4 pros halti and hidden performance to generate better games than we are seeing?.... (using what ever method..but for sake of it..the in game settings I described)

    Are you saying in games setting thst utilise halti features couldnt be implemented on exynos 5250 @ 1080p? Despite t604 able to run current games flawlessly at nexus 10 resolution?

    What about snspdragon 800 in a 1080p smartphone. ..you think the same think is not achievable?

    Same for upcoming rogue gpus...it is indeed very doable and financially viable, I predict we will see the above scenario happen in android gaming sooner than you think. ..if not halti then more than we are seeing now.

    You keep focusing on what I think halti is going to give us over es 2.0?...thats not my point, my point is next generstion gpus will offer gaming experiences over amd above what we have now...api is just one part of that.

    You john and Exophase are obviously very knowledgeable in this area. .ailuros has better knowledge than my self...so that's why im suprised that you are against putting higher tech gpus into next gen smartphones?

    You also seem to be suggesting that halti offers no real benefits over gl es 2.0?...if not then just what is your points? That your just pointing out my limited knowledge of halti?...because I have conceded that. ..perhaps then a more constructive way would be for you to offer up just why it is a good idea NOT to move to halti in games?

    I have respect for you lot however your points are puzzling TBH.
     
  5. Jubei

    Regular

    Joined:
    Dec 10, 2011
    Messages:
    555
    Likes Received:
    195
    You are confusing the Nexus 10 with Nexus 4, the tablet doesnt have ultra low thermal limits. The performance is very close to what Samsung themselves claimed in their Exynos 5250 whitepaper. And im not sure why you think only ARM can improve drivers, IMG confirmed on CES a new driver update that will give a performance boost aswell for existing lineup.

    Except Apple doesnt like the PC model for iOS, they prefer more of a closed box model, thats why you dont see gfx settings on iOS games. So you wont see halti being targeted until the majority of iOS devices have OGL 3.0. Just like you dont see games targeting SGX 543MP4 right now, because its a waste of money to create a game for a small target audience.

    Android doesnt make a lot of money in gaming, companies like Gameloft can barely be bothered to release optimized versions of their games, i doubt they will start focusing on halti APIs strictly for Android titles. If anything, Samsung going with SGX544MP3 will mean a lot of games will run better since almost all games are coming from iOS wich also uses IMG gpus
     
  6. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
    Well ill eat my hat if samsung introduced their own tablet with exynos 5250 and it didnt out perform nexus 10..
    How can it be that t604 with much more power, unified shaders, likely more efficient resources (cache snooping?) And nearly twice the bandwidth is only 3x the gs3 in gl benchmark 2.5?

    Dont forget the massive single thread advantage of a15s over moderately clocked a9s..that has to play some part.
    Anyway i was expecting a higher clocked t604..or a low clocked t628/t658 or 8 core equivalent.

    Nonsense regarding apple....are there not optimised games for the new ipad and its mammoth sgx 554 mp4? Of course apple would enable a ios 7 with halti support if they included rogue and that would give them bragging right over the competition....dont say apple doesnt care about hardware specs and performance because they are continuely the gpu leader, dont say apple isnt into optimising games for its socs features, because they basically kicked off open gl es 2.0 in smartphones, also apple platforms games are normally the best version of such game when launched...buttery smooth.

    Just to note im not an apple fan, never owned a single product from them..just saying.
     
  7. JohnH

    Regular

    Joined:
    Mar 18, 2002
    Messages:
    595
    Likes Received:
    18
    Location:
    UK
    Hmm, not sure how you want me to be more specif.

    Sorry, but you're wrong, current mobile GPUs don't have "room" to spare when running _demanding_ content, just look at GLBench 2.5, I would hardly call this demanding by todays standards, yet current mobile GPU's are still struggling with it. There is still huge scope for gain from increased performance.

    What's hidden about it? S4 struggles on GLBench2.5, halti features are irrelevent to the basic shortag of horsepower.

    Which _demanding_ games would you be talking about exactly? Titles are written for the lowest common denominator, which at the current time comes down to performance, uplift performance across the board and conent will become more demanding and IQ will improve. Adding Halti features doesn't aid this.

    You're kidding me right? You're the one who was saying how dissappointed you are with a vendors possible choice of GPU in their next SoC based on what seems to be a combination of Halti and a bunch of ARM marketing bullshit.

    Hmm, I'm pretty sure I'm not arguing against higher tech or that ES3.0 won't offer benefits, perhaps you missed the comment about supporting useful ES3.0 features via extensions (something NV and now IMG are both doing). I'm primarily pointing out that you need a balanced approach, throwing features at the platform alone will not increase IQ, you need performance to go with them. I'm pretty certain someone has already answered why it isn't a good idea to implement Halti games right now i.e.why limit your market to a few select high end devices, particularly where more IQ gains can be had just by exploiting greater performance.
     
  8. Jubei

    Regular

    Joined:
    Dec 10, 2011
    Messages:
    555
    Likes Received:
    195
    And your excuse for the Samsung Arndale board? is that one also under "ultra low thermal limit"? But feel free to produce a source regarding Nexus 10 having low thermal limit, like i said you are confusing it with Nexus 4, no source has so far claimed this with Nexus 10

    If you think T604 is thermically limited on a tablet wich has much more space to allow heat distribution, why would a higher clocked T604 do well in a phone?

    There are no games optimized specifically for SGX554MP4, the only difference between that and ipad3 is that some games run at native resolution while they are upscaled on iPad 3. iPad 3 also lacks some particle effects (wich are present on ipad 2) on a few games but that is not optimization. Like i said, you cant change GFX settings on iOS games, i cant switch between low res or high res textures or decrease or increase effects in games. That is already taken care of when i buy the game

    And i have never said apple doesnt care about specs or performance. That is a strawman from your side
     
  9. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    I'm a layman compared to the other two gentlemen and while following the debate here, none of us is actually against more advanced GPUs into next generation smartphones. I personally am merely saying that there are quite a few dilemmas to face when it comes to as power restricted platforms as smartphones.

    Make sure the hat is made of chocolate then; riddle me this: if it would had been possible why would Samsung be so insane and license GPU IP elsewhere and spend additional millions on licensing fees and royalties? Also see Jubei's points above about Samsung's own reference board.

    Samsung has chosen for the octacore Exynos twice as many CPU cores as in the 5250 and you also think that twice as many T6xx GPU clusters would had been possible at the same time for a power budget of a smartphone?

    For the 2nd sentence riddle me also this: what's the theoretical difference in performance if I have from the very same architecture no changes involved 4 clusters at 600MHz vs. 8 clusters at 300MHz? The 2nd scenario is most likely going to win in terms of power consumption, yet not by a huge margin either since 8 clusters don't come for free either in terms of die area. Performance will be the same, no hw changes involved.

    On top of Jubei's accurate point in answer to the above, assuming Apple will integrate Rogue in its next generation SoCs. Let's call the smartphone SoC A7 and the tablet SoC A7X; how many GPU clusters would you expect for a "A7" and how many for a "A7X"?

    By the way IMHO (and I'd like to stand corrected) IMG scaled so far entire cores with Series5XT, while they'll move to scaling clusters in Series6/Rogue. While the first has the disadvantage of hw redundancy since entire cores are scaled, I have a damn hard time believing that whatever efficiency advantages Rogue comes with that MPs don't have a single advantage against them. As simple examples on a MP4 you have 4 raster/trisetup units and 4*16z/stencil units. If in theory you go for the same amount of units in a 4 cluster Rogue the cluster advantage to save hw redundancy goes poof.

    Have a look at the 1080p offscreen triangle results here:

    http://www.glbenchmark.com/compare....+Pad+TF700T+Infinity&D4=Apple+iPhone+5&cols=4

    I've not the vaguest idea what ARM has done with the T604 exactly, but since I suppose that it has 64 USC ALU lanes clocked at 533MHz, compared to the 4 VS ALU lanes at 520MHz in the Tegra3 GPU the measured triangle rates for the T604 leave quite a big question mark. The SGX543MP3@325MHz in the iPhone5 tells a totally different story.

    To make it more simple: if the T604 at 533MHz would yield say ~7000 frames in GL2.5, Samsung might have opted for a T604@350MHz if that would had been within their power budget.
     
  10. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    2,078
    Likes Received:
    1,534
    I think it's safe to assume A7 will be the G6200/G6230 and A7X will be the G6400/G6430.
     
  11. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    That's 2 clusters for a smartphone design then and 4 clusters for a tablet design (all on a purely hypothetical basis) and that coming from Apple that doesn't mind gigantic batteries in their tablets, nor do they mind as much about die area (especially for the GPU) as other SoC designers.
     
  12. ltcommander.data

    Regular

    Joined:
    Apr 4, 2010
    Messages:
    616
    Likes Received:
    15
    For what it's worth, the requirements for Ice Storm are actually DX9L1 + shadow filtering support.

    The delay between the G6x00 and G6x30 announcements was 7 months. Is that close enough that they could have similar time-to-market in shipping devices or is the G6x00 a 2013 part and the G6x30 more likely a 2014 time-frame part?
     
  13. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    I stand corrected. I recalled DX9 L3:

    http://www.futuremark.com/benchmarks/3dmark

     
  14. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    2,078
    Likes Received:
    1,534
    I'm kind of wondering if the xx30 implementations aren't a custom implementation created at Apple's request (a la the 543MP4+ for the Vita) that they are offering to other partners either delayed at Apple's request, in a modified form, or with some other caveat to protect their relationship with Apple (a 9% stakeholder).

    I've not seen any indication what the xx30 series adds over the xx00, however.
     
  15. ltcommander.data

    Regular

    Joined:
    Apr 4, 2010
    Messages:
    616
    Likes Received:
    15
    http://www.4gamer.net/games/144/G014402/20121118001/

    According to a Google translation the "3" stands for extra "frame buffer compression logic".

    Apple designing custom CPUs does raise the possibility for Apple doing the same for GPUs. Making a GPU from scratch is a little much, but modifying a PowerVR design like Sony did as you say may make sense. The G6x30 seems to trade die area for more performance and/or reduced power consumption, which is in line with Apple's philosophy, but it's probably more just IMGTech offering a design recognizing those principles for their own merits, which may be of interest to many customers, rather than specifically targeting Apple.

    I'm trying to think of a way to relate this back to nVidia/Tegra, but I can't so I'll just stop here.
     
  16. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    Think of Project Denver.
     
  17. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,065
    Likes Received:
    7,026
  18. Helmore

    Regular

    Joined:
    Apr 5, 2010
    Messages:
    466
    Likes Received:
    0
    Seems like a smart move to me to do so. Should keep the price reasonable. I do wonder what kind of effect a yearly release will have on games.
     
  19. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    17,696
    Likes Received:
    7,702
    I'd be hugely surprised, as I don't think even the iPad 4's GPU comes even close to Intel's integrated desktop GPUs (Core i3/i5/i7 integrated). And the distance from there to 7770/7850/7870 level of performance (depending on what rumor you believe in) is huge.

    Add to that it'll likely still be using ARM based parts designed for, at best, tablet duties (which means low power but not as low as smartphones), and the likelihood becomes even lower that it'll come close to matching the next gen consoles.

    Or think of it another way. The Ouya isn't even going to be able to match the X360 and that's using tech from pre-2005.

    Regards,
    SB
     
  20. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    There's also a huge difference in power envelopes between the iPad4 GPU and Intel's current desktop SoC GPUs. Upcoming SoCs from Intel for tablets will use GenX GPUs, but I suspect that the frequencies of the cores won't be nowhere near as high as in current desktops. Next generation consoles from Microsoft and Sony sound like roughly 7700 class GPUs with a projected range of 1.2-1.5 GFLOPs give or take.

    The iPad4 GPU is currently at 72 GFLOPs and in roughly 1-1.5 years it will most likely raise by a factor of 3.0 in terms of GFLOPs. The rough 5 year timeframe he sets might be aggressive, but with a bigger timespan it doesn't sound impossible.

    The upcoming OGL_ES3.0 crop of GPUs will most likely last 4-5 years from their appearance until the succeeding generation of GPUs appears. Tablets will always be devices with low power envelopes compared to any PC/notebook that will be available at any given time; however you won't colour me surprised if that uber-next GPU generation comes along even with some ray tracing capabilities.

    How about a comparison then between the original XBox and the Tegra4 GPU.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...