NVidia Tegra ULP GeForce Speculation

Discussion in 'Mobile Graphics Architectures and IP' started by TimothyFarrar, Feb 13, 2009.

  1. TimothyFarrar

    Regular

    Joined:
    Nov 7, 2007
    Messages:
    427
    Likes Received:
    0
    Location:
    Santa Clara, CA
    Anyone have any speculation as to NVidia Tegra ULP GeForce architecture and how this will compare in power and performance to PowerVR SGX?
     
  2. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Ask me that on wednesday/thursday and hopefully I'll be able to answer a bit better than I can right now... :) I don't think it's possible to get truly objective *and* comparable power consumption data from handheld companies in practice, so for the sake of objectivity I don't think I'd ever want to comment on that though.

    Here's what one of their presentation says architecture-wise:
    And another page for performance:
    i.e. it's 2 TMUs @ 120MHz with single-cycle 5xCSAA (which is 2xMSAA with 3 extra coverage samples).
     
  3. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,445
    Likes Received:
    181
    Location:
    Chania
    Is that highest end variant or will there be another one?

    Anyway 5xCSAA is just fine for the screen sizes it's aimed for. The most interesting part is the 8x AF bit.

    Before anyone says it, games on mobile devices hardly ever enable AA or AF in games (I think the latest q3a mobile version has an option for enabling AA though). I wonder if Tegra's TMUs are strong enough to handle AF or if it's just a bandwidth constrained scenario as Q3A on mobile devices typically is.

    As for the critical comments in the first quote: I've heard better excuses in my lifetime than those LOL :D
     
  4. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,934
    Likes Received:
    2,264
    Location:
    Germany
    Would 1x MSAA +4 Coverage-Samples save any significant portion of the ROPs - apart from being some kind of "lame"?
     
  5. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    That's the APX 2500, but it looks like the SKUs for the chip are being shuffled around a little bit (type 'APX 2600' in Google, look at the cached entry, and go to 'Specifications' to see what I mean) and I have no idea what the clock speeds for all of them will turn out to be, especially not for the 3D part.

    Chip-wise, this will likely be the highest-end 65nm chip before they go to 40nm. There was a lower-end chip in the pipeline according to what I heard several times, but I don't know what's happening to that. If it still exists, you'd expect it to have started sampling some time ago.

    As for 1xMSAA + 4xCSAA, I don't think that makes theoretical sense given how CSAA works... :) The CSAA samples need to be able to choose between at least two colours, unless you're thinking of doing some weird semi-but-not-really-Quincunx stuff that uses the data of adjacent pixels? That'd seem very complex to me, and I hope nobody ever gets the idea of doing something so crazy...
     
  6. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,445
    Likes Received:
    181
    Location:
    Chania
  7. TimothyFarrar

    Regular

    Joined:
    Nov 7, 2007
    Messages:
    427
    Likes Received:
    0
    Location:
    Santa Clara, CA
    The AA and AF seems a little insane to me given the relatively ultra tiny pixels of these portable screens, unless that feature set is for those looking to run regular desktop sized displays from portable devices.
     
  8. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,445
    Likes Received:
    181
    Location:
    Chania
    The primary moot point for current mobile phone games is that textures are usually of abysmally low quality due to memory footprint constrictions. Anything over bilinear will improve things slightly, but it won't make a huge difference either.

    I've played Q3A mobile on a 320*240 screen and I can't make out what the texture sizes are at in that one, yet they're definitely higher than in most other mobile games. I still noticed though the lack of AF.

    With AA it gets a wee bit trickier, since one might say that it's not absolutely necessary for games for a small form factor device (which is highly debatable IMO also), but you will need it for stuff like advanced UIs amongst others as of course for things like OpenVG where you'd need something around 16x sample AA to get the content to a decent aa'ed level. A menu with rolling icons is about enough to show you that AA can make a significant difference even on a < 3" screen.
     
  9. iwod

    Newcomer

    Joined:
    Jun 3, 2004
    Messages:
    179
    Likes Received:
    1
    So any idea how it will perform against PowerVR SGX?

    Looking at that feature table. It looks like it is a strip down version of Old Tech Geforce......
     
  10. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,445
    Likes Received:
    181
    Location:
    Chania
    If their performance rates are close to reality and since Arun says its an APX2500 and they're largest core under 65nm, you could eventually compare it against SGX540.
     
  11. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    SGX 530 in the OMAP3530/Pandora is a 110MHz 1xTMU GPU, GoForce in the 65nm Tegra is a 120MHz 2xTMU GPU. The 45nm OMAP36xx seems to have a 200MHz SGX 530, FWIW, while the OGL ES 2.0 Imageon in the Qualcomm & Freescale platforms seems to be a 133MHz 1xTMU GPU.

    Of course, in the lot, the SGX 530 should be the one with the highest real-world efficiency out of that fillrate since it's a TBDR, doesn't need a Z-Pass, etc. - in theory it's plausible that it might handle very small triangles a bit better too, but that's not a given.

    Things get more complicated when you consider AA & AF, since you'd assume that a TBDR could also be more efficient at AA. But then again, NV has 5xCSAA (i.e. 2xMSAA + 3 coverage samples); I asked for a demo of Quake3 at MWC, and at WVGA on NV's prototype I really couldn't see much if any aliasing with 5xCSAA. The demo moved a tad too fast to have a very clear idea of the AF, although it seemed OK and didn't shimmer. Talking of AF, obviously that's one of the cases where Tegra should theoretically do best against a SGX 530. It isn't clear to me whether their they'll increase their numbers of TMUs in the 40nm chip; if not obviously SGX 540 is pretty much superior in every way.

    As always for handheld architectures, we know basically nothing about ALU performance (besides the fact Tegra obviously isn't unified and it's scalable further downwards in both VS & PS)... So in fact talking about performance like I did above doesn't make a lot whole of sense, but you gotta do with what you have. And of course it'd be interesting if we could have a clear die size number, but there isn't a single '3D block' in the Tegra die shot; it's at least 3 blocks (maybe more), and the display pipeline might be included in one of the three. So pretty damn hard to get any comparable figure, although I did estimate it in a pretty credible way once.

    BTW, semi-OT: I know we had some discussions in the past about the performance of the Imageon core in the STMicro STn8820, and well, I'm still not 100% sure but I can safely claim it doesn't matter because it will probably never sell a single unit... ST-Ericsson didn't even show it as part of their line-up/roadmap at MWC despite including the STn8815 in it. I was pretty happy about the little that ST-Ericsson did show though, FWIW. Definitely a big player in the handheld world and worth watching for...
     
  12. Wishmaster

    Newcomer

    Joined:
    Nov 16, 2008
    Messages:
    238
    Likes Received:
    0
    Location:
    Warsaw, Poland
    So in the end we have to wait for some real life testing on devices based on those h/w platforms and running the same OS. It will be the only fair way of telling how they stand against each other.

    For now we only know about upcoming toshiba tg01 (snapdragon's imageon z430).
    Palm Pre (SGX530) is running WebOS so comparing it against winmo will probably be impossible (lack of benchmarking tools).
    And unfortunately not even 1 announced tegra based smartphone. It's not encouraging...

    I think that those models that were supposed to be based on winmo7 weren't moved to winmo6.5 due to the older winCE (5 on wm6/6.1/6.5 and 6 on tegra prototypes and future wm7) version.
    But I'd be happy to be proven wrong.
     
  13. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Don't hold your breath...

    Hmm? These are three ODM phones coming out in 2H09: http://www.engadget.com/photos/nvidias-tegra-in-the-flesh/1365027/
    One slide mentioning those: http://www.engadget.com/2009/02/16/nvidias-tegra-jumps-on-the-android-bandwagon/

    OEM phones are still slated for 2H09, although I guess at this point we're talking mid/late Q4 for availability for most (all?) of those. No idea if we'll see announcements at CTIA. MID/Netbooks are slated for Q3, should see announcements at Computex. WinMob7 delays probably hurt them, although they claim what hurt them the most was just the economic crisis in general reducing expenditures at their customers and project delays throughout the industry. I've heard Wolfson Micro (audio codecs & analogue etc.) basically say in a CC that the number of delays in the industry is substantially above average right now, so at least part of the problem isn't NV's fault.
     
  14. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    I remember the Tegra said to be based on geforce 6 architecture; filtering is cheap in transistors and fast on geforce 6/7.. though not 100% clean which annoys me for old games on my desktop PC; but that should look better on a mobile screen.

    the texture rate disappoints me, 240Mtexels.. only 33% better than a voodoo2 :razz:
    I'm not interested in smartphones though, obviously they would clock it higher on a netbook?
     
  15. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,439
    Likes Received:
    280
    Why do you think it might have an advantage with small triangles?
     
  16. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,445
    Likes Received:
    181
    Location:
    Chania
    Based is quite a vague term; you're die area is as limited as it can be on chips for the handheld/mobile market and each IHV has to set it's priorities. Filtering is anything but "cheap" for such miniscule cores and if you'd look at any mobile chips floorplan it would be quite obvious how much die area alone 1TMU can capture.

    That said when you're bandwidth limited AF can come eventually for free.

    They can scale frequencies as much as units for a higher end core. That said considering Ion already has a 9400 in it, I'd rather think that the next generation Ion2 would come rather from the future IGP corner than handheld chip area.
     
  17. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Yup, as Ailuros said though, 'based' doesn't mean all that much in the handheld world.

    But with Early-Z, and I'm not sure how much bigger than a Voodoo2 the die size actually is. Let's take a random number and say it's 8mm2 (I estimated it once, but that's not even the right number because I forgot about it; at least this way it also applies to SGX) - that's on 65nm, and on 350nm that'd become 256mm². Voodoo2 was implemented as 3 chips on 350nm, each with a 64-bit memory bus IIRC. Surely that couldn't be less than 128mm², and probably more. Given the difference in programmability, I don't think it's all that surprising sadly.

    In ARM-based netbooks? Theoretically they could. I'm not sure it really matters though; 240MPixels/s is enough for a pretty 3D user interface even at 1280x1024, but overclocking it by 20% isn't magically going to give you enough performance to do anything useful gaming-wise at that resolution.

    Well, TBH I was only thinking about pixel shading intensive cases (where you're probably going to be the most GPU-limited anyway) because of the shader's MIMD nature, but I couldn't get any info about whether they do anything interesting there or what, so I have no idea if they do anything interesting there. They keep most of their MIMD marketing centered around branching, obviously.
     
  18. Wishmaster

    Newcomer

    Joined:
    Nov 16, 2008
    Messages:
    238
    Likes Received:
    0
    Location:
    Warsaw, Poland
  19. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    I think the Compal is probably WM, but I'm not sure. Ironcially, I know which basebands are in them but I don't know the OS - oops? :D Either way you'll have WM/WinCE devices coming out in 2H09...
     
  20. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,439
    Likes Received:
    280
    I'm not sure a MIMD advantage becomes any greater with small triangles. I guess it depends on the shader. A chip like the one Qualcomm bought from AMD probably branches on a quad granularity anyway so the point is probably moot. Though I don't know for sure what the branch width is.

    If someone packs pixels from different quads they're likely to have an advantage.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...