TimothyFarrar
Regular
Anyone have any speculation as to NVidia Tegra ULP GeForce architecture and how this will compare in power and performance to PowerVR SGX?
Ask me that on wednesday/thursday and hopefully I'll be able to answer a bit better than I can right now... I don't think it's possible to get truly objective *and* comparable power consumption data from handheld companies in practice, so for the sake of objectivity I don't think I'd ever want to comment on that though.Anyone have any speculation as to NVidia Tegra ULP GeForce architecture and how this will compare in power and performance to PowerVR SGX?
And another page for performance:Early-Z and fragment caching
- These are big computation and bandwidth savers
Ultra Efficient 5x Coverage Sampling Anti-Aliasing Scheme
- Mobile version of CSAA technology from GeForce
Not a tiling architecture
- Tiling works reasonably well for DX7-style content
- For DX9-style content the increased vertex and state traffic was a net loss
Not a unified architecture
- Unified hardware is a win for DX10 and compute
- For DX9-style graphics, however, non-unified is more efficient
i.e. it's 2 TMUs @ 120MHz with single-cycle 5xCSAA (which is 2xMSAA with 3 extra coverage samples).Tegra APX can achieve:
- Over 40M triangles/sec
- Up to 600M pixels/sec
- Texture 240M pixels/sec
Run Quake 3 Arena
- 45+ fps WVGA (800 x 480)
- 8x Aniso Texture Filtering
- 5x Coverage Sampling AA
That's the APX 2500, but it looks like the SKUs for the chip are being shuffled around a little bit (type 'APX 2600' in Google, look at the cached entry, and go to 'Specifications' to see what I mean) and I have no idea what the clock speeds for all of them will turn out to be, especially not for the 3D part.Is that highest end variant or will there be another one?
The AA and AF seems a little insane to me given the relatively ultra tiny pixels of these portable screens, unless that feature set is for those looking to run regular desktop sized displays from portable devices.
So any idea how it will perform against PowerVR SGX?
Don't hold your breath...So in the end we have to wait for some real life testing on devices based on those h/w platforms and running the same OS. It will be the only fair way of telling how they stand against each other.
Hmm? These are three ODM phones coming out in 2H09: http://www.engadget.com/photos/nvidias-tegra-in-the-flesh/1365027/And unfortunately not even 1 announced tegra based smartphone. It's not encouraging...
Is that highest end variant or will there be another one?
Anyway 5xCSAA is just fine for the screen sizes it's aimed for. The most interesting part is the 8x AF bit.
Before anyone says it, games on mobile devices hardly ever enable AA or AF in games (I think the latest q3a mobile version has an option for enabling AA though). I wonder if Tegra's TMUs are strong enough to handle AF or if it's just a bandwidth constrained scenario as Q3A on mobile devices typically is.
As for the critical comments in the first quote: I've heard better excuses in my lifetime than those LOL
Why do you think it might have an advantage with small triangles?Of course, in the lot, the SGX 530 should be the one with the highest real-world efficiency out of that fillrate since it's a TBDR, doesn't need a Z-Pass, etc. - in theory it's plausible that it might handle very small triangles a bit better too, but that's not a given.
I remember the Tegra said to be based on geforce 6 architecture; filtering is cheap in transistors and fast on geforce 6/7.. though not 100% clean which annoys me for old games on my desktop PC; but that should look better on a mobile screen.
the texture rate disappoints me, 240Mtexels.. only 33% better than a voodoo2
I'm not interested in smartphones though, obviously they would clock it higher on a netbook?
Yup, as Ailuros said though, 'based' doesn't mean all that much in the handheld world.I remember the Tegra said to be based on geforce 6 architecture
But with Early-Z, and I'm not sure how much bigger than a Voodoo2 the die size actually is. Let's take a random number and say it's 8mm2 (I estimated it once, but that's not even the right number because I forgot about it; at least this way it also applies to SGX) - that's on 65nm, and on 350nm that'd become 256mm². Voodoo2 was implemented as 3 chips on 350nm, each with a 64-bit memory bus IIRC. Surely that couldn't be less than 128mm², and probably more. Given the difference in programmability, I don't think it's all that surprising sadly.the texture rate disappoints me, 240Mtexels.. only 33% better than a voodoo2
In ARM-based netbooks? Theoretically they could. I'm not sure it really matters though; 240MPixels/s is enough for a pretty 3D user interface even at 1280x1024, but overclocking it by 20% isn't magically going to give you enough performance to do anything useful gaming-wise at that resolution.I'm not interested in smartphones though, obviously they would clock it higher on a netbook?
Well, TBH I was only thinking about pixel shading intensive cases (where you're probably going to be the most GPU-limited anyway) because of the shader's MIMD nature, but I couldn't get any info about whether they do anything interesting there or what, so I have no idea if they do anything interesting there. They keep most of their MIMD marketing centered around branching, obviously.3dcgi said:Why do you think it might have an advantage with small triangles?
Hmm? These are three ODM phones coming out in 2H09: http://www.engadget.com/photos/nvidias-tegra-in-the-flesh/1365027/
One slide mentioning those: http://www.engadget.com/2009/02/16/nvidias-tegra-jumps-on-the-android-bandwagon/
I think the Compal is probably WM, but I'm not sure. Ironcially, I know which basebands are in them but I don't know the OS - oops? Either way you'll have WM/WinCE devices coming out in 2H09...But none of them seems to be running WM. Instead they will probably use android and that means that we won't be able to test them thoroughly due to lack of benchmarking tools.
I'm not sure a MIMD advantage becomes any greater with small triangles. I guess it depends on the shader. A chip like the one Qualcomm bought from AMD probably branches on a quad granularity anyway so the point is probably moot. Though I don't know for sure what the branch width is.Well, TBH I was only thinking about pixel shading intensive cases (where you're probably going to be the most GPU-limited anyway) because of the shader's MIMD nature, but I couldn't get any info about whether they do anything interesting there or what, so I have no idea if they do anything interesting there. They keep most of their MIMD marketing centered around branching, obviously.