NVIDIA Tegra Architecture

How do you came to such conclusion?
Maybe you know something we don't?
These 72 FPUs should be clocked at 650-700 Mhz considering 20x fp performance over Tegra2 from leaked slide, it's roughly 100 Gflops, should be enough to keep up with something like SGX554MP4 in iPad4 in GLB2.5


What? 100GFlops?

quoting Anandtech:
NEW: the shaders aren't unified, the majority are 20-bit pixel shader cores though. No idea on the ratio yet.

Disappointing!

No wonder they showed no GLB2.5 scores.
 
What? 100GFlops?

quoting Anandtech:


Disappointing!

No wonder they showed no GLB2.5 scores.

Theoretical peak FLOPs have what exactly to do with pixel shader precision? I would had bet on at least USC ALUs; as it stands I'm betting 4*Vec4 FP20 PS ALUs and 2*Vec4 VS ALUs, 1 TMU/Vec4 ALU and of course coverage sampling AA :LOL:

Peak theoretical GFLOPs might even be as high IF frequency is as high as he claims it to be; if performance now should not break even with an iPad4 I for one won't be the one that will have expected too much ;)
 
Theoretical peak FLOPs have what exactly to do with pixel shader precision? I would had bet on at least USC ALUs; as it stands I'm betting 4*Vec4 FP20 PS ALUs and 2*Vec4 VS ALUs, 1 TMU/Vec4 ALU and of course coverage sampling AA :LOL:

16bit Flops then :cry:
 
16bit Flops then :cry:

A large portion FP20 and a smaller on FP32 for the VS ALUs; however since applications don't require strictly only FP32 everywhere: floating point is still floating point irrelevant of precision.

OGL_ES is quite specific where lowp, mediump and highp should be used; as long as their vertex shaders are inevitably FP32 they've got the majority of highp recommendations covered. The question now is what the OGL_ES3.0 requirements exactly will be; by the sound of Tegra4 I'm willing to bet that the minimum won't be FP32 :p
 
The question now is what the OGL_ES3.0 requirements exactly will be; by the sound of Tegra4 I'm willing to bet that the minimum won't be FP32 :p
The spec is public. In GLSL ES 3.00 highp support is required in both vertex and fragment shaders and must be FP32, while mediump must be at least FP16.
 
The spec is public. In GLSL ES 3.00 highp support is required in both vertex and fragment shaders and must be FP32, while mediump must be at least FP16.

Meaning that the ULP GF in Tegra4 won't reach OGL_ES3.0?
 
That already seems out of question due to lack of unified shaders.

Great :rolleyes: Well is there anything really interesting besides the i500 to the whole Tegra4 enchilada or am I the only one here that seems bored to death? Oh wait there's project shield...:oops: arggghhh :rolleyes:
 
What? 100GFlops?

quoting Anandtech:


Disappointing!

No wonder they showed no GLB2.5 scores.

Yes, this is surprising (and one has to wonder if NVIDIA is saving their more modern and more powerful GPU architecture to go up against next gen Rogue/Mali/Adreno/Radeon, etc.), but at the end of the day it now makes sense where the 6x GPU performance improvement comes from when comparing Tegra 4 vs. Tegra 3. Compared to Tegra 3, it appears (but has not yet been confirmed) that Tegra 4 has 6x more pixel shader execution units (ie. 48 pixel shader execution units vs. 8 pixel shader execution units) and 6x more vertex shader execution units (ie. 24 vertex shader execution units vs. 4 vertex shader execution units), for a grand total of 72 pixel/vertex shader execution units in Tegra 4 vs. a grand total of 12 pixel/vertex shader execution units in Tegra 3.

Up to 6x GPU performance improvement in Tegra 4 vs. Tegra 3 is quite significant, and should be good enough to put NVIDIA at or near the top of the heap with respect to GPU performance compared to the highest performance mobile/handheld SoC's currently on the market today. Top that off with what is arguably the highest CPU performance in a mobile/handheld SoC, in addition to some of the new features introduced on Tegra 4, and NVIDIA can legitimately claim that Tegra 4 is the world's fastest mobile processor.
 
Is this going to work power wise? I thought Anandtech's last article had the A15 in exynos sucking up a lot of juice.
It's also a question what clock you're going to run this thing at, right? Is it a given that an A15 does worse in terms of perf/W and perf/mm2? I imagine that, with very high maximum clock speeds, you can save quite a bit of power by clocking down and lowering VDD as well.

As long as the possibility is there to reasonably trade off performance vs. power consumption, it doesn't have to be a problem.
 
nVidia boasted that Tegra 3 enabled new-era PCs, equipped with Windows RT, that would never use a fan/heatsink like conventional PCs.

Although Project Shield is a different product with different priorities and even though the fan/heatsink is mostly a non-issue, it's at least a little amusing that they'd be the company to release a mobile device needing one.
 
Late to the discussion, but Tegra 4 doesn't have an integrated modem and it doesn't have a SM4 GPU. What the fuck are they playing at?!? What were they doing for the past year. How hard can it be to stuff 80 or so Kepler ALUs into this and integrate the baseband!

Now comes the long wait for Tegra 5 where we just have to hope Nvidia become a real competitor to Qualcomm, because right now they aren't even turning up to the smartphone race with quad A15 and no integrated baseband.
 
Back
Top