Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
POWERVR SGX520 is less than 2.6mm2 in TSMC 65LP process.
Thats pretty small size could describe it as being "nano" centric.....
Arm's smallest Mali core that is OpenGLes2.0 compliant is the Mali200 and at 65nm its 5mm square, which makes it just about twice the size of SGX520, and yet it is only slighty higher performing (9M triangles and 275M pixels V's 7M triangles and 250M pixels).
THE one thing missing from both specs sheets of course is power comsumption.
This does however ignore the fact that Mali isn't a deferred rendering device so in the presence of overdraw its power consumption is likely to be higher per unit area for both core and IO.
John.
Power is roughly proportional to area, so assuming Mali200 is as aggressive with its clock gating as SGX then Mali200 will consume ~2x the power. This does however ignore the fact that Mali isn't a deferred rendering device so in the presence of overdraw its power consumption is likely to be higher per unit area for both core and IO..
according the the feature list in this pdf, mali is both tile-based and deferred rendering....but they also mention immediate rendering ?
http://www.arm.com/miscPDFs/21863.pdf
Not only is clock gating important, i.e turning off the bits that are not needed at any one time, but it may well be that one or other solution inherently results in more of the chip being able to be turned off at any one time.
Yeah, but then again they're not unified so that means more opportunities for clock gating. Doesn't necessarily mean lower *overall* power consumption, but it does have an effect on average power consumption per mm² - thus given all of these factors, the latter seems like a very problematic metric to use here!They are a tile based render, if anything this offers less opportunity for clock gating than a deferred tile based render due to how the laters pipeline fits together.
Yeah, but then again they're not unified so that means more opportunities for clock gating. Doesn't necessarily mean lower *overall* power consumption, but it does have an effect on average power consumption per mm² - thus given all of these factors, the latter seems like a very problematic metric to use here!![]()
A certain bright green company seems to have followed that train of thought to an extreme too!LOL Interresting point, make you core bigger with loads of logic that sits around idle so that you can claim lower power per mm^2 , although even in an LP process leakage does need to be factored into this...
Heheh, indeed you probably shouldn't have!Perhaps I shouldn't have used the term "unit area"![]()
..
However I would argue the main advantage of non-unified hardware from a power POV is to be able to use FP24 instead of FP32 in the pixel shader. If you don't care about having MIMD everywhere, it also allows you to have higher branching granularity in the PS than in the VS; how much that helps you depends a lot on how naive your architecture is though, ofc... (and what real-world handheld applications are & will be like)
Obviously FP24 per-se shouldn't be a problem for high target resolutions, so I presume you're thinking of very large textures? I have some difficulty to believe this is a big problem even at 1080p, but heh!Going forward FP24 won't cut it in the PS, part of the reason for this is that these devices are already being applied to UI's running at HD resolutions making FP24 marginal for texture calculations.
I definitely agree with both points, but it's not my fault PowerVR's competitors don't seem able to figure out how to implement efficient MIMO to save their lives!Throw GP-GPU into the mix and FP24 doesn't cut it at all. Courser branch granularity also falls foul of GP-GPU, and, it is possible to architect for the finest level of branch granularity without adding significant area to the design imo.
Obviously FP24 per-se shouldn't be a problem for high target resolutions, so I presume you're thinking of very large textures? I have some difficulty to believe this is a big problem even at 1080p, but heh!![]()
Yes but if it's applied 1:1 I'm not sure I see how it could go wrong? Surely you don't need to rotate it or anything like that... (or even if you did for a special effect it'd go fast enough that nobody would ever notice there's a 0.25 pixel error)FP24 gives you 15 bits of mantissa, 1:1 HD textures are 1920 wide, so you need 11 bits to achieve texel level addressing, this leaves you with 4 bits of sub texel accuracy, I consider this borderline, but artifacts do depend on the application.
I think that was Bitboys, yeah...Incedentally, could be wrong but I thought ARMs fragment shaders where restricted to FP16, or maybe that was the old bit boys part <shrugs>...
Yes but if it's applied 1:1 I'm not sure I see how it could go wrong? Surely you don't need to rotate it or anything like that... (or even if you did for a special effect it'd go fast enough that nobody would ever notice there's a 0.25 pixel error)
I definitely agree with both points, but it's not my fault PowerVR's competitors don't seem able to figure out how to implement efficient MIMO to save their lives!Obviously Imagination's processor/DSP heritage with META helps a lot there.
FP24 gives you 15 bits of mantissa, 1:1 HD textures are 1920 wide, so you need 11 bits to achieve texel level addressing, this leaves you with 4 bits of sub texel accuracy, I consider this borderline, but artifacts do depend on the application.
Incedentally, could be wrong but I thought ARMs fragment shaders where restricted to FP16, or maybe that was the old bit boys part <shrugs>...
John.