ARM Mali-400 MP

Discussion in 'Mobile Graphics Architectures and IP' started by Rob Evans, Jun 2, 2008.

  1. tangey

    Veteran

    Joined:
    Jul 28, 2006
    Messages:
    1,527
    Likes Received:
    278
    Location:
    0x5FF6BC
    Are the current SE Omap3 (Satio etc) phones using the U380 platform, i.e. have they integrated their HSPA along with an Omap3430 into a one-chip solution, or did they drop this platform and go with separate Omap3 + HSPA ?

    http://www.ericsson.com/ericsson/press/releases/20080208-1189711.shtml

    Also what ever happend to the U500 platform, which was ARM11 + Mali200....did that just get completely dropped ?

    http://www.ericsson.com/ericsson/press/releases/20080206-1188885.shtml

    Both platforms were announced Feb '08.
     
  2. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    U380 was a lie; it was two chips in a package (I'd guess OMAP3430 and M340); I do believe it's used in the Satio though. I can't seem to find the presentation where they indicated that anymore, though. U500 probably got canned in favour of the U8500 - it was a very weird achitecture with 3 ARM11 cores (1 for apps, 1 for modem, 1 for multimedia iirc) and I honestly doubt it would have been very impressive.
     
  3. roninja

    Regular

    Joined:
    Feb 9, 2002
    Messages:
    268
    Likes Received:
    0
    Thought U8500 was due to be announced at MWC but did not make an appearance. Interesting if this part heads Nokia's way or OMAP4 beats em too it?
     
  4. Grumpy

    Newcomer

    Joined:
    May 4, 2006
    Messages:
    18
    Likes Received:
    0
    Location:
    A cold place
  5. Lazy8s

    Veteran

    Joined:
    Oct 3, 2002
    Messages:
    3,100
    Likes Received:
    19
    A fun looking "UI playground" concept is presented, controlled by multi-touch where icons can be 3D objects and can interact under a physics model, running on a Mali 200 platform.

    http://www.youtube.com/watch?v=g0bwMCe6IaA
     
  6. rbaker

    Newcomer

    Joined:
    Feb 15, 2010
    Messages:
    11
    Likes Received:
    0
    Location:
    United Kingdom
    Nice demo!

    Got a chance to see this at GDC in ARM's booth. Very nice demo!

    Behind the curtain the panel was connected to a notebook PC.
     
  7. Lazy8s

    Veteran

    Joined:
    Oct 3, 2002
    Messages:
    3,100
    Likes Received:
    19
    Was the Mali200 clocked around 300 MHz?

    That's higher than a mobile phone implementation would use, I'd expect.

    More footage of the Canvas demo is shown. Performance on the high resolution display is nice: the video-mapped-to-the-cube getting thrown around as a 3D object and colliding with the pins was a good touch.

    http://www.youtube.com/watch?v=rAiK8jrI8CI&sns=em
     
  8. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
  9. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    Where are the ES 2.0 results? I can't find them on a quick look. Or are they MIA?
     
  10. Rys

    Rys Graphics @ AMD
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,174
    Likes Received:
    1,545
    Location:
    Beyond3D HQ
    Looks like a bug in their glReadPixels().
     
  11. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Now that Nufront's chip is announced as having Mali 400MP and Samsung Orion is rumored at having it I wanted to open this thread up again.

    I had a bunch of stuff here, but I'm seeing more that it's basically superseded by this document: http://infocenter.arm.com/help/topic/com.arm.doc.dui0363d/DUI0363D_opengl_es_app_dev_guide.pdf

    Apparently the ALUs are VLIW, SIMD, and 32-bit float. It also has hardware support for 16-bit float: it appears to suggest using these to save bandwidth post-geometry, not to improve computational throughput. One Mali400 ALU looks way more flexible/powerful than a USSE ALU (dunno as much about USSE2), more comparable to that of z430. So a Mali400MP should have pretty comparable computational performance.
     
    #131 Exophase, Sep 15, 2010
    Last edited by a moderator: Sep 15, 2010
  12. JohnH

    Regular

    Joined:
    Mar 18, 2002
    Messages:
    595
    Likes Received:
    18
    Location:
    UK
    That's the geometry processor not the fragment shader. Frag shader is apparently FP24.

    John.
     
  13. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Yeah okay, I missed that it just said "geometry processor." Do you have a source on the fragment shader being FP24?

    By the way, earlier in this thread you mentioned that 24-bit gives you 15-bits of fractional data. If this is the usual FP24 implementation which is like IEEE-754 float, ie, 1.8.15 for sign, exponent, and fractional portion then the effective absolute resolution is really minimally 16-bits. Normalized floats have an implicit higher order 1 bit; effectively, this is encoded by the exponent. So with 11-bit texture addressing you should have an effective 5-bits fractional at the full index magnitude.
     
  14. Xmas

    Xmas Porous
    Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    3,331
    Likes Received:
    158
    Location:
    On the path to wisdom
    That document actually says that the fragment shader uses FP16.
     
  15. darkblu

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,642
    Likes Received:
    22
    I don't have a link to a source handly, but I do remember Mali as having an fp24 fragment shader ALUs from earlier presentations/workshop sessions.

    Generally, both non-unified shader model, SoC-class GPUs I'm aware of follow the fp32-vertex / fp24-fragment scheme.

    edit: I just noticed Xmas' remark. It appears I have a case of faulty memory cells.. Rats. *reports to factory for repair*

    Correct.

    fp24 (15-bit mantissa) gives you 2^-16 relative precision (mandated by the GLSL ES specs for highp, btw);
    fp16 (10-bit mantissa) gives you 2^-11 relative precision, etc.
     
    #135 darkblu, Sep 15, 2010
    Last edited by a moderator: Sep 15, 2010
  16. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Wow, you're right, I read the section and somehow completely misinterpreted it as meaning that FP16 is supported and should be used to save bandwidth between vertex shading and fragment shading.

    That only gives effective 11-bits of guaranteed precision, not very good for HD texture coordinates...

    I wonder if maybe the ALUs are FP16, but it uses FP24 internally and can access it for some purposes. Like if the TMUs can be addressed with it and varyings produce it. I guess even FP16 isn't that bad for texture coordinates (and on SGX you'd probably usually opt for it), since it still gives you 0.25 sub-texel precision at up to 512x512. 1024x1024 if somehow the texture coordinates can range from -1 to 1 (pretty sure it can't work that way but I don't really know for sure)

    So ARM promotes a lot that Mali has very efficient bandwidth utilization, even compared to "traditional tile renderers." Are they claiming that they have better post-transform geometry data compression than IMG does?
     
    #136 Exophase, Sep 15, 2010
    Last edited by a moderator: Sep 15, 2010
  17. Lazy8s

    Veteran

    Joined:
    Oct 3, 2002
    Messages:
    3,100
    Likes Received:
    19
    Exception has been taken to that characterization of PowerVR as the "traditional tile renderer".
     
  18. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,468
    Likes Received:
    187
    Location:
    Chania
    Their webpage states as of recently:

    http://www.arm.com/products/multimedia/mali-graphics-hardware/mali-400-mp.php

    While in their dev_guide I read the following:

    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0363d/index.html

    Since I'm having a hard time making out a difference can I call the Mali a TBIDR (tile based immediately deferred renderer)?
     
  19. JohnH

    Regular

    Joined:
    Mar 18, 2002
    Messages:
    595
    Likes Received:
    18
    Location:
    UK
    Imo the correct description of Mali is "Early Z Tile based rendering", this differs from the PowerVR method in that they don't do deferred shading/texturing.

    Exophase - yes 15 bit mantissa is actually 16 bits of precision when you take into account the implied 1, assuming that's the representation used.

    Note that FP16 is not generally sufficient precision to manipulate texture coordinates with the shader (think accumulated error). I still believe they're FP24, but I'm not sure why they wouldn't expose as that though if that really was the case.

    John.
     
  20. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Had this one out with aaronspink/darkblu/et al once.

    It's deferred and not immediate in the sense that it's scene-grabbing instead of rendering primitives as they're issued.

    It's not deferred and is immediate in the sense that it performs full rendering (minus early-Z elimination) within a tile as opposed to having a fast internal Z-path and then index based rendering.

    An important distinction between it and z430 is that it appears to have more explicit tiling and a fixed small (16x16) tile size, and therefore probably has a hardware binning pass prior to geometry as opposed to binning with geometry with skips for already binned polygons. This probably saves a lot of bandwidth in comparison, but like IMG I imagine they employ some kind of compression on their post-transform data. PowerVR should be as "traditional" as you get, having been the only tile based renderers for years, but yeah, there are clearly holes in their statements.

    What I've always wanted to see was a tile-based early-Z renderer with hardware binning that also performed some level of depth sorting. If it's going to bin per-tile anyway this can't be that much more expensive (or maybe it can, I don't really know the binning algorithms and haven't thought this through that thoroughly). That'd make early-Z much more effective, and if you also add in binning of opaque vs alpha primitives you'd get order-independent translucency too... seems like it'd bring the per-pixel savings much closer to TBDR (especially w/faster than fill early-Z) while not having highly costly alpha test and having order-independent translucency.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...