ARM Mali-400 MP

Rob Evans

Newcomer
A competitor for Imagination Technologies SGX?

From http://www.prnewswire.co.uk/cgi/news/release?id=228914

ARM Mali-400 MP Technology Brings High-End Graphics Performance to All Consumer Devices


CAMBRIDGE, England, June 2 /PRNewswire/ --

- Multicore Graphics Solution Revolutionizes User Experiences With Pioneering Scalability


ARM ((LSE:ARM); (Nasdaq:ARMH)) today announced the ARM(R) Mali(TM)-400 MP scalable multiprocessor graphics solution, capable of delivering performance of up to 1G pixels per second and enabling licensees to serve multiple product markets with the same architecture, whilst retaining the flexibility to choose the optimum power, performance and area configuration for their application. The pioneering Mali-400 MP architecture offers breakthrough scalability, also reducing costs for developers and OEMs associated with platform fragmentation, as no changes are required to support one to four processors.

"Architectural reuse of software and hardware components is of increasing importance to SoC developers," said Frank Dickson, co-founder and chief research officer, MultiMedia Intelligence. "The scalability of the ARM Mali-400 MP GPU, from 300 million to over 1 billion pixels per second, will enable OEMs to deliver a wide range of market-leading products on the same underlying architecture, reducing their total cost of ownership and maximizing ROI."

The ARM family of Mali GPUs is opening up new product markets that will benefit from graphics acceleration, from mobile feature phones through to 1080p-based iDTVs. Recent Mali GPU licensees in the set-top box market mean that the consumer experience in the home is set for a radical change.

"We see an increasing need for pixel processing of up to 1G pixels in the home as HD screens become ubiquitous," said Ola Larsén, vice president of marketing at TAT, an ARM Mali Developer Relations Program Partner. "The set-top box and digital TV user interface will never be the same once graphics acceleration such as Mali technology becomes the standard."

The Mali-400 MP solution builds on ARM's experience and knowledge gained in the widely-adopted ARM MPCore(TM) technology, implemented in the ARM11(TM) MPCore and Cortex(TM)-A9 MPCore multicore processors, designed to reduce system bandwidth, optimize performance and reduce power consumption.

"The battle for consumers' attention across all consumer electronics product markets means that the graphics acceleration capability of these devices is rapidly becoming a must-have feature. Consumers expect an equally compelling user experience when accessing content from their mobile and digital home entertainment devices," said Michael Dimelow, director of marketing, Media Processing Division, ARM. "The ARM Mali-400 MP graphics solution enables our customers to bring scalable graphics processing performance to a wider range of product markets and more economically than before."

Power consumption and area efficiency are key aspects of the Mali-400 MP GPU design. The ability to scale performance to meet different price points and power budgets allows the Mali-400 MP GPU to address the widest possible market while bringing significant cost benefits through utilizing the same single software stack across multiple devices.

The ability to capture the attention of consumers through the display on their electronic devices, as demonstrated by a range of high-end designs today, is driving the need for hardware graphics acceleration in a wide range of consumer devices. The scalable multicore graphics processing design of the Mali-400 MP solution will help bring graphics acceleration to almost any device with a screen.

Availability

The ARM Mali-400 MP GPU is available for licensing today. For more information about the ARM Mali graphics stack, please visit: http://www.arm.com/products/esd/multimediagraphics_home.html.
 
I know thought I would just be too the point, but all too seriously the competitive environment is heating up, I just think that PowerVR folks have a lead in this area but the likes of both Nvidia and ARM can still catch up....
 
credit to Rob Evans for uneathering this article as well....

Multi-core GPU in ARM strategy

ARM advances its graphics processing strategy with multi-core versions of Mali 3D graphics processor cores

EW 11-17 JUNE 2008 ElectronicsWeekly.com
DAVID MANNERS

ARM has laid-out its graphics processor strategy with scale able multi-processor versions of its Mali 3D graphics processing cores.

The family, called Mali-400, has four variants: single core, dual core, triple core and four core. The four core can deliver a graphics processing performance of billion pixels a second or 30 million triangles a second.

“It has the lowest memory bandwidth of any GPU (graphics processing unit) which directly translates into lower power consumption,” Chris Porthouse senior product manager for media hardware at ARM, told Electronics Weekly.

Low memory bandwidth translates into lower power usage because the GPU is not writing and reading from memory all the time.

The single core version delivers 275 million pixels per second, the dual core delivers 550 million pixels per second, and the triple core delivers 825 million pixels per second. The 275 million to 1 billion pixel per second spread allows licensees to make a range of products using the same architecture, and the same software stack, which cuts down their costs, while ARM, as an IP vendor, can spread its cost of developing Mali across many users.

The Mali-400 is designed for 65nm process technologies. The dual core version occupies nine square millimetres of silicon. “We have customers who are asking us for 32nm and 22nm versions,” said Porthouse, “it scales to 32nm quite easily.”

ARM has been talking to customers and expects to sign the initial licenses for Mali-400 shortly. “We are very close to signing licences,” said Porthouse.

Clearly the big market is smartphones where every phone will need a GPU, and there are expected, by ARM, to be 600 smartphones made every year by 2012.

The less highly featured smartphones will not all have GPUs but will still represent a 400 million annual unit market for GPUs by 2012, reckons ARM.

Other products which may in part use GPUs are portable media players, which ARM reckons could be a 200 million unit market by 2012 and satnav, a 65 million unit market by 2012.

High definition TVs are driving a rapidly growing market for high-end GPUs in set-top box, representing a potential 231 million units market by 2012 and digital TV, which could be a market of 113 million units by 2012.

So ARM is looking at a substantial target market for the multi-processor graphics core in the next four years.
-End-
 
“It has the lowest memory bandwidth of any GPU (graphics processing unit) which directly translates into lower power consumption,” Chris Porthouse senior product manager for media hardware at ARM, told Electronics Weekly.

Interesting that they believe they have a lower bandwidth solution than IMG's PowerVR series.

Rob.
 
http://www.arm.com/news/23608.html

ARM signed 13 processor licenses in Q3. The quarter was characterised by licensing of ARM® technologies across the portfolio, with licenses being signed for the ARM7™, ARM9™, ARM11™ and Cortex™ processor families, as well as for the Mali™ graphics processor, including with STMicroelectronics who licensed ARM’s latest graphics processor, the Mali 400MP GPU.
 
Hah. I wonder what'll happen with Qualcomm then - unless they'll actually purchase AMD's handheld division? I would actually expect them to given how much of their IP is sourced from them, FWIW... And congratulations to the ex-Falanx guys/ARM! :)
 
Of course as is the nature of press releases, the most interesting things are what they don't say.

For example what clock frequency are used to get the stated performance figures, and at what power cost. No size is stated for the quad-core part.

The PR also says
"The four core can deliver a graphics processing performance of billion pixels a second or 30 million triangles a second."

which seems strange cause the docs on the arm site
http://www.arm.com/miscPDFs/21863.pdf
indicate that the dual core can do 30M tri/s and 550M pixels. If the data is true, it would be say that there is no more extra triangle capability with quad over dual ????

The latest SGX datasheet states that at the high end, its IP can do 100M tri/s and 4B pixels per sec. this is at a 20mm size and at 200Mhz. So thats the high end crown taken.

The smallest SGX part is 1.5mm square and still does 200M pixels per sec. The smallest Mali-400 is 5.5mms. To get down to 1mm, you have to go to MAli55, which doesn't have any vertex engine and thus isn't Opengles2.0 compliant, and has 1/2 triangle performance of the 1.5mm SGX part which is openglEs2.0 compliant. So thats the low end taken too.
 
...
The PR also says
"The four core can deliver a graphics processing performance of billion pixels a second or 30 million triangles a second."

which seems strange cause the docs on the arm site
http://www.arm.com/miscPDFs/21863.pdf
indicate that the dual core can do 30M tri/s and 550M pixels. If the data is true, it would be say that there is no more extra triangle capability with quad over dual ????
...

Beware of marketing fud, they appear to only use a single MaliGP irrespective of the number of cores so the geometry throughput does not scale with cores laid down...

John.
 
SGX 530 is 2 TMUs, SGX 520 is 1 TMU, SGX 510 is canned. Clock speeds for SGX, Mali, Imageon and GoForce are all similar on 65nm, which is to say low to mid 100MHz (sigh...)

Anyway a single-core Mali 400MP is most similar to a SGX 530 in terms of theoretical performance excluding potential boosts related to TBDR (avoiding a Z-Pass and/or shading fewer useless pixels). In this context, their perf/mm² don't seem to be fundamentally different.
 
SGX 530 is 2 TMUs, SGX 520 is 1 TMU, SGX 510 is canned. Clock speeds for SGX, Mali, Imageon and GoForce are all similar on 65nm, which is to say low to mid 100MHz (sigh...)

Anyway a single-core Mali 400MP is most similar to a SGX 530 in terms of theoretical performance excluding potential boosts related to TBDR (avoiding a Z-Pass and/or shading fewer useless pixels). In this context, their perf/mm² don't seem to be fundamentally different.

Well, at the low end IMG offer a core which is much smaller yet still ES2.0 compliant and at the higher end the equivalent perf Mali MP solution is going to be quiet a bit bigger than the single core SGX solution (which may also have higher poly throughput).

To be clear, I don't think there's anything wrong with multi-core as a concept (IMG have already done this in the past afterall), however I just don't think ARM have got it right in terms of the base core that they are scaling from in terms of hitting good area efficiency/ performance per unit area.

John.
 
John,

Point take regarding the "cores", however its their term, and the quad core one and he dual core one appear to have exactly the same tri/s rate (but pixel rate is different).
 
SGX 530 is 2 TMUs, SGX 520 is 1 TMU, SGX 510 is canned. Clock speeds for SGX, Mali, Imageon and GoForce are all similar on 65nm, which is to say low to mid 100MHz (sigh...)

What you say is correct for the mobile phone space.

However SGX530 is used in the System Controller Hub that is the chipset for the Z series Atom processor. There are 3 SKUs and 2 of them clock the graphics at 200Mhz. Depending on which spec sheet you look at, the SCH chip is either fabbed at 90nm or 130nm.

There is a rumoured refresh of Menlow coming out that suggests a different SCH SKU. Be interesting to see if the graphics core is changed, the may also take the opportunity to fabbed it differently.

http://www.digitimes.com/NewsShow/NewsSearch.asp?DocID=PD000000000000000000000000007764&query=MENLOW
 
What you say is correct for the mobile phone space.

However SGX530 is used in the System Controller Hub that is the chipset for the Z series Atom processor. There are 3 SKUs and 2 of them clock the graphics at 200Mhz. Depending on which spec sheet you look at, the SCH chip is either fabbed at 90nm or 130nm.
That's a SGX 535, but yes, it's available in two SKUs: one at 100MHz and without the PowerVR HD video decoder, and one at 200MHz with the latter.

What's much more interesting is what will happen with Moorestown, where instead of being manufactured on 130nm (definitely not 90nm), it will be on 45nm (!!!) so clocks might increase quite a bit if they don't screw up! :) AFAIK, it's still a SGX 535 though...
 
If the data is true, it would be say that there is no more extra triangle capability with quad over dual ????

This is indeed correct. Mali-400MP is not a unified shader (it has separate processing units for vertex processing and fragment processing), and we decided against making the vertex processing section scalable this time around. The 30Mtris/s is basically the upper limit of what our current vertex/binning processing unit can do; the fragment processors can do ~18Mtris/s per core.
 
Moorestown was described as having 50% more graphics performance than the SCH, so a difference in clock speed and not SGX variant seems likely.
 
This is indeed correct. Mali-400MP is not a unified shader (it has separate processing units for vertex processing and fragment processing), and we decided against making the vertex processing section scalable this time around. The 30Mtris/s is basically the upper limit of what our current vertex/binning processing unit can do; the fragment processors can do ~18Mtris/s per core.

Of course you also won't be able to handle higher VS load, so its not only poly throughput that doesn't scale....

John.
 
Well, at the low end IMG offer a core which is much smaller yet still ES2.0 compliant and at the higher end the equivalent perf Mali MP solution is going to be quiet a bit bigger than the single core SGX solution (which may also have higher poly throughput).

To be clear, I don't think there's anything wrong with multi-core as a concept (IMG have already done this in the past afterall), however I just don't think ARM have got it right in terms of the base core that they are scaling from in terms of hitting good area efficiency/ performance per unit area.

John.

You're still talking about the cancelled SGX510, right?
 
Back
Top