Creative ZMS-20 and ZMS-40

  • Thread starter Deleted member 13524
  • Start date
Like they claimed an "aggregated Cortex A9 clock speed of 6GHz" and added the 4 ARM cores to the 96 "stemcell" cores to get the "100 cores" in the press release, it's also possible they are adding the GFLOPS from the dual Cortex A9 @ 1.5GHz to the GFLOPS from the 48 stemcell cores in order to get the 26GFLOPS.
Yes, but it's obvious from the phrasing that they're adding the ARM and stemcell cores, or aggregating the ARM clock cycles. "48 x 32-bit floating point media processing cores for 26GFlops of compute" seems clear to me, too.
 
Dumb question: isn't ZiiLabs actually the leftovers from the former UK based 3DLabs (RIP)?
 
Dumb question: isn't ZiiLabs actually the leftovers from the former UK based 3DLabs (RIP)?
Correct, although I think most of the GPU engineers were fired rather than moved to the handheld division. BTW, it's interesting this architecture is most similar to Broadcom's Videocore which came from Alphamosaic... which was also an UK start-up.

I suspect a good majority of the exotic parallel architectures in the world were probably developed in the UK, as an heritage of Inmos and the Transputer. Therefore it's pretty ironic that the Transputer project was seen as a failure by the government, when it likely created an unbelievable number of jobs over the years. Too bad so many of these start-ups folded or were acquired relatively cheaply... a successful Icera IPO would have made up for it in my mind (Simon Knowles was part of Inmos in the later years and even led the team after it was acquired by STMicro), but of course that won't happen now.
 
Correct, although I think most of the GPU engineers were fired rather than moved to the handheld division.

BTW, it's interesting this architecture is most similar to Broadcom's Videocore which came from Alphamosaic... which was also an UK start-up.

Well if the majority of GPU engineers were fired then the result doesn't surprise me one bit.

I suspect a good majority of the exotic parallel architectures in the world were probably developed in the UK, as an heritage of Inmos and the Transputer. Therefore it's pretty ironic that the Transputer project was seen as a failure by the government, when it likely created an unbelievable number of jobs over the years. Too bad so many of these start-ups folded or were acquired relatively cheaply... a successful Icera IPO would have made up for it in my mind (Simon Knowles was part of Inmos in the later years and even led the team after it was acquired by STMicro), but of course that won't happen now.

That's a very long OT debate; unfortunately true. The UK governments should know what they're losing.
 
Still, I do think the connection here is weak.. I mean, I don't think "pixel clock" really refers to the stemcell array clock. Can't see why they wouldn't have put it that way. Pixel clock can mean RAMDAC; this gives just enough for 1080p past 60Hz.

RAMDAC is a pretty standard term. Why call it "200MHz Pixel clock image processing", and why would it represented as a sub-spec within the "ZiiLABS flexible Stemcell media processing capabilities"?
(I couldn't get the formatting right in the first post, but in the original website, the specs that start with "|" are sub-specs of the specs that start with the "-")




Yes, but it's obvious from the phrasing that they're adding the ARM and stemcell cores, or aggregating the ARM clock cycles. "48 x 32-bit floating point media processing cores for 26GFlops of compute" seems clear to me, too.

Eeerm.. is there a "not" missing in that sentence? I can't get its meaning...



Correct, although I think most of the GPU engineers were fired rather than moved to the handheld division. BTW, it's interesting this architecture is most similar to Broadcom's Videocore which came from Alphamosaic... which was also an UK start-up.

That's interesting. Do you know where to find info on the current Videocore architecture? I can't find a single decent whitepaper in Broadcom's website about it..
 
Yes, but it's obvious from the phrasing that they're adding the ARM and stemcell cores, or aggregating the ARM clock cycles. "48 x 32-bit floating point media processing cores for 26GFlops of compute" seems clear to me, too.

Yes, and in the block diagram the "26GFLOPS" is clearly tied to the stemcell array, I think it'd be very strange if it included the ARM cores. Likewise, I don't see why they wouldn't clearly call the stemcells as 200MHz instead of allocating that somewhere else.

I think that the 200MHz just means that the ALUs can keep up with a RAMDAC at that speed.
 
That's interesting. Do you know where to find info on the current Videocore architecture? I can't find a single decent whitepaper in Broadcom's website about it..
Heh, you won't find anything recent, but the best info to compare various generations can be found here: http://www.curiouscat.org/Steve/
In terms of how it evolved over the years:
  • VC01 (2002/2003): 130nm, 85MHz (originally 125MHz?), in-house 32-bit RISC scalar processor + Vec16 8/16-bit Processor. 600K gates excluding memory but based on the die shot on Steve's page, vast majority of the chip is the 1MB of SRAM (no need for external DRAM, ala NVIDIA GoForce 4800 and earlier, although it does support optional SDRAM). MPEG-4 CIF (352x288) + Audio Encode @ 30 fps at 54mW. Used mostly in a few Samsung phones. And I found the full product brief!
  • VC02 (2004): 130nm, 150MHz, RISC upgraded to superscalar dual-issue and extra video-centric instructions for the Vec16 8/16-bit Processor. Marketing: 3xVC01 performance, so it's possible the Vec16 unit has also doubled performance per clock although unlikely. Increased instruction/data caches from 32+16KB to 128KB total, and increased SRAM to 1.25MB. Video decode/encode up to MPEG-4 VGA @ 30 fps. AKA BCM2702. Used in the original iPod with Video.
  • VC03 (2007): 65nm, ???MHz, Dual Vec16 processors, and most importantly a lot of dedicated HW accelerators for video and 3D Graphics (up to 32MTri/s) to improve performance and power consumption. The VideoCore engines are still used as ALUs for OpenGL ES 2.0 shaders which seems to imply FP16/P32 support for PS/VS. Significantly less SRAM, uses 32MB Stacked LPDDR1 instead. Video decode/encode up to 720p H.264 High Profile @ 30fps (encode at 450mW full-system). AKA BCM2727.
  • VC04 (2010): 40nm, ???MHz, very probably Quad Vec16. 3x GPU performance via more fixed-function units especially for fill rate and texturing (up to 1 Gigapixel/s). Video decode/encode up to 1080p H.264 High Profile @ 30fps (encode at 490mW full-system, decode at 160mW). AKA BCM2763 (renamed as BCM11182 in tablets), also used in the BCM11311 application processor and the BCM28150 baseband.
  • VC04 Light (2011): 40nm, Single Vec16, fewer HW accelerators. Video encode/decode up to VGA H.264 @ 30fps (unknown profile). 20MTri/s 3D. Used in mainstream BCM21654 baseband.

What makes the architecture unique? Mostly the register file, here's a direct quote from the VC01 product brief: "64x64 pixel 2D register file". That's very large, and the 2D organisation makes pretty good sense for video processing. And of course, while the basic ALUs are probably just 16-bit Multiply-Add, there's also special instruction for a variety of algorithms (and more added each generation). Interestingly enough, the VC01 product brief claims 6GOps @ 125MHz which is more than 0.125x16x2 if it was just MAC. Presumably some instructions are more than just clever ways of moving data around, and they're actually capable of more computation power than MAC for these specific algorithms.

This is horribly off-topic, but it is very much related somehow: this is a fairly similar architecture that started as being 100% programmable, and then they added more specialised instructions, then lots of hardware accelerators for both video and 3D, and now even more accelerators for 3D. It certainly contradicts the notion that a fully programmable architecture is going to be competitive. However, some level of programmability may be useful to share more silicon between standards, or even between decode and encode. That might result in higher power consumption, but in some markets lower cost is more valuable than saving a few milliwatts.
 
I suspect a good majority of the exotic parallel architectures in the world were probably developed in the UK, as an heritage of Inmos and the Transputer. Therefore it's pretty ironic that the Transputer project was seen as a failure by the government, when it likely created an unbelievable number of jobs over the years. Too bad so many of these start-ups folded or were acquired relatively cheaply... a successful Icera IPO would have made up for it in my mind (Simon Knowles was part of Inmos in the later years and even led the team after it was acquired by STMicro), but of course that won't happen now.

I'm sure you are aware of this, but just in case, Hossein Yassie the CEO of Imagination, previously held senior positions with Inmos.
 
I'm sure you are aware of this, but just in case, Hossein Yassie the CEO of Imagination, previously held senior positions with Inmos.
Yeah, although he wasn't a founder nor an engineer. But it's a good example of how ex-Inmos people are scattered in important positions throughout the industry, and Imaginations is a good example of a successful parallel computing-centric company in the UK (even if their architecture is very different from most others since it's MIMD and more focused on latency tolerance).

I think at this point a lot of ex-Inmos engineers are hidden deep inside the UK design centers of large international companies. Ziilabs is one example of that (their main HW group is in Bristol iirc).
 
I think at this point a lot of ex-Inmos engineers are hidden deep inside the UK design centers of large international companies. Ziilabs is one example of that (their main HW group is in Bristol iirc).

Hmm, I thought the Bristol office had pretty much gone and all dev was now done in the US?
 
Heh, you won't find anything recent, but the best info to compare various generations can be found here: http://www.curiouscat.org/Steve/
In terms of how it evolved over the years:
(...)

Thanks! That info is really hard to get. I had no idea the VideoCore III was using the dual vector processors as pixel and vertex shaders. I thought they were just sitting idle during 3D games, for example. It makes more sense that way.

Allow me just a small addition: I can't find the source right now but I'm 99% sure the BCM2763 has 128MB of stacked RAM.
It seems the versions without an embedded application processor have a hard time reaching sufficient memory bandwidth from system's main memory (as previously stated in that "how to program for BCM2727" video), hence the need for a dedicated memory pool (and a "redundant" memory controller, I might say).
 
Hmm, I thought the Bristol office had pretty much gone and all dev was now done in the US?
Many of the 3D Labs US employees (former Intergraph/Intense 3D) were hired by Nvidia so I was under the impression those that remained were from the UK office. Ironic that the impression is different depending on which side of the pond you live on.
 
Lol, well we interviewed a couple of guys from the UK office and they said they said they where leaving as most work was being done in the US now iirc. Confused or what!
 
http://www.ziilabs.com/products/platforms/androidreferencetablets.aspx

Reference Honeycomb tablets for ZMS-20:

7-inch:
jaguartabletb.jpg


10-inch:
jaguartabletc.jpg



They claim it's a modular design, so OEMs can choose I/O boards, touchscreen controllers (there's the choice of going either resistive or capacitive), and camera modules:
ziilabsjaguarblockdiagr.png


Sounds interesting, but if Creative themselves aren't building the tablets, who is?

BTW, ZMS-40 is still left out. I'd say they're waiting for 28nm to make the higher-end SoC.
 
Last edited by a moderator:
1GPixel fillrate confirmed.
Doesn't this mean there are fixed-function units for 3D rendering?

I'm still curious as to what the 3D performance might be in this.
He makes a claim about 3D performance being good, but all he gives out is the fillrate.
 
PowerVR's pixel fillrate numbers' != other's fillrate numbers

Besides, imo, wikipedia is kinda dodgy for embedded specs.
 
Back
Top