iPhone/Zune/iPod & More Prediction Thread

the most power-demanding task i could run on the device since its first full battery charge last night, was video playback. for the purpose i used one of the two demo flicks, 'elephant's dream' (a really nice open project short-film production) of resolution 1280x720p, and duration 10:53. i played that back 6 times, to a battery meter drop of 50% (2/4 bars). i belive it's safe to extrapolate from that that the device can playback 720p video for at least 2h straight. in the course of the playback, the device's back got slightly warm, but nowhere near what an ipod touch 2g gets under similar tasks.

so these are the dry facts.

Bitrate @ video codec used? Show me 2h+ 40mbs,1080p, h264 HP, then I might be impressed ;)

John.
 
so these are the dry facts. the more i thought of those, though, the more my respect for the 3dlabs guys grew.

That isn't too impressive. The 2nd Gen iPod Touch turns in much better battery life than that (most reviewers claim of 5-6 hours w/ 50% screen brightness), and with a much smaller battery (750 mAh vs. 1200 mAh). Now, it does depend on what video you're decoding, but that is a fairly massive gap.

As far as performance benchmarking goes, I am more interested in the mundane, conventional 3D benchmarks (vertex processing rate, blended fill, texture sample rate, etc).

For example, the wikipedia page stats: 42M textured pixels/sec, 21M vertices/sec. Vertex rate is pretty good for a device in this category, but efficient software vertex processing is fairly easy. You would expect it to scale fairly well with the GFLOP rate, which it appears to do.

However, the texture sampling is one place where a pure-software solution has never been shown to perform well, and their advertised texture rate very sub-par. Does anyone know if they have dedicated texture samplers like Larabee, or if they're trying to do it all in software?
 
Last edited by a moderator:
Two hours plus of decent HD is just good enough to be useful, so achieving usefulness on a fairly demanding, highly specialized task with general purpose hardware is impressive in its own right.

I suspect its relative performance, however, compared with some alternate architectural approaches to the same GP streaming end would be underwhelming.
 
Bitrate @ video codec used? Show me 2h+ 40mbs,1080p, h264 HP, then I might be impressed ;)

John.
you know of a < 1W stream processor/dsp that does that? ; )

to answer your question, though: AVC1, VBR (peak 4.2Mb/s, avg across the stream 2.66Mb/s), 1280x720p @24fps.
 
The question is not one of programmability, but rather of specialization. Would you consider a Tensilica Xtensa-based DSP with plenty of specialized video instructions to be a "DSP"? If so, there's one in Qualcomm's Snapdragon that does 720p at well under 1W (moving to 1080p in Snapdragon2), even much less than the Zii. And of course, at a bit more than 1W because it's optimized more for area than power, it's also what you have in either NVIDIA or AMD graphics cards.

There's also Broadcom's VideoCore as I said earlier, which was also designed by an UK company (Alphamosaic) with a similar design philosophy as the Zii; they do 720p H.264 High Profile decode & encode at well under 1W. Like the Zii, it's not very specialized and can even do OpenGL ES 2.0. They're moving to 1080p now and the original version of this very same core aimed at VGA video decode shipped in millions of devices: specifically, the original iPod with Video! Once again, zero hardware acceleration. If they go on, I'm sure they could easily do <1W for 1080p 40Mbps H.264 HP on 28nm (or on 40nm with more cores at a much lower voltage/frequency maybe).

There are plenty of weird little startups with unclear architecture I won't get into here, but perhaps most interesting are the wireless baseband startups (I know, I always get back to them!) that prove it's all about specialization, not programmability. Icera, with a fully programmable but very exotic approach (the PHY is a single instruction stream although there are multiple 'programmable' units that each have multiple execution units) that is the unambiguous HSPA+ leader, and Altair for WiMax/LTE with small super-specialized processors working on small parts of the overall problem one step after the other, each seemingly more efficient in both area and power than any commercially available fixed-function alternative implementing all the possibilities.

In the end, specialized solutions that are also programmable or at least highly configurable (i.e. PowerVR VXD) continue to have a massive advantage over the general-purpose approaches of architectures like the Zii or even GPGPU. The question is always whether the target market (and the silicon's average utilization when it is available) is large enough to warrant dedicated hardware; for things like video encode/decode, it seems very clear that is the case and the Zii's approach feels like a historical oddity.

For other problems, exotic architectures --including GPGPU to a certain extent-- holds a lot of promise although it is certain to be too complex or sometimes limited in some ways for certain consumer apps. The serial-centric CPU is also a historical oddity, and as you imply it's unclear how long it will prosper in the most performance or power-centric applications where special-purpose hardware either makes no sense or cannot be afforded. Intel's best shot at fighting this, besides playing the same game to a certain extent with Larrabee, is their 'many mini-cores (plus a few big OoOE ones)' strategy they've talked about in the past; it's still unclear to me when and even whether that will ever come to fruition.

3DLabs should be applauded for actually getting a seemingly very solid solution to market (even if I don't really approve of the marketing tactics) and it's good to see them attract attention to exotic architectures; however, there is nothing fundamentally unique about their design philosophy.
 
you know of a < 1W stream processor/dsp that does that? ; )
Thats exactly my point, fixed function video decoders do exactly that for a lot less power.

to answer your question, though: AVC1, VBR (peak 4.2Mb/s, avg across the stream 2.66Mb/s), 1280x720p @24fps.

Doesn't sound like a particularly demanding clip to decode, which profile?
 
Also worth remembering that just because a stream is marked as a particular profile does not mean it is necessarily using all the features of that profile, the encoder could be using just one fairly simple feature from that profile in order to qualify as a high (or main) profile encoder.
It is the decoder that is more tightly specified, a decoder that supports a particular profile must support all the features of that profile.
It is what makes developing encoders/decoders interesting. Decoders are more complicated in that they have to support everything, whereas encoders can give you freedom to be innovative.

CC
 
Thats exactly my point, fixed function video decoders do exactly that for a lot less power.
whereas mine was that i have little use of video decoder silicon per se ; )

Doesn't sound like a particularly demanding clip to decode, which profile?
i dunno, i'm not a video codec guy - couldn't tell mjpeg from mng. i thought the fourcc was sufficient to tell the profile.


ps: Arun, your post was informative, and has been appreciated (the alphamosaic guys blipped once on my radar not long ago, and now they're dully tracked), i just don't feel like delving into industry-vs-market-vs-efficiency-vs-utilization discussions atm; i'm yet to carry out my gl tests on the zii, and i hope to be able to come up with more informative numbers next time i post. so far, though, the battery drainage of the device has been nothing out of the ordinary for the class, considering i've been doing little but demoing it around.
 
Last edited by a moderator:
Didn't want to start another thread, this ones about as close as I can find.

NXP has launched a range of 45nm Socs all of which apparently contain SGX. They are for various STB implementations. Interesting in that they integrate ARM's A9 core with SGX. Thats only the 2nd A9+SGX implementation I'm aware off (the other being Omap4). NXP has played some catch-up, given that they are sampling q4 '09, which is I think around when Omap4 is sampling.

http://www.nxp.com/news/content/file_1609.html
 
The duration between sampling and volume production for different manufacturers can vary quite a bit, and I'd expect Texas Instruments to be decidedly faster here.

Still, as was noted, NXP does seem to have accelerated their development impressively.
 
"
In a Q and A session with the New York Times, the Apple CEO has said that although Apple weren’t "exactly sure how to market the Touch" at first it soon became apparent that games was the way forward:

"We started to market it that way, and it just took off. And now what we really see is it’s the lowest-cost way to the App Store, and that’s the big draw. So what we were focused on is just reducing the price to $199. We don’t need to add new stuff — we need to get the price down where everyone can afford it."
"
the 8Gb one is the $199 gaming console

The 32Gb and 64Gb (16 has been dropped) uses the same processor etc as in the iphone 3GS

At the presentation yesterday SJ stated that itouch had sold 20M since launch, and iphone 30M. They now have 50M gaming devices out there, thats quite a userbase, and its growth is accelerating, not bad for a company that wasn't known as a console maker.
 
Excuse the OT but it would be ridiculous to open a separate thread for that one: where's the virtual demo room IMG has promised for its home site? It was supposed to appear on July 2009 :(
 
so, after some time tinkering with my zii EGG (that is the developer's edition), i can give you a power-efficiency comparison to an ipod touch 1G, for a typical GPU task (rasterizer/fragment-bound). take it with a grain a salt, as some of the measures in there were not particularly accurately taken. yet, they should give a general idea.

all times are from full battery, just-off-the-charger condition, to system auto-shutdown due to low power. also, battery for the two devices are as follow:

zii egg: 3.7V, LiPo, 1200 mAh
ipod touch 1G: 3.7V, LiPo, 900 mAh

note that the latter data are not manufacturer's figures (apple do not give any), but to the best of my google search for replacement parts, and as such might not be correct.

test run was a rudimentary object-space normal-mapping frame loop, which can be seen and obtained from here. the sleep functions on both devices were suppressed (auto lock: off) and both screens' back-lighting was set to minimal (ipod: manual, slider to min; zii: off, effectively producing a minimal backlight, visually comparable to the ipod's minimal).

so, the resulting times are:

ipod touch: 6h
zii egg: 5h 40min

also, it takes noting that whereas the fps on the ipod was a stable 60hz, the fps on the zii was fluctuating between 30hz and 60hz, most of the time at 30hz (about 75% of the time), due to the beta-version nature of device's power-managing software.

in terms of work done per supplied energy, the zii is about (900mAh / 1200mAh ) * (5h 40' / 6h) = 0.71 times as efficient as the ipod touch at this particular task, framerate fluctuations notwithstanding. if we take the fluctuations into account (and take the truly rough estimate of fps time percentages - i.e. 25% @ 60, 75% at 30), then the resulting index becomes 0.44. but because of the rough nature of the measurements, i believe we could assume the more general index of 0.5.

i don't know about you, but i find that comparative index to such a power-efficient gpu as the MBX lite, particularly coming from a non-gpu, quite satisfactory.
 
Last edited by a moderator:
I hope you guys dont mind my question.
How does the Tegra Chipset perform compared to e.g. iPhone3GS one?
 
I hope you guys dont mind my question.
How does the Tegra Chipset perform compared to e.g. iPhone3GS one?

iPhone3GS should win this one, especially if you consider that Tegra APX variants are meant for smartphones.
 
Seriously? Is it that bad? I was looking at the whole picture here, cpu+gpu+videoprocessing and so on, i know that the cortex a8 xyz(3gs type) crushes the Arm11 core when it comes to the cellphone types clocking at 550mhz. The Tegra Types are higher clocked so it seems.

Is there any information, benches, comparisons availible?
 
Back
Top