PDA

View Full Version : Intel Atom Z600


Pages : [1] 2 3

liolio
06-May-2010, 15:30
I realize that there is no proper thread on matter so I started a new thread.
Preview can be read here:
Techreport (http://techreport.com/articles.x/18866)
Anandtech (http://anandtech.com/show/3696/intel-unveils-moorestown-and-the-atom-z600-series-the-fastest-smartphone-processor)
tomshardware (http://www.tomshardware.com/reviews/intel-atom-moorestown-smartphone,2624.html)

I'm being possibly overly optimistic but I see this platform has the game changer in this market segment.
I don't expect it to take over the market in one or two year but it's clearly the Intel first tough blow in the direction of ARM. Put otherwise:
To paraphrase vice president Biden, this is a big f—ing deal. After attending this Moorestown briefing, I walked away fairly convinced that I’d just seen the future of mainstream computing. No, I’m not saying that I think 40% of the market will be toting around Moorestown-based devices next year. I mean that, if certain requisite elements are in place, I see no reason why the median form factor used for computing shouldn’t continue its march from the desktop to the pocket.

I see a lot a nitpicking and critics in the comment of the aforementioned sites but honestly I fail to see how to dismiss Intel achievement I didn't them to "get there" that fast.
OK it's still not perfect for smart-phone most likely due to first cost then form factor, but it's imho already almost a death blow to ARM supporting companies in the net-book segment.
I also read concern about the OS and the software and I don't see this as a problem at all. Honestly it wouldn't surprise me if Apple is actually the first to embraces Intel solution (say sometime in 2011).

I'm overly enthusiastic but when I read Anand article (the first up it seems) I could help but think "that's huge and it spells no good for competitors..."

MfA
06-May-2010, 21:10
Apple is married to ARM for the iPhone OS based devices ... it's a binary platform, a switch would be painful. The only potential OS X Atom platform would be a Netbook, and I don't think Apple would make a 360 on that. Besides, as long as they don't push gaming what the hell do they care if a processor is faster and/or has a process advantage?

Apple doesn't need Atom ... but their competitors need some way to make Atom more relevant. I personally think the best way to do that is by pushing gaming (real gaming, PSP style gaming or at the very least DS style gaming ... not mobile phone style gaming).

Pressure
07-May-2010, 01:22
I also don't see Apple going to throw away the $350 million in investments they did this last year on companies working on ARM designs (P.A. Semi and Intrinsity), even though it is pocket change for them.

Intel will become an option when the power management is better than what ARM designs can squeeze out at comparable performance.

Lazy8s
07-May-2010, 07:08
ARM designs still seem to be getting quite a bit more CPU performance per cost, so Apple should continue to push the hardware envelope in performance and integration with ARM for the foreseeable future.

Intel's aggressive push in graphics and also Medfield's process advantage will give them strong selling points, though, to some other market players.

rpg.314
07-May-2010, 07:16
I also don't see Apple going to throw away the $350 million in investments they did this last year on companies working on ARM designs (P.A. Semi and Intrinsity), even though it is pocket change for them.

Intel will become an option when the power management is better than what ARM designs can squeeze out at comparable performance.
Apple is free to choose any ISA as long as it is ARM.

aaronspink
07-May-2010, 08:43
Apple is free to choose any ISA as long as it is ARM.

You mean the ISA they helped architect from the company they helped found!

Kurt
07-May-2010, 09:14
The only potential OS X Atom platform would be a Netbook, and I don't think Apple would make a 360 on that.

What difference would it make?

tangey
07-May-2010, 11:49
Reading some articles on UMPCportal, and re-reading the anandtech article again, it appears that there maybe a yet unannounced variant of moorestown to come called moorestown-W. This *WILL* add the PCI support necessary to run full windows7 (i.e. not windows mobile). We heard some mutterings about moorestown-W quite a while back.

We also have heard about "oaktrail" which is apparently pinetrail replacement for netbooks.

It could well be that moorestown-W and oaktrail are one and the same. With a 400Mhz SGX535, it would give it twice the graphics performance of the existing Z series chipset being used in mids and some netbooks (and probably an increase on pinetrails graphics) AND give it full video decode capability without needing the seperate broadcom chip. And with the massive decrease in power useage, would give brilliant battery life.

Of course we've been here before with pinetrail, but it is a possibility.

rpg.314
07-May-2010, 13:01
You mean the ISA they helped architect from the company they helped found!

Yeah, the same ISA on which 2e5 apps for their platform run. :wink:

liolio
07-May-2010, 13:23
Yeah, the same ISA on which 2e5 apps for their platform run. :wink:
In which languages those apps are coded?

MfA
07-May-2010, 13:44
What difference would it make?
Well they'd look silly ... which is not a big deal, but regardless I don't think they will do a 360 on the netbook issue.

rpg.314
07-May-2010, 15:01
In which languages those apps are coded?

In a family of languages famed for combining the power of an assembly with the portability of an assembly. :wink:

liolio
10-May-2010, 09:12
In a family of languages famed for combining the power of an assembly with the portability of an assembly. :wink:
:lol: I guess I didn't get properly what Mfa meant by binary platform.

So I'll agree that Apple may not shift in near future as the touted difference in performances is not worse the software expenses. Still as other pointed other than Apple may find the platform interesting for tablets.
I'm not fond of cell/smart phone at all, I see them as some sort of enslavement and I personally don't need them for professional reasons, net book on the other hand... :)
I'm willing to see Intel declination of the platform intended for net-book say a two cores variations supporting PCI. As I don't play on my computer I'm getting more and more bothered by encumbrance of standard pc. It's huge and then come the cables/wires. I got a used laptop for free from my job and it's still too big, autonomy is too low, etc. Net-book based on Atom could be perfect for my daily use. Intel has a winner here imho, putting Apple talk aside we can pretty much say that Intel has already made a victim: Nvidia ion2 platform is set to disappear soon, Via will also have a tough time competing too.

Overall after reading quiet some comments on various forum I've the feeling that a lot of the talk mostly against Intel is a manifestation of the old RISK vs CISC war or more "Intel vs the world". A lot of people want to see in ARM the RISK stronghold that may threaten Intel dominance. Intel solution is not perfect that's clear but the talk is not fair for most part. Lot of people assume that while Intel is doomed due to X86 overhead, on the other side they are assuming that will be a piece of cake for ARM to catch up on things like floating point perfs and other things while still consuming less, be cheaper, etc.
I feel like once again people are underestimating Intel. Next year with medfield Intel may very well dominate the high end with chips powerfull enough to be a fit for anything from high end smart phones to net-books and tablets. Maybe I underestimate ARM and companies supporting their architectures but I think that as far as high end is concerned the gap between Intel and ARM will only grow at Intel advantage. I think manufacturers will have intensive to move to Intel offers sooner than a lot of people are expecting. In regard to software, how many of that ten thousands apps available on Iphone are worse it?
Back to Apple, they invest money in PA semi and Intrinsity, that's a lot of money but for Apple I read it's "only" a quarter of incomes and shifting to Intel (at least for tablets in next future) would not mean that their investments would be lost, Intel may securing the high end sooner than latter it's clear that they are years away from being able to compete in the lower segments. I know a lot will disagree but I don't think that Apple will accept to be at a perf disadvantage for too long for the sake of the 99.9% of showelwares out of 50 000 in the applestore.

That's the end of my rent I know most people with disagree but clearly ARM engineers must have a lot of pressure on their shoulder right now, and their "eagles" may better turn out as an astounding piece of tech.

Exophase
11-May-2010, 04:54
You mean the ISA they helped architect from the company they helped found!

Apple's involvement may have spun off ARM Ltd, but the ISA itself far predates this. Apple certainly did not help architect it.

darkblu
11-May-2010, 05:58
liolio, I admire you optimism about intel's next year foray into ARM territories ; )

Intel surely have the litho advantage, but that cannot be kept indefinitely. Intel may not be at the end of the x86 rope just yet, but they're steadily reaching it. Architectures that were designed for human friendliness (versus computational efficiency, etc) are a bit past their prime time, and Intel's effort to keep said architectures en vogue is costing them more and more (it already cost them the smartphone market).

So I'm not so sure which hypothetic scenario could occur sooner - Apple going Intel on the idevices, or Intel dropping x86 from the handheld segment.

Kurt
11-May-2010, 09:01
Well they'd look silly ... which is not a big deal, but regardless I don't think they will do a 360 on the netbook issue.
But I thought 360 meant no change in direction, so what you are saying is, they ARE going to do it? Or is this one of those 'I turned 360 and walked away' jokes? :) Sorry for being thick.

liolio
11-May-2010, 10:16
liolio, I admire you optimism about intel's next year foray into ARM territories ; )
I would put it the other way around I'm pessimistic for ARM ;)

Intel surely have the litho advantage, but that cannot be kept indefinitely. Intel may not be at the end of the x86 rope just yet, but they're steadily reaching it. Architectures that were designed for human friendliness (versus computational efficiency, etc) are a bit past their prime time, and Intel's effort to keep said architectures en vogue is costing them more and more (it already cost them the smartphone market).

So I'm not so sure which hypothetic scenario could occur sooner - Apple going Intel on the idevices, or Intel dropping x86 from the handheld segment.I think you're putting to much weight in the touted "X86 overhead of doom". ARM is the leader so Intel needs a competitive edge, being a match would have gotten them nowhere. They decide to go for greater performances a the cost of higher power consumption and they gave them-selves time to meet the power requirement of handled devices. If Morrestown is not just a paper dragon it looks like there are already/almost there. Power consumption would be acceptable for a high end smart and perfs are better than anything else. Till they don't have enough of a competitive edge, cost is against them form factor is another issue. The thing that impresses me is that they get there using their 45nm process not their newest. It really spells trouble for ARM derivatives as "medfield" will be there next year. I've tough time believing that we will see 28nm ARM parts for most part of 2011 no matter one goes with TSMC or GF.

One other thing in regard to the X86 overhead, I'm close to thinking that it's irrelevant/insignificant. Say Intel didn't need a competitive edge, do you really think that the gap in power consumption between atom and ARM CPUs would have been that big. Atom TDP was 2.5Watts if my memory serves right which is more than ARM counter but perfs are nothing close either. If Intel only wanted to match ARM perfs first Atom consumption could have been lower:
*remove the "high performance" FP unit, put a sucky but low power FP unit
*remove the "high performance" SIMD unit, same as above
*use narrower memory channel (as in the new ones)
*remove hyperthreading
*aim at lower clockspeed
*move to a shorter pipeline
Do you thing it would still consume 2.5Watt for the sake of X86 overhead? I don't think so, power consumption would be really close to ARM counter parts and one would realize that the pretended overhead is not really the problem. The problem is Intel can't go for parity imho ARM is king of the hill, that's the overhead they are facing now.

MfA
11-May-2010, 12:59
But I thought 360 meant no change in direction, so what you are saying is, they ARE going to do it? Or is this one of those 'I turned 360 and walked away' jokes? :) Sorry for being thick.
Ugh, thanks for ruining my day ... nothing like feeling like a moron early in the morning :/ Yeah, I meant 180.

Simon F
11-May-2010, 13:10
One possible benefit of the x86 ISA over the ARM (ignoring the Thumb ISA for the moment) is that it is probably a bit denser (www.csl.cornell.edu/~vince/papers/iccd09/iccd09_density.pdf). The savings in bus bandwidth and cache efficiency might cancel out the increased decode complexity <shrug>

darkblu
11-May-2010, 17:22
I think you're putting to much weight in the touted "X86 overhead of doom".
That overhead is not just transistor-related, it's present on various levels - from architecture licensing (yes, you can finally design your own atoms now, but who'd want to?), to compiler backends (the reason some x86 compilers generate good code is the enormous amount of human effort poured into catering to that architecture), to a meaningless organization of the opcode space (i.e. without much, if any, relation to opcodes' use statistics).

ARM is the leader so Intel needs a competitive edge, being a match would have gotten them nowhere. They decide to go for greater performances a the cost of higher power consumption and they gave them-selves time to meet the power requirement of handled devices. If Morrestown is not just a paper dragon it looks like there are already/almost there. Power consumption would be acceptable for a high end smart and perfs are better than anything else. Till they don't have enough of a competitive edge, cost is against them form factor is another issue. The thing that impresses me is that they get there using their 45nm process not their newest. It really spells trouble for ARM derivatives as "medfield" will be there next year. I've tough time believing that we will see 28nm ARM parts for most part of 2011 no matter one goes with TSMC or GF.
Well, as we already discussed, lithography is Intel's main weapon in this battle.

One other thing in regard to the X86 overhead, I'm close to thinking that it's irrelevant/insignificant. Say Intel didn't need a competitive edge, do you really think that the gap in power consumption between atom and ARM CPUs would have been that big. Atom TDP was 2.5Watts if my memory serves right which is more than ARM counter but perfs are nothing close either. If Intel only wanted to match ARM perfs first Atom consumption could have been lower:
*remove the "high performance" FP unit, put a sucky but low power FP unit
*remove the "high performance" SIMD unit, same as above
*use narrower memory channel (as in the new ones)
*remove hyperthreading
*aim at lower clockspeed
*move to a shorter pipeline
Do you thing it would still consume 2.5Watt for the sake of X86 overhead? I don't think so, power consumption would be really close to ARM counter parts and one would realize that the pretended overhead is not really the problem. The problem is Intel can't go for parity imho ARM is king of the hill, that's the overhead they are facing now.
Intel spent those 2.5W for x86 binary compatibility - for the benefit or running code that has been once compiled for *that* SIMD and *that* FPU. And the result from that was as expected - Intel secured themselves the market for little windows machines (AKA netbooks) - nothing more, nothing less. Unfortunately, that gave them nothing from handheld markets.

Btw, your logic goes either way. Do you believe that if ARM did not focus so much on power draw they could not have come up with a 2.5W core of equal-or-better performance to atom? Hint: check out the new A9's ; )

IMHO, until Intel start producing equally watt-efficient designs at the same litho nodes as ARM lincensees, ARM are safe and sound. And I can't see Intel achieving that while sticking to the venerable x86. I'm not saying that from the position of a CPU designer (I'm not), but from the position of somebody who's been closely following the history of the architecture (my first asm code was for 8080).

One possible benefit of the x86 ISA over the ARM (ignoring the Thumb ISA for the moment) is that it is probably a bit denser. The savings in bus bandwidth and cache efficiency might cancel out the increased decode complexity <shrug>
That would have likely been the case *if* x86 was a modern CISC ISA designed from scratch with op use satistics in mind. We know that was not the case - the ISA was made following the model of historical tarball. When your opspace has a munch of 1-byte BCD arithmetic instructions (as AAA), while a CMOV for somethig as rudimentarty as r32->r32 (for a measly 8-reg file!) is 3 bytes, the density benefits form the ISA become questionable. And this is before we consider things like the architecture's encoding expressivness, e.g. 2-operand vs 3-operand encoding, etc. At the end of the day, IA-32 yields just a tad higher density than your vanilla RISC of choice, and gets easily beaten by narrow RISCs like Thumb, compiler's stupidity nonwithstanding.

aaronspink
11-May-2010, 17:49
Apple's involvement may have spun off ARM Ltd, but the ISA itself far predates this. Apple certainly did not help architect it.

Apple had people at ARM in Cambridge working on the architecture.

aaronspink
11-May-2010, 17:52
liolio, I admire you optimism about intel's next year foray into ARM territories ; )

Intel surely have the litho advantage, but that cannot be kept indefinitely. Intel may not be at the end of the x86 rope just yet, but they're steadily reaching it. Architectures that were designed for human friendliness (versus computational efficiency, etc) are a bit past their prime time, and Intel's effort to keep said architectures en vogue is costing them more and more (it already cost them the smartphone market).

Recent history is that of the embedded architectures retreating further lower as x86 has pushed down and pushed them out of traditional markets. So far, its the embedded architectures inability to adapt and offer compelling performance levels that has been the story.

MfA
11-May-2010, 20:01
I like throughput computing, but ARM is only marginally better suited to it than x86 ... and hell, no one is even using ARM for this whereas x86 at least "has" Larrabee. ARM is actually going away from throughput efficiency with diminishing return speed ups such as superscalar OoO execution and large caches ... once cores get that fat where is the big advantage of ARM?

darkblu
11-May-2010, 20:02
Recent history is that of the embedded architectures retreating further lower as x86 has pushed down and pushed them out of traditional markets. So far, its the embedded architectures inability to adapt and offer compelling performance levels that has been the story.
The fact that Intel are the proverbial 800-pound guerilla (fabbing power, etc) has nothing to do with it, i assume. E.g. I'm sure you know why Apple switched to Intel from PPC, right? Actually, I see the opposite trend these days - ARM are pushing x86 out of its traditional low-end stronghold (the netbooks) through the new generation of tablets.

Apropos, is it 'the embedded architectures inability to offer compelling performance' that keeps those three PPC SoC-based machines (the 3 Blue Genes) in the current top10 of Top500 list of supercomputers (http://en.wikipedia.org/wiki/Top500)?

Come on, let's not fool outselves. Performance/power has little to with Intel's current position in the middle segment of the computational spectrum. Intel have been pushing x86 for generations into two directions alone - absolute performance, and windows compatibility. The moment you introduce (a) power efficiency, and (b) don't care about windows (or windows does not care about Intel - same effect), x86's advantages shrink and vanish. Intel have been desperately trying to address (a) lately, but I don't see how they could fight (b). For reference, see what's happening with Windows Phone (http://www.mobiletechworld.com/2010/05/05/intels-moorestown-chipset-will-not-never-support-windows-phone-7/) (i find Intel's quoted response hilarious, for what it's worth).

rpg.314
11-May-2010, 20:04
Actually, I see the opposite trend these days - ARM are pushing x86 out of its traditional low-end stronghold (the netbooks) through the new generation of tablets. I am looking fwd to ARM netbooks too. Where can I find a decent collection of them to choose from? :lol:

rpg.314
11-May-2010, 20:06
Apropos, is it 'the embedded architectures inability to offer compelling performance' that keeps those three PPC SoC-based machines (the 3 Blue Genes) in the current top10 of Top500 list of supercomputers (http://en.wikipedia.org/wiki/Top500)?Doubtful. Very doubtful. :???:

aaronspink
11-May-2010, 23:53
E.g. I'm sure you know why Apple switched to Intel from PPC, right? Actually, I see the opposite trend these days - ARM are pushing x86 out of its traditional low-end stronghold (the netbooks) through the new generation of tablets.

PPC was a dead end? And about those ARM netbooks. Um, yeah, those are selling, I think. It is kinda hard to tell.

Apropos, is it 'the embedded architectures inability to offer compelling performance' that keeps those three PPC SoC-based machines (the 3 Blue Genes) in the current top10 of Top500 list of supercomputers (http://en.wikipedia.org/wiki/Top500)?

BlueGene isn't about the cpu isa but about the network architecture like all ultra scale machines.

Come on, let's not fool outselves. Performance/power has little to with Intel's current position in the middle segment of the computational spectrum.

Actually, unless you don't care about performance, there really hasn't been anyone who has competed with x86 in performance/power.

thop
12-May-2010, 01:17
PowerVR graphics? No Linux support then.

Helmore
12-May-2010, 01:21
PowerVR graphics? No Linux support then.

Pretty much the only platform they're supporting is Linux.

thop
12-May-2010, 01:24
Where are proper GMA 500 Linux drivers then?

Exophase
12-May-2010, 02:03
One possible benefit of the x86 ISA over the ARM (ignoring the Thumb ISA for the moment) is that it is probably a bit denser (http://www.csl.cornell.edu/%7Evince/papers/iccd09/iccd09_density.pdf). The savings in bus bandwidth and cache efficiency might cancel out the increased decode complexity <shrug>

That paper doesn't seriously include Thumb-2 aside from a fleeting mention. Thumb-2 is much denser that ARM with only a very small average loss in instruction efficiency (some additional small losses in other areas in current implementations). Despite being much denser, it doesn't save actually help much in icache pressure, and I doubt this applies for x86 which is probably not nearly as dense.

Apple had people at ARM in Cambridge working on the architecture.

In 1984, when Acorn was Apple's competitor and Apple were themselves not doing that well (until Macintosh)? Oh well, let's see your source, I guess.

aaronspink
12-May-2010, 03:18
In 1984, when Acorn was Apple's competitor and Apple were themselves not doing that well (until Macintosh)? Oh well, let's see your source, I guess.

The whole reason it is called ARM is because of the joint development with Apple which caused them to spin out the ARM team from Acorn to allow neutral development. Apple started working on ARM basically before the first version(ARM2/3) shipped and used the second generation (ARM6 and later ARM7) in the newton. This is somewhat known in the tech industry and confirmed from multiple sources including wikipedia.

darkblu
12-May-2010, 06:09
PPC was a dead end?
Far from that. Jobs grew tired of trying to spin the negative clock difference with x86. That and IBM screwed up with a bunch of deadlines for the G5 (and Moto had not been particularly stellar with their G4 deliverables either). As you say, well-known things in the industry.

And about those ARM netbooks. Um, yeah, those are selling, I think. It is kinda hard to tell.

I said 'tablets'. Not hard to tell how they sell (given there's one brand that's actually selling). Also, not hard to tell how netbooks have been affected by the foray of said tablets:

http://i13.photobucket.com/albums/a278/Rubxqub/netbook-sales-3.jpg
Some say a market's viability is measured by its growth. But i have no knowledge of such things.

BlueGene isn't about the cpu isa but about the network architecture like all ultra scale machines.Fine. Are game consoles also about the network architecture?

Actually, unless you don't care about performance, there really hasn't been anyone who has competed with x86 in performance/power.
Oh wow. I wouldn't even try to comment on that.

Laurent06
12-May-2010, 09:35
The whole reason it is called ARM is because of the joint development with Apple which caused them to spin out the ARM team from Acorn to allow neutral development. Apple started working on ARM basically before the first version(ARM2/3) shipped and used the second generation (ARM6 and later ARM7) in the newton. This is somewhat known in the tech industry and confirmed from multiple sources including wikipedia.

Sorry but Wikipedia on its own can't be considered as a reliable source of information.

ARM indeed was founded as a joint venture between Acorn, VLSI and Apple, by end of 1990. (ref (http://groups.google.co.uk/group/eunet.micro.acorn/browse_thread/thread/c3a21c6fffda982c?pli=1)).

ARM1 taped out in early 85, while ARM2 taped out in 86 and was used in Acorn Archimedes, and I'd be extremely surprised if you could find any reference of Apple involvement that early (though it's highly probable they were involved before the founding of ARM Ltd).

aaronspink
12-May-2010, 10:59
Sorry but Wikipedia on its own can't be considered as a reliable source of information.

ARM indeed was founded as a joint venture between Acorn, VLSI and Apple, by end of 1990. (ref (http://groups.google.co.uk/group/eunet.micro.acorn/browse_thread/thread/c3a21c6fffda982c?pli=1)).

ARM1 taped out in early 85, while ARM2 taped out in 86 and was used in Acorn Archimedes, and I'd be extremely surprised if you could find any reference of Apple involvement that early (though it's highly probable they were involved before the founding of ARM Ltd).

Lets put it this way, I personally know several sources that worked for apple AT Acorn/arm. You can either believe or disbelieve.

Helmore
12-May-2010, 12:26
Where are proper GMA 500 Linux drivers then?
What I meant was that the only platform's that Intel is currently supporting are Linux based. The Intel Atom Z600 chipsets don't support Windows (XP, Vista and 7) and they don't support Windows Mobile nor is there support for Windows Phone 7. The platforms that Intel is supporting are Moblin, MeaGo, Android and they are all based on Linux.

Laurent06
12-May-2010, 12:38
Well I also have access to first hand sources, but it looks like both you and me can't talk too much :)

Rys
12-May-2010, 12:51
Aaron is completely correct about Apple and ARM.

thop, what would you consider a 'proper' driver here? It was Intel's choice to write their own GMA 500 driver, too.

tangey
12-May-2010, 14:53
What I meant was that the only platform's that Intel is currently supporting are Linux based. The Intel Atom Z600 chipsets don't support Windows (XP, Vista and 7) and they don't support Windows Mobile nor is there support for Windows Phone 7. The platforms that Intel is supporting are Moblin, MeaGo, Android and they are all based on Linux.

I think Intel will quickly follow-up with the rumoured moorestown-w, that does have PCI support and thus suitable for windows.

Remember that their recently discussed Tunnel creek Soc, that is targetting IVI and digital signage is very similar to the lincroft chip, (SGX graphics, video encode and decode) but does have a PCI interface. It probably doesn't have all the various extreme power saving features.

http://download.intel.com/pressroom/kits/events/idfspr_2010/pdfs/Tech_Insight-Tunnel_Creek.pdf

thop
12-May-2010, 21:25
thop, what would you consider a 'proper' driver here? It was Intel's choice to write their own GMA 500 driver, too.
A driver that works and is maintained, is all. Not even asking to open source it. I'm still sceptic about the GMA 600 driver, even if Linux is their only supported platform.

Exophase
13-May-2010, 01:43
The whole reason it is called ARM is because of the joint development with Apple which caused them to spin out the ARM team from Acorn to allow neutral development. Apple started working on ARM basically before the first version(ARM2/3) shipped and used the second generation (ARM6 and later ARM7) in the newton. This is somewhat known in the tech industry and confirmed from multiple sources including wikipedia

Or the whole reason it's called ARM is because it stood for "Acorn RISC Machine", which has nothing to do with Apple. Yes, Apple started showing interest in ARM in the late 80's as a mobile platform. But ARM development began 4 years before Newton development.

Saying it's well known and on Wikipedia is not actually giving a source. I think you're confused about what Apple's contribution actually was. The Wikipedia node on ARM shows nothing to corroborate your claims - only that Apple started working with Acorn long after ARM was first developed. That's not "helping create the ISA." The ISA was developed by Sophie Wilson, not Apple. The spin-off company ARM Ltd is completely separate from the ISA's development.

Simon F
13-May-2010, 09:45
This talk by Steve Furber (http://www.computinghistory.org.uk/det/5633/Steve%20Furber%20Talk%20-%20Acorn%20World%20-%2013-09-2009) should shed some light on the history. I think the discussion of ARM starts around the 38 minute mark.

darkblu
13-May-2010, 19:12
This talk by Steve Furber (http://www.computinghistory.org.uk/det/5633/Steve%20Furber%20Talk%20-%20Acorn%20World%20-%2013-09-2009) should shed some light on the history. I think the discussion of ARM starts around the 38 minute mark.
Thank you, Simon. A jolly good talk altogether. Btw, the ARM part starts from the 32min mark, but not listening to the whole talk would be a loss to any of the participants in this thread.

Exophase
14-May-2010, 01:43
Thanks Simon, very interesting watch. I hope this clears up any doubts regarding Apple's involvement - Furber clearly says Apple "came knocking on the door" right after he left in 1990 (around 46:10).

Simon F
14-May-2010, 06:48
I liked the bit about the full CPU simulator being only 800 lines of code. :-)

JohnH
14-May-2010, 14:10
We've clearly been using the wrong approach, it's better to have no money and no people available :ROFL:

Great trip down memory lane!

John.

eastmen
16-May-2010, 04:22
Far from that. Jobs grew tired of trying to spin the negative clock difference with x86. That and IBM screwed up with a bunch of deadlines for the G5 (and Moto had not been particularly stellar with their G4 deliverables either). As you say, well-known things in the industry.



I said 'tablets'. Not hard to tell how they sell (given there's one brand that's actually selling). Also, not hard to tell how netbooks have been affected by the foray of said tablets:

http://i13.photobucket.com/albums/a278/Rubxqub/netbook-sales-3.jpg
Some say a market's viability is measured by its growth. But i have no knowledge of such things.

Fine. Are game consoles also about the network architecture?


Oh wow. I wouldn't even try to comment on that.

Netbooks are dieing out because of the high prices vs notebooks and the lack of performance. It also doesn't help that there hasn't been many refreshes in over a year. A 1.6ghz atom is going to perform just as shitty as a 1.2ghz atom.

I would think with intel's nxt refresh that should bring dual core atom chips will help.


I mean a dell mini 10 with an atom 1.2ghz cpu , 1 gig of ram and integrated 3 year old intel igp isn't very good as a deal for $300. Paying $400 for the same thing with a 1.6ghz cpu insead isn't great either.

You jump up to $350 right between those , you get the inserpon 11z. It comes with a bigger screen , 2 gigs of ram , a celeron 1.3ghz and a much newer g45 intel igp.


Netbooks just aren't a good value and many see that. The same might be said about the ipad and tablets in general 10 months from now as people learn that they aren't powerfull enough to replace thier laptops.

darkblu
16-May-2010, 05:24
Netbooks just aren't a good value and many see that. The same might be said about the ipad and tablets in general 10 months from now as people learn that they aren't powerfull enough to replace thier laptops.
While I agree that netbooks are not particularly good value, I suspect you haven't actually tried an ipad yet. They are freakishly fast for what they do, and quite possibly the best web browsing device on the planet.

ps: flash can go die in a barn fire, for all i care. Adobe had all their chances and blew them, like the little 'oh-look-we-have-the-windows-desktop-by-the-balls-why-bother-about-embedded' snobs that they are were.

Fox5
16-May-2010, 06:52
I like throughput computing, but ARM is only marginally better suited to it than x86 ... and hell, no one is even using ARM for this whereas x86 at least "has" Larrabee. ARM is actually going away from throughput efficiency with diminishing return speed ups such as superscalar OoO execution and large caches ... once cores get that fat where is the big advantage of ARM?


Creative's ARM based Zii architecture has a thoroughput focus, doesn't it? We'll probably see the first mobile device supporting OpenCL before anything uses Zii, though.

Exophase
16-May-2010, 06:57
Creative's ARM based Zii architecture has a thoroughput focus, doesn't it? We'll probably see the first mobile device supporting OpenCL before anything uses Zii, though.

While Zii has an ARM as its general purpose control core (like the PPC on Cell, and it's a very old and slow ARM at that) the big throughput compute array is anything but ARM.

Actually, I say that but I have no idea what it really is, and I think no one else really does either, unless more info has surfaced.

On the other hand, Furber did say (in the video Simon F just posted in this thread, at that) that he has a research project going with some utterly obscene number of ARM9 cores. Again, not comparable to modern ARM, but in some sense an ARM's an ARM. I do wonder if you could gain more with tinier/simpler cores. Furber has said a lot about ARM being super small and simple because they couldn't afford to make it complex, but it really does a lot of things that were quite extravagant for its time, even if much of it was just generalized solutions to things they needed to have on die anyway.

It seems to me that if you want high data throughput going for really wide SIMD makes the most sense, which would be accomplished either by having a bunch of cores with a shared instruction fetch/decode frontend (GPU shaders approach) or really wide vector instructions (Larrabee approach). If you wanted something with really high control throughput, like AI might be, and I think this is what Furber is doing, you might want the opposite extreme - a bunch of extremely small cores with tiny register files and really small/simple instructions (and not a lot of them, with what you have being specialized for the application). ARM as it exists in any incarnation doesn't seem to cater fantastically to either extreme, but I do think it does better than vanilla x86. And no, I don't consider Larrabee vanilla x86, the x86 part is barely more than a casual point of interest.

Main point is, for many-core you probably want something more specialized, but for right now we still need to run our existing general purpose code.

Exophase
16-May-2010, 07:25
Real OT, but here's the snippet on Zii's compute array:

Media Processing Array - Architecture
High compute density SIMD architecture
24 Processing Elements (PE) in 3 clusters
Each cluster runs the same or independent code
Multiple High bandwidth memory paths
Advanced hierarchical cache structure
Random access to memory per PE
Shared access to ARM memory
Independent DMA controller per cluster
Integer, IEEE 32-bit and 16-bit floating point

Sounds a lot like shaders on various GPUs right? But I'm far from the expert on these things like many here are. Would guess those 8-per cluster units are single-issue scaler units, but that still gives Zii grossly more shader ALU power than any portable 3D solution on the market (the units are said to run at 166MHz). DMA is nice too. Too bad that when you're doing 3D so much compute time is spent on texturing and other traditionally fixed function tasks.

Ailuros
16-May-2010, 08:01
Don't know if you guys have read that one: http://www.imgtec.com/factsheets/SDK/POWERVR%20SGX.OpenGL%20ES%202.0%20Application%20De velopment%20Recommendations.1.8f.External.pdf

I'm not sure how the Zii exactly handles integers, fp16 and fp32, but I'd say that it's safe to assume that at least for fp32 the ALUs act as scalar units.

SGX ALUs on the other hand as described above can either operate as scalars 1 fp32 (highp), Vec2 fp16 (mediump) or Vec4 int8 (lowp). The developer recommendation for those precision levels out of the document above are:



Use highp for vertex position and transformation matrices
Use highp or mediump for texture coordinates
Use lowp for normals and colours as long as the range is sufficient



That's of course mostly for SGX520-545 (USSE), SGX543 (USSE2) not included.

It would be interesting to know how much die area the Zii captures.

Exophase
16-May-2010, 09:19
Don't know if you guys have read that one: http://www.imgtec.com/factsheets/SDK/POWERVR%20SGX.OpenGL%20ES%202.0%20Application%20De velopment%20Recommendations.1.8f.External.pdf

I'm not sure how the Zii exactly handles integers, fp16 and fp32, but I'd say that it's safe to assume that at least for fp32 the ALUs act as scalar units.

SGX ALUs on the other hand as described above can either operate as scalars 1 fp32 (highp), Vec2 fp16 (mediump) or Vec4 int8 (lowp). The developer recommendation for those precision levels out of the document above are:



That's of course mostly for SGX520-545 (USSE), SGX543 (USSE2) not included.

It would be interesting to know how much die area the Zii captures.

Even more OT, but the 4-way integer SIMD on USSE is actually 10bit 1.1.8 rather than 8bit, as can be seen in the description of lowp in the document you've linked. This does conflict with TI's description but I take IMG more at their word, and I believe I've received direct confirmation on this before.

What's the highest end SGX we can consider really on the market right now, 535 still right? And that's 2x USSE1, no? So still only 8x int10 per clock at comparable clock speeds, which still pales in comparison to 24x on Zii.

MfA
16-May-2010, 16:47
It seems to me that if you want high data throughput going for really wide SIMD makes the most sense
I disagree, I think wide vectors are used exactly because people are using fat cores and this is the only way to make it work with them. The 5 wide VLIW cores on AMD GPUs are a better example of a core well suited for throughput floating point computing IMO (in practice still used in a SPMD setup, but I think that has to do with the history of GPU computing where having low branch granularity didn't impact efficiency much).

Compared to that both x86 and ARM are fat.

aaronspink
16-May-2010, 17:07
I disagree, I think wide vectors are used exactly because people are using fat cores and this is the only way to make it work with them. The 5 wide VLIW cores on AMD GPUs are a better example of a core well suited for throughput floating point computing IMO (in practice still used in a SPMD setup, but I think that has to do with the history of GPU computing where having low branch granularity didn't impact efficiency much).

Compared to that both x86 and ARM are fat.

but its not 5 wide VLIW. its 5x16 wide VLIW. As soon as they drop the x16, they'll have less density than nvidia!

MfA
16-May-2010, 17:16
Still more density than NVIDIA would have if it tried to drop the x16 ... let alone Larrabee if it tried to drop the x16.

Exophase
16-May-2010, 19:04
I disagree, I think wide vectors are used exactly because people are using fat cores and this is the only way to make it work with them. The 5 wide VLIW cores on AMD GPUs are a better example of a core well suited for throughput floating point computing IMO (in practice still used in a SPMD setup, but I think that has to do with the history of GPU computing where having low branch granularity didn't impact efficiency much).

Compared to that both x86 and ARM are fat.

I think it's apples and oranges IMO, the wide SIMD on modern x86 and ARM doesn't make it fat. Larrabee, for instance, is much leaner and much wider, and there I think it's the x86 part that's more tacked on than the wide SIMD part.

I do consider SPMD still an example of the SIMD I'm getting at, and it's certainly making the cores much leaner (and not just out of heritage). VLIW obviously has its merits too, I didn't mean to exclude that, I was merely referring to a large number of operations per instruction fetch/decode ratio.

MfA
16-May-2010, 19:23
VLIW obviously has its merits too, I didn't mean to exclude that, I was merely referring to a large number of operations per instruction fetch/decode ratio.
A lot of them will be wasted though in divergent kernels. With both x86 and ARM you have the choice between scalar (lot of overhead) SIMD (low branch granularity) and superscalar (fat). VLIW expands the design space, because for each VLIW you can decide at compile time to use either superscalar or SIMD execution ... and when not bogged down with forward compatibility VLIW can do superscalar much leaner (VLIW combined with forward compatibility is really the worst of all worlds in the end, ie. Itanium).

Exophase
16-May-2010, 20:12
A lot of them will be wasted though in divergent kernels. With both x86 and ARM you have the choice between scalar (lot of overhead) SIMD (low branch granularity) and superscalar (fat). VLIW expands the design space, because for each VLIW you can decide at compile time to use either superscalar or SIMD execution ... and when not bogged down with forward compatibility VLIW can do superscalar much leaner (VLIW combined with forward compatibility is really the worst of all worlds in the end, ie. Itanium).

Interesting discussion, I hope no one minds our hijacking too much ;)

I don't disagree with this, for the most part.

I do think that especially "lean" VLIW many-cores will have a lot of specialization per execution-unit, and therefore shouldn't present that much opportunity for SIMD.

One of the downsides of leaner VLIW is that you end up with much wider instructions that will inevitably have a bunch of execution unit NOPs in them. You can stitch them out like TI's C6x does, but then you end up with more complex variable length fetches (although nothing like x86, of course) and execution unit scheduling. If amortized over many cores this might not matter much.

From here the main thing separating VLIW from conventional superscalar is interlocking. Superscalar doesn't necessarily need it to be superscalar, but of course superscalar on x86 and ARM do to be backwards compatible. Stuff like this bites you, and out of order execution then bites you a lot more. SMT is a leaner solution to hiding latencies than OoE, along with large enough register files for software scheduling and perhaps features like the software loop pipelining capabilities in C6x (although those are probably overkill).

I guess it's worth comparing just how big N scalar cores vs 1 N-wide VLIW is when fetch/decode is amortized out. The VLIW has to be compressed to really have comparable code densities, if that's determined to even matter.

I agree that forward compatibility is awful for VLIWs.

A little more on topic: when regarding current x86 and ARM and which is "fatter", the real question to me is if there's anything about ARM that lends itself to leaner OoE than x86. Maybe someone else can comment, I don't have anything on this yet.

eastmen
16-May-2010, 20:47
While I agree that netbooks are not particularly good value, I suspect you haven't actually tried an ipad yet. They are freakishly fast for what they do, and quite possibly the best web browsing device on the planet.

ps: flash can go die in a barn fire, for all i care. Adobe had all their chances and blew them, like the little 'oh-look-we-have-the-windows-desktop-by-the-balls-why-bother-about-embedded' snobs that they are were.

I've used one and I like it , wont buy it cause its apple though.

However some users may want to replace thier laptop and buy an ipad and find that it just doesn't cut it and at $500 for the average price of an ipad you can get a decent laptop.

I've seen this trend . People had desktops and had to replace them so they bought laptops. However laptops weren't powerfull enough to replace desktops unless you dropped a huge chunk of money. So OEMs made big laptops to put bigger faster hardware in and people bought those but they were no longer very viable to take with you and battery life was really bad. Now we have netbooks that have great battery life so you'd wnat to keep them with you almost all the time. However the processor sucks for anything outside of word some internet sites and 8 year old games.


The same trend may happen to tablets. In fact a friend of mine talked about wanting wow on his ipad (Before the ipad came out )and i had to sit and explain to him why he'd be waiting a long time and why he shouldn't get his hopes up.

The ipad / tablets may work a bit better than netbooks because they are closer to cell phons with a really big screen while netbooks are closer to pcs with a really small screen. But they will still be limited and still face the same challanges that netbooks face.

MfA
16-May-2010, 23:46
I do think that especially "lean" VLIW many-cores will have a lot of specialization per execution-unit
Not that much, if you already have a floating point multiplier per slot you don't need to sweat the small stuff.
One of the downsides of leaner VLIW is that you end up with much wider instructions that will inevitably have a bunch of execution unit NOPs in them. You can stitch them out like TI's C6x does, but then you end up with more complex variable length fetches (although nothing like x86, of course) and execution unit scheduling. If amortized over many cores this might not matter much.
On something like Evergreen's VLIW cores (4 general slots, one special purpose slot) you'd only have to shift one instruction word. Not a big deal.
From here the main thing separating VLIW from conventional superscalar is interlocking. Superscalar doesn't necessarily need it to be superscalar
To be able to handle the hazards in software you need to know the exact instructions being executed on a given cycle, just ditching out of order processing is not enough ... if you don't know whether neighbouring instructions execute on the same or on subsequent cycles beforehand you'd need inefficient safety margins during compilation. Tricks like AMD plays with the register file also would not work without VLIW (they only use 3 ports per register bank).

tangey
17-May-2010, 01:20
What's the highest end SGX we can consider really on the market right now, 535 still right? .

SGX540 is used in the Samsung S5PC110 which is AP inside the Samsung Galaxy and Wave handsets.

Exophase
17-May-2010, 03:17
Not that much, if you already have a floating point multiplier per slot you don't need to sweat the small stuff.

But what do you gain by having N FMACs that can also do M load/stores (where M <= N) instead of having N FMACs and M LSUs independent that can be co-issued? I thought maximizing execution width was the driving point of VLIW.

On something like Evergreen's VLIW cores (4 general slots, one special purpose slot) you'd only have to shift one instruction word. Not a big deal.

Just because four units can do the same thing doesn't mean that there will be no dependencies. Making the CPU resolve dependencies instead of the compiler again seems to be defeating the purpose. As I mentioned, I see the main reason for this is compatibility. C6x does quite well w/o interlocks.

To be able to handle the hazards in software you need to know the exact instructions being executed on a given cycle, just ditching out of order processing is not enough ... if you don't know whether neighbouring instructions execute on the same or on subsequent cycles beforehand you'd need inefficient safety margins during compilation. Tricks like AMD plays with the register file also would not work without VLIW (they only use 3 ports per register bank).

Why wouldn't you be able to track what's executing on subsequent instructions? In a good non-interlocked VLIW design (mainly citing C6x again) most instructions are single cycle issue and the only non-deterministic stalls are from cache misses. Branches are fully delayed instead of predicted and there's predication, which works well for high data throughput code. Cache misses stall rather than interlock, which is where SMT would work well.

I wouldn't call limited register file ports a "trick", it's just an additional expense with VLIW, but really that's the case with any wide design. A segmented register file (again, C6x) can help keep this under control if things get especially wide.

SGX540 is used in the Samsung S5PC110 which is AP inside the Samsung Galaxy and Wave handsets.

Oh yeah, I forgot about that. Probably because Samsung is being surprisingly quiet about it. I wonder when other companies can ship devices with S5PC110, and how much they'll cost. S5PC100 is a pretty great value, gimped 3D aside.

MfA
17-May-2010, 04:54
In a good non-interlocked VLIW design (mainly citing C6x again) most instructions are single cycle issue and the only non-deterministic stalls are from cache misses.
Never mind, I thought you were arguing you could do superscalar without interlocking without going for VLIW.
A segmented register file (again, C6x) can help keep this under control if things get especially wide.
Pretty sure the most common way to refer to such a scheme is banking.

Exophase
17-May-2010, 05:31
Never mind, I thought you were arguing you could do superscalar without interlocking without going for VLIW.

Yes, or more to the point, that the scheduled/"compressed" VLIW (no need for nops in unused EUs) in platforms like C6x is pretty much exactly that, in-order superscalar without interlocking.

Pretty sure the most common way to refer to such a scheme is banking.

Yes I'm pretty sure you're right ;) I think people got the idea.

Ailuros
17-May-2010, 06:58
What's the highest end SGX we can consider really on the market right now, 535 still right? And that's 2x USSE1, no?

The Samsung S5PC110 (Wave, Meizu, Galaxy) contain a SGX540 so that's 4 ALUs.

So still only 8x int10 per clock at comparable clock speeds, which still pales in comparison to 24x on Zii.16x in the case above, but that's besides the point. Zii strikes me more than a multimedia oriented GPP than what I'd call a GPU and you wouldn't typically use a SGX for stuff like video decoding for example either. If a semi doesn't have it's own video decoder and the SoC contains a VXD there's as much a META GPP processor at its heart as the SGX USSEx is.

There's a reason why I'm asking about die area or to expand a bit on it times its (expected) utilization vs. performance and inevitably power consumption. I don't know how the exact die area for 535 (SM3+) looks like; all they're giving under 65LP@200MHz is 2.6mm2 SGX520 (SM3.0+) up to 12.5mm2 SGX545 (DX10.1). The SGX543 (>Vec4 fp16) is at 8mm2/core.

In Zii's case as a (mostly) GPP core having 24 PEs doesn't strike me as it'll come for free in terms of area consumed.


Oh yeah, I forgot about that. Probably because Samsung is being surprisingly quiet about it. I wonder when other companies can ship devices with S5PC110, and how much they'll cost. S5PC100 is a pretty great value, gimped 3D aside.

Must be the reason why Samsung has been waving (no pun intended) in the media that the Wave is capable of 89M Tris. Note that IMG typically gives very conservative triangle throughputs and the SGX540/5 are rated at 40M Tris at 200MHz with <50% shader load. Irrelevant to that, Samsung has been all but quiet about it.

Exophase
17-May-2010, 07:19
Must be the reason why Samsung has been waving (no pun intended) in the media that the Wave is capable of 89M Tris. Note that IMG typically gives very conservative triangle throughputs and the SGX540/5 are rated at 40M Tris at 200MHz with <50% shader load. Irrelevant to that, Samsung has been all but quiet about it.

What I meant is that I didn't think Samsung has identified those phones as having S5PC110 and thus SGX540+, correct me if I'm wrong though (not that it matters, it's obvious what it is)

Ailuros
17-May-2010, 15:08
What I meant is that I didn't think Samsung has identified those phones as having S5PC110 and thus SGX540+, correct me if I'm wrong though (not that it matters, it's obvious what it is)

http://www.anandtech.com/show/2916/2 if Samsung didn't want to reveal anything in that regard (think of Apple's secrecy tick for example) IMG wouldn't had placed the prototype on display.

MfA
17-May-2010, 15:24
Yes, or more to the point, that the scheduled/"compressed" VLIW (no need for nops in unused EUs) in platforms like C6x is pretty much exactly that, in-order superscalar without interlocking.
That's not exactly a case in point of superscalar without interlocking without VLIW ... regardless of encoding if the grouping is determined at compile time it's VLIW, which is what happens here and what will happen with any superscalar core without interlocking (even the superscalar MIPS processors had interlocking despite the name).

You can't do this without VLIW ... so really, VLIW is the only option for lean superscalar/SIMD hybrid cores.

Exophase
17-May-2010, 16:14
http://www.anandtech.com/show/2916/2 if Samsung didn't want to reveal anything in that regard (think of Apple's secrecy tick for example) IMG wouldn't had placed the prototype on display.

I don't think Samsung is being deliberately secretive, I was just expecting them to directly promote that their latest phones are using their latest SoC. But I guess it's common practice for phone manufacturers to not talk much about the hardware inside the phones.

That's not exactly a case in point of superscalar without interlocking without VLIW ... regardless of encoding if the grouping is determined at compile time it's VLIW, which is what happens here and what will happen with any superscalar core without interlocking (even the superscalar MIPS processors had interlocking despite the name).

You can't do this without VLIW ... so really, VLIW is the only option for lean superscalar/SIMD hybrid cores.

Note that I didn't say the C6x example wasn't VLIW, I even referred to it as such. We're not actually disagreeing on anything, in fact just both repeating that non-interlocking superscalar == compact encoded VLIW.

tangey
24-Aug-2010, 16:54
Some strong Rhetoric coming from Intel at the announcement of the Intel/Nokia joint research facility.

"With (our) Moorestown processor we equal them on standby power, in the next generation Medfield we will equal them on active power," Justin Rattner, Intel's Chief Technology Officer, said.

I expect us to just pull away after that because we have a fundamental technology advantage, which they don't have," he said in an interview on the sidelines of a news conference opening Intel's joint research centre with Nokia in Northern Finland.

Very agressive stance, to be fair, they were agressive on their predictionds for Mooresown idle and they've exceeded those if early website analysis is to be believed.. The usage power thing is going to be much harder to address, if they can hit even close to the targets, and get the overall board space improved, it'll get really interesting by the end of next year.

http://economictimes.indiatimes.com/infotech/hardware/Intel-says-to-beat-ARM-in-power-usage/articleshow/6426215.cms

metafor
24-Aug-2010, 17:03
Some strong Rhetoric coming from Intel at the announcement of the Intel/Nokia joint research facility.

"With (our) Moorestown processor we equal them on standby power, in the next generation Medfield we will equal them on active power," Justin Rattner, Intel's Chief Technology Officer, said.

I expect us to just pull away after that because we have a fundamental technology advantage, which they don't have," he said in an interview on the sidelines of a news conference opening Intel's joint research centre with Nokia in Northern Finland.

Very agressive stance, to be fair, they were agressive on their predictionds for Mooresown idle and they've exceeded those if early website analysis is to be believed.. The usage power thing is going to be much harder to address, if they can hit even close to the targets, and get the overall board space improved, it'll get really interesting by the end of next year.

http://economictimes.indiatimes.com/infotech/hardware/Intel-says-to-beat-ARM-in-power-usage/articleshow/6426215.cms

Considering Moorestown idle is pretty much all 3rd party IP (the processor being shut down), I don't know how much there is there to boast about.

Exophase
24-Aug-2010, 18:12
Like usual Intel is gloating about idle power and, at best, consumption while the CPU is far from being heavily utilized. When Intel uses terms like "active power" it takes me back to their claims of Moorsetown's battery life while browsing the web and playing videos (of course offloaded to decoder blocks), not actually using most or all of the CPU time.

Incidentally, this article is basically admitting that right now they are inferior to ARM in "active power", which I doubt means anything less than perf/watt (nothing else is really meaningful). I also don't know why the title indicates x86 beating ARM, when Intel is only claiming they'll match them.

Yeah, x86 has superior technology, they also have an inferior architecture. Guess which one is easier to change?

3dcgi
25-Aug-2010, 03:31
Yeah, x86 has superior technology, they also have an inferior architecture. Guess which one is easier to change?
I can't tell which you're insinuating, but the answer is architecture. No one's been able to beat Intel in process technology for years.

tangey
25-Aug-2010, 10:25
I also don't know why the title indicates x86 beating ARM, when Intel is only claiming they'll match them.
The "expect us to just pull away after that (medfield) " springs to mind

Exophase
25-Aug-2010, 13:37
I can't tell which you're insinuating, but the answer is architecture. No one's been able to beat Intel in process technology for years.

And Intel has ultimately not changed architecture, so that point is a little moot. Intel has been leading in process technology, but the gap has been narrowing and the main point is that it's something that can be improved. The architectural drawbacks can be mitigated (sometimes entirely) in some contexts, but here they're still carrying baggage and as they go lower power that doesn't go away, it gets worse.

Hopefully it's clear what I mean by "architecture"

Lazy8s
25-Aug-2010, 19:05
Intel also has a tough fight on the business model end for the mobile market.

The TSMC deal helps somewhat with that, and the Nokia partnership is a good foot in the door. What they really need to do, though, is to play up where they can actually have a performance advantage, in high clocks and die areas for the graphics and video cores.

Exophase
26-Aug-2010, 06:29
What they really need to do, though, is to play up where they can actually have a performance advantage, in high clocks and die areas for the graphics and video cores.

I agree, although I'm not sure I see potential for competition on the video cores front, with everyone doing H.264 at 1080p any advantage seems pretty moot.

The advantage in clocks with their SGX implementations is definitely there (if/when actually clocked like that), the problem is with the sorry state of their SGX drivers for Linux (someone please correct me if they've improved this). This might not apply to their Windows deployments, but if we're talking about Moorestown (ie this thread) then Linux is what's applicable. And despite Intel's tendency to hype further and further down their Atom roadmap Moorestown is really where it should be at for a while into the foreseeable future. Does anyone know if it's seen any deployment yet?

I do wonder what the actual clock advantage is - I know for the historically typical 1.6GHz Atom parts that clock is achieved at the expense of a less power efficient process that translates into lower perf/watt (exact same trade-off is available for Cortex-A9). I imagine the clock advantages for the 3D are tied to this too, although they might be more aggressive even for the more power efficient parts.

Does anyone know what generation of SGX is on Moorestown?

Lazy8s
26-Aug-2010, 08:02
535

tangey
26-Aug-2010, 09:49
Does anyone know what generation of SGX is on Moorestown?

535, with some SKUs clocking @400Mhz

Although it was formally announced a few months ago, there's no product yet, and unusual for Intel, there is no tech data available on the Intel website. A significant % of next months IDF is dedicated to moorestown product and ecosystem, so expect a big push there.

Ailuros
27-Aug-2010, 12:12
I know that video captures aren't particularly telling but for Intel's Moorestown there's this:

http://www.youtube.com/watch?v=1mo5fg_hePs&feature=related

and IMG's own showcase of the demo in question:

http://www.imgtec.com/demo_room/viewdemo.asp?DemoID=54&DemoTech=POWERVR%20Graphics&DemoDev=Imagination&#ViewPort

and there's of course a showcase of Kwaak3 on a Moorestown smartphone:

http://www.youtube.com/watch?v=UzWGQaPEF9Y

I doubt the smartphone in question has its GPU clocked as high as 400MHz, since that sounds more like tablet/netbook material.

Lazy8s
27-Aug-2010, 14:02
Another video there shows off some displays of the multimedia performance of Moorestown such as multi-point HD streaming plus playback of Avatar in HD, a video teaser for World of Warcraft, and some other backgrounds to IMG's shader view demo like an outer space scene and a rooftop cityscape scene.

http://www.youtube.com/watch?v=bAB-Cqe0yz8

Exophase
27-Aug-2010, 14:23
Hm, okay, so non-useless drivers exist for at least something that isn't Windows. Hopefully we're not talking Android only or something, and hopefully this can propagate back to GMA500 netbooks running Linux.

Ailuros
27-Aug-2010, 14:25
Another video there shows off some displays of the multimedia performance of Moorestown such as multi-point HD streaming plus playback of Avatar in HD, a video teaser for World of Warcraft, and some other backgrounds to IMG's shader view demo like an outer space scene and a rooftop cityscape scene.

http://www.youtube.com/watch?v=bAB-Cqe0yz8

I don't know what the smart-phone frequency could be, but I'd dare to speculate not below 300MHz since that Q3 demo seems a lot faster then what I could see on a Hummingbird. As for the shader views demo here it is in their demo room:

http://www.imgtec.com/demo_room/viewdemo.asp?DemoID=49&DemoTech=POWERVR%20Graphics&DemoDev=Imagination&#ViewPort

The ShaderViews demonstration illustrates the image processing capabilities of the programmable shaders available on POWERVR SGX enabled platforms. It includes a variety of different post-processing effects applied to floating windows and frames zooming across the background. A variety of backgrounds are shown, including still photos, 3D scenes rendered-to-texture (Urban, Asia and Space scenes), and dynamically tessellated terrain.

More than 25 different post-processing effects are illustrated include edge detection, procedural deformations & distortion effects (punch, swirl, mirror, twirl, etc.), color matrix operations, blurring, frosted glass, greyscale and sepia filters. This demonstration includes a rolling demo with scrollable windows and a touch menu to select scenes and effects.

Post-processing is obviously one of their strengths ;)

DavidC
27-Aug-2010, 22:49
I doubt the smartphone in question has its GPU clocked as high as 400MHz, since that sounds more like tablet/netbook material.

Correct. On earlier demos Intel has done, they say the "faster" version got 100 fps while the "regular" version got ~60 fps. I'm guessing the "regular" version is "smartphone" while the "faster" version is "tablet".

Clock speed of the graphics in Tunnel Creek is 333MHz. I doubt even the Tablet Moorestown parts are clocked faster than that.

tangey
15-Sep-2010, 10:19
Full datasheet is up for the TunnelCreek (now E600) series embedded processors.
http://download.intel.com/embedded/processor/datasheet/324208.pdf

In terms of graphics/video the highlights are:-
skus run the graphics @320Mhz or @400Mhz
fill rate:- 2 pixels per clock
vertex rate, 1 triangle per 15 clocks
x4 MSAA

video encode
720P30 H.264 MP encode
From the table the max bit rate the encoder acheives is 16M.

Video decode
1080p H.264 MP&HP decode

display output supports 1080p

Lazy8s
15-Sep-2010, 14:20
The SGX535 still gets 320 even with the 600 MHz Atom. Nice.

Also, the E600 doc reveals a 267 MHz clock for the VXD.

Ailuros
15-Sep-2010, 15:04
Also, the E600 doc reveals a 267 MHz clock for the VXD.

Uhmm if you look into the VXD whitepaper you'll see in which cases it ranges from 50 to 266MHz; the peak frequency of the VXD was never really a secret, it just rarely runs actually as high.

tangey
15-Sep-2010, 15:33
Some of the Intel tech sessions from yesterdays IDF confirmed that Groveland (CE4200) is still using SGX535, so intel has now reused that core (at various clock speeds and process sizes), in CE3100,Z500,Z600,E600,CE4100,CE4200 and oaktrail.

I've never seen any proof thus far that Sodaville uses IMG I/P for video decode or encode. Interestingly, in pdf's from IMG relating to their results, it mentions that in the demo rooms there is Moorestown with SGX,VXD and VXE, and also sodaville demoing SGX (no mention of VXD or VXE).

http://www.imgtec.com/corporate/presentations/prelims10/IMG-Prelim-2010-Jun10.pdf
(see page 54).

I mentioned the above earlier, but it may have got lost.

Anyhow, the reason from bringing it up again is that two features that Intel highlighted as being in groveland (CE4200) are 1) HP H.264 1080p video encode and 2) S3D support.

Interesting, because in the same IMG presentation above, they state on page 43 that S3dD is "already supported in chips shipping with SGX and VXD".

And towards the end of last year, IMG announced VXE380 which now includes:-
"H.264 High Profile (HP), at HD resolutions."

http://www.imgtec.com/corporate/newsdetail.asp?NewsID=493

So I wonder does Sodaville currently have any IMG video I/P, and whether Groveland will have the new VXE380 ?

tangey
10-Dec-2010, 01:44
Moorstown is late, there is no doubt about it. It was talked about at the start of 2009 and formally announced in May of this year, but a look on the Intel website will show that other than the initial press releases ou'll find absolutly NOTHING about it, no technical data at all. This is in contrast to tunnel creek (now E6xx) series that was announced many months after it, there's loads of tech stuff on there. Additionally intel repeatedly said that moorestown would ship in products end of this year, but other than a few prototypes, it hasn't showed up.

why ?

well, two reasons I'd say. 1st I assume they got a luke-warm reaction from handset people. 2nd, the importance of having a windows compatible version has quickly become apparent, and they see more sales in that direction, than in trying to get 3rd tier handset providers to take moorestown. Hence oaktrail overnight took precedence, moorestown is now being talked about again more for MID/UMPC type devices, and for handsets the emphasis seems to be on Medfield, which they are now saying will appear 2nd half of 2011.

but what graphics will medfield have ?

Intel has indicated that it'll be x2 moorestown, which is x2 menlow. We know the x2 in moorestown is being acheived by going from 200Mhz to 400Mhz.

clearly the next x2 isn't going to be from running it at 800Mhz.

So SGX545 or a multi-core ?

My bet is 545. why ?, well timing is key here. 543mp has been out for a while, but it doesn't have any DX compliance. 544mp, although having DX9 compliance, has not been long announced. Yes it could have been available to Intel long before announcement, but if the next core we see being used by Intel is 544mp, then you'd have to assume that 545 has missed its Intel chance. Additionally IMG have already said in January of this year that 545 was available in test silicon.

But why does Intel need DX compliance in Medfield. Probably they don't. However what Intel has done up to now is take 1 SGX graphics core and use it in no less than 6 chips, menlow,canmore,sodaville,groveland,oaktrail & Tunnel Creek. It is in line with their recent ethos of reusing building blocks as much as possible to amortise costs and reduce developement time. So if they follow to form, they will step up to 545 and use that across another range of products, which implies ones that might be required to run windows OS (and hence need the DX9/10.1 support that 545 provides and 543mp does not).

If by some chance we *don't* see 545 shortly from Intel (i.e. in whatever is announced as a next gen to any of the above, and I think medfield is the 1st one up), then I might have to consider that it is Apple and not Intel that is taking 545 as a step up from their 535.

DavidC
10-Dec-2010, 07:07
There's a lot of delay on the software part(both Android and MeeGo) so we can't conclude its the hardware alone.

First Moorestown-based phone: http://www.ubergizmo.com/15/archives/2010/12/hilo_vibrant_meego_phone_in_russia_runs_on_intel_a tom_processor.html

It does look like its a Aava device rebranded, but it confirms there will be some handsets based on Moorestown. Now the release date is set as early 2011. I'm assuming due to multiple factors(customer reaction/delays in software and hardware/technical) they might stick most into Tablets and go full out handset with Medfield.

Power-wise Moorestown is a significant improvement over any previous Atom platform.

I don't know what the smart-phone frequency could be, but I'd dare to speculate not below 300MHz since that Q3 demo seems a lot faster then what I could see on a Hummingbird. As for the shader views demo here it is in their demo room:

About this, Tomshardware mentioned that Lincroft's communication between GPU and internal blocks were way better than anything done in this space previously. It might not be just clocks that determine performance.

Lazy8s
10-Dec-2010, 09:30
I had already run myself in circles trying to figure out when and with whom all these DirectX focused SGX cores belong.

Intel got good use out of the 535, but they now potentially have a lot of other SGX cores to deploy into a range of markets within a short time considering they're likely an early licensee of "Rogue".

roninja
10-Dec-2010, 14:03
TWatcher, sorry Tangey - you forgot Moorestown...so make that 7 SGX 535 families....
I think 545 is logical for Intel as sole licencee for it? Would expect some use of XT at some point in near future given it's 2011-4 lifespan before we see Rogue in 2013 I would guess inside a shipping SoC....

Exophase
10-Dec-2010, 16:20
Afaik Oaktrail and Moorestown are the same primary chip with CPU + GPU + memory controller (Lincroft), it's the supplementary input/platform controller hub chip that's different. Langwell in the case of Moorestown, Whitney Point in Oaktrail.

So counting both Oaktrail and Moorestown is redundant, better to just cite Lincroft. I haven't actually heard of Canmore, can I get a reference on that?

This is a useful thread for existing Atom cores:

http://forums.anandtech.com/showthread.php?t=2090553

tangey
10-Dec-2010, 16:52
Actually I did leave out Moorestown by accident. I think its a distinct Soc from either Oak Trail or Tunnel Creek, both of which have a PCI express bus (which moorestown definitely doesn't have)

Canmore was Intel's first Soc targetting CE, it was however not Atom, but a Xscale based processor. Its formal name was the CE2110.

http://www.intel.com/design/celect/2110/ce2110_brief.pdf?wapkw=(CE2110)

However I made a mistake calling it as SGX, it was in fact MBXlite.

Exophase
10-Dec-2010, 18:16
It is as I described it. PCI express and other secondary interlinks (and audio) are supplied by a support "hub" chip. In that sense you could say that it's not a complete SoC style chip like Sodaville and its successors, and like many other non-x86 SoCs.

Tunnel Creek, unlike the Lincroft on Oaktrail, does actually have PCI Express as part of the CPU, and is in fact its main interconnect (as opposed to an FSB interface). You could argue that it's closer to a real SoC, although it'll probably still usually be paired with a Topcliff hub (except in Stellarton) in order to have interfaces like USB, SD, SATA, etc.

While I'm on the topic, I actually think Sodaville is a disappointing showing for being the only real Atom SoC currently available. The CPU core is only 1.2GHz, and moreover, they've disabled hyperthreading which is Atom's ace in the hole and really the only thing making it remotely competitive with out of order CPUs. I haven't looked at the successors so I don't know what they're fixing, but it's obvious that Sodaville is selling strictly as a set-top chip and has that market firmly grasped because of its strong video decode capabilities. I could see something like Ontario taking over here, since its on-chip decode can probably handle most tasks and its GPU compute capabilities could pick up the slack where necessary.

roninja
10-Dec-2010, 22:30
Wasn't Canmore the CE3100 fast loosing track with all these soc names and code names

Exophase
10-Dec-2010, 22:39
Yeah apparently. I hadn't even heard of the CE line until Sodaville. I imagine it's the same for a lot of others >_>

Lazy8s
10-Dec-2010, 22:48
I've always found them the most interesting of Intel's line because of the consumer electronics focus (which gets them into the platforms of companies like Toshiba and Sony and brings them into competition with a whole other set of technologies) and because of their harder push for integration.

I consider the 2700G co-processor as the forerunner to the line.

CE2100 was the Xscale integrated platform.

CE3100 aka Canmore was the first with x86.

And CE4100 aka Sodaville was the first with Atom.

One of these days they should be really competitive little SoCs.

ToTTenTranz
17-Mar-2011, 14:40
I'm digging this thread just to put make a little question:


"We" still have no idea on what GPU will be in Medfield, right?

Are there any press releases hinting it will be a SGX chip?
Is there even any confirmation that Medfield will be in the Z6xx family? Given it's using a new process node, wouldn't it make sense to name it Z7xx?

Laurent06
17-Mar-2011, 15:10
Given it's using a new process node, wouldn't it make sense to name it Z7xx?
Isn't Oak Trail first product codenamed Z670 and already at 32nm as Medfield is supposed to be?
I am completely lost :smile:

Ailuros
17-Mar-2011, 15:30
Isn't Oak Trail first product codenamed Z670 and already at 32nm as Medfield is supposed to be?
I am completely lost :smile:

Z6x0 is SGX535 at up to 400MHz and should be on 45nm. Medfield is on 32nm and it's still a question mark if it'll contain a MP2 or something rather boring like a SGX545.

Laurent06
17-Mar-2011, 16:15
It looks like you're right: IIUC Oak Trail = Lincroft + Whitney Point (according to Wikipedia (http://en.wikipedia.org/wiki/Intel_Atom)) and Lincroft is 45nm (according to Intel (http://download.intel.com/pressroom/kits/atom/z6xx/pdf/Fact_Sheet_Intel_Atom_Processor_Platform.pdf)). Is that correct?

I find Atom related nomenclature even harder to follow than Intel desktop/server :)

ToTTenTranz
17-Mar-2011, 16:37
Z6x0 is SGX535 at up to 400MHz and should be on 45nm. Medfield is on 32nm and it's still a question mark if it'll contain a MP2 or something rather boring like a SGX545.

So Medfield is sure to have a PowerVR GPU?


SGX545 @ >=400MHz wouldn't be all that boring. It would probably surpass iPad 2's performance (triangle-rate wise, at least).
Intel got the SGX535 working @400MHz when they moved the GPU to the CPU side, at 45nm in Moorestown (from the original 200MHz @ 65nm when it was inside Poulsbo).
Using a now mature 32nm process, and keeping the same CPU block, I believe their target would be to achieve much higher clocks than the competition. 500-600MHz in the GPU, 2GHz in the CPU may not be that far-fetched.
(EDIT: according to wikipedia, if the SGX545 goes for 12.5mm^2 @65nm, @32nm would go for.. ~3mm^2?).

Plus, DX10.1 compatibility could be a requirement for Windows 8, making the SoC a lot more future-proof than any ARM solution currently in the pipeline (except maybe for A9600?).


I mean, why would the SGX 545 even exist, if not as a "custom order" from Intel for windows-compatible devices?

mczak
17-Mar-2011, 19:28
Intel got the SGX535 working @400MHz when they moved the GPU to the CPU side, at 45nm in Moorestown (from the original 200MHz @ 65nm when it was inside Poulsbo).

Poulsbo was at 130nm (!).


Using a now mature 32nm process, and keeping the same CPU block, I believe their target would be to achieve much higher clocks than the competition. 500-600MHz in the GPU, 2GHz in the CPU may not be that far-fetched.

I don't think so. Clock increases just for shrinks are minimal nowadays, and I don't think they'd want to sacrifice perf/power for higher clock.


(EDIT: according to wikipedia, if the SGX545 goes for 12.5mm^2 @65nm, @32nm would go for.. ~3mm^2?).

Yes, assuming perfect shrink. Since that's just the gpu core without i/o seems like a fair assumption.

ToTTenTranz
17-Mar-2011, 20:03
I don't think so. Clock increases just for shrinks are minimal nowadays,
That's true for desktop solutions, but not really for mobile SoCs. Look at OMAP3:

OMAP34xx: 65nm -> Cortex A8 @ ~600MHz, SGX530 @ 110MHz
OMAP36xx: 45nm -> Cortex A8 @ ~1000MHz, SGX530 @ 200MHz

OMAP4 is basically retaining aproximately the same clocks, it's using the same 45nm process.
But OMAP5 is scaling its dual A15 up to 2GHz, on 28nm.

It seems to me that at least the ARM CPUs have been steadily increasing their clocks according to die shrinks.


and I don't think they'd want to sacrifice perf/power for higher clock.

Well at this time, only Intel knows if they're giving up perf/power if they scale to higher clocks in 32nm.
Given the lack of any news stating otherwise, it seems that Medfield will be a single, dual-threaded Atom core, and they'll have to compete with all the dual A9s somehow.
Intel's statements during MWC showed us they're pretty sure to be beating ARM at their own game with Medfield.

But all this secretism around Medfield makes us all doubt that, though.

Ailuros
17-Mar-2011, 20:08
So Medfield is sure to have a PowerVR GPU?

Pretty much yes.

SGX545 @ >=400MHz wouldn't be all that boring. It would probably surpass iPad 2's performance (triangle-rate wise, at least).

Why? At 400MHz it would barely break even in terms of triangle rate with the MP2 in the iPad2.

Intel got the SGX535 working @400MHz when they moved the GPU to the CPU side, at 45nm in Moorestown (from the original 200MHz @ 65nm when it was inside Poulsbo).

At the cost of ~twice the die area. Nothing comes for free.

Using a now mature 32nm process, and keeping the same CPU block, I believe their target would be to achieve much higher clocks than the competition. 500-600MHz in the GPU, 2GHz in the CPU may not be that far-fetched.
(EDIT: according to wikipedia, if the SGX545 goes for 12.5mm^2 @65nm, @32nm would go for.. ~3mm^2?).

12.5mm2@65nm but at 200MHz. What makes you think that a higher frequency won't need far more die area especially when you go for a power gating and not die area optimized core? By the way smaller manufacturing processes don't give advantages to just one core but all of them. Under 65nm and at 200MHz a 545 weighs 12.5mm2 and a MP 16mm2. Since there's a vast performance difference for the latter in almost anything you don't even need to pump up the frequency as much on a MP2, get rid of the real die area difference and still end by N% faster.

Plus, DX10.1 compatibility could be a requirement for Windows 8, making the SoC a lot more future-proof than any ARM solution currently in the pipeline (except maybe for A9600?).

You get DX11 DX9.0/L3 certification with a SGX544 too and not I don't see any mobile/embedded game developer going any lengths of using anything close to DX10 yet. That is if Intel truly intends to enter the smart-phone/tablet market. What do you need windows for on a smart-phone anyway?

I mean, why would the SGX 545 even exist, if not as a "custom order" from Intel for windows-compatible devices?

I think you should have a careful look how the iPad2's performance will evolve in the nearest future when it's full power gets unleashed. Intel or anyone else who might have licensed it can then stand small against that one and yell "but it's got DX10.1, get impressed dammit...."

Rys
18-Mar-2011, 12:24
OMAP4 is basically retaining aproximately the same clocks, it's using the same 45nm process.
Nope, there's a big gap between OMAP3 and OMAP4, clocks wise.

tangey
18-Mar-2011, 12:40
Otellini said a year or so ago that PowerVr would be their graphics of choice in handheld for at least the next few iterations.

That alone probably guarantees medfield is PowerVr.

Intel said that moorestown/oaktrail would be x2 menlow
and that medfield would be x2 moorestown.

With oaktrail having options of 535@400MHZ, one is left with a choice of SGX545@400MHZ or an SGX543/544 for medfield.

As I said previously on this or perhaps another intel related thread in handheld, first time around Intel worked with SGX535, and used that single core in 6-7 Socs. If their intention again is to do the same, then 545 is the one they will use as it gives them options for windows tablets/netbooks.

If they are being agressive on power/die area, then they might not go that way, but rather go with 543 or 543mp2 specifically for medfield.

I am begining to wonder who is looking at SGX554, which seems to have been launched at an "in-between" time, out of sync for Apples refresh, not used by ST, unlikely to be used by TI. Unless Intel is jumping straight to 554 ?

ToTTenTranz
18-Mar-2011, 16:40
Nope, there's a big gap between OMAP3 and OMAP4, clocks wise.

In the GPU, yes. But not in the CPU, according to the released specs. Both OMAP36xx and OMAP4xxx have their CPUs clocked at ~1GHz.

Nonetheless, that further proves my point that clocks have been going up quite fast.



Intel said that moorestown/oaktrail would be x2 menlow
and that medfield would be x2 moorestown.

With oaktrail having options of 535@400MHZ, one is left with a choice of SGX545@400MHZ or an SGX543/544 for medfield.

As I said previously on this or perhaps another intel related thread in handheld, first time around Intel worked with SGX535, and used that single core in 6-7 Socs. If their intention again is to do the same, then 545 is the one they will use as it gives them options for windows tablets/netbooks.

^ My point exactly. Intel would save a lot of money if they develop only one "32nm Atom" SoC and then just add a "controller hub" with PCI, SATA, etc for Windows-compatible systems.
Plus, given the "timing" of the SGX545 announcement, it really seems like something that was specifically ordered for future windows devices.
Why else would IMG create a DX10.1 compliant GPU, if not to be coupled with a x86 CPU?
And I'm pretty sure AMD isn't interested.


I am begining to wonder who is looking at SGX554, which seems to have been launched at an "in-between" time, out of sync for Apples refresh, not used by ST, unlikely to be used by TI. Unless Intel is jumping straight to 554 ?
Samsung's follow-up to Orion?

Laurent06
18-Mar-2011, 17:10
Why else would IMG create a DX10.1 compliant GPU, if not to be coupled with a x86 CPU?
Isn't Windows Mobile (or whatever it's called) using DirectX?

ToTTenTranz
18-Mar-2011, 17:24
Isn't Windows Mobile (or whatever it's called) using DirectX?

WP7 requires DirectX 9, which is already supported by pretty much every modern GPU for mobile devices.
Besides, all WP7 have a 1st-gen Snapdragon with an Adreno 200, which is probably the weakest GPU of the bunch.

Laurent06
18-Mar-2011, 17:28
WP7 requires DirectX 9, which is already supported by pretty much every modern GPU for mobile devices.
Besides, all WP7 have a 1st-gen Snapdragon with an Adreno 200, which is probably the weakest GPU of the bunch.
I still fail to see how this proves that DX10.1 was only done for x86 needs.

ToTTenTranz
18-Mar-2011, 17:41
I still fail to see how this proves that DX10.1 was only done for x86 needs.

Do you know of any hardware+software system using DX10.1 that doesn't use a x86?

Lazy8s
18-Mar-2011, 18:03
Wait too much longer for a 545 based SoC and a hypothetical SGX548 (USSE2 MPcore version of a 545) starts to make more sense.

I think licensees would instead jump to Series6, and I don't envision IMG announcing a 548 out of the blue to be ready in time anyway. I think Intel's roadmap simply got disrupted in more ways than one along the way, but they were among the first with MBX (Series4 if you want to consider it that) and Series5 and could no doubt catch up with a Series6 implementation.

ST-Ericsson's A9600 is something really special, though. It's not just the first announced Series6 product; those clock speeds/implementation show they can end up being a very competitive player.

Laurent06
18-Mar-2011, 18:04
Do you know of any hardware+software system using DX10.1 that doesn't use a x86?
I don't, but don't you think MS has been discussing for some time now with IP companies about the upcoming Windows 8 which will run on ARM? Of course this is all speculative :)

ToTTenTranz
18-Mar-2011, 18:41
I don't, but don't you think MS has been discussing for some time now with IP companies about the upcoming Windows 8 which will run on ARM? Of course this is all speculative :)

Yes, but both Vista (launched in 2006) and Win7 have the same DX9 requirement for their "enhanced UI (Aero)".

5 years later, Microsoft could be "forcing" a DX10 requirement for a supposed "Aero 2" in Windows 8, ARM or not.
Furthermore, I can see a scenario where Windows 8 thoroughly uses DirectX Compute Shader for many tasks, for example. You can't get that with DirectX 9L, afaik.

Ailuros
18-Mar-2011, 20:22
Furthermore, I can see a scenario where Windows 8 thoroughly uses DirectX Compute Shader for many tasks, for example. You can't get that with DirectX 9L, afaik.

Still my question remains: for a smart-phone or tablet? If you'd say netbook or higher I could eventually understand it, but a smart-phone?

On a less relative note SGX535 passed OpenCL conformance at Khronos excluding fp64 and atomic operations: http://www.khronos.org/adopters/conformant-products/

What's weird is that IMG filed the 535 for conformance (probably to its widest deployment amongst all SGX cores) and not SGX545.

ToTTenTranz
19-Mar-2011, 01:51
Still my question remains: for a smart-phone or tablet? If you'd say netbook or higher I could eventually understand it, but a smart-phone?

As I said: Intel would make the same next-gen Atom CPU for both smartphones, tablets and netbooks.
Netbooks and some tablets would get an additional controller hub for PCI-Ex and SATA, for example.

Erinyes
19-Mar-2011, 04:10
In the GPU, yes. But not in the CPU, according to the released specs. Both OMAP36xx and OMAP4xxx have their CPUs clocked at ~1GHz.

Nonetheless, that further proves my point that clocks have been going up quite fast.


OMAP 4440 has a clock of 1.5 Ghz, but i think its more of a tablet than smartphone part.

Otellini said a year or so ago that PowerVr would be their graphics of choice in handheld for at least the next few iterations.

That alone probably guarantees medfield is PowerVr.

Intel said that moorestown/oaktrail would be x2 menlow
and that medfield would be x2 moorestown.

With oaktrail having options of 535@400MHZ, one is left with a choice of SGX545@400MHZ or an SGX543/544 for medfield.

As I said previously on this or perhaps another intel related thread in handheld, first time around Intel worked with SGX535, and used that single core in 6-7 Socs. If their intention again is to do the same, then 545 is the one they will use as it gives them options for windows tablets/netbooks.

If they are being agressive on power/die area, then they might not go that way, but rather go with 543 or 543mp2 specifically for medfield.

I am begining to wonder who is looking at SGX554, which seems to have been launched at an "in-between" time, out of sync for Apples refresh, not used by ST, unlikely to be used by TI. Unless Intel is jumping straight to 554 ?

TI's OMAP 5 has been announced and it has a SGX 544-MPx

Lazy8s
19-Mar-2011, 07:44
A 543 doesn't make sense for Intel's API support. A 554 or 544, maybe.

argor
19-Mar-2011, 11:20
Yes, but both Vista (launched in 2006) and Win7 have the same DX9 requirement for their "enhanced UI (Aero)".

5 years later, Microsoft could be "forcing" a DX10 requirement for a supposed "Aero 2" in Windows 8, ARM or not.
Furthermore, I can see a scenario where Windows 8 thoroughly uses DirectX Compute Shader for many tasks, for example. You can't get that with DirectX 9L, afaik.

you remember when Microsoft showed of windows 8 on arm only those soc that had DX9.0/L3 support were used like omap5

Blazkowicz
19-Mar-2011, 14:57
requiring pixel shaders 2.0 was a bold enough measure for what is basically a 2D interface.
more shockingly, if it is to run on Cortex A15 or more current SoCs, then Windows 8 won't be a 64bit-only operating system :lol2:

now, will Windows 8 on the PC will be available as 32bit x86, or 64bit only?
if the latter then Intel should stop with their crappy habit of disabling 64bit on the Atom.

Erinyes
21-Mar-2011, 04:03
you remember when Microsoft showed of windows 8 on arm only those soc that had DX9.0/L3 support were used like omap5

OMAP 5? Afaik that isnt even sampling yet! (Scheduled for H2 2011)

Ailuros
21-Mar-2011, 07:33
As I said: Intel would make the same next-gen Atom CPU for both smartphones, tablets and netbooks.
Netbooks and some tablets would get an additional controller hub for PCI-Ex and SATA, for example.

Netbooks could theoretically also get served with notebook SoCs. As for smart-phones and tablets specifically DX9.0/L3 (or as Vivante calls it DX11 certified DX9.0) sounds sufficient which is present in SGX544 and the likes.

argor
21-Mar-2011, 08:33
OMAP 5? Afaik that isnt even sampling yet! (Scheduled for H2 2011)

yes that is true it not sampling yet
but it is was one of soc that were used during the Microsoft presentation
TI most likely send a few copy’s of OMAP5 to Microsoft to be include in the presentation

Laurent06
21-Mar-2011, 11:31
but it is was one of soc that were used during the Microsoft presentation
TI most likely send a few copy’s of OMAP5 to Microsoft to be include in the presentation
Any link sustaining that claim? I highly suspect this is completely wrong.

Ailuros
22-Mar-2011, 18:26
http://www.xbitlabs.com/news/mobile/display/20110321213911_General_Manager_of_Intel_s_Ultra_Mo bility_Group_Leaves_Company.html

ToTTenTranz
09-May-2011, 20:16
http://vr-zone.com/articles/exclusive-intel-s-cedarview-atom-to-sport-powervr-graphics/12117.html

It looks like Atom Cedarview for netbooks and nettops will ditch Intel's graphics for the PowerVR SGX545.
Netbooks will clock the GPU @ 400MHz while nettops will have it a bit higher at 640MHz.

Given that Intel ditched its own GPU architecture for higher-powered Atoms, it seems a clear indication to me that Medfield will also have a PowerVR SGX545.

Nonetheless, this leaves the next-gen Atoms with dirt-low 3D performance compared to low-power Fusions like C-30 and C-50 for netbooks, even if they do bump the CPU performance somehow.

Ailuros
09-May-2011, 20:21
I'll have a damn hard time justifying that design decision.

Blazkowicz
09-May-2011, 21:09
I didn't know some of the current atoms had Intel graphics, so it's a lottery based on an unreadeable model name (who knows what's the difference between a D and an N?).
good to have that cleared away.
notice the 2x performance target over previous PowerVR.

I don't know what to think of the performance but I hope there's a clear focus on drivers. stable, full featured and fast enough (using multithreading).
The Atom is CPU limited anyway in Warcraft III custom maps when there's many, many monsters and things around. It also needs to just work under linux!

Exophase
09-May-2011, 21:24
I'll have a damn hard time justifying that design decision.

And here I (and others) thought it'd probably be based off of Sandy Bridge's GPU, at a lower clock/possibly lower EU count.

It really begs the question, if Medfield at 32nm is supposed to deliver substantially lower power consumption than Moorestown why exactly is Cedar Trail pegged for 10W with presumably the same GPU? Sure it has two cores, but I'm wondering if Medfield won't also. In fact, I'd say Medfield absolutely needs to have two cores in order to look at all competitive in 2011.

notice the 2x performance target over previous PowerVR.

Actually, the claim is a 2x performance improvement over the graphics in Pinetrail, which uses Intel's GMA 3150 GPU. Considering it's a mobile GPU compared against one meant for IGPs on desktops and laptops it's pretty impressive, although not that impressive considering how old GMA 3150 has become. At any rate, it shows that Intel has no real confidence in its ability to scale down and compete in perf/W at all. I guess that's not that surprising.

ToTTenTranz
09-May-2011, 21:45
I'll have a damn hard time justifying that design decision.
It's not that hard IMO.
http://img217.imageshack.us/img217/5143/intelcedartrailfeatures.png

- By going PowerVR with a VXD, all Atoms will now support video acceleration for FullHD High-Profile decoding, which was unprecedented in the low-cost versions and is actually good (it felt ridiculous that netbooks had inferior video performance than most mid-to-high end smartphones).
- They can brag about supporting DX10.1 with hardware vertex shading this time (whooohoo), which means it'll support some more games (horribly) and maybe they'll even come up with an OpenCL driver for it, just for the lulz.
- They can also brag about having twice the 3D performance of the previous generation, which means 2x the performance of a GMA 3150. The truth is, the 3150 (2 pixel shaders) had about half the performance of the 945G (4 pixel shaders) in the 1st-gen Atoms, so we're basically back to the Atom's original 3D performance back in 2008, maybe just a little bit higher and with more functionality.
- With the Atom now on 32nm and with SGX+VXD cores, I think Intel might be able to reach a huge battery life even on the low-cost, 3-cell battery netbooks, which might hurt the tablet rising somehow.


I didn't know some of the current atoms had Intel graphics, so it's a lottery based on an unreadeable model name
Only the low-power Z500 and Z600 had PowerVR graphics. The first-gen netbook Atoms paired with the 945G northbridge and the 2nd-gen (current) netbook Atoms had the horrible GMA 3150 (something that struggled with Windows Aero) within the CPU.


(who knows what's the difference between a D and an N?).
D is for nettops (or Small Desktops) and N is for Netbooks.

notice the 2x performance target over previous PowerVR.
It's not 2x over previous PowerVR (although it's probably true compared to the 400MHz SGX535 in Oak Trail), it's 2x more powerful over GMA 3150. And that's not really a hard feat, not even for some ARM SoCs.



I don't know what to think of the performance but I hope there's a clear focus on drivers. stable, full featured and fast enough (using multithreading).
The Atom is CPU limited anyway in Warcraft III custom maps when there's many, many monsters and things around.

Nonetheless, every Atom system we've seen so far (except for Ions and mini-itx boards with PCI-Ex cards, of course) is clearly GPU limited, regarding game performance at least.

Otto Dafe
09-May-2011, 21:46
Is there a credible estimate of the TDP of the GPU part of Sandy on it's own?

ToTTenTranz
09-May-2011, 21:51
Is there a credible estimate of the TDP of the GPU part of Sandy on it's own?

Nope, and it'll be mighty hard to find one, since it shares the L3 cache witht the CPU.

Otto Dafe
09-May-2011, 22:12
Nope, and it'll be mighty hard to find one, since it shares the L3 cache witht the CPU.

Yeah, so we don't really know if it would be reasonable to mash it up with Atom I guess.

Blazkowicz
09-May-2011, 23:01
thanks a lot for the replies, I also googled about the GMA 3150 to clear things out.
crucially there was a GMA 3000 before, which is an updated GMA 950 and is nothing like a GMA X3000.

I didn't know Intel played that little naming game, that's why a 3150 can have a worse architecture than a "3000". I was also amused to learn the G31 chipset has the wrong kind of 3100. who knew an "X" was such a vital feature :lol:

tangey
10-May-2011, 00:08
FINALLY !

it was clear for a while now that intel had to be the licensee for SGX545, as IMG announced JAN '10 that they had a lead licencee for it.

Intel has stated for ages now that cedartrail was DX10.1 and OpenGL3 compliant, the only core from IMG that fits that bill is the 545. So in some way its unsurprising that its turned up in CedarTrail.

However in many ways the fact that Intel has chosen it is very surprising. As someone above pointed out, its a very clear signal that Intel can't get their own inhouse graphics down enough on power to hit the required spec (or at least couldn't do it in time). To have won a design seat, for which intel's own inhouse design department would have been competing strongly for (after all it was Intel's Graphics dept that got the seat for the previously generation, pinetrail), is a strong endorsement of img's technology and may be seen as IMG going up in form factor designs within Intel. Yes Z500 did appear in some netbooks (dell's 12 inch, and some smaller Sony's) but the Z500 was never really designed with that form factor in mind.

Looking at performance within the PowerVr range, this SGX545@640Mhz, will certainly be the highest performing single core solution by around a factor of 2, and 2nd only to the dual core 543 in the A5. So why go 545 instead of a dual core? Well, it would appear to me that DX compliance was needed (for windows obviously), and a combination of SGX545 being already well down the design road, and perhaps only having Dx9 compliances multi-core in the required time frame (IMG could of course design a DX10 multi-core if someone wanted it, but only DX9 was in the map and DX9 might looking poor from a marketing point of view), it ended up having to be SGX545.

With regard to video decode, intel have stated that cedartrail will have full blu-ray decode. So that looks like IMG's VXD390/1 block, which can decode multiple blu-ray streams.

I'm not sure that seeing this 545 in cedatrail totally means that this will be the core in medfield. medfield won't be running windows, so no DX requirement, and the space saved by getting rid of that might be better used by putting in a dual core.

ToTTenTranz
10-May-2011, 00:39
I'm not sure that seeing this 545 in cedatrail totally means that this will be the core in medfield. medfield won't be running windows, so no DX requirement, and the space saved by getting rid of that might be better used by putting in a dual core.

If Microsoft is supporting ARM in Windows 8, it means they'll have loosened the requirements for the legacy buses and SATA that Medfield doesn't support.
That said, at some point Medfield may be able to run Windows 8, with the advantage of actually being backwards-compatible with all previous windows software (unlike the ARM builds).

At that point, Medfield could become the one chip that does all.
And I think Intel seems to be more concerned with the feature check-list than raw gaming performance.

tangey
10-May-2011, 00:45
And I think Intel seems to be more concerned with the feature check-list than raw gaming performance.

Agreed about the feature list, but given their biggest issue, which is trying to compete on power consumption, getting rid of superfluous circuitry is very important. The DX10.1 and full profile OpenCL compliance of SGX545 probably adds to the area signficantly too.

And insertion of dual-core does more that just increase top end performance, for many tasks, a single core would suffice, which provides its own significant power saving.

Otto Dafe
10-May-2011, 01:10
Is the GUI in Vista / 7 crippled in anyway by DX9L? That to me would be very significant in a netbook, moreso than raw performance I think.

Blazkowicz
10-May-2011, 01:50
requirements may be more loose now actually, with dx11 introducing the direct3D 9 feature level as well as the 10.x ones, so more forward compatibility for the d3d9 feature set.

Exophase
10-May-2011, 04:00
At that point, Medfield could become the one chip that does all.

Then what purpose does Cedar Trail serve exactly?

This is not as much of a counter-argument as a serious question.

If it comes down to nothing more than Cedar Trail going to 2GHz instead of 1.5GHz that's going to be pretty lame. Because that's not much differentiation, but also because Lincroft series CPUs already go to 1.5GHz and it'll seem ridiculous if Medfield can't go higher.

The only real differentiator I see available is Medfield having a single core option (we know Cedar Trail has two). But then it won't survive against current high end ARM SoCs. Seems to me like either Medfield will make Cedar Trail look ridiculous at its TDP or Medfield will be too underpowered to compete... I just don't see how Intel can move forward in this.

(but we all know Wichita is going to destroy Cedar Trail anyway so I guess its market position vs Medfield is sort of moot)

Blazkowicz
10-May-2011, 04:36
the differentiation would be at least package size and inputs/outputs (you really need PCIe lanes, more USB, a SATA controller etc. on a laptop, and why not thunderbolt)
then the speculative part about the GPU, I would side with the smaller GPU idea.

I've just read though that current NM10 chipset will be used. that's lame : still stuck with 100Mb ethernet and USB 2.0, as on an old PC or on a 486-based SoC.

I wonder about virtualization extensions. still crippled or not? all core i-something have it now. yes, a computer with a slow 4-thread CPU, 4GB memory and a big hard drive should allow you to toy with linux and windows VMs.

Erinyes
10-May-2011, 05:57
I'll have a damn hard time justifying that design decision.

I agree with you there, it makes sense if it was a smartphone or tablet part, but for a chip intended for netbooks this is a major fail. Its going to make a netbook useless for anything but surfing and word processing, etc. Ontario/Brazos is going to kick its butt hard, and with Krishna/Wichita scheduled for Q1 2012, Atom is going to get whooped pretty hard till they get a new arch on 22 nm (2013 according to the roadmap). The small plus side is, battery life should go up tremendously. (At least 20-30% im estimating)

Exophase
10-May-2011, 06:44
Cedar Trail is 10W TDP for 2GHz, compared to Atom D525 at 1.8GHz which has a TDP of 13W. Both have IGPs, and I'm operating under the assumption that SGX545 is a lot more efficient even at the same process node. So I think the move to 32nm didn't actually win very much, unless Cedar Trail is rated very conservatively.

Which is not a good sign given Intel's hype that Medfield would be as power efficient as current ARM SoCs.

Squilliam
10-May-2011, 08:03
Which is not a good sign given Intel's hype that Medfield would be as power efficient as current ARM SoCs.

Is the TDP given for the netbook or tablet variant?

tangey
10-May-2011, 10:28
Cedar Trail is 10W TDP for 2GHz, compared to Atom D525 at 1.8GHz which has a TDP of 13W. Both have IGPs, and I'm operating under the assumption that SGX545 is a lot more efficient even at the same process node. So I think the move to 32nm didn't actually win very much, unless Cedar Trail is rated very conservatively.

Which is not a good sign given Intel's hype that Medfield would be as power efficient as current ARM SoCs.

In an awful calculation 45/32=1.4. 13/1.4=9.3. So an awfully rough calculation would suggest that shrinking pinetrail down and doing nothing else would get you around 9-10W.

Cedar trail has full hardware video decode built in, which was a separate chip (broadcom ?) on pinetrail, and not included in pinetrails power figures. Additionally it also handles full blu-ray decode. Cedartrail (according to Intel) also has double the graphics performance, and Dx10.1 compatibilty. So you might say they've added a lot of extra functionality/performance/compliance without dipping into the power saving they got from the process shrink and also got rid of a chip.

ToTTenTranz
10-May-2011, 10:45
Then what purpose does Cedar Trail serve exactly?

This is not as much of a counter-argument as a serious question.
Maybe Medfield and Cedarview will be the same chip, with Cedartrail using NM10 for added connections and functionality for netbooks\tops.

As far as power consumption goes, they could both be dual core, with Cedarview aiming at >2GHz (maybe even 2.5GHz) and Medfield going to sub-1GHz and being higher-binned, like the ULV versions of higher performance CPUs.
As far as Javascript performance goes, benchmarks indicate that a single-core 1.6GHz Atom easily beats a dual-core 1GHz Cortex A9, so I think a ~900MHz dual-core, quad-threaded Atom would beat higher-clocked A9s.
And Medfield will probably use slower LPDDR2, as opposed to Cedartrail's DDR3.

Furthermore, as Blazkowicz said, Medfield could save a lot of power on reducing I/O ports, as we've seen with that 5W version of C-50.

Is the TDP given for the netbook or tablet variant?

I'd also like to know where those 10W for Cedarview are coming from. Is it for the CPU only? Does it include the NM10? The DDR3 RAM?

tangey
10-May-2011, 10:56
Maybe Medfield and Cedartrail will be the same chip, with Cedartrail using NM10 for added connections and functionality for netbooks\tops.

medfield will be a different chip.
when Intel produced pineview from the menlow platform, they dropped a lot of the power saving techniques that were in menlow. So expect medfield to have those. Also medfield will include video encode I/P from IMG, which cedartrail does not. Also I expect medfield to have a smaller package.


I'd also like to know where those 10W for Cedartrail are coming from. Is it for the CPU only? Does it include the NM10? The DDR3 RAM?

Cedarview is the Soc, Cedartrail is the chipset.

ToTTenTranz
10-May-2011, 14:31
Cedarview is the Soc, Cedartrail is the chipset.

I'm sorry, what exactly is Cedartrail? The platform for Cedarview? As in Cedarview + NM10?


Here the last known roadmap (http://www.anandtech.com/show/4295/intel-cedar-trail-platform) for Cedartrail:
http://img534.imageshack.us/img534/3523/imageviewd.png

In this roadmap, there's a 1.86GHz D2500 without HT and a 2.13GHz D2700 with HT. These are the nettop parts so both should have the SGX540 running @ 640MHz.



The latest rumours from fudzilla (http://www.fudzilla.com/notebooks/item/22658-new-netbook-platform-allows-4gb-ram-121-inch-screens) mention a N2600 single-core and a N2800 dual-core (I'd bet it's a mistake and it's actually both dual cores with the latter supporting HT).
There's a couple of other interesting things in there, like Intel limiting the batterys to 4-cell minimum and raising the amount of out-of-the-box RAM to 4GB DDR3.

I'd say the N2800, if clocked at 2.13GHz, will be a bit faster than the 1.6GHz Bobcats in CPU intensive tasks, specially with multitasking in mind.

That said, with the 2.13GHz system (Cedarview + NM10) consuming ~10W, a medfield @ <900MHz may hover around 1W with higher-binned parts, no NM10 and low-power memory controller.

tangey
10-May-2011, 15:47
I'm sorry, what exactly is Cedartrail? The platform for Cedarview? As in Cedarview + NM10?

Yeah, my understanding is the cedartrail is a cedarview + NM10, regardless of what its clocked at, which is the same naming convention that was used for pineview/pinetrail. To be accurate with regard to quoting TDP, a particular N or D part number needs to be referenced.

Exophase
10-May-2011, 16:20
Maybe Medfield and Cedarview will be the same chip, with Cedartrail using NM10 for added connections and functionality for netbooks\tops.

Yeah it's possible, although there'd be different SKUs to prevent Medfields from actually interfacing with NM10.

As far as power consumption goes, they could both be dual core, with Cedarview aiming at >2GHz (maybe even 2.5GHz) and Medfield going to sub-1GHz and being higher-binned, like the ULV versions of higher performance CPUs.
As far as Javascript performance goes, benchmarks indicate that a single-core 1.6GHz Atom easily beats a dual-core 1GHz Cortex A9, so I think a ~900MHz dual-core, quad-threaded Atom would beat higher-clocked A9s.
And Medfield will probably use slower LPDDR2, as opposed to Cedartrail's DDR3.

Judging CPU performance based solely on Sunspider is foolhardy, there are all sorts of variables that have nothing to do with CPU. I have good confidence that clock for clock and with similar memory subsystems Cortex-A9 out-does Atom, and Atom will be up against dual-core A9s at much higher than 900MHz in 2011. Sure, if it's dual core it'll have an advantage with stuff that's well threaded (until the quad-core A9s come out anyway) but how much on phones and tablets do you seriously expect to be well distributed among four threads?

At 1.6GHz Atom would have an advantage over 1GHz Cortex-A9s in typical scenarios, but Atom won't be 1.6GHz on phones and Cortex-A9 won't stop at 1GHz.

Furthermore, as Blazkowicz said, Medfield could save a lot of power on reducing I/O ports, as we've seen with that 5W version of C-50.

That isn't where the power saving comes from, those reductions are all in the Hudson chipset which the TDP doesn't include.

I'd also like to know where those 10W for Cedarview are coming from. Is it for the CPU only? Does it include the NM10? The DDR3 RAM?

Regardless of what Anand's calling the preview those numbers are attached to Dxxx chips, not the platform, and won't includes NM10 (which has a TDP of 2.1W for what it's worth).

By the way, Oak Trail's SM35 PCH is 0.7W, so I'm sure there'll be a tablet variant paired with something like this.

mczak
10-May-2011, 17:38
- By going PowerVR with a VXD, all Atoms will now support video acceleration for FullHD High-Profile decoding, which was unprecedented in the low-cost versions and is actually good (it felt ridiculous that netbooks had inferior video performance than most mid-to-high end smartphones).
- They can brag about supporting DX10.1 with hardware vertex shading this time (whooohoo), which means it'll support some more games (horribly) and maybe they'll even come up with an OpenCL driver for it, just for the lulz.

Well I think the expectation was that intel would use some HD graphics derivative, which would do all that too. Though HD2000 (on 32nm) is about 30mm˛ I think, which might be too big (and power hungry) and I don't know if it downscales further well. Not to mention it might lose some of its appeal if there's no L3 cache it could use, coupled with the pathetic memory bandwidth these platforms have.

DavidC
10-May-2011, 20:12
I'd say the N2800, if clocked at 2.13GHz, will be a bit faster than the 1.6GHz Bobcats in CPU intensive tasks, specially with multitasking in mind.

CPU-world has the full specs on N2600/N2800 Netbook parts.

http://www.cpu-world.com/news_2011/2011050401_Intel_32nm_Cedar_Trail_Atom_CPUs_to_lau nch_in_Q4_2011.html

N2600: 1.6GHz CPU/1MB cache/2 cores/Hyperthreading/400MHz GPU/3.5W
N2800: 1.86GHz CPU/1MB cache/2 cores/Hyperthreading/640MHz GPU/6.5W

Exophase
10-May-2011, 21:32
Looks like D2500 is going to have a big price advantage to make anyone want it over N2800, yet the cpu-world page suggests it'll actually cost more o_O

I also like that they're calling SGX 545 "GMA 3650", making the previous Intel GPU feel even more replaced.

tangey
10-May-2011, 23:54
CPU-world has the full specs on N2600/N2800 Netbook parts.

http://www.cpu-world.com/news_2011/2011050401_Intel_32nm_Cedar_Trail_Atom_CPUs_to_lau nch_in_Q4_2011.html

N2600: 1.6GHz CPU/1MB cache/2 cores/Hyperthreading/400MHz GPU/3.5W
N2800: 1.86GHz CPU/1MB cache/2 cores/Hyperthreading/640MHz GPU/6.5W

So if that is to be beleived, the two drops in clock saves 3W of power ?

DavidC
11-May-2011, 00:00
Looks like D2500 is going to have a big price advantage to make anyone want it over N2800, yet the cpu-world page suggests it'll actually cost more o_O

I also like that they're calling SGX 545 "GMA 3650", making the previous Intel GPU feel even more replaced.

"All four microprocessors are expected to launch in the 4th quarter 2011, priced lower than $55 for D2xxx chips, and less than $50 for N2xxx."

Not really, it would suggest N2800 costs less than D2700, and the latter is faster so...

How does the 545 compare against the 543?

BTW, someone mentioned that GMA 3150 is a half clocked version of the 945G. That doesn't matter too much because GMA 900/950 architecture wasn't limited by fillrate, but rather by vertex shaders, which is then powered by the CPU. Going from 200MHz to 400MHz showed a gain of 10-20% for lot of games and applications.

Cedar Trail is coming in the fall, which would give 5-6 months lead time over Wichita.

tangey
11-May-2011, 00:23
How does the 545 compare against the 543?

543 has no formal Dx compliance, has lower OpenGL compliance, and only has embedded OpenCL, not full profile.

according to IMG marketing, 543 has slightly lower poly rate 35M, compared to 545 40M, both @200Mhz. Both have the same fill rate.

Of course 543 in ipad2 is kept company by another 543. no one knows what ipad2 is clocking 543 at, a totaly guess would be 200MHz. If that is true, then one might see the 400Mhz cedartrail have similar performance to ipad2, and the 640Mhz version have 50%-ish more.

For me, it would make much more sense for medfield to have 543MP2

Exophase
11-May-2011, 00:40
Cedar Trail is slated for Q4 2011, Wichita for Q1 2012. That could mean 6 months, but it could also mean under 1 month.

Unless something has changed in IMG's numbering SGX545 is Series5 while SGX543 is Series5XT, meaning that the 543 has wider ALUs. According to IMG SGX545 still has 2 TMUs and probably 4 ALUs, but they list a triangle rate of 40M at 200MHz (as opposed to 20M given for 540), so maybe they doubled triangle setup rate somehow. I would still expect it to usually perform like a much higher clocked SGX 540, though. I expect A5's SGX543MP2 to beat the 400MHz SGX545 most of the time and the 640MHz one occasionally.

DavidC
11-May-2011, 02:52
Cedar Trail is coming in early fall, mid-August to early-September.

Also, clock speeds benefit everything, while USSE2 is just on the shaders.

rpg.314
11-May-2011, 04:47
Cedar Trail is slated for Q4 2011, Wichita for Q1 2012. That could mean 6 months, but it could also mean under 1 month.
Damn it AMD. Why do you have to keep missing the holiday season?

eastmen
11-May-2011, 06:15
Damn it AMD. Why do you have to keep missing the holiday season?

seems that amd has 32nm zacate for this fall. So we will see some performance increases / power savings there from the 40nm ones on the market

Exophase
11-May-2011, 06:34
Cedar Trail is coming in early fall, mid-August to early-September.

I'm afraid not, Cedar Trail got moved to Q4.

http://www.anandtech.com/show/4295/intel-cedar-trail-platform

Also, clock speeds benefit everything, while USSE2 is just on the shaders.

Multiple cores also benefit everything. I expect iPad 2 is clocking its SGX 543MP2 at 250+MHz, and based my comments on that (ie, it'd be almost as fast as a 500MHz SGX543MP1)

mczak
11-May-2011, 13:04
seems that amd has 32nm zacate for this fall. So we will see some performance increases / power savings there from the 40nm ones on the market
There is no 32nm zacate on any roadmap. Krishna/Wichita (28nm enhanced bobcats) are due 2012.

ToTTenTranz
11-May-2011, 15:29
Is the GUI in Vista / 7 crippled in anyway by DX9L? That to me would be very significant in a netbook, moreso than raw performance I think.

No problems there. Aero runs on DX9.0, SM2.0 GPUs.
But AFAIK hardware acceleration in the most popular Windows web-browsers (IE9, FF4, Chrome10) can only be enabled in >DX10 GPUs, so that may be the main reason for Intel ordering a "custom-made" SGX solution with DX10.1 support.
Same goes for Adobe Flash >10.1 acceleration using pixel shaders in windows.


Judging CPU performance based solely on Sunspider is foolhardy, there are all sorts of variables that have nothing to do with CPU. I have good confidence that clock for clock and with similar memory subsystems Cortex-A9 out-does Atom, and Atom will be up against dual-core A9s at much higher than 900MHz in 2011.
What about the differences in Linpack and Whetstone (http://www.slideshare.net/napoleaninlondon/arm-cortex-a8-vs-intel-atomarchitectural-and-benchmark-comparisons)?
Don't get me wrong, but I have the feeling that everything "ARM" is constantly overestimated and everything "x86" is underestimated, particularly in this sub-forum..



There is no 32nm zacate on any roadmap. Krishna/Wichita (28nm enhanced bobcats) are due 2012.

Yap.
But there'll be updated Zacates and Ontarios with Turbo enabled (for both CPU and GPU) this year, FWIW.

Exophase
11-May-2011, 16:39
What about the differences in Linpack and Whetstone (http://www.slideshare.net/napoleaninlondon/arm-cortex-a8-vs-intel-atomarchitectural-and-benchmark-comparisons)?
Don't get me wrong, but I have the feeling that everything "ARM" is constantly overestimated and everything "x86" is underestimated, particularly in this sub-forum..

That's Cortex-A8, not A9. Note that you're digging very specifically into floating point performance here, which is only going to be a factor in a subset of mobile applications. Nonetheless, even though the "tree vectorize" option was used and allegedly offered a large benefit the compiler was probably not able to do a very thorough job of vectorization or otherwise isn't good at scheduling NEON. These numbers are just too bad to be taken seriously (even for Atom).

People do overestimate ARM and underestimate Atom a lot, but I see it going the other way very commonly too. Typical approaches include using Intel numbers with little or no explanation behind them, using code ran behind a very platform specific JIT (like Javascript), exploiting the weakness of Cortex-A8's scalar FPU (not a weakness in A9), or making the claim that since ARM doesn't publish SPEC scores they have something to hide.

I just think that at a microarchitectural level Cortex-A9 is more sophisticated than Atom in several ways. Atom does have some advantages, most prominently SMT, but it has several disadvantages too.

Lazy8s
11-May-2011, 18:41
The 545 still makes for an exceptionally small and cool running Direct X 10.1 compliant GPU which might've been needed to help the rest of the chip meet the cost constraints.

Arun
11-May-2011, 19:21
The 545 still makes for an exceptionally small and cool running Direct X 10.1 compliant GPU which might've been needed to help the rest of the chip meet the cost constraints.It is noteworthy, however, that Imaginations only bragged about their DX9 drivers in recent PRs and that they dropped DX10 support in SGX544/554. I'm somewhat skeptical that DX10.1 will even be exposed... oh well!

Ailuros
11-May-2011, 19:29
I agree with you there, it makes sense if it was a smartphone or tablet part, but for a chip intended for netbooks this is a major fail. Its going to make a netbook useless for anything but surfing and word processing, etc. Ontario/Brazos is going to kick its butt hard, and with Krishna/Wichita scheduled for Q1 2012, Atom is going to get whooped pretty hard till they get a new arch on 22 nm (2013 according to the roadmap). The small plus side is, battery life should go up tremendously. (At least 20-30% im estimating)

You wouldn't need DX10.1 for a smart-phone/tablet. If we're talking about 2012 (and until 2013) you don't even need to go as far as Fusion APUs; just take a 2nd look at Rogue@28nm.

mczak
11-May-2011, 22:57
But there'll be updated Zacates and Ontarios with Turbo enabled (for both CPU and GPU) this year, FWIW.
The rumors for these didn't look too promising imho - not sure these are really new silicon even? Though I agree in theory Turbo has potential. For the graphics part though I suppose the official support of DDR3-1333 will do more for performance than any turbo stuff...

tangey
11-May-2011, 23:45
It is noteworthy, however, that Imaginations only bragged about their DX9 drivers in recent PRs and that they dropped DX10 support in SGX544/554. I'm somewhat skeptical that DX10.1 will even be exposed... oh well!

Really ?, Intel have let the DX10.1, OpenGL3.X specs be known for Cedartrail for many many months now.

In fact a quick search showed that DX10.1 was part of a just about spot on accurate cedartrail spec list in an article from Nov '09 which cited fudzilla as its source (the fudzilla link is now defunct).

http://www.netbookreviews.net/news/intel-cedartrail/

Point being that DX10,1 was rumoured from over 18 months ago, and yet from then until the VR-zone article, just about everyone was assuming the graphics would be a derivation of Intel's in-house, which would suggest that the rumour was not based on the thought that a DX or otherwise PowerVr core would be used.

I had suggested a few times in various places that SGX545 matched the rumoured spec, and IMG had said that they had a lead licensee for it, and the timing of that announcement (jan '10) would fit in, but having seen pinetrail correctly rumoured early-on to be In-house Intel, I concluded that all the rumours of cedartrail being In-house would also be correct.

swaaye
12-May-2011, 01:21
Unfortunately the last thing these low power IGPs need is more transistors dedicated to features (dx10.1) instead of performance. It doesn't really benefit the user experience.

ToTTenTranz
12-May-2011, 02:13
The rumors for these didn't look too promising imho - not sure these are really new silicon even? Though I agree in theory Turbo has potential. For the graphics part though I suppose the official support of DDR3-1333 will do more for performance than any turbo stuff...

http://macles.blogspot.com/2011/04/amd-to-release-c-60-with-turbo-core-in.html
Turbo in Ontario will make a big difference for the GPU, it'll clock it from the original 280MHz all the way to 400MHz, and the CPU will get a nice 33% boost.
I'd say it's the same silicon but now with Turbo not laser-cut.


Faster memory is definitely a must for Zacate, but not necessarily for Ontario, as the CPU cores are clocked around 1GHz and the GPU is @ 280MHz.


Unfortunately the last thing these low power IGPs need is more transistors dedicated to features (dx10.1) instead of performance. It doesn't really benefit the user experience.

As I said before, DX10 is the absolute minimum if you want to enable, for example, Flash 10.2 hardware acceleration for vector graphics drawing out-of-the-box.

Intel is actually making sure users will get a decent (almost)full-PC experience with the next-gen Atom, and 3D gaming will be left behind as it's always been with the CPU line.

Don't forget that Atom is the rock-bottom (price-wise) for PC computing. Makes sense that they prefer to enable a better web-browsing experience than a gaming experience that even with a SGX543MP4 would be mediocre at best, in a Windows PC.

Maybe they're doing the best they can right now, by choosing not to compete with AMD in netbook processors with fairly decent 3D performance and simply going the other way (better battery life, equally good web browsing experience).

mczak
12-May-2011, 02:52
http://macles.blogspot.com/2011/04/amd-to-release-c-60-with-turbo-core-in.html
Turbo in Ontario will make a big difference for the GPU, it'll clock it from the original 280MHz all the way to 400MHz, and the CPU will get a nice 33% boost.
I'd say it's the same silicon but now with Turbo not laser-cut.

Oh you're right I forgot about Ontario - I thought Zacate was supposed to get some Turbo too.
Almost all netbooks use E-350 anyway, but C60 could make Ontario really viable there possibly.


Faster memory is definitely a must for Zacate, but not necessarily for Ontario, as the CPU cores are clocked around 1GHz and the GPU is @ 280MHz.

Agreed. For the cpu cores I don't think it really matters even at 1.6Ghz. I'd say though it could make a difference for the C60, but given the tight power envelope it's understandable all of Ontario are ddr3-1066.

swaaye
12-May-2011, 03:04
As I said before, DX10 is the absolute minimum if you want to enable, for example, Flash 10.2 hardware acceleration for vector graphics drawing out-of-the-box.

I certainly agree that more GPU acceleration is a definitely plus. HD video and flash acceleration make a huge difference on these wimpy CPUs.

I was referring to DX10.1 3D features being meaningless, not that stuff. You know, nobody's gonna be playing Metro 2033 DX10 mode on this hardware regardless of the feature set being there. ;) With every generation of DirectX, new features pump up transistor counts and that's just not the right direction for 3D performance on a micro GPU. It's nuts that Flash requires DX10 for acceleration! I used to run hardware accelerated Shockwave games on an original Radeon!

eastmen
12-May-2011, 03:16
There is no 32nm zacate on any roadmap. Krishna/Wichita (28nm enhanced bobcats) are due 2012.

???
http://i.imgur.com/zl3hd.jpg
????

ToTTenTranz
12-May-2011, 09:16
That's a very old roadmap (from pre-2009?).
Ontario and Zacate are out now and use 40nm.

Ailuros
12-May-2011, 15:24
As I said before, DX10 is the absolute minimum if you want to enable, for example, Flash 10.2 hardware acceleration for vector graphics drawing out-of-the-box.

Honest question: do you have a link handy that states the requirements for Flash 10.2? Are the vector graphics requirements exceeding those of OpenVG1.0?


Don't forget that Atom is the rock-bottom (price-wise) for PC computing. Makes sense that they prefer to enable a better web-browsing experience than a gaming experience that even with a SGX543MP4 would be mediocre at best, in a Windows PC.I specifically never mentioned a SGX543 but a 544 which is DX9.0 Level3 compliant (DX11 certified). If there aren't any necessary requirements absent in a 544 compared to a 545 (for web browsing or flash 10.2 video acceleration), I don't see why a MP4 would be mediocre against a 545, rather the contrary.

Theoretically@400MHz:

SGX545
6.4 GFLOPs
800 MTexels
6.4 GPixels
80 MTris

SGX544MP4
51.4 GFLOPs
1600 MTexels
25.6 GPixels
266 MTris

The downside would be that the latter would capture quite a bit more in die area than the first; still a MP2@400MHz would be an entirely different chapter.

ToTTenTranz
12-May-2011, 16:56
Honest question: do you have a link handy that states the requirements for Flash 10.2? Are the vector graphics requirements exceeding those of OpenVG1.0?

I can't find where I read the DX10 requirement, but I can give you the support list (http://www.nvidia.com/object/gpus_supporting_adobeflash.html) from nVidia. I think it has something to do with being able to process vector scaling on top of a video window (http://blogs.adobe.com/penguinswf/2010/01/solving_different_problems.html).




I specifically never mentioned a SGX543 but a 544 which is DX9.0 Level3 compliant (DX11 certified). If there aren't any necessary requirements absent in a 544 compared to a 545 (for web browsing or flash 10.2 video acceleration), I don't see why a MP4 would be mediocre against a 545, rather the contrary.

Oh I never said the MP4 (543 or 544) wouldn't be a lot faster than the single 545.
I said its performance for windows games would be mediocre at best, as is the (probably) similar-performing HD6250 in C-50 and C-30 (45GFLOPS, 2.2GTexels/s).
And then Intel would have to fight AMD not only in the hardware performance front (in which they could, for sure), but also in the driver development front in order to keep the performance competition, where AMD has a clear advantage.

Not only that, but Intel would have to either purchase the driver support to a 3rd party (see how well that went last time) or split their graphics driver development team.

I bet it was Intel's conscious decision to not get into a 3D performance battle against Fusion. Maybe when they start using 22nm for Atoms, they'll scale back the current HD line, but for now they just don't want to fight that battle.
It's a decision made on a level well above the performance/transistor considerations, and a pretty wise one IMO.

rpg.314
12-May-2011, 17:23
I can't find where I read the DX10 requirement, but I can give you the support list (http://www.nvidia.com/object/gpus_supporting_adobeflash.html) from nVidia. I think it has something to do with being able to process vector scaling on top of a video window (http://blogs.adobe.com/penguinswf/2010/01/solving_different_problems.html).

That may be due to use of cuda to do video post proc.

Exophase
12-May-2011, 18:38
I hadn't noticed this before, but CPU-World says that the Cedar Trail Atoms will have shared L2 cache (instead of the split 2x512KB in previous dual-core Atoms). Seems like this should improve performance.

Ailuros
12-May-2011, 19:51
I can't find where I read the DX10 requirement, but I can give you the support list (http://www.nvidia.com/object/gpus_supporting_adobeflash.html) from nVidia. I think it has something to do with being able to process vector scaling on top of a video window (http://blogs.adobe.com/penguinswf/2010/01/solving_different_problems.html).

Obviously not a good enough explanation but rather guesswork; http://www.adobe.com/products/flashplayer/systemreqs/

Systems using Broadcom video decoding should use a Windows Aero theme for optimal full-screen playback performance.
Systems using GMA 500 video decoding should use a Windows Aero theme for optimal full-screen playback performance.Sounds more like a graphics ram problem than anything else since 128MB is the minimum graphics memory required for 10.2. Still can anyone enlighten me if that stuff is actually a GPU or rather a fixed function video decoding hw affair?



Oh I never said the MP4 (543 or 544) wouldn't be a lot faster than the single 545.
I said its performance for windows games would be mediocre at best, as is the (probably) similar-performing HD6250 in C-50 and C-30 (45GFLOPS, 2.2GTexels/s).And here I thought the MP4@200MHz in the NGP is giving occassionally a C50@280MHz a run for its money.

And then Intel would have to fight AMD not only in the hardware performance front (in which they could, for sure), but also in the driver development front in order to keep the performance competition, where AMD has a clear advantage.Drivers should be an IMG affair. Alas if Intel either orders again drivers from a 3rd party like Tungsten in the past (GMA500) or even worse tries to develop them itself. Drivers should be written and optimized by the hardware developers and no-one else.

Not only that, but Intel would have to either purchase the driver support to a 3rd party (see how well that went last time) or split their graphics driver development team.I've better chances to write a decent driver compared to that sw rendering crap that came from Tungsten.

I bet it was Intel's conscious decision to not get into a 3D performance battle against Fusion. Maybe when they start using 22nm for Atoms, they'll scale back the current HD line, but for now they just don't want to fight that battle.
It's a decision made on a level well above the performance/transistor considerations, and a pretty wise one IMO.I fail to see any wisdom in any of Intel's graphics related decisions, but then again there are folks out there that still believe that Larabee could have given or slightly modified could give AMD/NVIDIA GPUs a run for their money or else pigs can fly. Intel simply lacks the correct philosophy for graphics.

They might sell the result well (or they might not look how GMA600 captured the "world") but they definitely won't make a significant standpoint in the smart-phone market they desperately try to enter for years now. At least not considering what the real players of the smart-phone market are working on. Yes in that regard Intel has beaten ARM in the embedded market already senseless.....on paper of course.

Laurent06
16-May-2011, 12:10
I hadn't noticed this before, but CPU-World says that the Cedar Trail Atoms will have shared L2 cache (instead of the split 2x512KB in previous dual-core Atoms). Seems like this should improve performance.
Probably a typo; going from split L2 caches to a shared L2 would require non-negligible micro architectural changes.

tangey
17-May-2011, 23:57
Good detailed die shot of what looks like lincroft on one of the intel investors presentations (you may need to register to access it).

http://intelstudios.edgesuite.net/im/2011/pdf/2011_Intel_Investor_Meeting_Perlmutter.pdf
Page 16, it magnifies really well.

Looks exactly the same layout as the much poorer lincroft one available here:-
http://imageshack.us/photo/my-images/205/lincroft.jpg/

Also in that presentation is slide 4 showing that graphics in Intel tablets will go x10 in performance within 4 years, as will cpu performance.

Exophase
18-May-2011, 00:40
http://www.engadget.com/2011/05/17/intel-promises-smartphones-in-first-part-of-next-year-we-put/#disqus_thread

Looks like the theory that Intel killed Moorestown to accelerate Medfield's release has turned into simply Intel killed Moorestown.

Thanks for the slides tangey, lots of fascinating information:

The CPU performance roadmap is staggering; most of that 10x is happening all at once in 2013. I imagine it'll play out like this:

2012: Dual-core Cedar Trail tablets replace single-core Oak Trail and minor clock speed bump.
2013: Massive revision to Atom uarch (probably OoO, might not even resemble current Atom), 22nm + FinFET, move to quad core and higher clocks all of a sudden.. this is going to be pretty earth-shattering and only here will we finally see if Intel has what it takes to play in the mobile space
2014: Which then slides into a refinement of the uarch using a little more space and clock speed headroom available since the process has matured. But look how tiny this is, not exactly a strong endorsement of "tick/tock" on Atom.
2015: And 15nm hits, garnering a more noticeable clock speed increase but staying at quad-core.

Like usual Intel doesn't say a word on what it's power consumption is like doing things that keep CPU utilization high. Nothing on oh, gaming for instance. Same old story as for Moorestown, and while ultimately we can't say it did a bad job it not showing up to the party doesn't instill a lot of faith.

Finally, note that Intel has actually changed its uarch name for the 32nm Atoms, referring to it as Saltwell instead of Bonnell, meaning tick/tock has already started for them. Nobody really expected 32nm Atoms to actually change anything at the core design level, but at least this means that a move to shared L2 cache is less unrealistic. It would also appear that the speculation that Medfield and Cedar Trail use the same core SoC and merely have different I/O chips is right on target, according to page 17.

EDIT: I missed this quote..

"The chip sports just a single core at a time when competitors are rolling out dual-core chips, but the Atom core will deliver better performance than the competition, he said."

http://www.eetimes.com/electronics-news/4216089/Intel-rewrites-Atom-road-map

The chip there is Medfield, in case the context wasn't clear. Fail, big time. Single core Medfields delivering better performance than (let's be extremely generous here) January 2012 ARM CPUs, including quad-core Cortex-A9s from nVidia? Yeah right...

Arun
18-May-2011, 11:45
That performance graph does make me curious about the 2013 architecture. Remember that you can't just "make Atom OoOE" - the current core is inherently true CISC, which is fundamentally incompatible with OoOE. All the clever things they're doing based on being CISC go away, and a whole let of other tricks will have to replace them. It has to be completely different.

SPEC2000int_rate already benefits maximally from SMT on Atom, so presumably OoOE wouldn't buy you as much in that specific benchmark. But for any workload that doesn't use more threads than cores (i.e. a big majority of them with 4 cores!) it's essential to have good performance without SMT, so I still suspect we're going to see OoOE. And then 3-issue becomes the only way to explain that performance level... I define that as 'just too much' in my article, but if implemented with more engineering prowess than Atom, it might not be so bad.

As for Medfield, here's what I wrote in my Handheld CPU article: "Even if Intel executes properly on the hardware front, an immature software ecosystem (e.g. x86 Android although MeeGo should be fine) might delay their partners' projects until everyone decides to pull the plug in favour of unambiguously superior competitors" (stopped the quote there for obvious reasons :p)

Laurent06
18-May-2011, 15:58
I think too it has to be a completely different micro-architecture. I hope (for them, not me) that they don't go too far or even their process advantage won't be able to keep power low enough for their intended market; too aggressive OoOE has the terrible property of wasting power doing too much useless speculative work unless you add a lot of hardware to improve the accuracy of your speculation, which will burn power too :grin:

Lazy8s
02-Jul-2011, 18:30
I just bought that overpriced HDMI adapter for my iPad 2 and iPhone 4 to straight away watch the videos I take primarily with my phone, and I've really been appreciating the quality of the video.

The decent camcorder (and camera with the genuinely useful HDR post-process) helps, yet VXE and VXD do a good job of not dropping frames or breaking up the subtle moving elements of scenes I like to capture. I do a lot of hiking and tend to photograph the running water of little mountain waterfalls or the trickling water from streams or my dogs at times moving through the view as I'm panning around.

The VXE and VXD included in many of the Intel Atom platforms could be used as a competitive differentiator to some of the multitude of little known products packing them, considering few semis selling to the open market have yet adopted PowerVR video. Complaints about dropped frames were fairly common with some of the competing mobile HD solutions.

tangey
25-Jul-2011, 00:16
Just in case there was any lingering doubt about GMA3600 being IMG, i've just seen an install package for GMA3600 drivers for 32&64 bit Win7. Clearly i can't install as i don't have a cedartrail platform, but unzipping and looking at some of the .inf files reveals copious references to IMG/PVR including:-

GfxKey="System\CurrentControlSet\Services\PowerVR\PowerVRE urasia\HWSettings"
GfxOGLKey="Software\PowerVR\OpenGL\Common"

Didn't think "Eurasia" was ever going to be used on the public side

Some further reading of the .inf file says:-
;8.14.xx.xxxx Win7 & later DX9
;8.15.xx.xxxx Win7 & DX10

As this version is v.8.15.8.1033, it would suggest that these are DX10 drivers.

http://translate.google.co.uk/translate?hl=en&sl=ru&u=http://www.devdrivers.ru/load/intel_gma_drivers/intel_graphics_media_accelerator_3600_series_drive rs_v_8_15_8_1033_drajvera_dlja_integrirovannykh_vi deokart_intel_gma_3600_pod_windows/42-1-0-294&ei=MaYsTuSmC9SKhQfK88GqCw&sa=X&oi=translate&ct=result&resnum=1&ved=0CB8Q7gEwAA&prev=/search%3Fq%3DGMA3600%2Bdrivers%26hl%3Den%26sa%3DN% 26biw%3D1382%26bih%3D985%26prmd%3Divns

Erinyes
25-Jul-2011, 05:15
Just in case there was any lingering doubt about GMA3600 being IMG, i've just seen an install package for GMA3600 drivers for 32&64 bit Win7. Clearly i can't install as i don't have a cedartrail platform, but unzipping and looking at some of the .inf files reveals copious references to IMG/PVR including:-

GfxKey="System\CurrentControlSet\Services\PowerVR\PowerVRE urasia\HWSettings"
GfxOGLKey="Software\PowerVR\OpenGL\Common"

Didn't think "Eurasia" was ever going to be used on the public side

Some further reading of the .inf file says:-
;8.14.xx.xxxx Win7 & later DX9
;8.15.xx.xxxx Win7 & DX10

As this version is v.8.15.8.1033, it would suggest that these are DX10 drivers.

http://translate.google.co.uk/translate?hl=en&sl=ru&u=http://www.devdrivers.ru/load/intel_gma_drivers/intel_graphics_media_accelerator_3600_series_drive rs_v_8_15_8_1033_drajvera_dlja_integrirovannykh_vi deokart_intel_gma_3600_pod_windows/42-1-0-294&ei=MaYsTuSmC9SKhQfK88GqCw&sa=X&oi=translate&ct=result&resnum=1&ved=0CB8Q7gEwAA&prev=/search%3Fq%3DGMA3600%2Bdrivers%26hl%3Den%26sa%3DN% 26biw%3D1382%26bih%3D985%26prmd%3Divns

A bit OT but what would even be the need for DX9 drivers? If it is indeed a SGX 545 then it is 10.1 compliant. And it seems to be for Win7 only so its not like the DX9 driver is for XP

tangey
25-Jul-2011, 09:02
A bit OT but what would even be the need for DX9 drivers? If it is indeed a SGX 545 then it is 10.1 compliant. And it seems to be for Win7 only so its not like the DX9 driver is for XP

It's probably more a case of a development history, having mostly likely started with their dx9 drivers from sgx535 and then added the dx10 compliance stuff.

Lazy8s
26-Jul-2011, 06:29
credit IPPaws...

It's finally become a reality: http://www.fujitsu.com/global/news/pr/archives/month/2011/20110721-01.html

A full blown PC phone. Install anything, run any kind of Flash or web standard, etc.

rpg.314
26-Jul-2011, 07:44
Is it medfield?

hoho
26-Jul-2011, 08:17
A full blown PC phone. Install anything, run any kind of Flash or web standard, etc.... just make sure you can get to charger in under 2 hours :P

Laurent06
26-Jul-2011, 08:40
That phone most probably has an ARM to run Symbian and get acceptable battery life.

hoho
26-Jul-2011, 09:20
That phone most probably has an ARM to run Symbian and get acceptable battery life.No it doesn't have ARM but it does get decent battery life out of it with Symbian

From the URL:
______

Fully equipped with basic PC features
With Windows® 7, an Intel® Atom™ processor, memory and a solid-state drive, the F-07C features functionality identical to that of a PC.


With Symbian:
Continuous Standby Time: ~600 hours in FOMA 3G
Continuous Talk Time:
~370 minutes in FOMA 3G voice mode
~170 minutes in videophone mode


With W7:

Windows® 7 battery life: ~2 hours in Windows® 7 mode

_________


I wonder how can it be that huge difference. I know people have hacked their N900's to run regular desktop apps over the video out and with BT kb/mouse and I haven't heard their phones to run dry 20x faster just for running an OS capable of doing it. Is regular windows really that bad on batteries? If so then I guess it kind of makes sense why tablets with windows often have 2x shorter battery life.

Laurent06
26-Jul-2011, 09:48
No it doesn't have ARM but it does get decent battery life out of it with Symbian
Oh, you have seen Symbian on x86, really? Breaking news :wink:

I wonder how can it be that huge difference. I know people have hacked their N900's to run regular desktop apps over the video out and with BT kb/mouse and I haven't heard their phones to run dry 20x faster just for running an OS capable of doing it. Is regular windows really that bad on batteries? If so then I guess it kind of makes sense why tablets with windows often have 2x shorter battery life.
I gave my guess about that. It's not because their marketing spec only mentions Atom that it doesn't have a ARM.

hoho
26-Jul-2011, 09:57
Oh, you have seen Symbian on x86, really? Breaking news :wink:


I gave my guess about that. It's not because their marketing spec only mentions Atom that it doesn't have a ARM.Yeah, my bad about that. There indeed have been rumors about it having ARM in there somewhere as well. No idea what kind though.

Blazkowicz
26-Jul-2011, 18:02
.. run any kind of antivirus software. but sure it's interesting.

ToTTenTranz
27-Jul-2011, 10:45
Razer Switchblade (http://vr-zone.com/articles/razer-switchblade-concept-will-be-powered-by-1.7ghz-intel-z690-atom-processor/13106.html) "Windows Handheld Gaming Device" will have an Oak Trail Z690 @ 1.7GHz:

http://img812.imageshack.us/img812/9063/razerjpg.png


So it's a gaming device. With an Oak Trail. :nope:

Bound for failure?
Unless it has a discrete GPU somehow, I'd fire the people responsible for not getting a C-50 (or even Z-01) in there.

Lazy8s
27-Jul-2011, 16:34
The form factor with which they're targeting there means aiming for lower power consumption and size/heat/cost, and going with an Atom CPU like that leaves only enough headroom for a small/efficient GPU core.

As an aside, what I believe they're building is a reference platform for other manufacturers to make into end products.

Blazkowicz
27-Jul-2011, 20:10
I thought it's more a proof of concept, there's a video of it on youtube and it's bad at runnning warcraft III.

I.S.T.
29-Jul-2011, 07:21
Fudzilla ran a rumor that the 2015 Atoms will be as fast as Athlon IIs... I doubt this will happen.

rpg.314
29-Jul-2011, 07:58
Fudo speaking of products 4 years out, what do you expect?

Laurent06
29-Jul-2011, 09:08
That number comes from an Intel claim: see the slide here (http://www.itproportal.com/2011/07/26/2015-version-intel-atom-faster-amd-phenom-ii-cpu/) where they say by 2015 Atom will be 10 times faster than today.

ToTTenTranz
29-Jul-2011, 11:07
http://img202.imageshack.us/img202/1016/clipboard08pw.jpg

If these charts are true, then I think Intel is bound for failure.
10x the graphics power of SGX535 @ 400MHz is what? Equivalent to the 500MHz GPU in the Fusion E-350?

By 2015, that'll be way too low compared to any high-end smartphone/tablet SoC.

hoho
29-Jul-2011, 11:35
From that same article:
Notebookcheck says that one of the newest games that actually runs on that GPU is Starcraft 2010 which reaches 3fps on low settings; in comparison Nvidia's integrated ION2 GPU - which is two years old already - reaches 25.3 frames per second on that same gameSo in about 4 years intel will catch up with GPU performance that NV will have had for 6 years :)

Ailuros
29-Jul-2011, 13:05
If these charts are true, then I think Intel is bound for failure.
10x the graphics power of SGX535 @ 400MHz is what? Equivalent to the 500MHz GPU in the Fusion E-350?

By 2015, that'll be way too low compared to any high-end smartphone/tablet SoC.

Depends what they're planning to integrate by then. ST's A9600 which might appear in 2013 will be times more faster than 10x times a 535@400MHz.

From that same article:
So in about 4 years intel will catch up with GPU performance that NV will have had for 6 years :smile:

Intel's graphics driver toirtuse speed development is a feature and not a bug. You should be grateful they don't charge extra for it :P

ToTTenTranz
29-Jul-2011, 14:10
From that same article:
So in about 4 years intel will catch up with GPU performance that NV will have had for 6 years :)

Yes, but I think they're using the wrong comparison. They mention the GMA3150, and that's for 2010 Atoms in netbooks, whereas the chart shows performance numbers for tablets.
For 2011 that performance number should correspond to Oak Trail's SGX535 @ 400MHz, which should be faster than the GMA3150 (2 pixel shaders, vertex shaders on the CPU, 200MHz). I don't know how much it'd do in Starcraft 2 low settings, but it should do a lot more than the 3fps.



Depends what they're planning to integrate by then. ST's A9600 which might appear in 2013 will be times more faster than 10x times a 535@400MHz.

Well the chart is clear about the performance target..
In 2013, smartphone/tablet Atoms are supposed to bring 3-4x the performance of a SGX535@400MHz..
So in 2013, Intel will bring a GPU performance for tablets that's (maybe) equivalent to Fusion's Z-01 in 2011.

OTOH, CPU performance will be a monster..

Lazy8s
29-Jul-2011, 16:22
Case in point why performance claims from different marketing departments (or at least different marketing purposes) are not comparable.

rpg.314
29-Jul-2011, 16:51
http://img202.imageshack.us/img202/1016/clipboard08pw.jpg

If these charts are true, then I think Intel is bound for failure.
10x the graphics power of SGX535 @ 400MHz is what? Equivalent to the 500MHz GPU in the Fusion E-350?

By 2015, that'll be way too low compared to any high-end smartphone/tablet SoC.

See sig.

I'll believe Intel in any segment except PC when I see products.

ToTTenTranz
29-Jul-2011, 16:52
Case in point why performance claims from different marketing departments (or at least different marketing purposes) are not comparable.

Are you saying the performance target for Atom's graphics throughout 2012-2015 could be higher?

Lazy8s
29-Jul-2011, 17:09
On the scale some other companies are using, yes.

tangey
29-Jul-2011, 17:18
The small print in that chart says "starting with 2011 (2nd generation tablets) as baseline".

Isn't oaktrail Intel's first gen chip aimed directly at tablets ?

ToTTenTranz
29-Jul-2011, 17:38
The small print in that chart says "starting with 2011 (2nd generation tablets) as baseline".

Isn't oaktrail Intel's first gen chip aimed directly at tablets ?

I think the "first gen" would be the first iteration of Lincroft - Atom Z600 - with a 200MHz SGX535.

It couldn't be anything after Oaktrail, since medfield and clovertrail are only scheduled for 2012 and cedartrail is for netbooks (10W TDP).

Exophase
29-Jul-2011, 18:04
Second generation as baseline would mean that's what the 1x is on the graph. Lincroft is what's in Oaktrail, and includes GMA600 with a 400MHz SGX535. The 200MHz SGX535 was was included in Poulsbo which was part of the Menlow platform and accompanied the Z5xx Silverthorne CPUs. I doubt Menlow was in anything resembling a tablet but it was at least in some UMPCs and netbooks. So I could see calling Oaktrail second gen for tablets. Or maybe they just mean appearing in a second generation of tablets from an overall market viewpoint, whatever that means.

I agree that we won't see Medfield tablets in 2011, and even if someone wanted to make a Cedertrail tablet I doubt we'd see it in 2011 either.

Laurent06
29-Jul-2011, 18:36
even if someone wanted to make a Cedertrail tablet I doubt we'd see it in 2011 either.
Isn't MSI WindPad 120W (http://www.netbooknews.com/27175/msi-windpad-120w-cedar-trail-tablet-hands-on/) based on Cedar Trail? Anyway I don't think it's available.

I'm with rpg.314 on this: until Intel executes on Atom for smartphone/tablets, I won't trust any of their claims.

Exophase
29-Jul-2011, 19:27
Isn't MSI WindPad 120W (http://www.netbooknews.com/27175/msi-windpad-120w-cedar-trail-tablet-hands-on/) based on Cedar Trail? Anyway I don't think it's available.

Hm, guess the TDP isn't nearly as high as ToTTenTranz said. Then yes, it'll compete with the Ontario tablets with relatively low battery life. Then I could see this coming out in late 2011.

Wonder what the performance delta will be vs GMA600, if that's the baseline. Probably at least 4x for anything ALU limited rather than TMU limited.

ToTTenTranz
29-Jul-2011, 20:25
Hm, guess the TDP isn't nearly as high as ToTTenTranz said. /QUOTE]

Here it is:
http://www.anandtech.com/show/4295/intel-cedar-trail-platform

And Cedar Trail is formally announced for netbooks and nettops:

[QUOTE]- "Cedar Trail," Intel's upcoming netbook and entry-level desktop platform, will deliver features including Intel® Wireless Music, Intel® Wireless Display, PC Synch and Fast Boot, as well as improvements in media, graphics and power consumption.

- Innovation beyond the PC: Embedded Intel® Atom™ Z670 creates smaller, thinner, fanless devices for mobile clinical assistants, industrial tablets and portable point-of-sales devices.

Straight from the horse's mouth. (http://newsroom.intel.com/community/en_za/blog/2011/04/12/new-intel-atom-processor-for-tablets-spurs-companion-computing-device-innovation)


Of course, there are actively-cooled tablets with C-50 and C-30, so a lower power version of Cedar Trail wouldn't be impossible to achieve.

Nonetheless, it's not the "baseline" for tablet CPU performance.

Exophase
29-Jul-2011, 20:48
http://www.cpu-world.com/news_2011/2011050401_Intel_32nm_Cedar_Trail_Atom_CPUs_to_lau nch_in_Q4_2011.html

If correct, Cedar Trail goes all the way down to 3.5W. Regardless of what Intel promotes it for that's much more suitable for tablets than C-50 and C-30 are today, they wouldn't need "lower power" versions (nor active cooling) and I wouldn't put it past Intel to be counting this as the baseline.

zed
30-Jul-2011, 04:59
So in about 4 years intel will catch up with GPU performance that NV will have had for 6 years
Reminds me of the PR from intel about 'larrabee'
something like "in 2or3 years larrabee will launch and have twice the performance of todays nvidia top GPU"
Thats when I knew 'larrabee was in trouble', though I did expect it to launch not bomb completely.
Why cant intel do GPUs?

Ailuros
01-Aug-2011, 13:33
Good question; however eventually they could come up in the distant future with a more viable GPU design than anything they've designed so far; and then the burning question might be "why can't Intel write GPU drivers?"

DavidC
04-Aug-2011, 04:51
Yes, but I think they're using the wrong comparison. They mention the GMA3150, and that's for 2010 Atoms in netbooks, whereas the chart shows performance numbers for tablets.
For 2011 that performance number should correspond to Oak Trail's SGX535 @ 400MHz, which should be faster than the GMA3150 (2 pixel shaders, vertex shaders on the CPU, 200MHz). I don't know how much it'd do in Starcraft 2 low settings, but it should do a lot more than the 3fps.

Oak Trail's presentations are putting the graphics performance at roughly equivalent to the GMA 3150(actually slightly less than GMA 3150). Are you sure about the GMA 3150 having 2 pixel shaders? The older GMA 950 had 4. The GMA 950 definitely wasn't fill rate limited.

Intel's claiming 10x+ here, not 10x. You can actually compare that claim with the PC. For PC graphics, they were aiming for 10x improvement by the 32nm(Sandy Bridge) generation compared to the G33, and they said at Sandy Bridge release that they actually achieved 25x.

tangey
25-Aug-2011, 14:50
Some testing using 3DMARK2006 is suggesting that Cedartrail's SGX545 graphics is getting close to x3 graphics over the GMA3150.

http://www.umpcportal.com/2011/08/amd-e-450-vs-intel-n2800-vs-intel-n570-at-blogeee-net/

Intel in the leaked document was aiming for x2. The testing above was using the Cedartrail N2800 which clocks the SGX545@640Mhz. So the @400MHZ would be about x2, although the drivers are bound to be at a very early stage at this point

http://forum.beyond3d.com/showpost.php?p=1549558&postcount=133
(link to earlier post with the intel x2 target).

Ailuros
25-Aug-2011, 16:38
Intel is clearly saving a lot of die area and power consumption from their GPU design decision (contrary to what I imagined in the past); I honestly hope they'll sell it at very attractive prices which is usually unlikely for Intel.

Blazkowicz
25-Aug-2011, 16:49
Good question; however eventually they could come up in the distant future with a more viable GPU design than anything they've designed so far; and then the burning question might be "why can't Intel write GPU drivers?"

it always seemed that Intel doesn't want to do GPUs.
the other question I have is, why can't Intel make an Atom with better CPU performance?
they seem to make them reluctantly as well. but nonetheless Intel GPUs and Atoms are market leaders.

Ailuros
25-Aug-2011, 19:10
it always seemed that Intel doesn't want to do GPUs.

They are making graphics processing units for years now; they just don't seem to give a dime about graphics performance probably because they're too self assured that they'll sell their stuff anyway.

the other question I have is, why can't Intel make an Atom with better CPU performance?
they seem to make them reluctantly as well. but nonetheless Intel GPUs and Atoms are market leaders.

My knowledge and understanding of CPUs in general is relatively poor; however IF they haven't designed Atom from ground up for the embedded space it might be one of the explanations.

On the other side AMD isn't yet diving its entire foot into the embedded market with its APUs exactly because they haven't been developed from ground up for ultra low power consumption.

ToTTenTranz
28-Sep-2011, 11:51
First Cedar Trail CPUs have been formally announced by Intel (http://www.anandtech.com/show/4884/intel-releases-atom-d2500-and-d2700-processors).

These are desktop CPUs, but the same platform should scale down to netbooks and tablets.

There's D2500 and D2700, supposedly both have a SGX545 @ 400MHz and 640MHz of "base clock", respectively. I don't get if these clocks can be subject to automatic overclock based on temperature and power consumption.

Nonetheless, both models are light-years away from the C-60 in GPU power, but will probably beat the 1GHz Bobcats pretty hard.

CarstenS
03-Oct-2011, 11:44
There's a data-sheet available, describing more details of the processors and also their mobile SKUs N2800 and N2600:
- No DX10.1 (as rumored as of late)
- Almost no power management for the IGP (Full-on & off)
- No Blu-ray support for the two little SKUs

Here's (http://www.gpu-tech.org/content.php/176-Intel-unveils-more-details-about-Cedar-Trail-s-Atom-D2x00-N2x00) more.

ToTTenTranz
03-Oct-2011, 13:43
Any possible explanation to that "200MHz Render clock"? Some parts of the GPU are clocked lower than the 400/640MHz base frequency?

Exophase
03-Oct-2011, 15:56
Any possible explanation to that "200MHz Render clock"? Some parts of the GPU are clocked lower than the 400/640MHz base frequency?

The 200MHz display clock has nothing to do with the SGX545 (the article is incorrect in saying that it drives the two displays) SGX renders to a framebuffer that's somewhere in memory, beyond that it plays no role in getting images to a display. SoCs need some controller for this, and often include additional 2D acceleration. You can see on the datasheet that this includes compositing and output to several different analog and digital display types. Intel made this a little confusing by labeling all of these features under "GPU", other SoC datasheets tend to separate things more.

It's interesting that Intel went with citing "vertex rate" (with the stipulation "transform only") instead of triangle rate. In the past Intel has listed its SGX535 based parts as having a triangle setup rate of 15 cycles. It's known that Series5-XT improved this to 10 cycles - from what I know about SGX545 it should still be Series 5 and hence 15 cycles. The claim of 4 pixel/cycle fill rate might be misleading too, since that usually refers to textured pixels and I don't think SGX545 has more than two TMUs. Finally with only USSE (instead of USSE2) it's at a further disadvantage vs SGX543. It'll be interesting to see how it performs against iPad 2.. even the 640MHz part might lose some of the time.

tangey
03-Oct-2011, 16:47
).

(in that regard I think the article is incorrect in saying that the SGX is driving the two independent displays)

...unless the 2D / display is also being supplied by IMG and they consider the 2D/3D to be all one (SGX is not specifically mentioned anywhere)

Exophase
03-Oct-2011, 17:42
...unless the 2D / display is also being supplied by IMG and they consider the 2D/3D to be all one (SGX is not specifically mentioned anywhere)

The article. Not the datasheet.

"The integrated PowerVR SGX 545 core has two display pipelines available, allowing to drive two fully independet displays on your device."

This is wrong, regardless of who does the display IP (but I doubt it's IMG)

Something else interesting that the datasheet reveals, that I don't think I had seen divulged for any other SGX part thus far, is some information on the register layout:

— 4096 32-bit registers and 48 40-bit registers
— 3-way 10 bit integer and 4-way 10 bit integer operations

I had seen before some conflicting information on the lowp data-format: it is called out as 10-bit (1.1.8 format, or 1 sign 1 whole 8 fractional) in the OpenGL ES 2 optimization guideline documents but some of the datasheets only mention 32-bit operations. I got a confirmation before that it does indeed have both 40-bit registers and can do 4-way 10-bit operations on them, but here we can see that they're in limited quantity. This is probably for the entire core, so it'd mean 12 per USSE. Of course, since you'd only be using them for vec4 color components you don't need nearly as many of them vs needing 4 for vec4 FP32 operations, but it's still a little tight. I wonder if you're expected to keep the operations in 4x8-bit as much as possible.

tangey
03-Oct-2011, 18:33
The article. Not the datasheet.
ahh

Ailuros
04-Oct-2011, 07:09
Something else interesting that the datasheet reveals, that I don't think I had seen divulged for any other SGX part thus far, is some information on the register layout:



I had seen before some conflicting information on the lowp data-format: it is called out as 10-bit (1.1.8 format, or 1 sign 1 whole 8 fractional) in the OpenGL ES 2 optimization guideline documents but some of the datasheets only mention 32-bit operations. I got a confirmation before that it does indeed have both 40-bit registers and can do 4-way 10-bit operations on them, but here we can see that they're in limited quantity. This is probably for the entire core, so it'd mean 12 per USSE. Of course, since you'd only be using them for vec4 color components you don't need nearly as many of them vs needing 4 for vec4 FP32 operations, but it's still a little tight. I wonder if you're expected to keep the operations in 4x8-bit as much as possible.

GMA500 aka SGX535 has half those registers:

http://www.mitrax.de/?cont=artikel&aid=36&page=2

You'll also find differences in pixels/clock (4 vs. 2), vertex rate (1 Tri/8 clocks vs. 15) and thread amount (16 vs. 4).

By the way something that I don't see mentioned anywhere SGX525-540 should be 8 z/stencil, while 545,543,544,554 should all be 16 z/stencil.

Finally with only USSE (instead of USSE2) it's at a further disadvantage vs SGX543. It'll be interesting to see how it performs against iPad 2.. even the 640MHz part might lose some of the time.

I guess it would depend on the workload; I think but am not sure anymore that 545@200MHz was rated in past documents at 40M Tris/s. No idea if that was too optimistic but considering SGX540@200MHz is supposed to be at 20M, anything between 30 and 40M for the 545 doesn't sound too absurd.

Assuming the iPad2 is clocked at the estimated 250MHz it would be at the ideal peak scaling around 83M Tris/s, with the 545 ranging (I wish I would be sure about the accurate peak throughput) between 96 and 128M Tris/s. If you'd tailor a case with a high enough geometry amount and sparse on shader instructions it could be that the 545@640MHz is an at advantage; I just personally doubt that it'll be a realistic showcase for any recent OGL_ES2.0 application.

On another note does anyone have a clue why besides the marketing tickbox Intel chose to go for a DX10.1 core? I bet that even if Intel is asked they have a half way reasonable answer. Even worse regarding its possible appearance in final devices and since you're comparing it to iPad2, I'd rather ask how it'll compare against iPad3.

ToTTenTranz
04-Oct-2011, 14:00
On another note does anyone have a clue why besides the marketing tickbox Intel chose to go for a DX10.1 core? I bet that even if Intel is asked they have a half way reasonable answer. Even worse regarding its possible appearance in final devices and since you're comparing it to iPad2, I'd rather ask how it'll compare against iPad3.

Direct Compute 4.1 and flash acceleration for windows (restricted for DX10+ GPUs AFAIK). Both may become important for Windows 8.


Apple's A6 difference to A5 may be small and evolutional. Looking at the performance evolution of their phones/MIDs since iphone's launch, you get:

iphone 1 -> 3G: evolutional (both ARM11 + MBX Lite)
3G -> 3GS: revolutional (ARM11 -> Cortex A8 | MBX -> SGX535)
3GS -> 4/ipad 1: evolutional (both Cortex A8 + SGX535)
4 > 4S/ipad2: revolutional (Cortex A8 -> 2*Cortex A9 | SGX535 -> SGX543MP2 (rumoured))
4S -> 5/ipad3: ??evolutional??

That said, we may very well be looking at the very same GPU in iPad 3, maybe just changing for higher-clocked Cortex A9 or going quad-core @ 1GHz.


So Intel's 2012 Atom for tablets could still be competitive to iPad 3 in both CPU and GPU power.

Exophase
04-Oct-2011, 16:14
GMA500 aka SGX535 has half those registers:

http://www.mitrax.de/?cont=artikel&aid=36&page=2

You'll also find differences in pixels/clock (4 vs. 2), vertex rate (1 Tri/8 clocks vs. 15) and thread amount (16 vs. 4).

By the way something that I don't see mentioned anywhere SGX525-540 should be 8 z/stencil, while 545,543,544,554 should all be 16 z/stencil.

The register scaling and pixels/clock scales with the number of USSEs as you'd expect. From what I understand each USSE has a ROP and can submit a next pixel to the tile memory every clock, even if they don't necessarily have enough TMUs to sustain this rate with texturing. You'd think as USSE count scaled this would eventually break down, but the tiling interface could make this cheaper than conventional ROPs would be.

I've always taken 15 clocks/triangle to be the setup rate on SGX535. Here Intel is calling the 8 clocks/triangle rate "transform only", and they're qualifying it with the number of vertexes per triangle. This would mean 16 cycles to transform a vertex - unfortunately, we really don't know what operations it includes. It'd probably at least include a 3x3 matrix transformation against FP32 input vectors, if not 4x3 or 4x4. It could also include lighting.

But I definitely think that the "transform only" qualifier is there for a reason (nVidia has pulled the same thing with Tegra), that should mean that the number of visible triangles is lower. So I don't see how it could enjoy such an improved peak triangle rate over SGX535.

Do you have any documentation that SGX545 has the 16 Z/stencil units? I'm looking for any indication that it's more than an SGX540 with DX10.1 support. If we follow the SGX530-535 analogy the difference would be DX support plus some additional capacity (2x TMUs on 535).

The 40MT/sec number at least comes from IMG:

New features in POWERVR SGX545 include:


http://www.imgtec.com/news/Release/index.asp?NewsID=516


Under improvements they list:






DirectX10.1 API support
Enhanced support for DirectX10 Geometry Shaders
DirectX10 Data assembler support (Vertex, primitive and instance ID generation)
Render target resource array support
Full arbitrary non power of two texture support
Full filtering support for F16 texture types
Support for all DirectX10 mandated texture formats
Sampling from unresolved MSAA surfaces
Support for Gamma on output pixels
Order dependent coverage based AA (anti-aliased lines)
Enhanced line rasterisation



All features, no mention of performance enhancements. It's interesting that full NPOT wasn't already a feature. I know on OMAP3 based SoCs (SGX530) it came with a huge performance penalty.

The release was after SGX543, so it'd be strange for it to not be Series 5-XT.. but it still refers to the pipelines as USSE and not USSE2.

On another note does anyone have a clue why besides the marketing tickbox Intel chose to go for a DX10.1 core? I bet that even if Intel is asked they have a half way reasonable answer. Even worse regarding its possible appearance in final devices and since you're comparing it to iPad2, I'd rather ask how it'll compare against iPad3.

Could be OpenCL related as well, although that may be even stranger.

Lazy8s
04-Oct-2011, 18:45
The 545 was showing up on roadmaps long before the Series5XT refresh was announced.

When IMG goes through the work of making a custom (numbered) core variant, they customize some of the performance attributes as opposed to just scaling (which MP could do). They made more-than-proportional increases to the triangle rate on both the 545 and cancelled 555 (which had a rate over 100M tri/s).

CarstenS
04-Oct-2011, 19:57
The article. Not the datasheet.

"The integrated PowerVR SGX 545 core has two display pipelines available, allowing to drive two fully independet displays on your device."

This is wrong, regardless of who does the display IP (but I doubt it's IMG)

Thx for pointing that out. It was indeed not clear to me upon reading the data sheet. It's a little more precise now.

Ailuros
05-Oct-2011, 13:11
Direct Compute 4.1 and flash acceleration for windows (restricted for DX10+ GPUs AFAIK). Both may become important for Windows 8.

There's still enough time to deliver DX10.x drivers before win8. So far I live under the impression that for win8 the minimum requirement is DX9 L3 (DX11 certified DX9).

Apple's A6 difference to A5 may be small and evolutional. Looking at the performance evolution of their phones/MIDs since iphone's launch, you get:

iphone 1 -> 3G: evolutional (both ARM11 + MBX Lite)
3G -> 3GS: revolutional (ARM11 -> Cortex A8 | MBX -> SGX535)
3GS -> 4/ipad 1: evolutional (both Cortex A8 + SGX535)
4 > 4S/ipad2: revolutional (Cortex A8 -> 2*Cortex A9 | SGX535 -> SGX543MP2 (rumoured))
4S -> 5/ipad3: ??evolutional??

That said, we may very well be looking at the very same GPU in iPad 3, maybe just changing for higher-clocked Cortex A9 or going quad-core @ 1GHz.You may freely call the iPad3 SoC evolutional and be still on the right track. If however A6 is truly a 28nm SoC and will appear only in "iPad4" in 2013 would it be time for an evolution or a revolution?

So Intel's 2012 Atom for tablets could still be competitive to iPad 3 in both CPU and GPU power.No ;) (more below)

Exophase,

I don't think there's a public document stating the amount of z/stencil units on 545. By the way that detailed 545 announcement mentions OGL3.2 while IMG lists it in its whitepapers as 3.1 and Intel 3.0 LOL. Now try to figure out what is what; frankly I don't even know what the differences between 3.0, 3.1 and 3.2 are but it's still strange. Could be that Cedartrail will be shipping with OGL3.0 drivers.

In any case SGX545 is IMHO (at 640MHz) :

4 Vec2 ALUs = 10.24 GFLOPs/s
2 TMUs = 1.28 GTexels/s
16 z/stencil = 10.24 GPixels/s

SGX54xMP3 at just 400MHz (and yes it's just a theoretical case example for anything <40/45nm):

12 Vec4+1 ALUs = 43.2 GFLOPs/s
6 TMUs = 2.4 GTexels/s
48 z/stencil = 19.2 GPixels/s

darkblu
05-Oct-2011, 13:16
I've always taken 15 clocks/triangle to be the setup rate on SGX535. Here Intel is calling the 8 clocks/triangle rate "transform only", and they're qualifying it with the number of vertexes per triangle. This would mean 16 cycles to transform a vertex - unfortunately, we really don't know what operations it includes. It'd probably at least include a 3x3 matrix transformation against FP32 input vectors, if not 4x3 or 4x4. It could also include lighting.
'Transform only' normally includes no lighting, and is against a 4x3 or a 4x4 matrix. Basically, 3 madds (3- or 4-wide each), with an optional 3- or 4-wide mul (depending on whether the vertex w component is sourced as a constant or not). For an architecture where the cost of 3-wide does not differ from 4-wide, that distinction would be irrelevant, but for the SGX5, i'd guess it's all meant to imply 3-wide fp32 madds, and some clever sourcing of clipping-space w.

Ailuros
05-Oct-2011, 13:31
The 545 was showing up on roadmaps long before the Series5XT refresh was announced.

When IMG goes through the work of making a custom (numbered) core variant, they customize some of the performance attributes as opposed to just scaling (which MP could do). They made more-than-proportional increases to the triangle rate on both the 545 and cancelled 555 (which had a rate over 100M tri/s).

The cancelled 555 (in favour of Series5XT) should have been an 8 Vec2 ALUs, 4 TMU core IMHO. No idea if its compliance would had been as high as 545, but in regards of performance and overall flexibility the MPs that replaced it sound today like a far better idea. Especially since a partner can license for demanding floating point cases a 8 Vec4 SGX554, which is also multi-core capable.

ToTTenTranz
05-Oct-2011, 18:15
You may freely call the iPad3 SoC evolutional and be still on the right track. If however A6 is truly a 28nm SoC and will appear only in "iPad4" in 2013 would it be time for an evolution or a revolution?

No ;) (more below)

(...)

In any case SGX545 is IMHO (at 640MHz) :

4 Vec2 ALUs = 10.24 GFLOPs/s
2 TMUs = 1.28 GTexels/s
16 z/stencil = 10.24 GPixels/s

SGX54xMP3 at just 400MHz (and yes it's just a theoretical case example for anything <40/45nm):

12 Vec4+1 ALUs = 43.2 GFLOPs/s
6 TMUs = 2.4 GTexels/s
48 z/stencil = 19.2 GPixels/s


I don't understand why you're assuming almost a ~2x clock increase for the GPU in A5's successor.
Even if it's made in 28nm and uses the same GPU, it could very well have the same clocks.
As we've seen from GLBenchmark's offscreen test, the SGX535 in 3GS' S5PC100 @ 65nm has the same clocks as the SGX535 in A4 @ 45nm.

And AFAIK, the SGX543MP2 in iPad2 could be @200MHz, so we could be looking at 21.5 GFLOPs/s, 1.2 GTexels/s and 9.6GPixels/s in iPad2/3 against 10.24FLOPs/s, 1.28GTexels/s and 10.24Pixels/s in an Atom 32nm tablet.

Furthermore, CPU performance should be a lot higher in an Atom dual-core, four-threaded @ ~1.5GHz (?) against 2*1GHz A9 in iPad2 and maybe also faster against whatever Cortex A9 multi-core may come in A6.

Exophase
05-Oct-2011, 19:49
With 1.5+ dual core and quad core Cortex-A9s just around the corner while still at 40nm I doubt Apple is going to release an A6 chip that's featuring something inferior at 28nm. It remains to be seen just what clock speed N2600 will support; I imagine this is the only one really gain traction in tablets. But if we end up seeing 2GHz dual core > 1.5GHz quad core A9s I doubt Atom will have any performance advantage. And I really doubt N2600 will support 640MHz for its SGX545 instead of 400MHz.

By only support 64-bit DDR3 Cedar Trail is less attractive in tablets, as is the 2.1W TDP IO hub (NM10 Express).

Ailuros
05-Oct-2011, 20:11
I don't understand why you're assuming almost a ~2x clock increase for the GPU in A5's successor.

Because smaller manufacturing processes usually allow that weird tendency maybe? 545 is clocked at 640MHz under 32nm; now what exactly is so impossible in clocking a theoretical MP3 by a tad more than 50% under 32nm? SGX545 is times more complex than 535 yet Intel clocked the former at 640MHz@32nm while the latter is in GMA600 at 45nm clocked up to 400MHz. Guess which of the two captures the higher die area and by how much.

Just as a reminder according to past IMG whitepapers the SGX545@200MHz was rated at 12.5mm2@60nm. SGX535 wasn't ever specifically mentioned under 60nm but under 90nm/200MHz it was if memory serves well in the 7mm2 region, meaning that under 60nm/200MHz it could have been in the +/-5mm2 region. That's over twice the die area for 545 under the same progress and 60% more frequency going from SGX535/400MHz/45nm to SGX545/640MHz/32nm.

Now I'm not even suggesting twice the die area; if my estimate wouldn't had been as conservative I would had gone for a MP4@400MHz. Besides Samsung will clock it's new 32nm Exynos by 50% higher than the 45nm one, which will bring its Mali400MP4 from 267MHz today up to 400MHz.

Even if it's made in 28nm and uses the same GPU, it could very well have the same clocks.28nm is more like TSMC for 2013 volumes IMHO.

As we've seen from GLBenchmark's offscreen test, the SGX535 in 3GS' S5PC100 @ 65nm has the same clocks as the SGX535 in A4 @ 45nm.No it hasn't; go back to the thread and re-read the replies.

And AFAIK, the SGX543MP2 in iPad2 could be @200MHz, so we could be looking at 21.5 GFLOPs/s, 1.2 GTexels/s and 9.6GPixels/s in iPad2/3 against 10.24FLOPs/s, 1.28GTexels/s and 10.24Pixels/s in an Atom 32nm tablet.No it's at 250MHz as a minimum and that because exactly the GLBenchmark2.1 fillrate tests.

By the way at 200MHz the MP2 would have 14.4 GFLOPs, 800 GTexels, 6.4 GPixels.

Furthermore, CPU performance should be a lot higher in an Atom dual-core, four-threaded @ ~1.5GHz (?) against 2*1GHz A9 in iPad2 and maybe also faster against whatever Cortex A9 multi-core may come in A6.I wasn't and am not talking about iPad2; if you can find any device with a GMA600 in it that would be the real indirect competitor but lack thereof it's more like competing against itself.

I am talking about Apple's next SoC and the possibilities there with which Cedartrail will actually indirectly compete if history doesn't repeat itself.

rpg.314
05-Oct-2011, 20:15
So Intel's 2012 Atom for tablets could still be competitive to iPad 3 in both CPU and GPU power.

A15 is going to squash Atom, any which way you slice it.

rpg.314
05-Oct-2011, 20:16
I don't understand why you're assuming almost a ~2x clock increase for the GPU in A5's successor.
Even if it's made in 28nm and uses the same GPU, it could very well have the same clocks.
Not increasing the clocks even with a full node shrink would be just plain dumb.

mczak
05-Oct-2011, 20:53
Not increasing the clocks even with a full node shrink would be just plain dumb.
The free clock increases with shrinks are long over (since 90nm or so).
What you could do is redesign the logic (e.g more transistors and hence die area) so it's able to run at a higher clock with lower voltage.
I have no idea what the plans there are but I think both basically the same clock (but possibly MP3/4 if more performance is "needed") or higher clock are possible solutions for future chips.
I don't think the 640Mhz SGX545 clock has much to do with the shrink, it comes at a quite "massive" power cost ("massive" here in the context of low power devices). If you look at the datasheet, the 640Mhz SGX545 in the N2800 has a I_TDC (I think that means sustained current but not sure) of just twice that of the 400Mhz N2600 (2.6A vs. 1.3A), together with the no doubt higher voltage it will have (datasheet doesn't list it as it's multiple VID just saying between 0.75V and 1.05V) you're looking at something like 1W for the 400Mhz version vs. 2.5W for 640Mhz most likely.
So the 640Mhz SGX clock instead of 400Mhz more look like the result of having a higher power envelope (since there's no separate netbook atom anymore) rather than anything else.

edit: the numbers are probably not comparable (I think the numbers for the old chips include more than just graphics) but E600 series has VNN (which is used for graphics) in the range from 0.75 to 0.9875V - hence the max is actually slightly higher for Cedar Trail and I bet that's for the higher possible clock. Old E600 atoms also are rated for 1.6A max there - so the 640Mhz SGX545 on 32nm definitely needs more power than the 400Mhz SGX535 on 45nm did. The 400Mhz SGX545 looks like quite an improvement though considering the expected performance difference.

I wonder what the TDP is of that A5 SGX543MP2 at 200Mhz - anyone got a datasheet :-). I assume though it needs to be in the same ~1W range max?

Ailuros
05-Oct-2011, 21:05
So the 640Mhz SGX clock instead of 400Mhz more look like the result of having a higher power envelope (since there's no separate netbook atom anymore) rather than anything else.

True yet if comparing aspects with a SGX545@400MHz in mind, the picture in terms of performance comparisons against any MPx/32nm alternative doesn't look any better, rather the contrary.