PDA

View Full Version : Llano IGP vs SNB IGP vs IVB IGP


Pages : [1] 2

AnarchX
29-Oct-2010, 14:40
How do you think they will compare?

Llano:
- 32nm
- 400SPs (5D VLIW) @ up to 600MHz
- dual-channel DDR3 @ ~ 1.6Gbps
- mid 2011

Intel Graphics HD 200:
- 32nm
- 12 EUs (4D MADDs?) doubled troughput over last generation , 4 TMUs, clocks up to 1.35GHz
- Direct3D 10.1 support, OpenCL, DirectCompute
- connected to 8MiB LL-cache
- dual-channel DDR3 @ ~ 1.6Gbps
- early 2011

Iy Bridge Graphics:
- 22nm
- 16 EUs according to Intel
- Direct3D 11 support
- stacked DRAM?
- early 2012

Chabi
29-Oct-2010, 16:43
SNB IGP OpenCL compatible?

AnarchX
29-Oct-2010, 16:44
Ce core graphique intègre cependant le support de l’antialiasing pour pouvoir passer à DirectX 10.1. Il supporte également OpenGL 3.1 et, plus intéressant, OpenCL. DirectCompute en version 4.1 est également au menu.
http://www.hardware.fr/articles/803-5/idf-2010-atom-sandy-bridge-honneur.html

mczak
29-Oct-2010, 18:01
Intel Graphics HD 200:
- 12 EUs (4D MADDs?) doubled troughput over last generation , 4 TMUs, clocks up to 1.35GHz

The EUs still can't do MAD. They can, however, do MAC (with a special accumulator reg), and, in contrast to the last generation, enable/disable accumulator update per instruction, which might make it more easy to exploit this. Earlier EUs were 4D physical, 8D logical (well they had 4D mode but such a 4D instruction still took 2 cycles), so it's possible (but I don't know) they are 8D physical now (which would explain the "double throughput" but maybe that quote was meant to describe something else).
I'm quite sure there were 8 TMUs even for i965 already (though not sure what they could do per clock), and I certainly wouldn't expect SNB to have less (in theory, it could have more, since it appears some versions will have 6 EUs the other 12 EUs, it's possible at least on paper the tmu block isn't shared).
In any case, texture fillrate should be quite good even with 8 TMUs (possibly approaching Llano levels), with the caveat I've no idea about FP16 etc. For flops, if that's 4D units, you're looking at ~120GFlops if you count that MAC as 2 ops. If that's 8D units, well then that's twice that which would begin to look nearly comparable to Llano.
So for Ivy Bridge, if that basically doubles SNB graphics performances, that could be quite a challenge for Llano. Though of course there's a lot more to graphic performance than just alus/tmus - one area intel was very weak was what AMD initially named HyperZ, things like early-z (though intel can do this now), z buffer compression etc to save bandwdith. I think though SNB improves this quite a bit, and the 8MB cache could give it a huge advantage in some situations since these chips are quite a bit bandwidth-challenged.

AnarchX
09-Nov-2010, 18:54
Next-Gen Fusions Trinity and Komodo: http://www.abload.de/img/amddesktop126q72.jpg

- still 32nm
- probably L3-Cache connection for IGP
- probably increased die-size (Thuban level ~300mm²) which should allow to increase SIMDs from 6 to 10 (800SPs @ 5D, 640SPs @4D)
- probably mid 2012 release
- Komodo probably with 3 memory channels or GDDR5 sideport

chavvdarrr
09-Nov-2010, 20:05
Next-Gen Fusions Trinity and Komodo: http://www.abload.de/img/amddesktop126q72.jpg

- still 32nm
- probably L3-Cache connection for IGP
- probably increased die-size (Thuban level ~300mm²) which should allow to increase SIMDs from 6 to 10 (800SPs @ 5D, 640SPs @4D)
- probably mid 2012 release
- Komodo probably with 3 memory channels or GDDR5 sideport
I had a feeling that Zacate has 2 SIMDs with 80SPs total

AnarchX
10-Nov-2010, 07:24
I had a feeling that Zacate has 2 SIMDs with 80SPs total
The topic is about higher performance APUs/CPU-IGP-chips: Llano IGP vs SNB IGP vs IVB IGP.

hkultala
10-Nov-2010, 07:41
Next-Gen Fusions Trinity and Komodo: http://www.abload.de/img/amddesktop126q72.jpg

- still 32nm


Yes, of course.


- probably L3-Cache connection for IGP


I see nothing suggesting this.


- probably increased die-size (Thuban level ~300mm²) which should allow to increase SIMDs from 6 to 10 (800SPs @ 5D, 640SPs @4D)


.. except that Llano will not have 6 but 3 SIMD cores (240 ALUs).
And I don't except them to increase die size much, would be too costly to manufacture.

My estimate is increase from 3(*80) to 4(*64)



- probably mid 2012 release
- Komodo probably with 3 memory channels or GDDR5 sideport

AMD has never used non-2-power memory buses before. I don't except them to do it with Komodo either.

hkultala
10-Nov-2010, 12:30
- Komodo probably with 3 memory channels or GDDR5 sideport

And there won't be a sideport in a chip which does not contain a GPU.

AMD's PDF document for the investor day:

http://phx.corporate-ir.net/External.File?item=UGFyZW50SUQ9Njk3NDJ8Q2hpbGRJRD0 tMXxUeXBlPTM=&t=1


“Komodo”
Market: Server and Performance Desktops
What is it? “Komodo” is AMD’s next generation CPU and is primarily intended for
servers and high-performance desktops. “Komodo” will feature next-generation
“Bulldozer” CPU cores and, in desktop PC platforms, is designed to couple with
DirectX® 11 GPUs to provide enthusiast-level system performance.
Planned for introduction: 2012

caveman-jim
10-Nov-2010, 20:00
"designed to couple with" doesn't prove the existence of sideport.

keritto
11-Nov-2010, 05:14
Next-Gen Fusions Trinity and Komodo: http://www.abload.de/img/amddesktop126q72.jpg


Komodo is listed asCPU, and you should differentiate it from Llano and NG-Trinity as it could be seen in slides :wink:

Komodo is CPU and guesstimating that it will probably be augmented with GPU similar to one used in Ontario/Zacate APUs, up to 80SPs (5D-VLIW) but more probably 64SPs "3rd Gen DX11" 4D-VLIW with other TMU:ROPS unchanged from O/Z. My guess is that Komodo will probably addressing lack of IGPs in new chipsets and also make it more comparable to intels SB. And it will be socket compatible with Zambezi (AM3r2)

As for Trinity APU as it's in slides 2-4 BD cores, i in fact hope for 4-6 BD cores and "3rd Gen DX11" (SI) with maybe some minor upgrade from 480SPs 5D (EG/"NI" shaders) in Llano to 640SPs 4D (SI shaders). But then maybe AMD will stay to 2-4 BD cores just so they could add up necessary 4MB of L3 cache to it instead of extra 2 BD cores.

Trinity
2-4BD cores (4MB L2 cache)
4MB L3 cache
640SP (4D DX11 gen3)
sFM1/sFS1

or better (?)
4-6BD cores (6MB L2 cache)
no L3 cache
640SP (4D DX11 gen3)
sFM1/sFS1

second solution would certainly need less job to adapt Llano style APU design to Trinity design.

And does GPU really benefit from additional 4MB L3, instead already large 6M L2 (total for six BDv1 cores) available in HPC case. And for most of 3D/gaming work Llano and probably Trinity will rely on cheap 128-bit DDR3 1866MHz memory BW giving 30GB/s in total (shared w/ CPU) which is probably even good enough for budget dual display 1080p noAA/noAF gaming (considering for praised 640SP), or single 1080p 2AA/16AF?

hkultala
11-Nov-2010, 06:45
As for Trinity APU as it's in slides 2-4 BD cores, i in fact hope for 4-6 BD cores and "3rd Gen DX11" (SI) with maybe some minor upgrade from 480SPs 5D (EG/"NI" shaders) in Llano to 640SPs 4D (SI shaders). But then maybe AMD will stay to 2-4 BD cores just so they could add up necessary 4MB of L3 cache to it instead of extra 2 BD cores.


more than 4 bulldozer cores/2 bulldozer modules would make it too big.
It's still manufactures at 32nm, and it's not a high-end products, so it must not big too big/too expensive to manufacture.

And I don't see L3 cache as "necessary thing" for this market segment. With 2*2 MB L2 cache there is already plenty of cache.

mczak
11-Nov-2010, 12:46
As for Trinity APU as it's in slides 2-4 BD cores, i in fact hope for 4-6 BD cores and "3rd Gen DX11" (SI) with maybe some minor upgrade from 480SPs 5D (EG/"NI" shaders) in Llano to 640SPs 4D (SI shaders).

I really don't see the 480SPs in Llano - not with the flop numbers AMD quoted. More like 240SP IMHO.

And I don't see L3 cache as "necessary thing" for this market segment. With 2*2 MB L2 cache there is already plenty of cache.
Well, the advantage of L3 is that you can use it for graphics too - L2 being exclusive to the cpu cores. This also probably means you can make the L2 cache attached to the ROPs smaller if you've got shared L3 and it's still faster (as the gpu l2 cache wasn't that large). Clearly, for Phenom II / Athlon II the L3 cache did not really help THAT much - but that balance should shift towards the solution with L3 cache in terms of performance benefits / area if you can also use it for the graphic core. It might require some changes to the MC/graphic core though, which might be something AMD isn't willing to do (as they couldn't just use basically unchanged discrete gpu cores).

hkultala
11-Nov-2010, 21:43
Well, the advantage of L3 is that you can use it for graphics too - L2 being exclusive to the cpu cores.


What makes this an advantage?

hkultala
11-Nov-2010, 21:45
I really don't see the 480SPs in Llano - not with the flop numbers AMD quoted. More like 240SP IMHO.


Yep.

And the size of the GPU part of the chip also seems to indicate it has 240 shader ALU's, not 480.

Alexko
11-Nov-2010, 21:47
I really don't see the 480SPs in Llano - not with the flop numbers AMD quoted. More like 240SP IMHO.

They said 500+ GFLOPS. That sounds to me like 480SPs @ ~550MHz or maybe 400SPs @ ~630MHz.

240SPs at ~1040MHz just doesn't seem realistic, power-wise.

http://tof.canardpc.com/preview/c98294ef-2e81-49fc-a71c-e3ce075f3e12.jpg (http://tof.canardpc.com/view/c98294ef-2e81-49fc-a71c-e3ce075f3e12.jpg)

That GPU-part looks to be around 100mm², which is close to Redwood's size, but on 32nm.

mczak
12-Nov-2010, 03:25
They said 500+ GFLOPS. That sounds to me like 480SPs @ ~550MHz or maybe 400SPs @ ~630MHz.

240SPs at ~1040MHz just doesn't seem realistic, power-wise.

The quote was 400-500 GFlops. And from how it was worded, it was for the whole chip. Which leaves 300-400Gflops for the GPU. With 240SPs that gives you 625-830Mhz. Sounds doable to me.

That GPU-part looks to be around 100mm², which is close to Redwood's size, but on 32nm.
You are right it looks quite big.

Alexko
12-Nov-2010, 06:57
The quote was 400-500 GFlops. And from how it was worded, it was for the whole chip. Which leaves 300-400Gflops for the GPU. With 240SPs that gives you 625-830Mhz. Sounds doable to me.

You are right it looks quite big.

There was another comment during analyst day, where the guy said 500+ GFLOPS, worded in a way that makes me think it was just for the GPU. I don't have time right now but I'll try to find it a link it later today.

mczak
12-Nov-2010, 13:16
There was another comment during analyst day, where the guy said 500+ GFLOPS, worded in a way that makes me think it was just for the GPU. I don't have time right now but I'll try to find it a link it later today.
Even with 500+ gflops for the gpu, shouldn't 400 SPs be more than sufficient? That would only need 625Mhz. Shouldn't the 32nm SOI process actually allow clock increases over 40nm bulk? Granted the structure doesn't really look like that. But it would be strange imho if there would be so many simds (hence increasing cost) but then they'd be clocked so low.

Alexko
12-Nov-2010, 13:45
Even with 500+ gflops for the gpu, shouldn't 400 SPs be more than sufficient? That would only need 625Mhz. Shouldn't the 32nm SOI process actually allow clock increases over 40nm bulk? Granted the structure doesn't really look like that. But it would be strange imho if there would be so many simds (hence increasing cost) but then they'd be clocked so low.

400 SPs seems plausible, but 240 doesn't, IMO.

I can't find a free transcript for Tuesday's analyst day, but I think the quote in question was during the Client platforms breakout session, for which the webcast is still available.

Karoshi
12-Nov-2010, 18:41
Why are there no APU's GPUs running at 2+ GHz?

DavidC
15-Nov-2010, 09:00
Why are there no APU's GPUs running at 2+ GHz?

It's all about balance. Remember we aren't talking about the 1980's which the components had passive cooling using 3W. We are already limited by cooling and power consumption.

It's probably better to get 400SPs at 650MHz than 200SPs at 1300MHz. GPU code has extremely high parallelism so adding more SPs are easier than clocking it high.

Nvidia does have high clock speeds for its SPs, but again, its just for SPs. All other blocks clock much lower. ATI design calls for having everything clock like the base clock. I guess they can change it, but not something that'll happen overnight.

Even if the process technology, thermal and power limits, and costs of development allow clocking the GPU at 2GHz, does the design allow it?

hkultala
15-Nov-2010, 19:48
It seems intel is finally at least developing openCL implementation for their integrated GPU's:

They just sent an email to llvm-developers list, recruiting people to develop their llvm-based opencl implementation:



LLVM Software engineer at Intel,CA(Santa Clara or Folsom)

In this position, you will be responsible for designing and developing highly competitive OpenCL (Open Compute Language, a new industry standard for heterogeneous data and task parallel computing across GPU's and CPU's). You will be supporting on integrated graphics processors. This includes a JIT compiler, a library of built-in functions and OpenCL runtime driver support. Responsibilities (depending on your skill set) will include applying state of the art compilation/JIT technology, knowledge of high performance math algorithms and system architecture skills to allow applications to tap into the computation power of GPUs previously only available to graphics applications ....

rpg.314
30-Dec-2010, 06:28
http://www.semiaccurate.com/2010/12/29/intel-puts-gpu-memory-ivy-bridge/

Interesting.

GZ007
30-Dec-2010, 10:42
http://www.semiaccurate.com/2010/12/29/intel-puts-gpu-memory-ivy-bridge/

Interesting.

The size and the bandwith doesnt sound to realistic to me.
Wouldnt they use it already in server cpu-s if they could get 1 GB of memory at 5770 speeds in the ivy bridge design.

rpg.314
30-Dec-2010, 12:06
Not quite. Graphics can live with memory under 1 gig. Server workloads can't. Besides, it is not immediately clear that it will have lower latency than normal dram though it is likely. Besides, you would want to start with a lower risk product.

Arun
30-Dec-2010, 12:13
512-bit LPDDR2 stacks? That's unlikely although not strictly impossible. Also describing LPDDR2 as "old" when there's barely any smartphone using it today proves only that Charlie doesn't know enough about that part of the market to speculate intelligently about it. It's interesting that nobody is thinking of doing that kind of bus width before the JEDEC Wide I/O standard with TSV (Through Silicon Vias i.e. 3D packaging) but Intel is hardly using a traditional approach here so standards are not very relevant.

LPDDR2 chips are always 32-bit and the only official JEDEC packages are for Package-on-Package configurations. The maximum is a 64-bit PoP package with 2 or more chips. But Intel could certainly buy the raw chips and stack it themselves (something they couldn't do with GDDR5 perhaps) - they'd need one chip for every 32-bit. That would mean 16 chips for 512-bit (each 512Mbit for 1GB). That's a HUGE stack - this isn't going to be a thin package if true! Obviously that would be the top SKU and not aimed at ultraportables or netbooks - but the problem is that if you've got only 256MB then you can't have more than a 128-bit memory bus (there are no 256Mbit chips and 512Mbit isn't the most cost efficient standard already). They'd also be wasting a fairly huge 384-bit worth of their memory controller! It doesn't matter that the memory chips are closer and that the CPU's pitch is smaller, the memory controller (and probably PHYs) are still going to cost the same - that is to say... quite a lot!

Could Ivy Bridge be doing something fancy with in-package memory resulting in a substantial performance boost? Yes. But I'm not convinced it's what Charlie is describing if so. Either way I'd like one of these - if he's wrong, deeply stacked enough to be smoked.

3dilettante
30-Dec-2010, 14:54
Silicon interposers have been part of an FPGA vendor's product plans already, so that is doable.
I think Altera was the one. (edit: Xilinx)
I hope the drawn diagram isn't too accurate, since that would require one massive glob of thermal grease to reach from the CPU to the bottom of the heatspreader.

It's a slight step backwards from the progressively more unified GPU/CPU memory hiearchy introduced by Sandy Bridge.
The on-die memory hierarchy on the CPU could still be unified, but there would be a secondary memory controller that would be primarily useful to the GPU.
Perhaps at some point it would just be a DRAM L4 cache? It seems like a waste to have it idling if someone opts out of the on-board graphics.

rpg.314
30-Dec-2010, 15:01
Silicon interposers have been part of an FPGA vendor's product plans already, so that is doable.
I think Altera was the one.
I hope the drawn diagram isn't too accurate, since that would require one massive glob of thermal grease to reach from the CPU to the bottom of the heatspreader.
Sounds like AMD *could* do it reasonably cheaply.

It's a slight step backwards from the progressively more unified GPU/CPU memory hiearchy introduced by Sandy Bridge.
The on-die memory hierarchy on the CPU could still be unified, but there would be a secondary memory controller that would be primarily useful to the GPU.
Perhaps at some point it would just be a DRAM L4 cache? It seems like a waste to have it idling if someone opts out of the on-board graphics.

I assume that the driver will preferentially allocate memory for graphics objects from this pool.

Teaching libc malloc to not touch this would be a different matter though. Or can drivers lock down a segment of physical memory to themselves?

rpg.314
30-Dec-2010, 15:28
Also describing LPDDR2 as "old" when there's barely any smartphone using it today proves only that Charlie doesn't know enough about that part of the market to speculate intelligently about it. I remember reading on these forums (not sure who said it) that LPDDR2 was expensive and hence wasn't being used. It COULD be that charlie meant that it the standard had been finalized a while ago.

fehu
30-Dec-2010, 15:40
http://www.semiaccurate.com/static/uploads/2010/12_december/Ivy_Bridge_diagram.png

He even added the watermark in the easiest spot to delete it :grin:

Arun
30-Dec-2010, 15:42
I remember reading on these forums (not sure who said it) that LPDDR2 was expensive and hence wasn't being used. It COULD be that charlie meant that it the standard had been finalized a while ago.Pretty sure I'm the one who said that ;) Specifically that the Apple A4 used 64-bit LPDDR1 instead of 32-bit LPDDR2 because 512MB of the latter would be a lot more expensive in that timeframe (and might not even have been available in the volumes Apple needs). Hopefully in early 2012 there wouldn't be a huge price difference versus LPDDR1 anymore, but there would still be a big price difference versus DDR3. No idea how it would compare per megabyte versus GDDR5. Expect plenty of LPDDR2 devices in 1H11 (starting with the LP Optimus 2X using Tegra2).

3dilettante
30-Dec-2010, 15:49
Sounds like AMD *could* do it reasonably cheaply.

Could, if that is the direction they choose. GlobalFoundries may have some input on this.
The additional question is "when".

It's not only process technology Intel has historically beat AMD on by a wide margin.
Packaging technology has also been a strong suit for Intel, with AMD usually lagging by a fair amount.

Also, I checked and it was Xilinx that had the silicon interposer tech.

rpg.314
30-Dec-2010, 16:17
Packaging technology has also been a strong suit for Intel, with AMD usually lagging by a fair amount.What is there in packaging tech to beat your competitor with? PPro's L2 cache comes to mind but doesn't seem like that big a deal.

3dilettante
30-Dec-2010, 16:35
Intel transitioned faster to organic substrates when that first came into use, and faster to use LGA packages.
It was faster to eliminate lead from its packaging, and one of the first to get a handle on the reliability issues that arose because of it.

Intel was also able to mass-produce dual-die packages much earlier than AMD. This was perhaps due to necessity, but this predates AMD's MCM by years.
As a result, it beat AMD's single-chip multicores to market, both for the dual and quad-core transitions.

TKK
30-Dec-2010, 19:59
Intel transitioned faster to organic substrates when that first came into use, and faster to use LGA packages.
It was faster to eliminate lead from its packaging, and one of the first to get a handle on the reliability issues that arose because of it.

Intel was also able to mass-produce dual-die packages much earlier than AMD. This was perhaps due to necessity, but this predates AMD's MCM by years.
As a result, it beat AMD's single-chip multicores to market, both for the dual and quad-core transitions.

These are simply situations where having tremendous resources to throw at certain issues pays off.
Their resources sure allow them to react very quickly.


Anyway, while you can never know with Intel, I wouldn't be surprised if the IB incarnation comes with something moderate like 128-bit/256MB, basically Intel's own answer to 'sideport memory'. Too much of this would drive up cost and make cooling rather challenging, I think.

I also wouldn't be surprised if developement of this started around the time AMD showcased sideport memory.

DavidC
02-Jan-2011, 05:40
Anyway, while you can never know with Intel, I wouldn't be surprised if the IB incarnation comes with something moderate like 128-bit/256MB, basically Intel's own answer to 'sideport memory'. Too much of this would drive up cost and make cooling rather challenging, I think.

I also wouldn't be surprised if developement of this started around the time AMD showcased sideport memory.

Doesn't ANYONE remember the leaked slide with Gesher and Larrabee stating similar things?

0-512MB 64GB/s bandwidth

DarthShader
02-Jan-2011, 10:15
http://www.xtremesystems.org/forums/showpost.php?p=4686529&postcount=45

Some info from Dr.Who? - ca. 60% hit rate of the L3 cache for the IGP. Concludes SB graphics will be faster than Llano - becasue the latter will be bandwidth starved and have shaders idling.

itsmydamnation
02-Jan-2011, 13:56
Some info from Dr.Who? - ca. 60% hit rate of the L3 cache for the IGP. Concludes SB graphics will be faster than Llano - becasue the latter will be bandwidth starved and have shaders idling.

current high end amd GPU's having like 7-8mb of ram on them, do they actually need to hit cahce at all. to me and im a layman, thinking about this logically doesn't make much sence.

GPUs are:
high memory latency
high memory thoughput

I would have thought that GPU's dont get that much duplicate data being pulled over the memory bus which is where a cache would help reduce memory bandwidth. Also if its getting 60% hit what about cache thrashing on CPU intensive games.

if that tiny amout of cache actually helped a GPU you think we would have seen it by now.

The other thing that a cache does is reduce latency, who cares about that for a GPU. so to me that conclusion makes little sence and until we know the way LLano's memory control works how can anything be gugaed.

edit: also is SB cache structure still inclusive, can SB prefetch from memory straight into L3 or does it have to do straight to L1 like AMD?

ShaidarHaran
02-Jan-2011, 16:51
http://www.xtremesystems.org/forums/showpost.php?p=4686529&postcount=45

Some info from Dr.Who? - ca. 60% hit rate of the L3 cache for the IGP. Concludes SB graphics will be faster than Llano - becasue the latter will be bandwidth starved and have shaders idling.

It confirms no such thing. It means exactly what it says. SNB graphics will potentially have access to a greater subset of data at a lower latency than Llano. Latency is practically irrelevant with such a parallel workload as graphics processing. Throughput is key.

compres
02-Jan-2011, 17:49
It confirms no such thing. It means exactly what it says. SNB graphics will potentially have access to a greater subset of data at a lower latency than Llano. Latency is practically irrelevant with such a parallel workload as graphics processing. Throughput is key.

I stopped reading when he mentioned Amdahl's law in this context...

TKK
03-Jan-2011, 16:27
Now that SB reviews are all over the net, anyone else here who fails to see why the IGP is praised so much in some of the reviews?

The way I see it, the only reason why Intel was able to catch up in performance is not that SB's IGP is so great, it's simply that both low-end discrete and AMD IGP performance have stagnated for a long, long time.
The 790GX chipset was released in August 2008, the Radeon 4550 September 2008. That's 2 1/4 years.

Furthermore, right now Intel still lags behind in [driver] stability and image quality. Performance hit from enabling AA/AF is much heavier than on a 5450 as well.

So Intel is getting closer, but I think they're not quite there yet.

Chabi
03-Jan-2011, 16:59
What GFlop/s performance with the SNB IGP ?

How could it be calculated?

mczak
03-Jan-2011, 17:42
Now that SB reviews are all over the net, anyone else here who fails to see why the IGP is praised so much in some of the reviews?

The way I see it, the only reason why Intel was able to catch up in performance is not that SB's IGP is so great, it's simply that both low-end discrete and AMD IGP performance have stagnated for a long, long time.
The 790GX chipset was released in August 2008, the Radeon 4550 September 2008. That's 2 1/4 years.

This is certainly a reason. But you can't fault intel for that.

So Intel is getting closer, but I think they're not quite there yet.
Haven't seen AA results yet, but from the looks of it intel isn't interested in having somewhat fast graphics for the desktop at all. Why else would they only enable all 12 EUs in the K editions, which have unlocked cpu multipliers you can't use in the H67? Also, only support DDR3-1333 despite the iGPU could likely benefit from faster ram (which you can only use with P67) and despite mobile cpus actually supporting DDR3-1600?
I agree though the 6 EU version isn't really that exciting. Well it's not surprising with half the cores of the old Ironlake (no matter how improved they are and that they are clocked higher - clearly the architecture improvements are there).
What GFlop/s performance with the SNB IGP ?

How could it be calculated?
EUs are still four-wide (128bit, as confirmed by anandtech), so 12 EUs are good for 48 flops per cycle. Or you can say twice that if you accept counting multiply-accumulate using accumulator reg (contrary to AMD/NVIDIA these intel igps can't do MAD only MAC) as two flops.

(So the gflops rating per clock did not actually change since last gen for the 12 EU version - though it should be much more easily possible to use the accumulator with Sandy Bridge, as you can now enable/disable it per instruction.)

Gipsel
03-Jan-2011, 19:11
http://www.xtremesystems.org/forums/showpost.php?p=4686529&postcount=45

Some info from Dr.Who? - ca. 60% hit rate of the L3 cache for the IGP.
To be honest, I would consider a 60% hit rate for an 8 MB cache as piss poor. Intel uses to claim much higher hitrates for their CPUs (90+%).

In some twisted sense, you could already claim a 50% hitrate for a tiny buffer storing just a burst read from memory when you are doing sequential accesses. With an 128Bit interface this would be 128 Bytes and when working with a cache line size of 64 bytes you could claim that each second line comes from this buffer and is therefore a hit. But does it help anything?

Sharing the last level cache may have benefits of lower latency draw calls or can serve as an extended buffer for operations which write an intermediate amount of data to memory and reuse it later. But for most traditional graphics workloads it's probably hardly worth the added effort compared to old fashioned texture and color caches of GPUs (which serves mainly as "bandwidth amplifiers").

ltcommander.data
03-Jan-2011, 21:28
Now that SB reviews are all over the net, anyone else here who fails to see why the IGP is praised so much in some of the reviews?

The way I see it, the only reason why Intel was able to catch up in performance is not that SB's IGP is so great, it's simply that both low-end discrete and AMD IGP performance have stagnated for a long, long time.
The 790GX chipset was released in August 2008, the Radeon 4550 September 2008. That's 2 1/4 years.

Furthermore, right now Intel still lags behind in [driver] stability and image quality. Performance hit from enabling AA/AF is much heavier than on a 5450 as well.

So Intel is getting closer, but I think they're not quite there yet.
Well Tech Report showed the SB's IGP is twice as fast as AMD's 890GX, Anandtech showed the SB's IGP is comparable to the current fastest IGP which was introduced in April 2010, nVidia's 320M, and SB's IGP is faster than the HD 5450 so it is as superlative as you can get for an IGP. Anand's review did mention that Intel is now putting real levels of funding behind game and driver testing, which as you say is where we'll have to keep watching.

mczak
03-Jan-2011, 22:19
Well Tech Report showed the SB's IGP is twice as fast as AMD's 890GX, Anandtech showed the SB's IGP is comparable to the current fastest IGP which was introduced in April 2010, nVidia's 320M, and SB's IGP is faster than the HD 5450 so it is as superlative as you can get for an IGP.
Unfortunately it's only really true for the 12 EU version, but the one you'd get on the desktop (for anything else than the K parts which are unlikely to be paired with H67) is the 6 EU version which isn't really a whole lot faster than AMD's IGP (which clearly shows its age). The situation seems to be much better with the mobile chips though indeed (not only do they have 12 EUs but the quad cores even support faster memory which is probably going to help a bit).
In any case, for the people saying SNB could challenge Llano in 3d performance, they need to think again. While I don't think Llano will be THAT fast, it should imho perform about the same as a HD5550 (ddr3 version - don't get confused by the ddr2 and gddr5 versions of this card...). From the looks of it, not even Ivy Bridge will even try to get there (with the rumors saying 16 EUs - hopefully for a bit more than only the "K" parts for desktop chips then...).

Pressure
03-Jan-2011, 22:23
Well Tech Report showed the SB's IGP is twice as fast as AMD's 890GX, Anandtech showed the SB's IGP is comparable to the current fastest IGP which was introduced in April 2010, nVidia's 320M, and SB's IGP is faster than the HD 5450 so it is as superlative as you can get for an IGP. Anand's review did mention that Intel is now putting real levels of funding behind game and driver testing, which as you say is where we'll have to keep watching.

Yeah, in low settings in every game. Cranking it up to medium settings and it gets apparent that Sandy Bridge is still lackluster.

Imagine how nVidia's 320M would do with Sandy Bridge, compared to the built-in IGP.

TKK
04-Jan-2011, 04:51
and SB's IGP is faster than the HD 5450 so it is as superlative as you can get for an IGP.
On average maybe (although not by much), certainly in some games, but definitely not everywhere. In many games the 5450 is either on par or a bit faster.

We also shouldn't forget that AMD has a Mobility 5470, clocked 100 MHz higher than the desktop 5450.

Haven't seen AA results yet
Unfortunately I don't remember which review and game it was, but it showed HD 3000 dropping from 55ish frames to 22-23 with 4xAA/16AF, whereas the 5450 only dropped from 43 to 30 fps (game was tested in 1024x768 with low settings). Depending on the type of game, 30 fps can be quite playable, while 22-23 is a borderline case (although admittedly, even that can be enough for some games).

but from the looks of it intel isn't interested in having somewhat fast graphics for the desktop at all. Why else would they only enable all 12 EUs in the K editions, which have unlocked cpu multipliers you can't use in the H67? Also, only support DDR3-1333 despite the iGPU could likely benefit from faster ram (which you can only use with P67) and despite mobile cpus actually supporting DDR3-1600?
Point taken, it really doesn't look like Intel cares much for the desktop graphics market right now.
Well, I guess they simply don't need to, all but the absolute cheapest PCs usually come with discrete solutions anyway, and most consumers who want the best CPU on the market will go for Intel, at least until BD hits (maybe even then).
It's a bit different in mobile space, where Intel's IGPs used to be - or, if we ignore SB for now since notebooks with SB are yet to reach the market in significant quantities, still are - a reason to avoid any notebook without a GPU from AMD or Nvidia.

mboeller
04-Jan-2011, 17:51
Unfortunately I don't remember which review and game it was, but it showed HD 3000 dropping from 55ish frames to 22-23 with 4xAA/16AF, whereas the 5450 only dropped from 43 to 30 fps (game was tested in 1024x768 with low settings). Depending on the type of game, 30 fps can be quite playable, while 22-23 is a borderline case (although admittedly, even that can be enough for some games).
.

Maybe here:

http://www.computerbase.de/artikel/prozessoren/2011/test-intel-sandy-bridge/53/#abschnitt_grafikleistung_preview

mczak
04-Jan-2011, 18:10
Maybe here:

http://www.computerbase.de/artikel/prozessoren/2011/test-intel-sandy-bridge/53/#abschnitt_grafikleistung_preview

Based on that (assuming the perf hit is mostly from AA, not AF) it looks like indeed MSAA is more of a checkbox feature (needed for DX10.1) instead of a serious implementation. I guess either the chip lacks buffer compression, and/or the chip can't do enough z tests/updates (either due to lack of capabilities in the rops or even limitations in message passing for the threads). Changing memory clock could give some hints.
In any case, interesting to see the performance difference vanishes between HD2000 and HD3000 - not unexpected, since the HD2000 simply lacks shader units so its performance drop isn't that large with MSAA. For HD3000 the drop is just huge, nearly approaching SSAA levels... Last chip I've seen with such large MSAA performance hit was probably GF4, though it remains to be seen if it fares any better with other games...

mczak
06-Jan-2011, 14:32
Strange, these AA results http://ht4u.net/reviews/2011/intel_sandy_bridge_sockel_1155_quadcore/index27.php show a very different picture. Performance drop quite comparable to HD 5450.

Andrew Lauritzen
07-Jan-2011, 00:27
Going to 4x MSAA and 16x AF requires a large increase in memory bandwidth. These things are already BW starved as it is so it's not surprising to see discrete cards - even slow ones - pull away when you simply jack up the memory bandwidth requirements.

swaaye
07-Jan-2011, 00:31
Now that SB reviews are all over the net, anyone else here who fails to see why the IGP is praised so much in some of the reviews?

Pretty much every new IGP gets people excited at launch but then they are rapidly forgotten.

Blazkowicz
07-Jan-2011, 00:49
Strange, these AA results http://ht4u.net/reviews/2011/intel_sandy_bridge_sockel_1155_quadcore/index27.php show a very different picture. Performance drop quite comparable to HD 5450.

maybe the reviewer used a 5450 with gddr3 in the first link, and a 5450 with ddr2 was used in yours one.

GZ007
07-Jan-2011, 09:00
Pretty much every new IGP gets people excited at launch but then they are rapidly forgotten.

And when AMD and Nvidia finaly arives at next process node(SB is 32nm) and decide to put more than 1-2 SIMD on those entry chips and use gddr5 only (not shity ddr3) than its prety much done.

But i have a feeling that the entry chips were designed to be so slow just because it was still lightyears before intels GMA. And prety much with no effort the same situation could happen if AMD and Nvidia decides to do so again and SB-s igp will be forgotten.(not like they didnt killed it by themself with the H and P mobos, and mainly 6EU gpus on desktop :lol:)

swaaye
07-Jan-2011, 18:41
Well I mean they are forgotten because the reality of their 3D gaming non-potential sinks in. IGPs are largely used to run Aero and maybe video. They are there to make a cheap but "complete" budget PC package. The 3D hardware isn't completely worthless but it might as well be.

They will never be competitive with discrete cards because doing so requires increasing their cost considerably. They need much more bandwidth than they have and more GPU hardware. The vast majority of IGP users would not care to pay for that stuff because they won't use it. Most IGP users wouldn't notice a GMA 950 compared to the latest greatest. Personally I think that all they need to be able to do is suffice for HD video playback because then you can make a HTPC very cheaply. 780G, GF8200 and even GMA4500 already do that.

Also, the sideport-style secondary RAM is there to save power in mobile applications, not to improve performance tangibly (which it doesn't).

I could only see a considerably faster "gamer IGP" happening on some sort of premium motherboard and why on earth would anyone want to buy that instead of a discrete, replaceable gaming card? It would probably end up more like a discrete card integrated onto the mobo because it would need its own fast RAM supply.

Andrew Lauritzen
07-Jan-2011, 19:18
(not like they didnt killed it by themself with the H and P mobos, and mainly 6EU gpus on desktop :lol:)
The "smaller" version is fine for desktop work and anyone who wants to do 3D/gaming needs a discrete GPU still anyways. Gaming on any IGP is a neat little party trick but not a reasonably experience for the time being.

And "throwing GDDR5" on an integrated part is not so obvious. If you're not sharing the memory bus with the CPU then you're basically a chipset-integrated/discrete part anyways with none of the advantages of being on-die. No I think it's far more likely that we have to shift how we do rendering to more bandwidth-friendly techniques liking binning/tiling, similar to what the phones and soon tablets are doing.

AnarchX
07-Jan-2011, 21:41
OCW oced HD Graphics 3000 to 1.9GHz: http://en.ocworkbench.com/tech/asrock-h67m-geht-reviewed-with-core-i5-2500k-with-gpu-overclocked/

So Intel may counter Llano with a clock upgrade?

Alexko
07-Jan-2011, 22:18
OCW oced HD Graphics 3000 to 1.9GHz: http://en.ocworkbench.com/tech/asrock-h67m-geht-reviewed-with-core-i5-2500k-with-gpu-overclocked/

So Intel may counter Llano with a clock upgrade?

A 0.2V voltage bump is pretty huge. I'm not sure what could be achieved at the standard voltage, or at least with a more reasonable increase. Honestly, even at 1.9GHz, I doubt "HD Graphics 3000" could match Llano's GPU.

TKK
08-Jan-2011, 02:44
So Intel may counter Llano with a clock upgrade?

I agree with Alexko, highly unlikely.
Not really feasible in mobile space were power efficiency is so important, and I expect the desktop version of Llano to be clocked quite a bit higher than the mobile version, so it wouldn't help much in desktop space, either.

It will simply be like this:
- People who know a bit and want the fastest CPU with a somewhat acceptable IGP (or don't care about the IGP at all) will buy SB.
- People who know a bit and just want a fast enough CPU with decent IGP will buy Llano.
- And those who don't know anything will base their decision either on brand recognition or price.


Conclusion: Intel wouldn't gain anything by brute-forcing their IGPs to significantly higher clocks, so they won't do it.

AnarchX
08-Jan-2011, 08:38
According to Intel HD Graphics 3000 delivers 105-125 GFLOPs SP MAD: http://software.intel.com/file/33239 page 11

mczak
08-Jan-2011, 14:00
According to Intel HD Graphics 3000 delivers 105-125 GFLOPs SP MAD: http://software.intel.com/file/33239 page 11
12 (EU) * 4 (physical width of EU) * 2 (mul + accumulate) * 1.3Ghz.
Strange, I don't see where the quoted doubling (per clock) from previous generation comes. Maybe intel counts different for new gen.

EduardoS
08-Jan-2011, 20:08
Or is counting fps instead of flops...

mczak
13-Jan-2011, 18:44
maybe the reviewer used a 5450 with gddr3 in the first link, and a 5450 with ddr2 was used in yours one.
No, actually it looks like it was just specific to COD:MW2. Computerbase has their full review up for SB graphics (http://www.computerbase.de/artikel/grafikkarten/2011/test-sandy-bridge-grafik/12/) and indeed in every other title the AA hit is roughly comparable to HD 5450. Could be either driver or something the game is doing SB doesn't like, but generally the AA implementation seems fine.

Tridam
14-Jan-2011, 01:50
12 (EU) * 4 (physical width of EU) * 2 (mul + accumulate) * 1.3Ghz.
Strange, I don't see where the quoted doubling (per clock) from previous generation comes. Maybe intel counts different for new gen.

AFAIK previous Intel IGP can't do single cycle MAD.

mczak
14-Jan-2011, 03:41
AFAIK previous Intel IGP can't do single cycle MAD.
Neither can the current one...
Unless you count the accumulator (whose usefulness has improved, but nothing which would change the theoretical peak flop rate as far as I can tell).
IIRC docs said you couldn't do back-to-back accumulator write followed by accumulator read without stalling (which might no longer be the case dunno), but for a constant stream of MACs this should not affect peak flop rate.

I.S.T.
18-Jan-2011, 14:39
http://www.lostcircuits.com/mambo//index.php?option=com_content&task=view&id=99&Itemid=1

rpg.314
22-Jan-2011, 06:10
Regarding Ivy Bridge GPU, we know it will be DX11, so,

a) How would intel implement local memory? Dedicated ram (a la cayman/fermi) or unified with the rest of the cache hierarchy like lrb? 64 KB of L1 seems doable on somewhat low freq (~1.1 GHz for full turbo).

b) Considering the jumps Intel has made with SB, I would expect unified cpu/gpu address space.

ltcommander.data
25-Jan-2011, 06:48
According to Intel HD Graphics 3000 delivers 105-125 GFLOPs SP MAD: http://software.intel.com/file/33239 page 11
So Intel's pretty clear on Sandy Bridge's IGP supporting Compute Shader 4.x, presumably meaning both CS4.0 and CS4.1, mentioning it on pg 14 and pg 16. Has anyone actually tested this capability yet, since I don't remember seeing it in any reviews?

http://arstechnica.com/apple/news/2010/12/apple-may-drop-nvidia-for-sandy-bridges-igp-next-year.ars
http://www.realworldtech.com/page.cfm?ArticleID=RWT120710035639

And what does this say about Sandy Bridge IGP OpenCL support? Chris Foreman and David Kanter were pretty adamant that Sandy Bridge's IGP couldn't/wouldn't support OpenCL, with Foreman going so far as saying Sandy Bridge couldn't do any GPGPU at all, which now appears incorrect with CS4.x support. If I'm not mistaken all GPUs that have Compute Shader support, CS4.0 in the G80 and up and CS4.1 in the RV770 and up, have also supported OpenCL. Is this generally true, that CS4.x and OpenCL have enough overlap that Intel should be able to support OpenCL as well?

Axel
27-Jan-2011, 16:00
Neither can the current one...

http://intellinuxgraphics.org/IHD_OS_Vol4_Part2_July_28_10.pdf
lists mad as opcode 0x5b (page 154), and has full description in 8.3.25 (p 211)

AnarchX
28-Jan-2011, 10:37
Fudo says that IVB features "only" 16 EUs: http://www.fudzilla.com/graphics/item/21658-16-graphics-eus-in-ivy-bridge

Could it be possible that Intel goes from MACs to MADDs?

HD Graphics 3000 @ 1,35GHz : ~130 GFLOPs
IVB Graphics @ ~1,5GHz: ~380 GFLOPs?

mczak
28-Jan-2011, 13:37
Fudo says that IVB features "only" 16 EUs: http://www.fudzilla.com/graphics/item/21658-16-graphics-eus-in-ivy-bridge

Could it be possible that Intel goes from MACs to MADDs?

HD Graphics 3000 @ 1,35GHz : ~130 GFLOPs
IVB Graphics @ ~1,5GHz: ~380 GFLOPs?
I've got some doubts they will support normal MADD but either way obviously intel counts it the same so it wouldn't improve flops.
I was wondering though with the earlier rumors about 24 EUs and now 16 EUs, maybe both rumors are true? Some chips with 24 and some with 16, instead of the current 12 or 6?

rpg.314
28-Jan-2011, 14:41
Fudo says that IVB features "only" 16 EUs: http://www.fudzilla.com/graphics/item/21658-16-graphics-eus-in-ivy-bridge

Could it be possible that Intel goes from MACs to MADDs?

HD Graphics 3000 @ 1,35GHz : ~130 GFLOPs
IVB Graphics @ ~1,5GHz: ~380 GFLOPs?

I don't know how accurate this is, but as a general rule of thumb - and paraphrasing Arun - people leak stuff to Fudzilla only when they want to deny it. :wink:

DavidC
29-Jan-2011, 01:54
I've got some doubts they will support normal MADD but either way obviously intel counts it the same so it wouldn't improve flops.
I was wondering though with the earlier rumors about 24 EUs and now 16 EUs, maybe both rumors are true? Some chips with 24 and some with 16, instead of the current 12 or 6?

Maybe not. Earlier rumors of G35 had EUs at 16, but it came with 8. They only increased EU count by 20-30% each generation so 16 for IVB isn't surprising.

BTW, about throughput, I don't know how they got the flops number, but just for the claim of doubled throughput might have been with geometry processing(VS/T&L) performance. Read the developer guide for HD graphics Sandy Bridge.

3dcgi
29-Jan-2011, 04:59
Fudo says that IVB features "only" 16 EUs: http://www.fudzilla.com/graphics/item/21658-16-graphics-eus-in-ivy-bridge

Could it be possible that Intel goes from MACs to MADDs?

HD Graphics 3000 @ 1,35GHz : ~130 GFLOPs
IVB Graphics @ ~1,5GHz: ~380 GFLOPs?
What is your distinction between MAC and MADD? To me they are two names for the same instruction.

rpg.314
29-Jan-2011, 06:02
What is your distinction between MAC and MADD? To me they are two names for the same instruction.

Sorta like FMA3 and FMA4, IMO.

mczak
29-Jan-2011, 14:01
Sorta like FMA3 and FMA4, IMO.
Yes, though not quite the same - IIRC FMA3 has several versions so you can choose which of the source regs is also used as destination (as you can specify 3 operands in total). intel igp however will always use the accumulator reg for last source operand.
The manual (from Ironlake) says:
"The mac instruction takes component-wise multiplication of <src0> and <src1>, adds the results with the corresponding accumulator values, and then stores the final results in <dst>."

rpg.314
29-Jan-2011, 15:47
Yes, though not quite the same - IIRC FMA3 has several versions so you can choose which of the source regs is also used as destination (as you can specify 3 operands in total). intel igp however will always use the accumulator reg for last source operand.That doesn't seem like much of a distinction. You can always change the order of your operands.

mczak
29-Jan-2011, 16:00
That doesn't seem like much of a distinction. You can always change the order of your operands.
How so? In a * b + c there's only one operand which gets added to the multiplication result.
Also, you need to get the operand into the accumulator reg first - luckily you can do that "for free", that is instructions can implicitly update the accumulator reg (or they can do it explicitly). You don't need to do that with FMA3 (so for FMA3, the hw still has to be able to fetch 3 normal regs, not so for igp mac).

rpg.314
29-Jan-2011, 16:58
How so? In a * b + c there's only one operand which gets added to the multiplication result.Eg.

Using the inst dest, src notation

just do fma3 c,a,b

mczak
30-Jan-2011, 02:36
Eg.

Using the inst dest, src notation

just do fma3 c,a,b
Yes certainly I was more thinking the other direction (igp mac has specific requirements for the last operand). FMA3 should be more easy to work with (even though it will always overwrite one reg).

rpg.314
30-Jan-2011, 04:05
Yes certainly I was more thinking the other direction (igp mac has specific requirements for the last operand). FMA3 should be more easy to work with (even though it will always overwrite one reg).

OOps. I should have said,

just do fmac c,a,b

mczak
31-Jan-2011, 02:30
OOps. I should have said,

just do fmac c,a,b
I don't understand what you mean. That will do c = a*b + (accumreg).

rpg.314
31-Jan-2011, 03:53
I don't understand what you mean. That will do c = a*b + (accumreg).
the dest register is supposed to be the accumulator register in an fmac operation, right?

mczak
31-Jan-2011, 12:01
the dest register is supposed to be the accumulator register in an fmac operation, right?
No, that's not necessary, but possible. dst reg can be any ordinary reg. (The accum reg cannot, however, be used as another explicit source operand - only instructions not using implicit accum reg can use accum reg as explicit first source operand).

rpg.314
31-Jan-2011, 15:04
No, that's not necessary, but possible. dst reg can be any ordinary reg. (The accum reg cannot, however, be used as another explicit source operand - only instructions not using implicit accum reg can use accum reg as explicit first source operand).
You seem to be be assuming that the accumulator is an implicit operand. For a "normal" isa, the result will be accumulated in destination register.

mczak
31-Jan-2011, 15:40
You seem to be be assuming that the accumulator is an implicit operand. For a "normal" isa, the result will be accumulated in destination register.
I don't assume that I'm reading that from the manual :-).
http://intellinuxgraphics.org/documentation.html
Granted the newest manual is for Ironlake (it is in this part, http://intellinuxgraphics.org/IHD_OS_Vol4_Part2_July_28_10.pdf) but it should be the same for SNB graphics.

rpg.314
26-Feb-2011, 18:27
http://www.xbitlabs.com/news/memory/display/20110222201121_Samsung_Develops_Mobile_Memory_with _Wide_I_O_Interface_Extreme_Bandwidth.html

My $0.02,

a) mobile RAM

b) Too wide IO to be useful for typical LPDRAM applications,

who do you think could use it? :)

fehu
26-Feb-2011, 18:52
1200pins compared to what for a 32bit lpddr2?
and all that for "only" 4x speed increase? why don't use 4 channel 32bit?
even this can be less expensive :S


i've read another rumor about Llano
the gpu part will come in different configurations up to 400sp and different speed bin, with the faster reaching around 5700

Lightman
02-Mar-2011, 00:00
AMD Llano A8-3510MX vs Intel Core i7 2630QM

http://www.youtube.com/watch?v=mdPi4GPEI74&feature=youtu.be&hd=1

Looks good, but lets wait for independent reviews!

Andrew Lauritzen
02-Mar-2011, 00:39
AMD Llano A8-3510MX vs Intel Core i7 2630QM
Interesting if true, but a little suspect that they start a 3D app then *leave it running* during all of the other tests and thus at least implicitly claim equal-or-better performance and lower power in those other workloads as well. That does not follow from their test, even if we accept superior 3D graphics performance (which doesn't really come as a surprise).

Lightman
03-Mar-2011, 19:03
Interesting if true, but a little suspect that they start a 3D app then *leave it running* during all of the other tests and thus at least implicitly claim equal-or-better performance and lower power in those other workloads as well. That does not follow from their test, even if we accept superior 3D graphics performance (which doesn't really come as a surprise).

Obviously this is marketing material so it needs to show the best/worst case scenario.
I suspect 3D app is trashing L3 cache of SB and lowers CPU performance for Excel, but on the other hand Llano have no L3 cache and all the CPU/GPU traffic needs to go through memory controller fighting for resources.
For sure WOW or EVE players will want to jump on Llano ASAP. They will be able to enjoy very playable FPS in games and work on e-mails / spreadsheets at the same time while at work :twisted:.

GZ007
04-Mar-2011, 15:24
I suspect 3D app is trashing L3 cache of SB and lowers CPU performance for Excel, but on the other hand Llano have no L3 cache and all the CPU/GPU traffic needs to go through memory controller fighting for resources.


Core-i architecture has just a tiny 256KB L2 per core. So you could also interpret the benchmark as a stupid design choice from intel. To let the GPU use the L3 cache when the cores have such small L2 cache.
Bulldozer could be quite interesting with 2MB L2 cache per module plus 8MB L3 cache.

Andrew Lauritzen
04-Mar-2011, 17:52
I'm actually less concerned about cache thrashing and more concerned about them measuring software things like 3D driver overhead, which is a moving target.

Lightman
04-Mar-2011, 18:20
Here is another video of Llano vs SB:
http://www.hexus.net/content/item.php?item=29360

This time from CeBit and with some more details.

fehu
16-Mar-2011, 16:59
http://www.xbitlabs.com/news/cpu/display/20110315134650_Initial_Desktop_AMD_Llano_Lineup_Wi ll_Include_Five_APUs_Documents.html

some info about Llano

11 sku with the top model at 400 stream core at more than 600MHz

Silent_Buddha
16-Mar-2011, 21:04
Hmmm, I wonder what the realworld power consumption of these will be? I may just hold off upgrading my WHS/HTPC machine until these launch.

Regards,
SB

Erinyes
17-Mar-2011, 05:57
http://www.xbitlabs.com/news/cpu/display/20110315134650_Initial_Desktop_AMD_Llano_Lineup_Wi ll_Include_Five_APUs_Documents.html

some info about Llano

11 sku with the top model at 400 stream core at more than 600MHz

That would be a wicked chip for a gaming laptop. Hopefully a quad core Llano with the top spec graphics will be available for <$700 (Im personally hoping for $600 but even for $700 it would be fair). Add in a Redwood/Turks graphics card and with hybrid crossfire you'll have Juniper level graphics for cheap

Silent_Buddha
17-Mar-2011, 14:50
That would be a wicked chip for a gaming laptop. Hopefully a quad core Llano with the top spec graphics will be available for <$700 (Im personally hoping for $600 but even for $700 it would be fair). Add in a Redwood/Turks graphics card and with hybrid crossfire you'll have Juniper level graphics for cheap

Well, it "looks" nice, but I'm still extremely skeptical of them reaching even 55xx levels of performance with the 65xx nomenclature. Just look at 68xx compared to 58xx. And this being on the CPU die with the bandwidth contraints to memory that it implies just makes me even more skeptical.

IMO, at least 56xx performance is what I'd need before I consider it adequate for gaming. Still it might be good for light gaming. More importantly for me is that it will be fantastic for HTPC duties if its real world power consumption is significantly lower than an equivalent CPU + 54xx or 55xx with at least the same audio and video playback capabilities.

Regards,
SB

Kaotik
17-Mar-2011, 21:35
Well, it "looks" nice, but I'm still extremely skeptical of them reaching even 55xx levels of performance with the 65xx nomenclature. Just look at 68xx compared to 58xx. And this being on the CPU die with the bandwidth contraints to memory that it implies just makes me even more skeptical.

IMO, at least 56xx performance is what I'd need before I consider it adequate for gaming. Still it might be good for light gaming. More importantly for me is that it will be fantastic for HTPC duties if its real world power consumption is significantly lower than an equivalent CPU + 54xx or 55xx with at least the same audio and video playback capabilities.

Regards,
SB

The naming of HD6800 tells nothing of the rest
HD5550 = 320 SP / 16 TU / 8/32 ROP @ 550MHz / 800-900MHz DDR3 or 900-1000MHz GDDR5 @ 128bit
HD5570 = 400 SP / 20 TU / 8/32 ROP @ 650MHz / 800-900MHz DDR3 or 900-1000MHz GDDR5 @ 128bit
HD5670 = 400 SP / 20 TU / 8/32 ROP @ 775MHz / 1000MHz GDDR5 @ 128bit
HD6670 = 480 SP / 24 TU / 8/32 ROP @ 800MHz / 1000MHz GDDR5 @ 128bit

Erinyes
19-Mar-2011, 04:18
Well, it "looks" nice, but I'm still extremely skeptical of them reaching even 55xx levels of performance with the 65xx nomenclature. Just look at 68xx compared to 58xx. And this being on the CPU die with the bandwidth contraints to memory that it implies just makes me even more skeptical.

IMO, at least 56xx performance is what I'd need before I consider it adequate for gaming. Still it might be good for light gaming. More importantly for me is that it will be fantastic for HTPC duties if its real world power consumption is significantly lower than an equivalent CPU + 54xx or 55xx with at least the same audio and video playback capabilities.

Regards,
SB

Come on now, the 58xx v/s 68xx naming snafu been discussed to death already :wink: But in the case of Llano, we know that it has 400 SP's which is the same as Redwood, which is named Radeon 5670/5570. Now considering its going to have almost the same memory bandwidth as 5570 (if one uses dual channel DDR3-1600), performance should be too far off.

Note that in the case of Zacate it supports just single channel DDR3-1066. Yet Zacate with its GPU at 500 mhz performs anywhere between 50-70% of a discrete Radeon 5450 (which is clocked at 650 mhz, so its got a 23% clock disadvantage anyway). So i think Llano could perform quite well. Im expecting 3-4X Sandy Bridge graphics performance. And as i said, in Hybrid Crossfire with a Redwood/Turks, we could get close to Radeon 5850M (or Juniper) level performance

eastmen
19-Mar-2011, 05:32
Well, it "looks" nice, but I'm still extremely skeptical of them reaching even 55xx levels of performance with the 65xx nomenclature. Just look at 68xx compared to 58xx. And this being on the CPU die with the bandwidth contraints to memory that it implies just makes me even more skeptical.

IMO, at least 56xx performance is what I'd need before I consider it adequate for gaming. Still it might be good for light gaming. More importantly for me is that it will be fantastic for HTPC duties if its real world power consumption is significantly lower than an equivalent CPU + 54xx or 55xx with at least the same audio and video playback capabilities.

Regards,
SB

Last december i bought a hp dm1 i believe that was a dual core neo + hd 3200 intergrated. I added in a hd 4200 and it more than doubled my frame rates in many games and the rest were cpu limited . The upgrade cost me $150 putting my laptop in the near $800 price range . It was an ultra portable but i would expect any form of Llano absoloutly crushing that chip configuration , using less power and costing alot less.

The situation wasn't much better this year

the only ultra portable i can configure is a dm4t which is $50 to add a radeon 6370 512meg or $100 for the 1gig verison. So lets assume $100 that puts me at $800 for a laptop with 6GB of ram , a radeon 6370 1 gig and a i5 450m dual core.

I'm sure the graphics processor in Llano will at least match that or surpase it in performance and I'm expecting the laptops will be much less than $800

ToTTenTranz
24-Mar-2011, 14:34
Last december i bought a hp dm1 i believe that was a dual core neo + hd 3200 intergrated. I added in a hd 4200 and it more than doubled my frame rates in many games and the rest were cpu limited . The upgrade cost me $150 putting my laptop in the near $800 price range . It was an ultra portable but i would expect any form of Llano absoloutly crushing that chip configuration , using less power and costing alot less.

You paid someone to change the soldered northbridge in your subnotebook?!


How is that even possible? Maybe they simply changed the motherboard?

Plus, the only way I see the HD4200 doubling the performance of the HD3200 is if it has
a) Higher clocked (by a lot)
b) using the system memory bus instead of that limited 32bit sideport bus in some HD3200 systems (like mine, i.e.)
c) a combination of the above.


Well, I bought an Acer Ferrari One with an XGP port over a year and a half ago and I've been waiting for that external graphics card to come. It's not coming, of course.
Might as well get look for the mobile HD3870 they made for the Fujitsu model, in e-bay.

Then again, I'll gain even more if I just wait for a 12" model with a Llano.

OpenGL guy
24-Mar-2011, 20:48
You paid someone to change the soldered northbridge in your subnotebook?!


How is that even possible? Maybe they simply changed the motherboard?

Plus, the only way I see the HD4200 doubling the performance of the HD3200 is if it has
a) Higher clocked (by a lot)
b) using the system memory bus instead of that limited 32bit sideport bus in some HD3200 systems (like mine, i.e.)
c) a combination of the above.
Many notebooks have an expansion slot for discrete graphics. They usually have dedicated memory on them.

Pete
24-Mar-2011, 22:23
Right, but aren't x2xx parts typically IGPs, not discrete MXM modules?

eastmen
25-Mar-2011, 19:43
Right, but aren't x2xx parts typically IGPs, not discrete MXM modules?

it was an option on the platform mabye it was a 4330 then. I dunno don't pay much attention

ToTTenTranz
29-Mar-2011, 16:46
it was an option on the platform mabye it was a 4330 then. I dunno don't pay much attention

Oh then yes. Going from a 3430 to a 4330 would give you twice the performance in many cases. (80shaders+8TMUs vs. 40shaders+4TMUs).

Mintmaster
02-Apr-2011, 20:50
Last december i bought a hp dm1 i believe that was a dual core neo + hd 3200 intergrated. I added in a hd 4200 and it more than doubled my frame rates in many games and the rest were cpu limited . The upgrade cost me $150 putting my laptop in the near $800 price range . It was an ultra portable but i would expect any form of Llano absoloutly crushing that chip configuration , using less power and costing alot less.

The situation wasn't much better this year By last December, do you mean Dec 2010?

This past November I bought an Acer 3820TG, which has a 2.4GHz i3 and an HD 5650, and it was under $800. Only 3.9lbs , almost 6h battery life when the discrete is switched off. Faster, lighter, thinner and better looking than the Alienware M11x, plus it has a bigger screen.

I think it's a damn fine laptop. Llano has a lot of potential, but AMD has to execute with price, performance, and power consumption, or else discrete+SB(or even Arrandale) will still be the better option.

Kaotik
04-Apr-2011, 23:53
Slightly related, AMD has started shipping Llanos
http://www.xbitlabs.com/news/cpu/display/20110404130917_AMD_Begins_to_Ship_A_Series_Fusion_ Llano_APUs.html

eastmen
05-Apr-2011, 00:34
By last December, do you mean Dec 2010?

This past November I bought an Acer 3820TG, which has a 2.4GHz i3 and an HD 5650, and it was under $800. Only 3.9lbs , almost 6h battery life when the discrete is switched off. Faster, lighter, thinner and better looking than the Alienware M11x, plus it has a bigger screen.

I think it's a damn fine laptop. Llano has a lot of potential, but AMD has to execute with price, performance, and power consumption, or else discrete+SB(or even Arrandale) will still be the better option.

yea , sorry about the vageness of that. I mean nov 2010 which to me was last nov and this nov 2011 was this nov meh.

I agree they need to hit all the notes and not just a few , i just hope it happens

AlphaWolf
05-Apr-2011, 00:58
Slightly related, AMD has started shipping Llanos
http://www.xbitlabs.com/news/cpu/display/20110404130917_AMD_Begins_to_Ship_A_Series_Fusion_ Llano_APUs.html

now we just need to find out who's going to be first out of the gate with a device running one.

Mintmaster
05-Apr-2011, 09:40
yea , sorry about the vageness of that. I mean nov 2010 which to me was last nov and this nov 2011 was this nov meh.I was actually wondering if you meant Nov 2009. I was just showing you that the $800 benchmark for a thin-and-light 4-5 months ago was not Neo+4330, but rather i3+5650. That's a colossal difference.

I doubt I would like even a 4-core Llano over my setup, and we don't even know if it's suitable for thin and lights, so I don't expect any impact in this class of notebook. I was really excited about Llano back when you couldn't get decent discrete graphics in notebooks under 4lbs, but it's going to have to hit some pretty lofty performance and power consumption targets to be anything but a niche product now.

entity279
05-Apr-2011, 10:38
It's just that you're seeing it from a consumer POV. I'm sure OEMs will jump at things like Llano even if it's just for the reason that they might get away without discrete GPUa for a larger number of configurations.

eastmen
05-Apr-2011, 19:43
I was actually wondering if you meant Nov 2009. I was just showing you that the $800 benchmark for a thin-and-light 4-5 months ago was not Neo+4330, but rather i3+5650. That's a colossal difference.

I doubt I would like even a 4-core Llano over my setup, and we don't even know if it's suitable for thin and lights, so I don't expect any impact in this class of notebook. I was really excited about Llano back when you couldn't get decent discrete graphics in notebooks under 4lbs, but it's going to have to hit some pretty lofty performance and power consumption targets to be anything but a niche product now.

yea i know its gotten better .

Still though your looking at $800 for your laptop thats alot of money for some. i3 + 5650 isn't bad at that price point but will a 4 core llano with a 6x00 part and a 6x00 add in for crossfire at $800 or less be better ?

Isn't the i3 a dual core cpu ?

AnarchX
14-Apr-2011, 13:19
Ivy Bridge is going to be an exciting product. Not only does it continue with the improvements AVX processor SIMD vector capabilities D3D11 and DX Compute Shader, 30 percent more EUs ( execution units ) and supports up to 3 displays and HDMI 1.4a, and an overall bandwidth boost from PCI 3.

http://software.intel.com/en-us/blogs/2011/04/12/visualize-this-game-samples-from-intel/

Nothing about Charlies BW-boosting stacked DRAM.

Man from Atlantis
15-Apr-2011, 08:54
Intel: Ivy Bridge graphics to feature DX11, triple-display support (http://techreport.com/discussions.x/20769)

colinisation
15-Apr-2011, 17:22
@AnarchX:

Charlie's speculation looked like something that would be for Haswell not Ivybridge. Everything listed above in addition to clock speed gains and buffer fettling is all I would expect.

Having said that there is absolutely no info on what Haswell brings to the table?

ToTTenTranz
15-Apr-2011, 17:50
Meh.. 16 EUs won't do wonders.

I can only see the ULV versions trying to compete with a "Neo" version of the dual-core, 160 shader Llano, but still losing terribly in 3D performance.

iwod
21-Apr-2011, 14:06
Meh.. 16 EUs won't do wonders.

I can only see the ULV versions trying to compete with a "Neo" version of the dual-core, 160 shader Llano, but still losing terribly in 3D performance.

If those EU would perform better then previous EU, Having 30% more EU, 30% Higher Clockspeed and 30% per EU performance should bring up to 100% increase in performance.

AnarchX
02-May-2011, 14:38
http://www.computerbase.de/bildstrecke/34348/5/

No information about boosted EUs. 6 EUs for the lower-end IGP looks a bit strange, why no 8 EUs?

Lightman
11-May-2011, 20:25
Another 'leak'

http://www.techpowerup.com/img/11-05-11/55a.jpg

http://www.techpowerup.com/img/11-05-11/55b.jpg

TechPowerUp (http://www.techpowerup.com/145597/AMD-A-Series-APUs-Tested-Against-Sandy-Bridge-CPUs-at-Gaming-on-IGP.html)


If that's caccurate then I'm impressed!

Blazkowicz
11-May-2011, 20:54
they cherry-picked games and settings quite a bit, as an i3 2100 can totally destroy a E-350 sometimes. but that's pretty reasonable.

it's a no-brainer laptop chip!

mczak
11-May-2011, 23:32
Another 'leak'

If that's caccurate then I'm impressed!
Still looks like HD6450-ish performance. Well maybe a bit more for the faster parts. That's what everybody is expecting anyways, so I'm not really impressed. Still, great for a IGP, should make for quite nice not too expensive notebooks.
I'd be more impressed though if we wouldn't know the cpu part is unlikely to be competitive.

SimBy
12-May-2011, 00:24
Is this suppose to be real?! First time I ever see those model numbers.

AlphaWolf
12-May-2011, 04:37
It's gotta be fake, those graphs extend to zero.

mboeller
12-May-2011, 07:09
Another 'leak'

http://www.techpowerup.com/img/11-05-11/55a.jpg

http://www.techpowerup.com/img/11-05-11/55b.jpg

TechPowerUp (http://www.techpowerup.com/145597/AMD-A-Series-APUs-Tested-Against-Sandy-Bridge-CPUs-at-Gaming-on-IGP.html)


If that's caccurate then I'm impressed!

Could that mean that the A4-3400 is a 25W part, the A6-3650 is a 35W part and the A8-3850 is a 45W part? If so then this APUs would really be perfect for notebooks.

Lightman
12-May-2011, 10:20
Is this suppose to be real?! First time I ever see those model numbers.

Yes, model numbers were known for some time now. Just search youtube for official AMD material regarding Llano :wink:


@mczak

Bear in mind these are laptop chips with probably limited memory clock to just 1333MHz. I wonder how well it will perform with proper memory speeds e.g. 1833MHz+.
Besides extending that TDP range from 45W to 95W or maybe even 125W on desktop will give quite a big breathing room clock wise to this APU. Obviously GPU performance will be limited in most cases by memory bandwidth, but still it should be a lot faster than shown on these slides.

SimBy
12-May-2011, 10:25
Could that mean that the A4-3400 is a 25W part, the A6-3650 is a 35W part and the A8-3850 is a 45W part? If so then this APUs would really be perfect for notebooks.

Again, this product codes and power consumption make no sense.

AMD demoed mobile A8-3510MX, which is prolly the fastets mobile part (45W TDP) against i7 2630QM.

So this are either desktop versions, but then the power consumption is not correct, or simply fake. The previously leaked model numbers for desktop Llanos were completely different.

Desktop:
Model Number CPU cores CPU Freq. L2 Cache Turbo Core Model GPU GPU Config GPU Freq. TDP Release Date
E2-3250 2 N/A 1 MB TBD HD 6370 160:??:? 443 MHz 65 W Q3 2011
A4-3350 N/A 2 MB N/A HD 6410 160:??:? 594 MHz 65 W July 20, 2011
A4-3360 N/A 2 MB N/A HD 6410 160:??:? N/A 65 W Q4 2011
A6-3450 4 N/A 4 MB N/A HD 6530 320:??:? 443 MHz 65 W June 20, 2011
A6-3450P N/A 4 MB N/A HD 6530 320:??:? 443 MHz 100 W June 20, 2011
A6-3460 N/A 4 MB N/A HD 6530 320:??:? N/A 65 W Q4 2011
A6-3460P N/A 4 MB N/A HD 6530 320:??:? N/A 100 W Q4 2011
A6-3550 N/A 4 MB N/A HD 6550 400:??:? 594 MHz 65 W June 20, 2011
A8-3550P N/A 4 MB N/A HD 6550 400:??:? 594 MHz 100 W June 20, 2011
A8-3560 N/A 4 MB N/A HD 6550 400:??:? N/A 65 W Q4 2011
A8-3560P N/A 4 MB N/A HD 6550 400:??:? N/A 100 W Q4 2011

Mobile:
Model Number CPU cores CPU Freq. L2 Cache Turbo Core Turbo Speed GPU Model GPU Config GPU Freq. MEM Freq. TDP Release Date
A8-3510MX 4 1.8 GHz 4 MB Yes 2.5GHz HD 6620M 480:24:8 500 - 725 MHz ? 45 W June 2011
A4-3330M 2 2.2 GHz 2 MB Yes ? HD 6480M 160:8:4 ? ? ? June 2011
? ? ? ? ? ? HD 6620Ga ? 400 - 500 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6520Gb ? 400 - 440 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6480Gc ? 400 - 500 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6380Gd ? 400 - 500 MHz 667 - 667 MHz ? ?

fehu
12-May-2011, 12:46
maybe tipical amd/ati strategy to confuse everyone?

Kaotik
12-May-2011, 14:21
maybe tipical amd/ati strategy to confuse everyone?

How on earth is that "typical"? They've had perhaps the most sensible naming strategy for years, only 6750/6770 and 6800-series are exception to the rule

Blazkowicz
12-May-2011, 17:34
I thought the picture said, "there are 25W-35W-45W series A APU. feel free to believe we're implying things, but technically we didn't tell you they are those from the other picture"

fehu
12-May-2011, 22:06
How on earth is that "typical"? They've had perhaps the most sensible naming strategy for years, only 6750/6770 and 6800-series are exception to the rule

amd/ati usually before a product launch does the possible to confuse intel/nvidia with astruse codename and changes of the product names

Kaotik
12-May-2011, 22:45
amd/ati usually before a product launch does the possible to confuse intel/nvidia with astruse codename and changes of the product names

Could you actually show some examples of this, too?
The Rxxx > Names was known long before the release, and SI/NI mess was confusing just as much inside AMD for what I've heard (and wasn't planned in the first place)

fehu
13-May-2011, 09:25
for example the simply fact that Rxx don't exist, instead we have a lot of unrelated name so that nvidia is hard at figuring what is what
then the Si/Ni confusion, from an interview it was a contingent plan but they used it to further confuse nvidia recicling codenames
then the famous wrong sp count
amd has a similar policy in which they show under nda a number of info and then one wrong or made so that they are able to almost know who broken the nda
this produce a number of rumors almost identical except for some ininfluential element

Kaotik
13-May-2011, 15:16
for example the simply fact that Rxx don't exist, instead we have a lot of unrelated name so that nvidia is hard at figuring what is what
then the Si/Ni confusion, from an interview it was a contingent plan but they used it to further confuse nvidia recicling codenames
then the famous wrong sp count
amd has a similar policy in which they show under nda a number of info and then one wrong or made so that they are able to almost know who broken the nda
this produce a number of rumors almost identical except for some ininfluential element

The new naming policy was known long before launch of HD5-series, true that they don't follow the old bigger number = better scheme.
The interview I've read said that the NI/SI confusion was never planned in the first place and caused too much in-house confusion too, NI was going to come first, but 32nm got cancelled > 40nm versions became SI, 28nm = new NI, but for them NI first makes more sense so 40nm = new NI, 28nm = SI.
Also, in case of NI, if my memory serves me right, the names are self explanatory - smalles island(s) name = slowest chip, biggest = fastest chip (or even card in case of Antilles), just the same logic as old Rxxx-naming.

swaaye
13-May-2011, 20:16
It will be interesting to see how Llano boxes come equipped. Obviously this IGP is going to be very bandwidth limited. I can see now the notebooks with a single channel DDR3 interface completely crippling it.

fehu
20-May-2011, 15:39
llano board
http://www.pcgameshardware.de/aid,825078/AMD-Llano-Elitegroup-stellt-das-erste-FM1-Mainboard-vor/Mainboard/News/

http://www.pcgameshardware.de/screenshots/medium/2011/05/A75F-A_V10_02.jpg

fehu
20-May-2011, 15:40
why 3 pci slot?

LordEC911
21-May-2011, 00:17
So does AMD consider the entire chip an "APU?" Or just the GPU part?

Edit- It seems like the entire chip is an "APU." Which would mean the A6-3450, A4-3360 and A4-3350 are the 45w, 35w and 25w APUs respectively.

LordEC911
21-May-2011, 00:27
Desktop:
Model Number CPU cores CPU Freq. L2 Cache Turbo Core Model GPU GPU Config GPU Freq. TDP Release Date
E2-3250 2 1 MB TBD HD 6370 160:??:? 443 MHz 65 W Q3 2011
A4-3350 2 2 MB N/A HD 6410 160:??:? 594 MHz 65 W July 20, 2011
A4-3360 2 2 MB N/A HD 6410 160:??:? 594 MHz 65 W Q4 2011
A6-3450 4 4 MB N/A HD 6530 320:??:? 443 MHz 65 W June 20, 2011
A6-3450P 4 4 MB N/A HD 6530 320:??:? 443 MHz 100 W June 20, 2011
A6-3460 4 4 MB N/A HD 6530 320:??:? 443 MHz 65 W Q4 2011
A6-3460P 4 4 MB N/A HD 6530 320:??:? 443 MHz 100 W Q4 2011
A6-3550 4 4 MB N/A HD 6550 400:??:? 594 MHz 65 W June 20, 2011
A8-3550P 4 4 MB N/A HD 6550 400:??:? 594 MHz 100 W June 20, 2011
A8-3560 4 4 MB N/A HD 6550 400:??:? 594 MHz 65 W Q4 2011
A8-3560P 4 4 MB N/A HD 6550 400:??:? 5940 MHz 100 W Q4 2011

Mobile:
Model Number CPU cores CPU Freq. L2 Cache Turbo Core Turbo Speed GPU Model GPU Config GPU Freq. MEM Freq. TDP Release Date
A8-3510MX 4 1.8 GHz 4 MB Yes 2.5GHz HD 6620M 480:24:8 500 - 725 MHz ? 45 W June 2011
A4-3330M 2 2.2 GHz 2 MB Yes ? HD 6480M 160:8:4 ? ? ? June 2011
? ? ? ? ? ? HD 6620Ga ? 400 - 500 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6520Gb ? 400 - 440 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6480Gc ? 400 - 500 MHz 667 - 800 MHz ? ?
? ? ? ? ? ? HD 6380Gd ? 400 - 500 MHz 667 - 667 MHz ? ?

Some of that is confusing but we can fill in some of it.
Since when the shader count or the mhz changes they have a different GPU model.
As well as the number of CPU cores and L2 cache MB line up...

Blazkowicz
21-May-2011, 00:45
why 3 pci slot?

for people needing a parellel port, tuner, sound card, add an old 100Mb NIC, lab/industrial equipment?
all the same old reasons.

I run a sound card and an Ati Rage in my PCI slots for instance. if noise wasn't a concern I could plug it in an old SCSI controller and have a raid 0 of low latency old disks for the operating system.

I've been paying attention to PCIe slots now, I run a cheap PCIe gigabit NIC, and there are some great PCIe sound cards.
the PCIe 16x slots, even if working at 4x are very versatile. low power graphics card for adding outputs, PCIe 1x card, or high bandwith stuff that already exist but is yet to come to the consumer market (10GBase-T, Thunderbolt)

if the board has IOMMU I'm sold;

so this board looks very nice. lack of IDE means, well I'm giving up optical drives.
sure they could have made it so you lose a PCI slot when you use a dual slot GPU.

Psycho
21-May-2011, 02:02
a bit more detail - and prices:
http://img.donanimhaber.com//images/haber/26510/amdbulldozerlianofiyat_dh_fx57.jpg
http://www.donanimhaber.com/islemci/haberleri/AMDnin-Bulldozer-ve-Fusion-Liano-islemcileri-icin-ilk-fiyat-bilgileri.htm

Erinyes
21-May-2011, 11:51
a bit more detail - and prices:
http://www.donanimhaber.com/islemci/haberleri/AMDnin-Bulldozer-ve-Fusion-Liano-islemcileri-icin-ilk-fiyat-bilgileri.htm

Leaked pricing supposedly but if its true then its not bad at all. Looking forward to a Llano notebook for gaming :D

fehu
21-May-2011, 17:04
The price looks so low on the highest models that i'm starting to ask if bulldozer is a new "agena core"

Blazkowicz
21-May-2011, 17:24
it's about the price of Intel's new "high end" i.e. the core i7 2600k.

sure there is the hexacore on socket 1366, and the future processors on socket 2011 but those are rather marginal for consumers. AMD's socket C32 somewhat competes with them.

trinibwoy
21-May-2011, 17:31
Still looks like HD6450-ish performance. Well maybe a bit more for the faster parts. That's what everybody is expecting anyways, so I'm not really impressed. Still, great for a IGP, should make for quite nice not too expensive notebooks.
I'd be more impressed though if we wouldn't know the cpu part is unlikely to be competitive.

It's going to come down to price. Power consumption is important too but in the end Llano based notebooks will have to come up with decent overall performance to avoid being associated with "cheap" relative to Intel. AMD has worn that badge for too long and it continues to hurt them.

Entropy
21-May-2011, 20:25
It's going to come down to price. Power consumption is important too but in the end Llano based notebooks will have to come up with decent overall performance to avoid being associated with "cheap" relative to Intel. AMD has worn that badge for too long and it continues to hurt them.

There is nothing in that list that has a TDP lower than 65W.
That's not impossible in a notebook, but it's not terribly desirable either, particularly given that the graphics inevitably will be rather lack-luster compared to dedicated gfx-solutions with much better memory subsystems.

Entropy
21-May-2011, 20:30
The price looks so low on the highest models that i'm starting to ask if bulldozer is a new "agena core"

By architectural analysis and leaks alike, it's probably going to be beaten by Sandy Bridge in terms of single thread performance, but may make up for it in multithreaded performance, provided you can use 8 threads load balanced, and without memory hierarchy limitations. I'd say the pricing is higher than I fear is justified in the real world, but hope to get a positive surprise. Very curious about the performance of their new memory controller.

Erinyes
21-May-2011, 20:37
There is nothing in that list that has a TDP lower than 65W.
That's not impossible in a notebook, but it's not terribly desirable either, particularly given that the graphics inevitably will be rather lack-luster compared to dedicated gfx-solutions with much better memory subsystems.

Those are only for the desktop parts. The Laptop parts have a TDP of 45W or less.

With DDR3-1600 they'll have almost as much memory as the desktop equivalent (in this case the Radeon 5570). And with the clocks being lower than the 5570 it shouldnt be too starved of memory b/w. We're still talking roughly 3-4X Sandy Bridge performance

Blazkowicz
21-May-2011, 21:07
almost yesterday most laptop graphics used to have 64bit ddr2 or be an Intel GMA X3000, now it's sandy bridge or a llano with 128bit ddr3.

an interesting test would be running duke nukem forever, or rage for that matter. I hereby predict they will run great on a 45W llano :)

LordEC911
21-May-2011, 22:05
Leaked pricing supposedly but if its true then its not bad at all. Looking forward to a Llano notebook for gaming :D

I believe that is distributor's price, when purchasing 1k unit lots.

swaaye
21-May-2011, 22:08
With DDR3-1600 they'll have almost as much memory as the desktop equivalent (in this case the Radeon 5570). And with the clocks being lower than the 5570 it shouldnt be too starved of memory b/w. We're still talking roughly 3-4X Sandy Bridge performance
Except that there's also a CPU and some other subsystems using that RAM bandwidth. It will be interesting to see its performance. I think it will probably be beaten by Radeon 3850/4670/5570 or GeForce 9600 level hardware. Which is still ok I guess as it establishes a new budget gaming performance level.

pjbliverpool
22-May-2011, 00:41
Except that there's also a CPU and some other subsystems using that RAM bandwidth. It will be interesting to see its performance. I think it will probably be beaten by Radeon 3850/4670/5570 or GeForce 9600 level hardware. Which is still ok I guess as it establishes a new budget gaming performance level.

>360/PS3 power in intergrated chips can only be a good thing. But by the time these chips are in the majority of PC's we'll be looking forward to the next generation of consoles.

That said, now that AMD are in on the game and Intel look like they are starting to get serious (especially in light of the new competition from AMD) we might see the intergrated chips equaling the power of next gen consoles much sooner next generation. If your basic off the shelf laptop is able to graphically compete with next gen consoles by only midway through their life then it could bode very nicely for PC gaming.

I think most people underestimate the truly huge advantage of PC gaming that is cost of games. I don't think I've paid more that £12 for a game in 2 years and most of those are AAA's from the same period. Take that alongside a platform that everyone has already for doing homework and you have a pretty potent combination.

denev2004
22-May-2011, 04:42
From my point of view there is nothing i can say to Intel's IGP, except Larrabee System

DavidC
22-May-2011, 08:14
We're still talking roughly 3-4X Sandy Bridge performance

I don't think so. Here's a benchmark for Nforce 420: http://www.anandtech.com/show/828/12

Sharing memory takes quite a bit off performance. Nforce 420 with 50% more memory bandwidth gets outperformed by the Geforce 2 by 20-30%.

Benches comparing 5570 with 5450 gives a difference of 2.5-3.5x depending on the application. Also, the leaked benches suggest 1.5-2x difference, unless you include the 6EU chip. There's even the laptop comparison slide which shows gains as little as 30%.

Entropy
22-May-2011, 09:41
I don't think so. Here's a benchmark for Nforce 420: http://www.anandtech.com/show/828/12

Sharing memory takes quite a bit off performance. Nforce 420 with 50% more memory bandwidth gets outperformed by the Geforce 2 by 20-30%.

Benches comparing 5570 with 5450 gives a difference of 2.5-3.5x depending on the application. Also, the leaked benches suggest 1.5-2x difference, unless you include the 6EU chip. There's even the laptop comparison slide which shows gains as little as 30%.

I agree with your critique of Erinyes statement, but I think this link from Anandtech, directly comparing the desktop HD5570 with Sandy Bridge HD3000 drives home the point that there is no way that Llano can offer 3-4 times SB performance. Direct comparison link (http://www.anandtech.com/show/4083/the-sandy-bridge-review-intel-core-i7-2600k-i5-2500k-core-i3-2100-tested/11)
Also, while Bulldozer will offer some improvements over the previous generation in terms of main memory handling, I have heard nothing of the sort for Llano (not that I've had my ear to the rail for that processor. Not my segment professionally.).

denev2004
22-May-2011, 10:52
I agree with your critique of Erinyes statement, but I think this link from Anandtech, directly comparing the desktop HD5570 with Sandy Bridge HD3000 drives home the point that there is no way that Llano can offer 3-4 times SB performance. Direct comparison link (http://www.anandtech.com/show/4083/the-sandy-bridge-review-intel-core-i7-2600k-i5-2500k-core-i3-2100-tested/11)
Also, while Bulldozer will offer some improvements over the previous generation in terms of main memory handling, I have heard nothing of the sort for Llano (not that I've had my ear to the rail for that processor. Not my segment professionally.).
It is said that some changes have taken place through K10.5 to Llano, most are not about memory system but maybe some.

Erinyes
23-May-2011, 05:57
Except that there's also a CPU and some other subsystems using that RAM bandwidth. It will be interesting to see its performance. I think it will probably be beaten by Radeon 3850/4670/5570 or GeForce 9600 level hardware. Which is still ok I guess as it establishes a new budget gaming performance level.

True the CPU will definitely have an impact but i dont expect it to be extremely significant (looking at Zacate performance). Here is my own post from earlier in the thread regarding that - "Zacate with its GPU at 500 mhz performs anywhere between 50-70% of a discrete Radeon 5450 (which is clocked at 650 mhz, so its got a 23% clock disadvantage anyway)"

Edit: And Zacate has a 64 bit DDR3-1066 memory controller. Radeon 5450 is mostly using 64 bit DDR3-1600

I agree with your critique of Erinyes statement, but I think this link from Anandtech, directly comparing the desktop HD5570 with Sandy Bridge HD3000 drives home the point that there is no way that Llano can offer 3-4 times SB performance. Direct comparison link (http://www.anandtech.com/show/4083/the-sandy-bridge-review-intel-core-i7-2600k-i5-2500k-core-i3-2100-tested/11)
Also, while Bulldozer will offer some improvements over the previous generation in terms of main memory handling, I have heard nothing of the sort for Llano (not that I've had my ear to the rail for that processor. Not my segment professionally.).

Hmmm maybe i was being too optimistic regarding the performance of Llano, i think i should revise my estimate to 2X-3X of Sandy Bridge performance. The link you've posted shows that the 5570 regularly performs about 2.5 - 3X of Sandy Bridge, but that isnt a true comparison IMHO. The comparison needs to be performed at medium detail and/or higher resolution. If you have the GPU performance of Llano, you wont need to run at low detail except for the most demanding games. And moving to medium details and/or higher resolution can change the picture quite a bit.

And Llano will have a distinct CPU disadvantage which will definitely be felt in some games (eg. Battlefield BC2, Starcraft II to name a few). 32nm should move the clocks up a bit and with the addition of Turbo mode the CPU performance should take a step up from current Athlon II/Phenom II levels. So i think overall a figure of 2X-3X is quite possible. (Also Llano + hybrid Crossfire with Redwood/Turks will be a formidable mobile gaming platform)

Oh and Llano will feel the heat from Ivy bridge six months after its release. If IB manages to double SB performace, its not going to be a pretty sight for Llano.

fehu
23-May-2011, 11:37
http://img.donanimhaber.com//images/haber/26533/amdsabineplatformdetails2a_dh_fx57.jpg

the big meh of the mobile llano :|
the turbo in the husky revision works on some core or on all?

mczak
23-May-2011, 12:38
Oh and Llano will feel the heat from Ivy bridge six months after its release. If IB manages to double SB performace, its not going to be a pretty sight for Llano.
Yes, a two times increase over HD3000 would get it quite close to Llano GPU. So far I'm not quite sure how intel will manage that, after all there aren't that many units more at presumably still similar clocks. One thing though I noticed is that the MRF (message register file) is gone in IB it's all GRF now - might eliminate some message passing overhead. One advantage of SB/IB is of course the use of the L3 - with things like hierarchical-Z (which intel also supports nowadays) this can potentially make the chip quite a bit less bandwidth-starved, so even if the gpu core would be quite a bit slower than Llano it could make up for this due to superior cache/memory hierarchy. I guess AMD will follow suit with GPU-enabled BD but that's still quite far ahead.

LordEC911
24-May-2011, 04:13
http://img.donanimhaber.com//images/haber/26533/amdsabineplatformdetails2a_dh_fx57.jpg

the big meh of the mobile llano :|
the turbo in the husky revision works on some core or on all?

And does it work when the GPU is being used?
Someone mentioned that the TDP is for the base CPU speed + GPU or the max CPU speed w/ no GPU. I can't wait for reviews.

Erinyes
24-May-2011, 05:25
And does it work when the GPU is being used?
Someone mentioned that the TDP is for the base CPU speed + GPU or the max CPU speed w/ no GPU. I can't wait for reviews.

I would think that is max TDP when all cores(at base speed) + GPU are being fully loaded. I dont see why it couldnt work when the GPU isnt being used. If the TDP is below what the chip is rated for, Turbo should activate regardless of what is being used. Another point of note in that slide is that nothing is mentioned about graphics turbo.

Those closks and TDP's are quite disappointing really. The current Phenom II P920/930 run at 1.6/1.7 ghz and are rated at 25 watts. And the Radeon HD 5650M (the equivalent of Llano's graphics) is rated at 15-19 watts. These are at 45nm and 40nm for the CPU and GPU respectively. GF's 32nm HKMG process was supposed to be far superior in power and performance. Given that both CPU and GPU are now built under 32nm i was expecting a lot better TBH

eastmen
24-May-2011, 06:17
I don't get the problem ? The TDP is there because of the high turbo clock speed. your looking at 45w TDP for a 2.6ghz turbo with a 444mhz gpu with 400 shaders

fehu
24-May-2011, 08:08
Actually one of my friends is buyng a notebook with a Turion II P560 (2500 MHz 2MB 25W 45nm) and RS880 for ridicolous 350€
A llano that runs at 1900MHz all the time and overclocks one core (?) to 2500MHz has a tdp of 35W, even worst the base 2100MHz one that go up to 45W
I know that there's the gpu, but are only 240 cores :|
- from 32nm i was expecting something way better
- they are completely missing the 25W product range

and i was extremely interested in this cpu

Alexko
24-May-2011, 09:54
I don't get the problem ? The TDP is there because of the high turbo clock speed. your looking at 45w TDP for a 2.6ghz turbo with a 444mhz gpu with 400 shaders

The point of Turbo is that it's not supposed to increase the TDP, but make better use of it when not all cores are active. So this is a little strange.

Erinyes
24-May-2011, 11:52
I don't get the problem ? The TDP is there because of the high turbo clock speed. your looking at 45w TDP for a 2.6ghz turbo with a 444mhz gpu with 400 shaders

The point of Turbo is that it's not supposed to increase the TDP, but make better use of it when not all cores are active. So this is a little strange.

Like i said, afaik the TDP is the max power consumption when all cores are running at full load (at base speed) + GPU at full load. Turbo would up the core clocks when the GPU isnt being stressed and/or some cores are idle/lightly loaded, as long as power consumption stays within the design TDP.

So in this case 45W would be for 4 cores at 1.9 ghz and one redwood gpu at 444 mhz. Now considering current 4 core phenom's at 1.7 gigahertz consume 25 watts( on 45 nm) and Mobile redwood at 450 mhz consumes about 15-19W(on 40 nm). Add them up and you're looking at 40-44 watts. Now its disappointing that combining the two and shrinking to 32nm still consumes 45 watts, without much of an increase in clocks. (Llano also has a built in PCIE controller i think but that shouldnt have too much of an effect)

fehu
24-May-2011, 14:24
redwood has pcie too, in addition the memory controller and runs at 770MHz
glofo may have some serious problem with 32nm, or we are missing something fundamental, or that specs are wrong

Erinyes
24-May-2011, 16:26
redwood has pcie too, in addition the memory controller and runs at 770MHz
glofo may have some serious problem with 32nm, or we are missing something fundamental, or that specs are wrong

Thats desktop Redwood you're talking about, i.e. Radeon 5670 with a TDP of 60-70W. I was talking about mobile Redwood or Mobility Radeon 5650.

But yes could indicate a problem with the 32nm process. Not like its unprecedented, even when they transitioned from 90nm->65nm, the clocks were actually lower and they couldnt match the 90nm clocks for about a year and a half.

swaaye
24-May-2011, 19:38
The main thing I recall about AMD going 65nm was their nasty L2 latency increase. Surely they did that to improve yield. It caused around a 10% perf per clock deficit though.

eastmen
24-May-2011, 20:30
Like i said, afaik the TDP is the max power consumption when all cores are running at full load (at base speed) + GPU at full load. Turbo would up the core clocks when the GPU isnt being stressed and/or some cores are idle/lightly loaded, as long as power consumption stays within the design TDP.

So in this case 45W would be for 4 cores at 1.9 ghz and one redwood gpu at 444 mhz. Now considering current 4 core phenom's at 1.7 gigahertz consume 25 watts( on 45 nm) and Mobile redwood at 450 mhz consumes about 15-19W(on 40 nm). Add them up and you're looking at 40-44 watts. Now its disappointing that combining the two and shrinking to 32nm still consumes 45 watts, without much of an increase in clocks. (Llano also has a built in PCIE controller i think but that shouldnt have too much of an effect)

I think you are goign to see power savings just by the chips being combined there will be alot of complexity moved off the motherboard.

Also I can't find if the 5650 parts TDP (which all i can find is them saying the rumor points to those TDPs ) include the ram .


Anyway hopefully we get more info at e3 so we canform a more informed picture of Llano.


Kyle at hardocp is saying bulldozer wont come in june , which has pissed me off , i might buy a sandy bridge later today

Alexko
24-May-2011, 23:03
Like i said, afaik the TDP is the max power consumption when all cores are running at full load (at base speed) + GPU at full load. Turbo would up the core clocks when the GPU isnt being stressed and/or some cores are idle/lightly loaded, as long as power consumption stays within the design TDP.

So in this case 45W would be for 4 cores at 1.9 ghz and one redwood gpu at 444 mhz. Now considering current 4 core phenom's at 1.7 gigahertz consume 25 watts( on 45 nm) and Mobile redwood at 450 mhz consumes about 15-19W(on 40 nm). Add them up and you're looking at 40-44 watts. Now its disappointing that combining the two and shrinking to 32nm still consumes 45 watts, without much of an increase in clocks. (Llano also has a built in PCIE controller i think but that shouldnt have too much of an effect)

Which is why I find these specs strange. Hopefully, they're not true.

DavidC
24-May-2011, 23:55
So in this case 45W would be for 4 cores at 1.9 ghz and one redwood gpu at 444 mhz. Now considering current 4 core phenom's at 1.7 gigahertz consume 25 watts( on 45 nm) and Mobile redwood at 450 mhz consumes about 15-19W(on 40 nm). Add them up and you're looking at 40-44 watts. Now its disappointing that combining the two and shrinking to 32nm still consumes 45 watts, without much of an increase in clocks. (Llano also has a built in PCIE controller i think but that shouldnt have too much of an effect)

I think you guys are expecting too much. New process tech allows ~20% increase in performance or 30% reduction in power.

1.7GHz = 25W
GPU = 20W

20% increase in clocks = 2-2.1GHz CPU

+Moving from 40nm TSMC which might be more optimized for power usage than 45nm for CPUs
+PCI Express integration
+Greater average utilization on the memory controller due to integrating the GPU
+Not accounting for TurboCore

You can of course look at the 3500M, with 1.5GHz base clocks and 2.4GHz Turbo clocks, it probably clocks similar in real world usage with TurboCore with the 1.7GHz Phenom II, and saves 10W power for combo.

mczak
25-May-2011, 01:14
The main thing I recall about AMD going 65nm was their nasty L2 latency increase. Surely they did that to improve yield. It caused around a 10% perf per clock deficit though.
That's vastly exaggerated. More like 2-3% on average, only exceeding 5% with L2-testing synthetics. And later revisions brought about half of that back (not for the synthetics, as it was due to memory controller tweaks, not because of L2 changes). It was indeed surprising however - even more so cause AMD at some point said this would enable larger L2 caches in the future, yet they had 1MB (per core) L2 90nm parts and we never saw more than 512KB (per core) on 65nm...

Blazkowicz
25-May-2011, 12:18
I had the AMD 65nm single core die and loved it. huge upgrade over an athlon XP, and the regulated fan would often stop.
that was the 1.9GHz 256K sempron, which I had slightly undervolted and 25% overclocked. bought it for 27 euros I believe! with an upgrade path to the athlon II X2 which is a great behaving 45nm core as well.

maybe AMD should have done a dual core llano a well, but we'll see how disabled models fare. the quad core has a cost with mobile, but it's the same as mobile core i7, with the need for underclocking when all cores are used.

base clock is with all cores loaded 100% so in typical gaming you will be not so much downclocked.

Lightman
25-May-2011, 12:56
Official data:

http://sites.amd.com/us/vision/promo/disclaimer/Pages/ad-disclaimer.aspx

2011 VISION A4-based PC deliver up to 143% better visual performance than a 2011 VISION E2-based PC.

Tests conducted by AMD Performance Labs using FutureMark 3DMark Vantage Performance as a metric for visual performance. The 2011 VISION A4-based PC scored 1625 while the 2011 VISION E2-based PC scored 670. All scores rounded to the nearest whole number. The 2011 VISION A4-based PC consisted of the reference platform "Torpedo" with the AMD Dual-Core A4-3300M APU, with AMD Radeon™ HD 6480G graphics, 4 GB (2x2GB) DDR3-1333Mhz system memory, and Windows 7 Home Premium 64-bit. The 2011 VISION E2-based PC consisted of the reference design "Inagua" with the AMD Dual-Core E-350 APU, AMD Radeon™ HD 6310 graphics, 4GB (2x2GB), DDR3-1066Mhz system memory, Windows 7 Ultimate 64-bit. SBNB-I23

2011 VISION A6-based PC deliver up to 16% better visual performance than a 2011 VISION A4-based PC.

Tests conducted by AMD Performance Labs using FutureMark 3DMark Vantage Performance as a metric for visual performance. The 2011 VISION A6-based PC scored 1882 while the 2011 VISION A4-based PC scored 1625. All scores rounded to the nearest whole number. The 2011 VISION A6-based PC consisted of the reference platform "Torpedo" with the AMD Quad-Core A6-3410MX APU, with AMD Radeon™ HD 6520G graphics, 4 GB (2x2GB) DDR3-1333Mhz system memory, and Windows 7 Home Premium 64-bit. The 2011 VISION A4-based PC consisted of the reference platform "Torpedo" with the AMD Dual-Core A4-3300M APU, with AMD Radeon™ HD 6480G graphics, 4 GB (2x2GB) DDR3-1333Mhz system memory, and Windows 7 Home Premium 64-bit. SBNB-I24

2011 VISION A8-based PC deliver up to 51% better visual performance than a 2011 VISION A6-based PC.

Tests conducted by AMD Performance Labs using FutureMark 3DMark Vantage Performance as a metric for visual performance. The 2011 VISION A8-based PC scored 2842 while the 2011 VISION A6-based PC scored 1882. All scores rounded to the nearest whole number. The 2011 VISION A8-based PC consisted of the reference platform "Torpedo" with the AMD Quad-Core A8-3510MX APU, with AMD Radeon™ HD 6620G graphics, 4 GB (2x2GB) DDR3-1333Mhz system memory, and Windows 7 Home Premium 64-bit.The 2011 VISION A6-based PC consisted of the reference platform "Torpedo" with the AMD Quad-Core A6-3410MX APU, with AMD Radeon™ HD 6520G graphics, 4 GB (2x2GB) DDR3-1333Mhz system memory, and Windows 7 Home Premium 64-bit. SBNB-I26

eastmen
25-May-2011, 19:16
meh the a4-3300 has over twice the TDP of the e-350 of course its going to perform better !

mczak
25-May-2011, 21:24
meh the a4-3300 has over twice the TDP of the e-350 of course its going to perform better !
Yes, comparing against the low-end platform isn't very impressive. Being better in the midrange than your own lowend isn't going to be enough... I don't see these numbers on that page though?
Oh and btw a Vantage P score (assuming that's overall) of 1625 isn't all that much, right in HD3000 territory, with some HD3000 notebooks easily outscoring it (though at least the 2842 score would be outside of reach of HD3000, and presumably could be boosted some more with ddr3-1600).

Kaotik
27-May-2011, 00:25
meh the a4-3300 has over twice the TDP of the e-350 of course its going to perform better !

E2, not E, meaning E2-3250, not E-350.
E2-3250 is Llano based with 2 K10.5-x64 cores and 160 SPs and has same 65W TDP as A4-3350

Mintmaster
27-May-2011, 00:59
E2, not E, meaning E2-3250, not E-350.
E2-3250 is Llano based with 2 K10.5-x64 cores and 160 SPs and has same 65W TDP as A4-3350
Look harder:
The 2011 VISION E2-based PC consisted of the reference design "Inagua" with the AMD Dual-Core E-350 APU, AMD Radeon™ HD 6310 graphics, 4GB (2x2GB), DDR3-1066Mhz system memory, Windows 7 Ultimate 64-bit. SBNB-I23

Pete
27-May-2011, 01:31
Heh, Kaotik, your oversight reminds me of this gem from AMD's Bulldozer blog (http://blogs.amd.com/work/2011/05/16/stop-the-clocks/):

Our design engineers estimate that this will drop the power consumption by up to 95% in idle over the previous generation of processor cores[...].
95% drop in idle state over previous generation is quite impressive. --commenter

The 95% number is a comparison of a core in idle and a core in C6 state, this is not a comparison of 2 different generations. --blog author

I'm still trying to figure out if this is PEBKAC on my part.

Kaotik
27-May-2011, 08:52
Whoops, should have read more than the subject line :oops:
Though it sounds weird, that they'd call it E2 since E2 is Llano based CPU, too.

fehu
27-May-2011, 09:53
there's another "incongruence"
the comparison is only about visual performance, E350 ha 80 radeon cores, A4 has 240 but is only 143% faster that is about less that 200cores

AnarchX
27-May-2011, 12:10
Maybe both have only 4 ROPs?

mczak
27-May-2011, 13:02
there's another "incongruence"
the comparison is only about visual performance, E350 ha 80 radeon cores, A4 has 240 but is only 143% faster that is about less that 200cores
Quite the contrary the scaling is ok. 80->240 cores but the E350 has 12% higher gpu clock. Getting 2.43 the performance with only 2.7 times the shader power is quite good scaling. It also only has twice the rops (again, a bit less effectively due to clock), and memory bandwidth is also scaled according to the performance. There's also the cpu part of the benchmark to consider, which I'm too lazy to look up any numbers.

btw the gpu clocks are imho amazingly low. Consider mobility redwod (madison), which had 450-600Mhz clock for the HD 5650 mobility and up to 650Mhz for the other redwood based mobility 57xx parts.
This is why I was (wrongly) thinking Llano would have less simds than redwood initially, because 32nm SOI should enable at least slightly higher clocks, no? But instead we get lower clocks. I can only guess it's more power efficient than a cut-down gpu with higher clocks would have been, but the clocks are so low it's almost hard to believe.

ToTTenTranz
27-May-2011, 13:58
btw the gpu clocks are imho amazingly low. Consider mobility redwod (madison), which had 450-600Mhz clock for the HD 5650 mobility and up to 650Mhz for the other redwood based mobility 57xx parts.
This is why I was (wrongly) thinking Llano would have less simds than redwood initially, because 32nm SOI should enable at least slightly higher clocks, no? But instead we get lower clocks. I can only guess it's more power efficient than a cut-down gpu with higher clocks would have been, but the clocks are so low it's almost hard to believe.

These are numbers for the mobile parts, most of which will have Turbo enabled on the CPU and GPU.
As an example the C-60's GPU has a 280MHz base clock, but it turboes up to 400MHz.

That's a 43% overclock. Apply that same ~40% overclock to Llano GPUs and you get the ~550MHz clocks you were looking for.

But don't forget the TDP for mobile Llanos is 45W at most. HD5650M has a 20W TDP and quad-core Champlains have a 45W TDP. There's definitely a gain in power efficiency here.




Furthermore, it seems most Llano laptops will bundle discrete cards (http://www.passiontec.de/_fenster.php?art_id=600693537&ref_id=TradeDoubler&tduid=79092fa085f497c44728ec3129eb98d2&affId=1376965) for Crossfire, so the largest game changer might turn out to be the price/performance ratio for budget systems.

Entropy
27-May-2011, 14:12
Quite the contrary the scaling is ok. 80->240 cores but the E350 has 12% higher gpu clock. Getting 2.43 the performance with only 2.7 times the shader power is quite good scaling. It also only has twice the rops (again, a bit less effectively due to clock), and memory bandwidth is also scaled according to the performance. There's also the cpu part of the benchmark to consider, which I'm too lazy to look up any numbers.

This is Vantage you are talking about here, and I dare say that looking at that to judge scaling for these ICs is dubious either way. I can't imagine that they will ever be able to produce decent playability with that kind of graphics code. To get reasonable playability with these parts, you will have to adjust settings to something sympathetic. Which, in a sense, is OK. Judging them on the basis of extremely heavy benchmarks constructed to justify sales of new hardware isn't meaningful. Can they do a decent job of WoW? Diablo 3?

These aren't e-penis parts, and shouldn't be measured using e-penis measurement tools.

mczak
27-May-2011, 16:24
These are numbers for the mobile parts, most of which will have Turbo enabled on the CPU and GPU.
As an example the C-60's GPU has a 280MHz base clock, but it turboes up to 400MHz.

That's a 43% overclock. Apply that same ~40% overclock to Llano GPUs and you get the ~550MHz clocks you were looking for.

Yes that's true. I haven't seen any turbo numbers for the GPU part anywhere, so if they can indeed turbo up 40% it makes more sense.


But don't forget the TDP for mobile Llanos is 45W at most. HD5650M has a 20W TDP and quad-core Champlains have a 45W TDP. There's definitely a gain in power efficiency here.

It just seemed like they sacrificed quite a lot of frequency for not much more power efficiency. Buy if it can clock to 550Mhz with still active cpu I guess it's ok.


Furthermore, it seems most Llano laptops will bundle discrete cards (http://www.passiontec.de/_fenster.php?art_id=600693537&ref_id=TradeDoubler&tduid=79092fa085f497c44728ec3129eb98d2&affId=1376965) for Crossfire, so the largest game changer might turn out to be the price/performance ratio for budget systems.
If that really is a regular HD6750M that notebook is shipping with I don't think it would be terribly useful for CF (quite asymmetric already).

This is Vantage you are talking about here, and I dare say that looking at that to judge scaling for these ICs is dubious either way. I can't imagine that they will ever be able to produce decent playability with that kind of graphics code. To get reasonable playability with these parts, you will have to adjust settings to something sympathetic. Which, in a sense, is OK. Judging them on the basis of extremely heavy benchmarks constructed to justify sales of new hardware isn't meaningful. Can they do a decent job of WoW? Diablo 3?

These aren't e-penis parts, and shouldn't be measured using e-penis measurement tools.
Oh, agreed - but they are the only numbers so...
That said, scaling could be similar in games, even with lower settings. The cpu is a lot faster than the E350, so is the GPU, and memory bandwidth too.

Kaotik
27-May-2011, 16:54
http://www.ngohq.com/home.php?page=articles&go=read&arc_id=154

Was this slide deck posted already?

eastmen
27-May-2011, 19:44
http://forum.beyond3d.com/showthread.php?p=1554321#post1554321
posted it here .

swaaye
27-May-2011, 20:49
Cayman's Powertune limiter is essentially GPU Turbo, huh? It just spends most of its time at max clock because it's almost unrestricted in its power usage. Llano will be the second implementation of this then but with much more strict limits.

mczak
27-May-2011, 22:28
Cayman's Powertune limiter is essentially GPU Turbo, huh? It just spends most of its time at max clock because it's almost unrestricted in its power usage. Llano will be the second implementation of this then but with much more strict limits.
Cayman's Powertune isn't terribly efficient though since it can't (I believe) change voltage.
Though since Llano's gpu clock is a lot lower maybe it wouldn't matter there (the different clock states might have the same specified voltage, much closer to idle voltage than Cayman). Or it's improved to change gpu voltage too, much like the cpu.
Also I certainly would expect this to be coupled with cpu turbo, so I'm not sure it is really that similar to Powertune.

DavidC
28-May-2011, 00:33
There's also the cpu part of the benchmark to consider, which I'm too lazy to look up any numbers.

3DMark Vantage Formula

http://www.extremetech.com/article2/0,2845,2289656,00.asp

3DMark Vantage on Brazos

http://www.tomshardware.com/reviews/amd-fusion-brazos-performance,2790-3.html

548 GPU
2038 CPU

=670 3DMark Vantage score(Interesting, so they took Tomshardware results for comparison)

Phenom II-based Turion II Neo K625 gets 2459 in 3DMark Vantage CPU at 1.5GHz frequency.

Score=1625
CPU=2459
GPU=??
Using formula = 1460 = 2.66x

Assuming linear scaling to 1.9GHz

Score=1625
CPU=3115
GPU=1402 = 2.56x

AnarchX
31-May-2011, 19:14
http://www.anandtech.com/show/4386/a-quick-look-at-a-22nm-ivy-bridge-wafer

SNB 1/6 for GPU
IVB ~2/6 for GPU

swaaye
31-May-2011, 20:08
I guess that means Intel has an answer for Llano.

ToTTenTranz
31-May-2011, 20:09
http://www.anandtech.com/show/4386/a-quick-look-at-a-22nm-ivy-bridge-wafer

SNB 1/6 for GPU
IVB ~2/6 for GPU

I guess that means Intel has an answer for Llano.


Llano: 2/3 for GPU. Still a mile away, no?

fehu
31-May-2011, 20:30
I guess that means Intel has an answer for Llano.

"plase don't hurt too much"

mczak
31-May-2011, 23:35
http://www.anandtech.com/show/4386/a-quick-look-at-a-22nm-ivy-bridge-wafer

SNB 1/6 for GPU
IVB ~2/6 for GPU

SNB looked more like 1/5 to me.
Also, that figure obviously counts L3 as non-GPU - you could just as well argue it belongs to the GPU in which case the gpu would be bigger than all cpu cores :-).
Really though I think if you want to look at cpu/gpu balance you should look at the parts which are exclusive to these parts - so don't count L3, MC/NB. In this case the GPU part looks like about 60% of the cores which isn't too shabby actually (that's just a gross estimate from a quad core SNB).
Also, I'm not too sure that 1/3 part there on IVB really is gpu anyway.

3dilettante
31-May-2011, 23:42
The architecture really needs to have a sufficient L3/core ratio. Each core needs a tile of L3 if it's to be wortwhile.
The GPU uses the L3 as well, though I have not seen an analysis of how much.

It is interesting that Intel is maintaining the very rectangular die shapes for these smaller dies. Perhaps it is the ever-expanding IO capability or some quirk of the Intel's modular design.

Kaotik
31-May-2011, 23:54
According to this:
http://en.ocworkbench.com/tech/computex-taipei-2011-coverage-exclusive-performance-preview-of-amd-liano-vs-intel-sandy-bridge/

3DMark Vantage: Core i7 2600K can barely hold it's own against A4-Llano
PCMark Vantage: A6- and A8-Llanos barely beat Core i3 2100

mczak
01-Jun-2011, 01:34
The architecture really needs to have a sufficient L3/core ratio. Each core needs a tile of L3 if it's to be wortwhile.

Oh, not disputing that. Just pointing out it's not quite correct if you are looking at cpu/gpu ratio and solely attributing the L3 cache to the cpu.


The GPU uses the L3 as well, though I have not seen an analysis of how much.

Not sure neither but I'm quite sure it's one of the reasons SNB is faster than Arrendale IGP.


It is interesting that Intel is maintaining the very rectangular die shapes for these smaller dies. Perhaps it is the ever-expanding IO capability or some quirk of the Intel's modular design.
I think modular design plays a role (just like with SNB). Makes it easy to add/remove cores without thinking too much and without wasting too much die area.

DavidC
02-Jun-2011, 14:20
Mczak, it looks like the recent Sandy Bridge graphics driver have improved throughput in the 3DMark06 batch size testing at lower sizes. The improvement is at 8/32/128 Triangle sizes, and overall, the gains are roughly 20%. What's the significance of this? There are quite a few DX games that gained performance due to the new driver, and I'm thinking its related to the batch performance improvement.

iwod
05-Jun-2011, 12:08
Mczak, it looks like the recent Sandy Bridge graphics driver have improved throughput in the 3DMark06 batch size testing at lower sizes. The improvement is at 8/32/128 Triangle sizes, and overall, the gains are roughly 20%. What's the significance of this? There are quite a few DX games that gained performance due to the new driver, and I'm thinking its related to the batch performance improvement.

A good thing Intel is finally improving their drivers. Hopefully within 12 months their drivers should be decent enough for some games.

DavidC
07-Jun-2011, 06:46
A good thing Intel is finally improving their drivers. Hopefully within 12 months their drivers should be decent enough for some games.

You are certainly exaggerating on the playability for the Sandy Bridge IGP. No way related to the question too.

Does anyone know what low triangle batch size performance have to do with performance? I couldn't get anything more than that it has to do with "optimized performance with larger sizes" from few info I could find.

fellix
09-Jun-2011, 12:16
First AMD Llano “A8-3800″ Pictures and Benchmarks Exposed (http://wccftech.com/2011/06/09/amd-llano-a83800-pictures-benchmarks-exposed-overclocked-54ghz-12v/)

CPU performance is quite sub-par. Both CB11 (multi-threaded) and SuperPI (single-threaded) tests are running slow on this setup, below what a typical Phenom II system can achieve.

Alexko
09-Jun-2011, 12:58
The benchmarks come from Coolaler's forum, but there's apparently a bug in the BIOS, explained in this post, taken from the thread from which the benchmarks originate: http://forum.coolaler.com/showpost.php?p=2875503&postcount=70

Dave Glue
09-Jun-2011, 19:51
Need more detail but I was actually quite impressed with nearly getting 30fps on RE5 at 1080p, and 50fps with SFIV at 1080p. That's fantastic for an integrated part, nothing comes close to it at the moment if those numbers are accurate with respect to integrated GPU performance.

ShaidarHaran
09-Jun-2011, 23:40
I'll echo the sentiments expressed thus far - the game benchmarks are quite impressive while the CPU-bound benchmarks are not. Guess we'll have to wait for the next generation hardware to see some real powerhouse results all around.

itsmydamnation
11-Jun-2011, 18:32
http://forums.tweaktown.com/overclocking/44694-apu-smashed-new-igp-world-records-gigabyte-a75m-ud2h-3dvantage-p6160-igp.html#post396358

Lightman
11-Jun-2011, 19:56
Excellent numbers!
To think this is integrated GPUis just :shock:!

P4400 in Vantage is more than HD4670 on average :grin:

Harison
11-Jun-2011, 22:06
First generation of APUs are more impressive than I expected, not only they made low-end cards obsolete, but also mid-range too. Few generations later only the most hardcore gamers will need discreet cards, everyone else will be gaming with APU's.

Unfortunately for Nvidia, the only gaming mainstream APUs in town are AMDs, Intel hardly counts. Of course NV will have ARM+GPU in Maxwell, but not compatible with x86 will make it niche product. Perfect for tablets, hardly suitable for current PC gamers.

Alexko
12-Jun-2011, 00:19
First generation of APUs are more impressive than I expected, not only they made low-end cards obsolete, but also mid-range too. Few generations later only the most hardcore gamers will need discreet cards, everyone else will be gaming with APU's.

Unfortunately for Nvidia, the only gaming mainstream APUs in town are AMDs, Intel hardly counts. Of course NV will have ARM+GPU in Maxwell, but not compatible with x86 will make it niche product. Perfect for tablets, hardly suitable for current PC gamers.

I somehow doubt Maxwell will be anywhere near suitable for the tablet market: we're talking about an architecture that's bound to be very HPC-oriented, here.

But actually, it's fortunate for NVIDIA that Intel's integrated graphics blows: it means they can still sell mid-range graphics cards to Intel users.

Harison
12-Jun-2011, 01:28
I somehow doubt Maxwell will be anywhere near suitable for the tablet market: we're talking about an architecture that's bound to be very HPC-oriented, here.
It would make sense if Maxwell would be top to bottom, like AMD is doing (from 5W to 100W APUs). Although NV mentioned Maxwell with HPC mainly, so they may keep Tegra line separate, who knows.

rpg.314
12-Jun-2011, 08:11
It would make sense if Maxwell would be top to bottom, like AMD is doing (from 5W to 100W APUs). Although NV mentioned Maxwell with HPC mainly, so they may keep Tegra line separate, who knows.

Denver would certainly show up in tablets and smatphones.

Maxwell's derivative might show up in phones/tablets 2-3 years after it's desktop cousin.

Alexko
12-Jun-2011, 11:37
It would make sense if Maxwell would be top to bottom, like AMD is doing (from 5W to 100W APUs). Although NV mentioned Maxwell with HPC mainly, so they may keep Tegra line separate, who knows.

That's what I expect, yes. Over time, Tegra may start to integrate a few features from NVIDIA's current (or recent) GPUs, but I wouldn't expect convergence any time soon.

Harison
12-Jun-2011, 11:47
Denver would certainly show up in tablets and smatphones.

Maxwell's derivative might show up in phones/tablets 2-3 years after it's desktop cousin.
I agree, it still depends how efficient new gen NV GPU will be. They should have learned from Fermi debacle and if new chips are very efficient - they'll make to tablets sooner than later. Unified approach would cut down R&D and products time to market, and recent focus on design structure of "lego blocks" should make scaling easier.

That said, is there a mainstream market for desktop Maxwell? AMD and Intel APU's have vast x86 market, who really wants ARM on their PCs? Win8 should help, but lack of native apps (outside mobile market) will really make NV ARM adoption slow. MS is promising emulation of x86 on WARM, but these CPUs are slow for anything more serious than office and internet (even speaking of A15), and emulation wont help with speed either. Sure there will be few enthusiasts, but no mass adoption, probably ever.

rpg.314
12-Jun-2011, 11:54
MS is promising emulation of x86 on WARM,

Actually, MS has promised otherwise.

Harison
12-Jun-2011, 12:29
Actually, MS has promised otherwise.
You right. I remembered how Intel claimed Win8 wont run legacy software on ARM devices, and Microsoft responded how Intel is wrong and misleading, but I missed footprint where MS said legacy apps will run on x86, but not ARM :lol:

Scratch my point above how slow NV ARM adoption will be on mainstream PC market, it simply wont happen.

Arun
12-Jun-2011, 13:41
Sigh, I really post too much off-topic stuff these days but...
I somehow doubt Maxwell will be anywhere near suitable for the tablet market: we're talking about an architecture that's bound to be very HPC-oriented, here.First is it very important to dissociate the Maxwell GPU core and the Denver CPU core, which are both used in the first Maxwell-based chip. Not every Maxwell chip will necessarily use Denver. I'm not sure that's the way NVIDIA think about their naming scheme (which in the G9x/GT2xx generation one of their top engineers told someone I know he couldn't keep up with anyway), but I can't find any other way to dissociate the two clearly.

Maxwell's GPU and the system architecture of that first chip are very HPC-oriented, but the Project Denver CPU itself is nearly certainly not. Remember the idea is to run the FP-heavy stuff on the GPU, not the CPU. I'd be very surprised if we had more than a single 128-bit FMA here - which Cortex-A15 already has!

As for the GPU, AFAIK the next-generation Tegra GPU is only coming in Logan which is likely slated for late 2012/early 2013 tape-out on 28HPM with 2H13 end-product availability. That will also be the first Tegra with Cortex-A15, as the 2012 Wayne is much more incremental. So the timeframe for next-gen Tegra GPU and the Maxwell GPU is surprisingly not that different, but the former comes up earlier than the latter and is one process node behind.

So I think architectural convergence is very very unlikely, unless it is the Maxwell GPU itself that is a next-gen Tegra GPU derivative, which would be completely crazy but rather in line with Jen-Hsun's insistence that Tegra is the future of the company and that performance will be much more limited by perf/watt than perf/mm² (and already is).

As for ARM CPU adoption on PCs... I think there's a strong possibility that many notebooks will evolve towards also having a touchscreen over time. That makes Metro UI and the like more attractive, and significantly reduces the relative appeal of legacy application compatibility. But yeah, desktops? No way. Maybe hell has already frozen over now that Duke Nukem Forever is released, but there's no way desktops are ever switching to ARM. Maybe some niche 'desktop' functions like Windows HTPCs, but that's more likely to migrate towards ARM by moing away away from Windows anyway.

Back on topic:
But actually, it's fortunate for NVIDIA that Intel's integrated graphics blows: it means they can still sell mid-range graphics cards to Intel users.Yeah, 64-bit discrete GPUs are clearly a thing of the past though.

Llano is very impressive, but I wonder how bandwidth limited it really is, I really wish someone benchmarked it with different DDR3 module speeds. If it's very limited, then there may not be much room to grow before DDR4 becomes mainstream, or some other clever trick is used (silicon interposers as rumoured for Intel Haswell?)

ninelven
12-Jun-2011, 15:45
Scratch my point above how slow NV ARM adoption will be on mainstream PC market, it simply wont happen.

Oh I don't know about that, the mainstream (consumer) PC market is rarely concerned with legacy apps AFAICS. If ARM adoption gains traction in the notebook market, there isn't really a reason it can't be successful in the desktop market as well. Of course, there will always be users who simply must run legacy software...

Harison
12-Jun-2011, 16:46
Oh I don't know about that, the mainstream (consumer) PC market is rarely concerned with legacy apps AFAICS. If ARM adoption gains traction in the notebook market, there isn't really a reason it can't be successful in the desktop market as well. Of course, there will always be users who simply must run legacy software...
Netbooks, low-end notebooks, terminal PCs - sure, thats pretty much all PC market ARM can penetrate IMO. Users who use anything more serious than office and browsing simply wont bother. No apps (like 99,99% less than x86 has), no speed (ARM barely rivals Atom, thats terrible in itself), no decent games.

Even if consumer PC needs are simplistic, dont have to run legacy apps, would you buy NV ARM if you can get faster and cheaper(?) Fusion chips with x86 compatibility? Most likely not. Corporate market wont bother as well as long as they satisfied with Intel's solutions. Migrating to ARM is costly and time consuming, while AMD/Intel has strong products, therefore desktop PCs wont migrate to ARMs, except for niche markets.

Florin
12-Jun-2011, 17:25
Between consoles, portable PCs, MIDs, cloud and VDI, desktop PCs themselves seem to be doing a pretty good job of becoming a niche market.

ninelven
12-Jun-2011, 17:27
Users who use anything more serious than office and browsing Like what?

No apps. Well duh, Windows 8 isn't even out yet.

Migrating to ARM is costly and time consuming It is? Putting all of your files on a usb stick and copying them over, or installing dropbox on one's ARM PC is going to be costly and time consuming? Or if your files reside on a server to begin with, there really isn't any migrating issue at all (provided the software exists).

would you buy NV ARM if you can get faster and cheaper(?) Fusion chips with x86 compatibility?Why would you think Fusion chips will be faster and cheaper? AMD's x86 performance certainly isn't Intels...

except for niche marketsUmm... the desktop PC is quickly becoming a niche market itself, and I wouldn't exactly call the masses a niche but I suppose technically you could. So yes, ARM could be quite viable for the largest niche of the niche consumer desktop PC market.

Laptop sales are already greater than desktop sales and are continuing to extend their lead. In the mobile world, battery life is king... Ultimately, developers go where the money is, and the money is where the market share is.... Certainly applications that run on ARM notebooks will also run on ARM desktops just fine.

Blazkowicz
12-Jun-2011, 18:32
the ARM desktop can be useful for lowest cost, lowest power and resistance to harsh environnements.
that's a small niche, taken by a small vendor using a 486-based SoC :

http://www.norhtec.com/products/index.html

that niche may grow, possibly a lot if 3rd world countries invest heavily in it, and with the increase in performance.

the biggest issue, and that's true for laptops as well, is if you have the option of getting an x86 version instead - if you make an ARM device similar enough to Atom, AMD bobcat, VIA in terms of cost and performances, why not go for x86.

15 years of compatibility with the odd or mainstream windows game, 5-year-old commercial software for creating gift cards, practising genealogy etc.
the customer does value those things.

Sinistar
12-Jun-2011, 19:17
I think you'll find that the people calling X86 a niche market did not do so until Nvidia had an ARM licence.
If you can find a post to the contrary, please do so.

ninelven
12-Jun-2011, 19:37
I hate being pedantic, but the constant misuse of this word is annoying.

niche - a distinct segment of a market.

Of course x86 is a niche of the market... niche != small.

Npl
12-Jun-2011, 20:01
First (?) AMD Llano A8-3800 Review (http://www.coolaler.com/showthread.php?t=266878), H.A.W.K and ResEvil 5 playable at 1080p, seems the GPU should suffice for a broad range of gaming if you can accept some compromises in "Bling". This is a very important step IMHO, certainly what Im gonna recommend for whoever doesnt care about bleeding edge graphics (pretty much all my relatives and most friends)!

Only thing I would like to know is how it stacks up in powerdraw compared to Sandybridges and current Athlons.

Alexko
12-Jun-2011, 20:02
Sigh, I really post too much off-topic stuff these days but...
First is it very important to dissociate the Maxwell GPU core and the Denver CPU core, which are both used in the first Maxwell-based chip. Not every Maxwell chip will necessarily use Denver. I'm not sure that's the way NVIDIA think about their naming scheme (which in the G9x/GT2xx generation one of their top engineers told someone I know he couldn't keep up with anyway), but I can't find any other way to dissociate the two clearly.

Maxwell's GPU and the system architecture of that first chip are very HPC-oriented, but the Project Denver CPU itself is nearly certainly not. Remember the idea is to run the FP-heavy stuff on the GPU, not the CPU. I'd be very surprised if we had more than a single 128-bit FMA here - which Cortex-A15 already has!

Agreed. I'd expect Denver to be slightly more HPC-oriented than Cortex-A15—otherwise, what's the point?—but not a computing beast by any means. Besides, going after Intel's Rockwell or AMD's Bulldozer 2 in a first attempt at a CPU would be foolish.

As for the GPU, AFAIK the next-generation Tegra GPU is only coming in Logan which is likely slated for late 2012/early 2013 tape-out on 28HPM with 2H13 end-product availability. That will also be the first Tegra with Cortex-A15, as the 2012 Wayne is much more incremental. So the timeframe for next-gen Tegra GPU and the Maxwell GPU is surprisingly not that different, but the former comes up earlier than the latter and is one process node behind.

So I think architectural convergence is very very unlikely, unless it is the Maxwell GPU itself that is a next-gen Tegra GPU derivative, which would be completely crazy but rather in line with Jen-Hsun's insistence that Tegra is the future of the company and that performance will be much more limited by perf/watt than perf/mm² (and already is).

Agreed again. Plus, Maxwell is pretty much bound to move further towards HPC, which really doesn't make sense for the embedded world where every mW counts and you don't really see any need for high-performance floating point computing at all. Putting a Maxwell derivative in Tegra would doom it, considering the power consumption penalty it would generate, vs. the leaner designs that TI, Qualcomm and Samsung will be offering at that time.

As for ARM CPU adoption on PCs... I think there's a strong possibility that many notebooks will evolve towards also having a touchscreen over time. That makes Metro UI and the like more attractive, and significantly reduces the relative appeal of legacy application compatibility. But yeah, desktops? No way. Maybe hell has already frozen over now that Duke Nukem Forever is released, but there's no way desktops are ever switching to ARM. Maybe some niche 'desktop' functions like Windows HTPCs, but that's more likely to migrate towards ARM by moing away away from Windows anyway.


Having fingerprints all over my notebook screen would drive me insane, but maybe that's just me. I'll admit to being slightly neurotic about such things.


Back on topic:
Yeah, 64-bit discrete GPUs are clearly a thing of the past though.

And I shall shed no tears over their demise.


Llano is very impressive, but I wonder how bandwidth limited it really is, I really wish someone benchmarked it with different DDR3 module speeds. If it's very limited, then there may not be much room to grow before DDR4 becomes mainstream, or some other clever trick is used (silicon interposers as rumoured for Intel Haswell?)

According to the first leaked benchmarks, it seems to scale pretty well with overclocking—though maybe they overclocked the RAM as well?—so I'd say there's still some margin for improvement.
Further, AMD has yet to give the GPU access to any kind of shared, last-level cache the way Intel does in Sandy-Bridge. Sure, Llano seems to be doing fine without it, but in the future it's one possible way to mitigate the need for higher memory bandwidth. At this point I would like to mention eDRAM and T-RAM, acknowledging that we've been talking about those (and the now defunct Z-RAM) for a while and that so far, only IBM has used any of them. Still, it might happen.

Finally, as AMD integrates Bulldozer cores into their APUs, it will become possible for them to offer relatively high-end APUs with powerful CPU and GPU cores, possibly justifying the addition of a third memory channel, perhaps on a second platform that would be shared with very high-end CPUs lacking any kind of integrated graphics, leaving the low-end and mid-range APUs to use a more standard a cheaper 2-channel platform. After all, if Intel did it with Nehalem, it's not entirely unreasonable. And obviously, none of these options are mutually exclusive.

Harison
12-Jun-2011, 20:44
Like what?
Gaming, flash, image/video/audio editing (current GPU acceleration isnt enough, nor there are decent apps on ARM), even for office needs some have way bigger demands than ARMs can meet anytime soon, like complex excel jobs, etc.


It is? Putting all of your files on a usb stick and copying them over, or installing dropbox on one's ARM PC is going to be costly and time consuming? Or if your files reside on a server to begin with, there really isn't any migrating issue at all (provided the software exists).
No, I mean corporations have complex business apps tailored to their needs, migrating to new platform costs a lot, and its time consuming. Can ARM's even meet performance they demand? In a lot of cases not.


Why would you think Fusion chips will be faster and cheaper? AMD's x86 performance certainly isn't Intels...
x86 cpu's in general are faster than ARM's, those are behind like 7 years. ARMs can hope to beat slowest APUs, but there will always be faster APUs to choose from, and knowing NV tendency to charge for its chips, we'll see how competitive they'll be. NV might just focus with Maxwell on HPC market.


Umm... the desktop PC is quickly becoming a niche market itself, and I wouldn't exactly call the masses a niche but I suppose technically you could. So yes, ARM could be quite viable for the largest niche of the niche consumer desktop PC market.
We'll see, I dont think we'll see any serious inroads of NV ARMs to mainstream desktops, ever.

ninelven
12-Jun-2011, 21:19
Gaming, flash, image/video/audio editing (current GPU acceleration isnt enough, nor there are decent apps on ARM), even for office needs some have way bigger demands than ARMs can meet anytime soon, like complex excel jobs, etc.

Gaming is a very complex subject in this regard, and I will only say it is very unclear (at least to me) where exactly it is going. Still, I wouldn't be terribly surprised to see Denver making an appearance in one of the next-gen consoles.

Flash - I don't really see what reasonable usage scenario could possibly overtax a Denver+GPU combination here.

Image/video/audio editing - you certainly have a point here with these niche markets; however if Nvidia is good at one thing though it is dev-rel and helping with custom solutions.

Office needs - can't we at least wait until Denver is out and benchmarked before declaring it unsuitable for such needs?

NV might just focus with Maxwell on HPC market.
I'd say it is fairly clear HPC is Nvidia's primary target with Maxwell, but I doubt they will pass up any opportunity for additional sales when the added investment is relatively minimal.

x86 cpu's in general are faster than ARM's And ARM chips have historically focused primarily on performance/watt with low power consumption, I don't think anyone is unaware that a cpu designed to consume 1W is going to be slower than one designed to consume 65Ws.

We'll see, I dont think we'll see any serious inroads of NV ARMs to mainstream desktops, ever. I don't see any reason we won't. If HP or Dell offer Maxwell derivatives as affordable Windows 8 boxes, I seriously doubt the masses will even be aware that they have purchased an ARM machine.

swaaye
12-Jun-2011, 22:42
This is a very important step IMHO, certainly what Im gonna recommend for whoever doesnt care about bleeding edge graphics (pretty much all my relatives and most friends)!
I'd only do so if the products it comes in have other advantages. The GPU is going to have non-existent added benefit for a lot of people because they only need a GPU for the GUI and anything can handle that (along with HD video). I would still seriously consider Core ix products because the CPU is faster and the power usage is likely to end up lower.

I think Llano might make for some interesting ultra budget gaming notebooks. The catch is that there are already nice notebooks with discrete GPUs for $600-700 and they are faster than Llano. You can find machines with Mobility 66x0/56x0 and a Core i5 for example.

entity279
12-Jun-2011, 22:45
First (?) AMD Llano A8-3800 Review (http://www.coolaler.com/showthread.php?t=266878), H.A.W.K and ResEvil 5 playable at 1080p, seems the GPU should suffice for a broad range of gaming if you can accept some compromises in "Bling".

I suppose there is a sane explanation for that 5.499.2 Mhz clock there :roll:

Npl
12-Jun-2011, 23:05
I'd only do so if the products it comes in have other advantages. The GPU is going to have non-existent added benefit for a lot of people because they only need a GPU for the GUI and anything can handle that (along with HD video). I would still seriously consider Core ix products because the CPU is faster and the power usage is likely to end up lower.

I think Llano might make for some interesting ultra budget gaming notebooks. The catch is that there are already nice notebooks with discrete GPUs for $600-700 and they are faster than Llano. You can find machines with Mobility 66x0/56x0 and a Core i5 for example.Thing is, that I dont think many ppl would need a faster CPU (most of the persons Im talkin about wont realise a difference if they sit in front of a Core i7 or an Athlon X2), but the IGPs certainly hinder them for light gaming or apps like Google earth. Hell, I play a couple games and most would be fine with a modest dualcore...

entity279: Bios and/or CPU issues, multipliers dont work as expected. the 5.5GHz result are supposedly from ~3GHz.

Dave Baumann
13-Jun-2011, 03:12
I'd only do so if the products it comes in have other advantages. The GPU is going to have non-existent added benefit for a lot of people because they only need a GPU for the GUI and anything can handle that (along with HD video). I would still seriously consider Core ix products because the CPU is faster and the power usage is likely to end up lower.
So for a notebook type usage you'd be looking at its idle power characteristics?

Plus, for things like web browsing and the like, te efficiency of the GPU is becoming increasingly important.

liem107
13-Jun-2011, 07:05
For vast majority of notebook users gpu gaming power would come far in the need list. Of course it is still a bonus but generally not a necessity. What comes first is general usage speed anf of course a good gpu can provide smoother experience for anything related to video. What is a bit annoying with k10.5 gen is that it is generaly significantly slower than SB in most tasks. If AMD really wants to cut the gap with Llano, they really have to compensate the weakness of the CPU by pushing hard on the APU concept. EVERYTHING should laverage the APU concept... of course photo and video editing come first in mind but why not pushing harder and use APU acceleration for more common office work? An APU accelerated libreoffice for exemple with accelerated presentation/spreadsheet/database could make llano significantly faster than SB if such acceleration would be feasable... this would push llano forward in enterprises and professional laptops.

Alexko
13-Jun-2011, 12:21
For vast majority of notebook users gpu gaming power would come far in the need list. Of course it is still a bonus but generally not a necessity. What comes first is general usage speed anf of course a good gpu can provide smoother experience for anything related to video. What is a bit annoying with k10.5 gen is that it is generaly significantly slower than SB in most tasks. If AMD really wants to cut the gap with Llano, they really have to compensate the weakness of the CPU by pushing hard on the APU concept. EVERYTHING should laverage the APU concept... of course photo and video editing come first in mind but why not pushing harder and use APU acceleration for more common office work? An APU accelerated libreoffice for exemple with accelerated presentation/spreadsheet/database could make llano significantly faster than SB if such acceleration would be feasable... this would push llano forward in enterprises and professional laptops.

Perhaps there's a bit of a "If we build it, they will come" argument to be made, here. Gaming wasn't a priority for most people buying notebooks because gaming came with a steep price premium, and often a significant impact on battery life. If Llano removes those two issues, then maybe people will be glad that their notebooks allow them to play recent games with decent settings.

As for the CPU, I use a Core 2 Duo on a daily basis (granted, at 3 GHz). I don't expect Llano to be significantly slower clock for clock, and I can't say that I ever find myself lamenting my CPU's "poor" performance.

Lightman
13-Jun-2011, 12:58
From the numbers I've seen so far AMD improved IPC of Llano cores to match or exceed Phenom II ones. This is pretty nice considering Llano cores lost 6MB of L3 cache and gained 512KB of L2 cache per core.

When I compared 3DMark CPU scores with ORB Llano clocked @2.9GHz was by some margin quicker than Q8200 @2.9GHz.

Quick list:
Phenom 9950 at 3.0GHz/NB2.27GHz/RAM DDR2 1100MHz CL5 - CPU 9783
Llano A3850 stock - CPU 10038
Phenom II 940 at 3655MHz/NB2365MHz/RAM DDR2 1146MHz CL5 - CPU 12389 [assuming linear scaling this Phenom II with tweaked NB and RAM would score 9829 at 2.9GHz)

Blazkowicz
13-Jun-2011, 13:10
for most people even those who use their computer for a living, gaming is the main application where you need a CPU as fast as possible.
calculting a few hundreds or a few million numbers in a spreadsheet is small change.

ToTTenTranz
13-Jun-2011, 14:21
I'd only do so if the products it comes in have other advantages. The GPU is going to have non-existent added benefit for a lot of people because they only need a GPU for the GUI and anything can handle that (along with HD video). I would still seriously consider Core ix products because the CPU is faster and the power usage is likely to end up lower.

I think Llano might make for some interesting ultra budget gaming notebooks. The catch is that there are already nice notebooks with discrete GPUs for $600-700 and they are faster than Llano. You can find machines with Mobility 66x0/56x0 and a Core i5 for example.


As seen by that leaked HP DV6, $600-700 Llano notebooks are also coming with discrete graphics cards within the Mobility 66xx\56xx range, allowing crossfire between the IGP and the discrete GPU.

Despite the differences in core frequency, ALU count and memory bandwidth, we may be talking about a gaming performance increase of 50-70% between an i5 + HD6650 and an A8 + HD6650.

And as AMD doesn't have anything to compete with nVidia's Optimus for Intel yet, I'd say a Llano with the discrete GPU turned off should be more power efficient than an i5+HD66xx.




I'm betting Llano for notebooks will be quite disruptive in gaming performance for that budget market. It remains to be seen if AMD can make the APP program justify the purchase for other computing tasks, though.

CarstenS
13-Jun-2011, 16:01
HAven't seen this link posted in this thread:
http://blogs.amd.com/fusion/2011/06/09/amd-platform-innovations/

There's some perf numbers in there, too.

Lightman
13-Jun-2011, 16:47
HAven't seen this link posted in this thread:
http://blogs.amd.com/fusion/2011/06/09/amd-platform-innovations/

There's some perf numbers in there, too.

Saw that the other day.
Interesting that battery life is compared to Intel system with I3-2310M which is just DualCore CPU with 35W TDP and 2.1GHz clock.
A8-3500M is 4 core 1.5GHz/2.4GHz and same 35W TDP.

3D Mark numbers are very good for integrated solution for laptop, and that is with low clocked 4 core APU where we all know 3DMark 06 loves fast cores.

I'm afraid when A8 laptop models hit the shops I will brake and buy one for me! Dual graphics models welcome :wink:

ToTTenTranz
13-Jun-2011, 17:33
Wow, AMD is really pumping up the expectations in there.

Isn't this kind of "we're outdoing ourselves" propaganda a bit dangerous, in case all the expectations aren't met?

All it needs is an OEM deciding to cut costs on the batteries' capacity (as many have done between AMD and Intel models so far) and this may go really wrong..

Alexko
13-Jun-2011, 17:38
Wow, AMD is really pumping up the expectations in there.

Isn't this kind of "we're outdoing ourselves" propaganda a bit dangerous, in case all the expectations aren't met?

All it needs is an OEM deciding to cut costs on the batteries' capacity (as many have done between AMD and Intel models so far) and this may go really wrong..

Maybe there are clauses in the contracts to prevent such things? In any case, it's supposed to be released tomorrow, so we'll know just about everything soon enough.

ToTTenTranz
13-Jun-2011, 17:51
Maybe there are clauses in the contracts to prevent such things? In any case, it's supposed to be released tomorrow, so we'll know just about everything soon enough.

Let's hope AMD made that kind of pre-arrangement, or they might get their success heavily sabotaged by the OEMs' greediness (as we've seen with the Congo\Nile vs. C2D ULV competition).

swaaye
13-Jun-2011, 19:11
I had a 12" HP dv2z awhile back that had a Turion Neo X2 1.6 and a discrete 3450. It was slow, hot when running games and could only manage about 3 hours battery. A 3450 has a hard time with some games from 2002. I do wonder if Llano could be used in something like that with better results. Brazos is slower than a Turion X2 so don't want that...

Anyway yeah tomorrow should be interesting. I am most curious about power usage of this new 32nm CPU + GPU.

ToTTenTranz
13-Jun-2011, 19:23
I had a 12" HP dv2z awhile back that had a Turion Neo X2 1.6 and a discrete 3450. It was slow, hot when running games and could only manage about 3 hours battery. A 3450 has a hard time with some games from 2002. I do wonder if Llano could be used in something like that with better results. Brazos is slower than a Turion X2 so don't want that...

Anyway yeah tomorrow should be interesting. I am most curious about power usage of this new 32nm CPU + GPU.


Newer Zacate UMPCs with Turbocore enabled and 1333MHz DDR3 (E-450) may come awfully close to that Turion X2 @ 1.6GHz. Plus, the integrated 80sp Cedar should be a lot faster than a 40sp RV620.

Mobile Llano's bottom power threshold is 35W, so I doubt they'll be coming to <13" models, unless it's something as thick\loud as an Alienware M11.

Alexko
13-Jun-2011, 19:32
Newer Zacate UMPCs with Turbocore enabled and 1333MHz DDR3 (E-450) may come awfully close to that Turion X2 @ 1.6GHz. Plus, the integrated 80sp Cedar should be a lot faster than a 40sp RV620.

Mobile Llano's bottom power threshold is 35W, so I doubt they'll be coming to <13" models, unless it's something as thick\loud as an Alienware M11.

Yeah but I honestly don't see why Llano couldn't come down to 25W at some point in the nearish future. Of course, that would be in dual-core, 160-SP form, but that's enough for this kind of market.

swaaye
13-Jun-2011, 20:35
M3410 is ~8W and Turion Neo X2 is ~18W. So you have ~26W there. There was also a 690G in that subnote but with the IGP disabled.

BTW, 34x0 was awful. Can't accelerate flash. 3D is like a Radeon 9600. Bleh. ;)

Blazkowicz
13-Jun-2011, 21:03
maybe you were trying to run the wrong things, rtcw and warcraft III should run like a charm.

swaaye
13-Jun-2011, 21:13
3450 can run some games ok. It will run UT3 alright at reduced settings. But then you try KOTOR and it's not entirely smooth. It really reminded me of a Radeon 9600 with support for more shader features.

The inability to accelerate flash is really a problem though when you are on a system with a wimpy CPU. It can accelerate H.264 and VC-1 but its UVD lacks something to allow flash accel.