Llano IGP vs SNB IGP vs IVB IGP

I personally couldn't care less about crossfiring gimpy GPUs (or crossfire in general for that matter). There are plenty of such results at Anandtech though.

"Gimpy"?!
We're talking about gaming performance on a 700€ 14" 2Kg laptop that goes awfuly close to 1600€ 17" 4.5Kg laptops!


Thank god you're not even marginally representative of budget-conscious gamers..
 
Yup the Tom's review shows the Phenom II laying some smackdown. It doesn't always keep up with the 1.5 GHz Phenom II. The lack of shared L3 is probably hurting it in highly threaded apps ala Athlon II. Turbo looks very weak. Gimpy CPU with a decent GPU. ATI saves the day.
Doesn't look like the lack of L3 makes that much difference to me (except abby finereader), just turbo seems to fail to clock up even a tiny bit with multithreaded loads and never reaching anything close to the max even with single-threaded loads.
I think skipping L3 cache makes sense for now. Just look at the AII and PII parts on the desktop, the L3 cache usually doesn't do all that much except for games when they are cpu-bound with high-end graphic cards - no issue for notebooks, and not much of an issue for lower-end desktops neither.
Even with L3 cache (I guess you'd really need 4MB at least, allowing reduction of L2 to 512KB for it to make any positive impact) the cpu just would be nowhere close to Sandy Bridge anyway, and the die is already huge compared to dual-core Sandy Bridge parts.
L3 cache makes a LOT of sense if you make that available to the IGP too, but I think AMD tried to stick to "old" graphics (and cpu) core without serious redesign (after all, it's their first 32nm chip). Trinity should probably show what this can do, at least I'd hope so, given intel already does that (and Ivy Bridge could potentially come close to Llano in IGP performance).
 
Thank god you're not even marginally representative of budget-conscious gamers..

Lol yeah I guess it's a good thing I'm not. This CPU is a dog compared to a lot of the budget Intel notebooks with discrete graphics. Strategy gamers beware probably.
 
Doesn't look like the lack of L3 makes that much difference to me (except abby finereader), just turbo seems to fail to clock up even a tiny bit with multithreaded loads and never reaching anything close to the max even with single-threaded loads.
Yea you are right that L3 would have cost a lot for a little gain. Turbo seems to either be very limited or bugged.
 
Regarding DDR3 speed, 1066 and 1333 are common in notebooks and that's where this chip is most interesting I think.
Still poor excuse for using ddr3-1333 even on the desktop. Maybe though we'll see ddr3-1600 a bit more in notebooks now? Let's face it up to now there just wasn't much incentive to use faster memory in notebooks. Well intel didn't support it at all, not sure about AMD but it would have been totally pointless anyway. The (presumably pretty small) power draw disadvantage should be well worth the faster graphics you get from ddr3-1600 I think. Though I'm wondering why it only supports low-voltage ddr3-1333 and not the low-voltage ddr3-1600 version - but few notebooks use low-voltage dimms today.
 
L3 cache makes a LOT of sense if you make that available to the IGP too, but I think AMD tried to stick to "old" graphics (and cpu) core without serious redesign (after all, it's their first 32nm chip). Trinity should probably show what this can do, at least I'd hope so, given intel already does that (and Ivy Bridge could potentially come close to Llano in IGP performance).

and that classic shared memory architecture worked better.
sandy bridge has this little problem where the performance tanks if all CPU threads are fully loaded while amd keeps the GPU usable.
once again Intel is the high rev and high tech small gasoline engine, and AMD is the clunky turbo diesel that falls short on power, but is steady when hauling big loads and is quite advanced on its own too. :devilish:
 
and that classic shared memory architecture worked better.
sandy bridge has this little problem where the performance tanks if all CPU threads are fully loaded while amd keeps the GPU usable.
once again Intel is the high rev and high tech small gasoline engine, and AMD is the clunky turbo diesel that falls short on power, but is steady when hauling big loads and is quite advanced on its own too. :devilish:
I've seen these benches but there was no proof it was due to the L3 sharing. Even if it was though there are ways to solve this (could reserve portions of the L3 cache to either cpus or gpu for instance), in any case I expect sharing L3 between cpu and gpu is going to be inevitable for next gen, the advantages are just too big.

btw anand has updated his desktop benches with ddr3-1866:
http://www.anandtech.com/show/4448/amd-llano-desktop-performance-preview/3
That gets things up to roughly gt430 performance...
So for 40% more memory bandwidth, performance improved by 20-30% (though mostly close to 20%). Definitely better than HD6450 gddr5 now (which is unavailable anyway but that's a different topic).
 
Yeah faster RAM gives it a nice boost. Still, being below a 5570 is somewhat sobering in the grand scheme. I want to see the 35W Llano put to good use in a smaller form factor where a discrete setup is less than ideal.
 
Lol yeah I guess it's a good thing I'm not. This CPU is a dog compared to a lot of the budget Intel notebooks with discrete graphics. Strategy gamers beware probably.
If we're talking gaming performance, I think the 4-core Llano is actually quite balanced to pair with a graphics system with a processing power equivalent to a downclocked desktop HD5750.
Just look at the 3dmark11 results in the anandtech review. The measly 1.5GHz quad-core + HD6690G2 (fGPU + HD6630M) is bitting the toes of Intel's mightiest laptop Sandybridge + dedicated 675MHz GF106. The latter are top-end big-sized gaming laptops with a ~2.5 hour battery life, not some mid-sized budget notebooks with 6+ hours.




btw anand has updated his desktop benches with ddr3-1866:
http://www.anandtech.com/show/4448/amd-llano-desktop-performance-preview/3
That gets things up to roughly gt430 performance...
So for 40% more memory bandwidth, performance improved by 20-30% (though mostly close to 20%). Definitely better than HD6450 gddr5 now (which is unavailable anyway but that's a different topic).

Wow, that's by far the largest jump in overall-performance per memory bandwidth I've ever seen!

I guess the memory controller tweaks make no miracles and the Fusion APUs are, in fact, starving in bandwidth.
The jump to DDR3 1333MHz for Brazos UMPCs should give them a large performance bump too. I've seen benchmarks where the CPU gets much higher scores when the fGPU is turned off.
 
If we're talking gaming performance, I think the 4-core Llano is actually quite balanced to pair with a graphics system with a processing power equivalent to a downclocked desktop HD5750.
Just look at the 3dmark11 results in the anandtech review. The measly 1.5GHz quad-core + HD6690G2 (fGPU + HD6630M) is bitting the toes of Intel's mightiest laptop Sandybridge + dedicated 675MHz GF106. The latter are top-end big-sized gaming laptops with a ~2.5 hour battery life, not some mid-sized budget notebooks with 6+ hours.
Of course you have to remember that multi GPU setups come with issues in some games and require profiles to perform well. Sometimes it simply doesn't benefit a particular game at all. That's why I tend to ignore Crossfire and SLI.
 
Just look at the 3dmark11 results in the anandtech review. The measly 1.5GHz quad-core + HD6690G2 (fGPU + HD6630M) is bitting the toes of Intel's mightiest laptop Sandybridge + dedicated 675MHz GF106. The latter are top-end big-sized gaming laptops with a ~2.5 hour battery life, not some mid-sized budget notebooks with 6+ hours.
I tend to think that's about the best scaling you'll ever see though (and it's still 20% off that GTX460M btw).
I think something like the 3dmark Vantage result is probably more representative what you'll get with asymmetric CF. Which means it'll beat GT540M but it's probably not even enough to beat the top-end mobile Turks alone (HD6770M - still the same graphic chip...). Of course you could use CF on that too but I doubt it's worth it. Now if you could run OpenCL physics on the IGP and graphics on the GPU that would be more interesting probably...
And really asymmetric CF is useless right now. It didn't really work in anything but 3dmark for now. I'm sure that'll get fixed but I'm not even sure AMD wants to extend support to d3d9?
I think for that level of performance you'd be much better off with something like a discrete HD6770M (switchable of course for battery runtime), not worrying about the headaches of asymmetric CF.

Wow, that's by far the largest jump in overall-performance per memory bandwidth I've ever seen!
Not even close there are plenty of HD5570 out there with, get this, DDR2 memory!
I bet those show a larger performance increase (in percent) per memory bandwidth increase (with 10% faster GPU and 33% less memory bandwidth to start with)...

I guess the memory controller tweaks make no miracles and the Fusion APUs are, in fact, starving in bandwidth.
I don't really see any improvement in that area, the better prefetching should be specific to the cpu cores. For the gpu I think it won't make much of a difference, it'll just have slightly higher latency (likely as it's more complex) than redwood MC had.
The jump to DDR3 1333MHz for Brazos UMPCs should give them a large performance bump too. I've seen benchmarks where the CPU gets much higher scores when the fGPU is turned off.
Yes Brazos should be equally bandwidth limited. Strange though the cpu would get higher scores with gpu turned off, the GPU shouldn't need much bandwidth (when it's idle - of course running full tilt it might have a similar scheme to Llano giving gpu access higher priority).
 
I've seen these benches but there was no proof it was due to the L3 sharing. Even if it was though there are ways to solve this (could reserve portions of the L3 cache to either cpus or gpu for instance), in any case I expect sharing L3 between cpu and gpu is going to be inevitable for next gen, the advantages are just too big.

btw anand has updated his desktop benches with ddr3-1866:
http://www.anandtech.com/show/4448/amd-llano-desktop-performance-preview/3
That gets things up to roughly gt430 performance...
So for 40% more memory bandwidth, performance improved by 20-30% (though mostly close to 20%). Definitely better than HD6450 gddr5 now (which is unavailable anyway but that's a different topic).

Now that's more like it! That's definitely enough for casual gaming, but it does raise the very interesting question (which we brushed a couple of days ago) of future scaling: how to deal with bandwidth starvation?
 
Now that's more like it! That's definitely enough for casual gaming, but it does raise the very interesting question (which we brushed a couple of days ago) of future scaling: how to deal with bandwidth starvation?

I think Trinity will have 3 channels. Roadmaps are hinting exactly that, but they not 100% confirmed.

Extra channel assuming DDR 1600 would give 12.8GB/s more and brought total to 38.4GB/s. There is also that possibility of using L3 cache of CPU to ease memory requirements a bit. Of course we don't know yet if AMD is going to implement that for next gen APU.
 
So for 40% more memory bandwidth, performance improved by 20-30% (though mostly close to 20%). Definitely better than HD6450 gddr5 now (which is unavailable anyway but that's a different topic).

Crysis Warhead 1280: ~25%
Crysis Warhead 1024: ~20%
Metro 2033 1280: ~28%
Metro 2033 1024: ~29%
Dirt2 1280: ~19%
Dirt2 1024: ~22%
Mass Effect 2 1280: ~25%
Mass Effect 2 1024: ~15%

edit:
They should test with 1280x720 & 1366x768 though, rather than 1280x1024 / 1024x768, since Llano is obviously going to be used with widescreen displays, let it be TV or monitor, rather than old 5:4/4:3 monitors
 
http://www.pcper.com/news/Graphics-...rinity-APU-will-use-VLIW4-Cayman-Architecture

Trinity will use a VLIW4 GPU.
As I suggested before, AMD is pushing harder to use the more computing-friendly VLIW4 architecture in future APUs, as the APP program is really important for the platform's success.

They've also shown Trinity in a working laptop.
Bulldozer cores + VLIW4 GPU in a laptop.. hmmm..


Would a half-Cayman (768 sps) be possible? Or something like 512sps, being almost as powerful as a Juniper Pro?


EDIT: PCPer's slides on the Fusion Developer Summit mention a 50% increase in the APU's GFLOPs, so we're probably talking about a ~512sp VLIW4 GPU, maybe with higher base clocks for the GPU.
 
Last edited by a moderator:
http://www.pcper.com/news/Graphics-...rinity-APU-will-use-VLIW4-Cayman-Architecture

Trinity will use a VLIW4 GPU.
As I suggested before, AMD is pushing harder to use the more computing-friendly VLIW4 architecture in future APUs, as the APP program is really important for the platform's success.

They've also shown Trinity in a working laptop.
Bulldozer cores + VLIW4 GPU in a laptop.. hmmm..


Would a half-Cayman (768 sps) be possible? Or something like 512sps, being almost as powerful as a Juniper Pro?


EDIT: PCPer's slides on the Fusion Developer Summit mention a 50% increase in the APU's GFLOPs, so we're probably talking about a ~512sp VLIW4 GPU, maybe with higher base clocks for the GPU.

Though for how long? Their nextgen desktop GPUs are moving completely away from VLIW, will next gen Fusion after Trinity do the same?
 
Wow! I was ready to take bets there's going to be a 6570 inside Trinity! :oops:

Given how it was said the IGP in Trinity is going to be 50% faster, most likely meaning more GFLOPS, I'd say it's going to be 512 shaders, with higher clocks than now. Of course one dispatch processor with one set of fixed hardware.

That's how probably the slowest SKU in 7XXX series is going to look like, allowing for hybrid crossfire with Trinity.
 
Though for how long? Their nextgen desktop GPUs are moving completely away from VLIW, will next gen Fusion after Trinity do the same?

I guess the fGPU will be one generation behind dedicated GPUs in architecture changes.

The PCPer's comments state the Trinity's successor will use the new Fermi-ish architecture, but that's 2013 and the next-gen graphics architecture should be available in 2012.

But yeah, one wonder's how much "effort" is worth for AMD to be investing in application acceleration while changing the graphics architecture 3 times in 3 years in a row..

And how will they achieve crossfire between the fGPU and an apparently very different discrete GPU? Maybe AMD will go with Lucid at some point?
 
Though for how long? Their nextgen desktop GPUs are moving completely away from VLIW, will next gen Fusion after Trinity do the same?

Probably, yes. That's the "problem" with APUs: both the CPU and the GPU are likely to be always lagging about a year behind the top CPU and the discrete GPU architectures, respectively.

So yeah, expect updated Bulldozer cores + a new Scalar+SIMD-based GPU on 22nm for the 2013 APU.

Perhaps as AMD gets used to their velocity thing and streamlines the process, this lag will be reduced to just 6 months or so, we'll see.
 
Probably, yes. That's the "problem" with APUs: both the CPU and the GPU are likely to be always lagging about a year behind the top CPU and the discrete GPU architectures, respectively.

So yeah, expect updated Bulldozer cores + a new Scalar+SIMD-based GPU on 22nm for the 2013 APU.

Perhaps as AMD gets used to their velocity thing and streamlines the process, this lag will be reduced to just 6 months or so, we'll see.

6 months is too low. The rest will be atleat a year behind for the next 2 years.
 
Back
Top