Trinity vs Ivy Bridge

The only 17W Trinity announced so far is the A6-4455M, which has 33% lower clocks and 33% less shader units than the A10-4600M that was reviewed. It means the ULV Trinity that was announced should have only ~45% of the GPU performance from the full 35W model.

Looking at those benchmarks, it seems that the ULV IvyBridge will confortably trade blows with the ULV Trinity, even surpassing it in some cases.

AMD will definitely need a 17W A8 or A10 to defeat th ULV IvyBridge in graphics performance.
You can hardly use continual proportion to extrapolate graphics performance of 17W Trinity. 35W Trinity seems to be even more bandwidth limited than Llano. Decreasing raw GPU performance by (let say) 20 % should decrease average gaming performance by significantly less than 20 %.
 
You can hardly use continual proportion to extrapolate graphics performance of 17W Trinity. 35W Trinity seems to be even more bandwidth limited than Llano. Decreasing raw GPU performance by (let say) 20 % should decrease average gaming performance by significantly less than 20 %.


The Llano A4-3300M has 240sp, 12 TMUs and 4 ROPs @ 444MHz. It does -510 3DMark 11P points.
The Llano A8-3550MX has 400sp, 20 TMUs and 8 ROPs @ 444MHz. It does 820 3DMark 11 P points.

That's a 60% drop of sheer shader processing resulting in a 60% drop in performance.
Sounds like a quasi-linear proportion to me..

I don't believe that lower-end Trinity models will keep the same number of TMUs and ROPs, so I guess a similar drop in performance is to be expected.


I do think that, given AMD's and Intel's differences on how they measure TDP (AMD=peak electrical consumption, Intel = average consumpion measured in small intervals of time), that 25W A10-4655M could get into the same space as the 17W IvyBridge.
But that's for the OEMs to decide, in the end.
 
Let me repeat it once again:

no-X said:
Decreasing raw GPU performance by (let say) 20 % should decrease average gaming performance by significantly less than 20 %.

3D Marks are known for very low bandwidth dependency.
 
The Llano A4-3300M has 240sp, 12 TMUs and 4 ROPs @ 444MHz. It does -510 3DMark 11P points.
The Llano A8-3550MX has 400sp, 20 TMUs and 8 ROPs @ 444MHz. It does 820 3DMark 11 P points.

That's a 60% drop of sheer shader processing resulting in a 60% drop in performance.
Sounds like a quasi-linear proportion to me..
Check your math, that's 40% drop, not 60%.
That 40% drop is for shader processing, as well as 40% less texture units and 50% less ROPs
 
Let me repeat it once again:

3D Marks are known for very low bandwidth dependency.

1 - The difference of GPU power between the 35W A10-4600M and the 17W A6-4455M is not 20%. It's 33% less on clock speeds and at 33% less on shader units. If we go by the previous iGPU iterations, the ROPs are probably halved and there's also 33% less TMUs.
We're not talking about a 20% slower iGPU. The 17W A6-4455M should have its iGPU at least 50% slower than the iGPU in the A10-4600M that was widely reviewed last week (Trinity A10-4600M x 0.66 <lower clock speeds> x 0.66 <less shader units> = ~45% of the original performance).


2 - Don't like 3DMark? Okay, here's some more:

Anno 2070
Llano A4-3300M (HD6480G @ 444MHz) - 32 FPS
Llano A8-3500M (HD6620G @ 444Mhz) - 40 FPS
20% less performance

Starcraft 2
Llano A4-3300M (HD6480G @ 444MHz) - 20 FPS
Llano A8-3500M (HD6620G @ 444Mhz) - 31 FPS
35% less performance

Risen
Llano A4-3300M (HD6480G @ 444MHz) - 31 FPS
Llano A8-3500M (HD6620G @ 444Mhz) - 43 FPS
28% less performance.

So for a 40% decrease in shader units, 40% decrease in TMUs and 50% decrease in ROPs, we get up to 35% less real-world performance (note: even though the A4 is only dual-core and the A8 has a quad-core, the A4 is substantially higher clocked).



Therefore, unless there's some magical drivers coming for the 17W A6 Trinity, its 3D performance will be some 30 to 50% slower than the 35W A10 Trinity.
That said, the 17W IvyBridge will be at least comparable - if not faster - than the A6 !7W in 3D gaming performance.


Check your math, that's 40% drop, not 60%.
That 40% drop is for shader processing, as well as 40% less texture units and 50% less ROPs
I stand corrected on the semantic level. By 60% drop, I meant the performance of the fastest model is multiplied by 0.6. The slower model has 60% of the performance of the faster one, resulting in a 40% drop in performance.
Semantics aside, the math is correct, though.
 
Lots of 16W ULV Ivy Bridge benchmarks (both CPU and GPU):
http://www.anandtech.com/show/5843/asus-zenbook-prime-ux21a-review/1

Unfortunately the ULV Ivy Bridge specs are still under NDA, so there's no official information about CPU/GPU clock speeds (or turbo speeds) yet. The chip can be configured to 16W TDP or 13W TDP (user configurable from Windows power options).

16W Ivy Bridge seems to be around 20%-25% faster in GPU/gaming benchmarks than 17W Sandy Bridge (Dell XPS 13). Comparison charts didn't include any other Ivy Bridge chips, so it's hard to say how much performance is lost compared to 45W/35W models.
 
Lots of 16W ULV Ivy Bridge benchmarks (both CPU and GPU):
http://www.anandtech.com/show/5843/asus-zenbook-prime-ux21a-review/1

Unfortunately the ULV Ivy Bridge specs are still under NDA, so there's no official information about CPU/GPU clock speeds (or turbo speeds) yet. The chip can be configured to 16W TDP or 13W TDP (user configurable from Windows power options).

16W Ivy Bridge seems to be around 20%-25% faster in GPU/gaming benchmarks than 17W Sandy Bridge (Dell XPS 13). Comparison charts didn't include any other Ivy Bridge chips, so it's hard to say how much performance is lost compared to 45W/35W models.

Dude...
 
So if not for Intels superior manufacturing process Trinity would wipe the floor with Ivy in GPU department.

To me it's still remarkable that AMD managed to cram so much punch using so little power on ill GloFo 32m HKMG process.

And just a pure speculation on my side, 17W Trinity will show better gaming performance (on average) in shader intensive titles compared to Ivy despite slower CPU performance.
 
So if not for Intels superior manufacturing process Trinity would wipe the floor with Ivy in GPU department.

To me it's still remarkable that AMD managed to cram so much punch using so little power on ill GloFo 32m HKMG process.

And just a pure speculation on my side, 17W Trinity will show better gaming performance (on average) in shader intensive titles compared to Ivy despite slower CPU performance.

it should have comparable bandwidth avalible to it , so perhaps it will.
 
The Llano A4-3300M has 240sp, 12 TMUs and 4 ROPs @ 444MHz. It does -510 3DMark 11P points.
The Llano A8-3550MX has 400sp, 20 TMUs and 8 ROPs @ 444MHz. It does 820 3DMark 11 P points.


We already have Vantage and 3dmark11 scores for the 17W Trinity model from AMD slides. If I remember me Vantage was something around 2300 points and 3dmark11 620 or so. Predictions based on that are not that easy since AMD introduced a GPU Turbo.

Interesting the anandtech 17W IVB test. It's obvious in most games max turbo frequency doesn't run, it's something between base and max turbo frequency. It would be interesting to see how the GPU Turbo behaves with a disabled CPU turbo. It might give more headroom for the GPU Turbo in games.
 
So if not for Intels superior manufacturing process Trinity would wipe the floor with Ivy in GPU department.
If I've learned one thing from hardware people it's that comparisons like that are never particularly meaningful :) There are too many variables to make statements like "if we could transplant piece of processor X onto a different process together with piece of processor Y, it would be better".

In fact this entire conversation is getting a little confused because of people trying to apply legacy "CPU vs GPU" thinking to this comparison. With increasingly large portions of the memory paths and the power budget being shared between different portions of the chip ("CPU and GPU"), it just doesn't make sense to think of it that way any more. For all we know, in a specific game processor A might dedicate 95% of its power budget to the "GPU parts" of the chip and processor B might dedicate 10%, and reviewers (and end users in general) are completely ill-equipped to measure stuff at that granularity. And of course if I'm trying to compare "just the GPUs" of these chips, iso-power, whose portion does shared stuff like memory come out of? What about shared caches?

It's confusing, and frankly irrelevant to try and separate out parts of the chip these days. The only really meaningful comparison is chip A to chip B in an application. Normalizing iso-power or iso-$ is completely legitimate of course as well. All the rest of the speculations that try to separate out parts of the chip are becoming increasingly dated... These chips don't fit into our simple concepts of how a CPU and GPU operate independently any more :)
 
Trinity isn't that integrated, though.
The GPU parts are still very distinct, and they pretty much share nothing up to the point that their dedicated bus interfaces meet up with the SRQ or memory controller.

Trinity is more integrated than hanging off the far side of a PCIe connection, granted, but unifying the memory spaces and software environment hasn't happened yet.
I'd say just for power and density reasons Trinity's GPU would be more impressive if were at process parity.
 
Trinity is more integrated than hanging off the far side of a PCIe connection, granted, but unifying the memory spaces and software environment hasn't happened yet.
Sure, but it's moving in that direction quickly, and even if only IVB is a bit more integrated (doesn't it share a cache?) that still makes the comparison problematic to the say the least.

I'd say just for power and density reasons Trinity's GPU would be more impressive if were at process parity.
Well one can certainly hope anything can be made better with a process shrink :). But like I said, things are just too complicated to try and draw conclusions of how the architecture itself compares from rough data like we have, as the CPU and GPU do not work in isolation these days. Even if *only* the power budget was shared, it would still be pretty meaningless as now the power policy is the most relevant.

Trinity seems good, don't get me wrong, but I hesitate to draw architectural comparisons since they have clearly made an intentional trade-off towards graphics compared to conventional x86 (in area, and likely power too). Similarly I argue that it's a bit unfair to architecturally criticize the CPU portion of trinity vs IVB because of that trade-off. It's a whole chip and needs to be evaluated as such; this will only increase going forward.
 
Sure, but it's moving in that direction quickly, and even if only IVB is a bit more integrated (doesn't it share a cache?) that still makes the comparison problematic to the say the least.
It's slightly less integrated than Sandy Bridge. SB had fewer hops between texture unit and the last level cache, and the GPU was able to reserve parts of it. Ivy Bridge added a graphics only L3, insulating it from the last level cache.

Trinity seems good, don't get me wrong, but I hesitate to draw architectural comparisons since they have clearly made an intentional trade-off towards graphics compared to conventional x86 (in area, and likely power too). Similarly I argue that it's a bit unfair to architecturally criticize the CPU portion of trinity vs IVB because of that trade-off. It's a whole chip and needs to be evaluated as such; this will only increase going forward.

The GPU still needs the CPU to run its driver and manage the system, though. A weaker CPU can't rely on the GPU to overcome any shortcomings in controlling the GPU.
At any rate, it's not that hard to criticize the CPU side for any software that doesn't purposefully use the GPU, which is still pretty much all of it.
Maybe if AMD's FSAIL initiative catches on, the software and hardware melding will become more pronounced, but right now there's no hiding the division.
 
Trinity seems good, don't get me wrong, but I hesitate to draw architectural comparisons since they have clearly made an intentional trade-off towards graphics compared to conventional x86 (in area, and likely power too). Similarly I argue that it's a bit unfair to architecturally criticize the CPU portion of trinity vs IVB because of that trade-off. It's a whole chip and needs to be evaluated as such; this will only increase going forward.

So, this means the CPU portion is a major bottleneck. Once- in gaming scenarios where you put more power on the GPU but waste it because you don't have a fast enough CPU to work with, and twice, unless you switch off a large part of the power budget for the GPU portion itself, in general purpose computation too.

I'd say just for power and density reasons Trinity's GPU would be more impressive if were at process parity.

I dream of an APU with the IB CPU and Trinity's GPU, that should be the ideal chip. ;)
 
It's slightly less integrated than Sandy Bridge. SB had fewer hops between texture unit and the last level cache, and the GPU was able to reserve parts of it. Ivy Bridge added a graphics only L3, insulating it from the last level cache.

Trinity is supposed to have more tightly coupled hw queues. AFAIK it also has improved IOMMU enabling more cpu->gpu zero copy work.
 
So, this means the CPU portion is a major bottleneck.
In some cases, sure, but not all (obviously since trinity still is ahead in many games). And that's definitely the sort of comparison that is valid to make - i.e. where the bottleneck lies for a certain workload, and it of course depends on the workload.

I dream of an APU with the IB CPU and Trinity's GPU, that should be the ideal chip. ;)
... but *that's* what you can't really say, and what I'm complaining about. I know from your winky that you understand this but it's worth noting for everyone else. These things are integrated tightly enough these days that you can't just pick up on piece from one and drop it into another. It's like saying "well maybe you could combined NVIDIA's ALUs with AMD's texture sampler and Intel's instruction scheduler" - it's silly and doesn't really make a whole lot of sense.

That's particularly true in this case since you're talking about taking a large part of the AMD chip and combining it with a large part of the Intel chip, and maybe allocating power that way too. Sure everything would be faster if you could use more area and more power, but that's not the point :)

/tldr There's far too little data here to make any comments on the architectural efficiency of segments of the chip designs. We can only really look at them as a whole.
 
... but *that's* what you can't really say, and what I'm complaining about. I know from your winky that you understand this but it's worth noting for everyone else. These things are integrated tightly enough these days that you can't just pick up on piece from one and drop it into another. It's like saying "well maybe you could combined NVIDIA's ALUs with AMD's texture sampler and Intel's instruction scheduler" - it's silly and doesn't really make a whole lot of sense.
That's putting things further along the integration curve than they are. It's not a simple cut and paste job, but there's nothing disclosed about the Trinity core that is fundamental to running on-chip with a GPU. The two sides are hooked onto an interconnect, power management unit, and uncore that insulates each side from the particulars of the other.
AMD's heterogenous initiative promises a form of interconnect that is modular enough to drop a core and a GPU or an accellerator down on the same chip with relative ease.

HSAIL. The F got changed to H, from Fusion to Heterogeneous
Letter change noted.
 
So if not for Intels superior manufacturing process Trinity would wipe the floor with Ivy in GPU department.

To me it's still remarkable that AMD managed to cram so much punch using so little power on ill GloFo 32m HKMG process.

And just a pure speculation on my side, 17W Trinity will show better gaming performance (on average) in shader intensive titles compared to Ivy despite slower CPU performance.

You could just as easily say that Intel would wipe the floor with AMD if not for AMD's superior drivers. The reality is that the two companies took very different approaches. AMD has a ton of area for the GPU, Intel not so much.

And the truth is that Intel has been ahead of everyone on process technology by 12+ months since I started paying attention to these things. And now, the gap is widening and perhaps closer to 15-18 months.

Yes Intel's process technology gives them an advantage, and that advantage is growing bigger every year.

DK
 
Back
Top