Trinity vs Ivy Bridge

If you bump GPU performance up, the Piledriver CPU is just going to become even more of a bottleneck.

As with Llano, it seems to me that the most interesting Trinity notebook would be something small that benefits from its high integration and relatively low heat output. I have a HP DV2 12" subnote which was exciting in '09 because of its discrete GPU that gave it usable gaming capabilities (thrilling R3450 512M). But it is noisy and hot because of all of the separate chips. Trinity could work out well for a small notebook like that.

I wonder if Trinity could be sold in a form with its GPU disabled and crank the CPU clocks up? Make it a bit more performant and more interesting to link with a fast discrete GPU.
 
And I'm pretty sure that this same Trinity combo for ~750€ will far surpass any IB+nVidia combo for the same price, in gaming performance.
In v-synced gaming performance where Crossfire works, of course :p

I'm unsure of the conversation rate, but here is a Lenovo Y480, Ivy Bridge 3610QM and NVIDIA GT640M for $899 USD that includes a 1TB drive, Bluray player, 8GB of ram and is a 14" form factor.

At the $750 pricepoint, the Trinity is going to find itself outgunned. To be competitive, it will need to find a home in the $600-or-less series of devices... My opinion, of course.
 
If you bump GPU performance up, the Piledriver CPU is just going to become even more of a bottleneck.

As with Llano, it seems to me that the best Trinity notebook is something small that benefits from the lower heat output. I have a HP DV2 12" subnote which was exciting in '09 because it was faster than a netbook (Athlon Neo, RS690M, SB600, R3450). But it is noisy and hot because of all of the separate chips. Trinity could work out better for a notebook like that.

This CPU bottleneck thing is getting way overblown, IMO.
Not everyone is going to play only Skyrim and Sarcraft 2, and they aren't exactly riding high-end GPUs anyways.

With the GPU power going into these laptops and at the level of detail they can provide, Battlefield 3 won't be bottlenecked by the dual-module Piledriver, neither will Dirt 3, Portal 2, Crysis 2, Diablo 3, Mass Effect 3 or Dragon Age 2, SW:TOR or tons of others.

For 99% of the PC games launched during the last 2 years, the dual-module Piledriver @2.3->3.2GHz will not be that big of a turnoff in games as many tend to believe.


As for your preference, it's just that - your preference. IMO, it doesn't make other alternatives any worse..
Some will prefer a lower-profile machine, others will enjoy a more powerful gaming set for a good performance/price ratio.


I'm unsure of the conversation rate, but here is a Lenovo Y480, Ivy Bridge 3610QM and NVIDIA GT640M for $899 USD that includes a 1TB drive, Bluray player, 8GB of ram and is a 14" form factor.

At the $750 pricepoint, the Trinity is going to find itself outgunned. To be competitive, it will need to find a home in the $600-or-less series of devices... My opinion, of course.

Unfortunately, for laptops, computer hardware and smartphones/tablets, the conversion rate is almost 1:1.
So a Trinity+HD7650M laptop for $750 that has better battery life and plays games faster than that IB+nVidia laptop for $900 isn't competitive enough?
 
Last edited by a moderator:
This CPU bottleneck thing is getting way overblown, IMO.
I think you and I had the same conversation about Llano when it came out... ;) I'm just trying to think up in what situations Trinity's attributes would shine, besides the usual super budget ~15" notebooks.
 
Trinity results using OpenCL accelerated Handbrake (didn't see this mentioned).
Link
I didn't see any IQ comparisons though.

AnandTech HandBrake piece said:
Image quality appeared to be comparable between all OpenCL outputs, although we did get higher bitrate files from the x86 transcode path.

Interesting. TR also said that QuickSync- and VPE-accelerated encodes produced smaller files in MediaEspresso:

TechReport Trinity review said:
We didn't see much of a difference in output image quality between the two, but the output files had drastically different sizes. QuickSync spat out a 69MB video, while VCE got the trailer down to 38MB. (Our source file was 189MB.) Using QuickSync in high-quality mode extended the Core i7-3760QM's encoding time to about 10 seconds, but the resulting file was even larger—around 100MB. The output of the software encoder, for reference, weighed in at 171MB.

I'd like to see an IQ comparison.
 
Unfortunately, for laptops, computer hardware and smartphones/tablets, the conversion rate is almost 1:1.
So a Trinity+HD7650M laptop for $750 that has better battery life and plays games faster than that IB+nVidia laptop for $900 isn't competitive enough?

Fair enough re: technology cost conversion rate. Purely from my own perspective, the extra $150 would be worth the serious performance upgrade if I were shopping for a gaming-capable but smaller form-factor laptop. I'm a current owner of the Lenovo Y460 and it is a fantastic size for my needs. Mine is the Arrandale i5-540M and paired with the ATI 5650M, and it is still a competent performer after three years or ownership.

Trinity wedged into a gaming-capable laptop really comes into it's own at the ~$600 price mark, as at that point, Ivy Bridge has no way to touch it. Trinity also has great potential for all-day laptops while keeping competent performance and without going for the super-mobile sizes.

But on the whole, Trinity is forced to aim for low cost models or models that focus on long runtimes in order to be fully competitive. Anything beyond $800 and you're in a place where IVB + dGPU will have stronger alternatives, and anything below 5 hours of life and IVB is there again too.

Purely my opinion.
 
This CPU bottleneck thing is getting way overblown, IMO.
Not everyone is going to play only Skyrim and Sarcraft 2, and they aren't exactly riding high-end GPUs anyways.

It must be pretty frustrating to have developed such a powerful igp just to watch it being held back by weak cores though, and to be frank at the level of fps we're talking about the cpu is holding a lot more weight than it would at the enthusiast end.

It's the wrong cpu by far for this market. It's the wrong cpu period tbh but it's especially galling to see it holding back what is a far superior gpu. It's not just about the losses that shouldn't be happening - every single game would be performing better with a more suitable cpu.

Steamroller needs to be a completely new philosophy.
 
In a two-thread situation where one thread can reach 2 or more and the other cannot, SB wins.
This is exactly the issue - IA decoder averages around 2instr/cycle (over the time, sometime you decode more, sometime less), and I don't see how the BD decoder can do much better. Averaging, it means 1instr/cycle/core (sometime more, sometime less). Not considering the TC used by SB that can pump max decoding bandwidth ofc.

I don't think AMD would have expanded the move issue rate if this wasn't the case.
Well, consider this: if you feed a single core with 4 decoder, you may get 3 or 4 MOP/Cycle peak rates that your 2ALU+2AGLU cannot sustain. Adding register renaming in PD can speed up a core when it's running 'alone' or when the other core is stalled and front-end feeds a single core. Running faster when you can is surely a good way to increase overall IPC.
Intel uses the decoder in the same way, but mixing two thread's dependencies over the same core allows them to 'skip' ordering dependencies that would otherwise keep ALU/memory ops unused, and allows them to use at best the CPU resources.

Build 2 decoders.
Imho, i bet they'll be forced to go for something similar; I do not think you can make a bigger decoder since there's an intrinsic limit in the average linearity of IA code (was around 7 instr) and single/dual/microcode instruction mix. I'm curious to see how they'll solve the issue..
 
Steamroller needs to be a completely new philosophy.

It takes half a decade to bring a cpu from inception to production. There is no way, no way at all, that Steamroller will be "a completely new philosophy".

This is exactly the issue - IA decoder averages around 2instr/cycle (over the time, sometime you decode more, sometime less), and I don't see how the BD decoder can do much better.

Actually, it can. The Intel decoders predecode from aligned 16-byte buffers. This means that if there are 5 instructions in a single 16-byte word, and you can decode the first 4 in a cycle, then you have to waste a full cycle for that last one. BD uses a ring of 16 "dispatch group buffers", and can pick instructions from the first two, with minimal alignment restrictions. The AMD decoder gets much, much closer to it's maximum theoretical throughput than the Intel one does, barring the instructions it really stumbles on (256-bit AVX).

Of course, this is not enough. The theoretical optimum is 2 per thread for BD, the realized throughput of SNB is ~2.5. And, without the uop cache, SNB would be decode-limited.

BD either needs to split the frontend, or add some form of caching of decoded uops (Mops for AMD?) for each core after the frontend.
 
you guys arguing about cpu's need
cpu top trumps..
http://www.thinkgeek.com/geektoys/games/ee8b/

:D

Now is your chance to relive history and battle it out to get all the CPUs. Who will you pick? The legendary Z80, the powerful Core i7, or the tiny 80286? This is the most fun you will ever have with CPU specs!"

on 1.5 micron technology
0.134 M transistors in 47 mm^2
with Ivy Bridge on 0.02 micron you have
1400 M transistors in 160 mm^2

So, if we you build such a die on 22 nm with only 0.134 M transistors, will you be able to squeeze more performance out of it in comparison to the ancestor?
 
Last edited by a moderator:
So, if we you build such a die on 22 nm with only 0.134 M transistors, will you be able to squeeze more performance out of it in comparison to the ancestor?
Wouldn't it become so small that you wouldn't be able to get many wires out of it? You could of course add a whopping megabyte of memory on the die, and there's no need to have an external memory bus :)
 
Wouldn't it become so small that you wouldn't be able to get many wires out of it? You could of course add a whopping megabyte of memory on the die, and there's no need to have an external memory bus :)

I would love that an Intel guy read this and think "Challenge accepted".
 
Nehalem had their first power control unit doodad AFAIK. Unless Speedstep required something like it in which case Pentium III might have one too...
 
The first performance numbers with a dual-core ULV Ivy Bridge are out (along with the review of the Asus Zenbook Prime):
http://www.anandtech.com/show/5843/asus-zenbook-prime-ux21a-review/6

They won't say which model this, other than it being a dual-core Ivy-Bridge with a 17W TDP. Intel is still keeping the NDA for the dual-core Ivy-Bridge.

EDzwj.png

hcbGb.png

29KAA.png

DdNjp.png

DFId6.png



This spells lots of trouble for the ULV Trinity:
XYCeG.png


The only 17W Trinity announced so far is the A6-4455M, which has 33% lower clocks and 33% less shader units than the A10-4600M that was reviewed. It means the ULV Trinity that was announced should have only ~45% of the GPU performance from the full 35W model.


Looking at those benchmarks, it seems that the ULV IvyBridge will confortably trade blows with the ULV Trinity, even surpassing it in some cases.

AMD will definitely need a 17W A8 or A10 to defeat th ULV IvyBridge in graphics performance.
 
The 35W Trinity is some 2.5X faster than the 17W Ivy Bridge in the actual games overall.

The review also shows 10 benchmarks with the 45W IB beating the 35W Trinity in 6 of them. Nothing else needs to be said.
 
The straight computation scores are impressive for a low clocked ULV part. Trinity isn't out yet though, even their 35W part is seeing far better battery life than the prototype 17W Ivy Bridge box. Granted, there is room for improvement in the firmware, drivers, blah-de-blah, but it's likely that Trinity is still going to take the prize for longest power-on time in any platform where it attempts to compete.
 
Back
Top