Trinity vs Ivy Bridge

That's putting things further along the integration curve than they are. It's not a simple cut and paste job, but there's nothing disclosed about the Trinity core that is fundamental to running on-chip with a GPU. The two sides are hooked onto an interconnect, power management unit, and uncore that insulates each side from the particulars of the other.
Right but even if everything else is ignored, as long as they share a power budget, the two are irreducibly coupled. The reality is that the allocation of power (and area) to the CPU or GPU portions is fundamentally going to affect any comparison between the architectures, and since by my knowledge no one has attempted to measure that or even understand the power policy of the two chips, it's impossible to make general comments about the architectural efficiency of *portions* of the chips related to one another. To play devil's advocate, it could be that trinity is giving 95% of its 17W/45W/whatever budget to the GPU portion, or the same for Ivy. Without that information, it's impossible to compare the architectures.

Not sure about Fusion, but Ivy definitely shares a budget and allocates frequency differently (turbo) to the different portions of the chip depending on what is going on. And there's a power policy there that certainly affects things. On desktop class power budgets it's a minor detail, but when you're comparing power-constrained parts like the 17W SKUs it is critically relevant.
 
Are AMD's CPU cores still sitting behind a too narrow crossbar?
The GPU was explictly singled out as having a quite generous bus connection to the memory controllers, while AMD has in the past throttled its CPUs with its slower northbridge and constricted data paths. Later Phenoms with higher speed memory support wound up getting no scaling from it because they wound up sucking the data through a straw once on-die.

Historically, AMD's memory controller efficiency has lagged Intel, but the apparent gulf this time around seems almost like there's something that constrains CPU side metrics more than it does the GPU.
 
clocking the uncore with memory saw additional performance improvements above the combined of just one at a time on Phenom II's. That should be a pretty simple test on bulldozer/pile-driver.


given that there was such a strong focus on throughput could it be that latency just took a back seat?


also the way RPG always says the same one liner over and over and nothing more, yay for point out the obvious and nothing more. Should become a politician.
 
Last edited by a moderator:
clocking the uncore with memory saw additional performance improvements above the combined of just one at a time on Phenom II's. That should be a pretty simple test on bulldozer/pile-driver.
I remember various users stating that, unlike 'K10', BD's NB+L3/uncore overclocking has almost negligable effect on real-world performance.

3200MHz vs 2200MHz with 4.5GHz BD 8 thread CPU
http://www.madshrimps.be
 
The first performance numbers with a dual-core ULV Ivy Bridge are out (along with the review of the Asus Zenbook Prime):
http://www.anandtech.com/show/5843/asus-zenbook-prime-ux21a-review/6

They won't say which model this, other than it being a dual-core Ivy-Bridge with a 17W TDP. Intel is still keeping the NDA for the dual-core Ivy-Bridge.


It was a i7-3517U. Here is another 17W test with a i5-3427U: http://www.anandtech.com/show/5872/intel-dual-core-ivy-bridge-launch-and-ultrabook-review/1
 
I remember various users stating that, unlike 'K10', BD's NB+L3/uncore overclocking has almost negligable effect on real-world performance.

3200MHz vs 2200MHz with 4.5GHz BD 8 thread CPU
http://www.madshrimps.be
I do have seen reports like this but this can't make sense to me....Still, according to AIDA64's cache test, BD performs badly

Or they've got another bottle neck?
 
Or they've got another bottle neck?
Or testing artifact?

CPU workloads usually doesn't need to much bandwidth so memory controller is organized (and optimized) to minimize latency, not to maximize bandwidth, the testing software will check CPUID and branch to path where code is written and memory organized in a way supposed to be optimal on the given CPU, not always the case...

On Trinity bandwidth is important for the graphics portion, so why not write a shader program and measure the bandwidth avaliable to the IGP?
 
On Trinity bandwidth is important for the graphics portion, so why not write a shader program and measure the bandwidth avaliable to the IGP?
You don't need to write one, it already exists and is called 3dmark (take any version) color fill...
None of the reviews I've seen published a score for it though. In fact I can't remember having seen scores for that for Llano neither, you'd expect it to be in the neighborhood of 128bit ddr3-based hdd5570 when equipped with ddr3-1600.
 
Seems to be even more bandwidth limited than Llano was, with 1866 probably being the new sweet spot.
eDRAM with tight integration couldn't come fast enough. With IGP solutions marching boldly to new performance heights faster and faster, even DDR4 won't relieve the mounting BW disparity.
 
http://www.xbitlabs.com/articles/graphics/display/amd-trinity-graphics_8.html

Seems to be even more bandwidth limited than Llano was, with 1866 probably being the new sweet spot.
Wow, according to that data, AvP and Borderlands 2 are ~60% bandwidth limited with DDR3-1600.

Seems like they did a good job with the graphics. I made a post somewhere in this thread comparing SB to IVB to see how much Intel is benefitting from 22nm vs 32nm, and how big HD4000 would be at 32nm. AMD seems to have an architectural advantage, but of course it's overwhelmed by Intel's process advantage..
 
Wow, according to that data, AvP and Borderlands 2 are ~60% bandwidth limited with DDR3-1600.

Seems like they did a good job with the graphics. I made a post somewhere in this thread comparing SB to IVB to see how much Intel is benefitting from 22nm vs 32nm, and how big HD4000 would be at 32nm. AMD seems to have an architectural advantage, but of course it's overwhelmed by Intel's process advantage..

That's pretty much it. AMD has a huge amount to gain with eDRAM and a better (stronger single thread) cpu core that we hope will be seen with Steamroller. Intel is pretty much at the limit of their shaders and will be forced into expending even more die area on the igp. Haswell will get them close for a few months before Kaveri, but as you mentioned a 32nm Haswell would actually be larger than this Trinity.

AMD are bandwidth limited and intel are shader limited and how they both fix these issues will decide the "winner" in the end.
 
Back
Top