Ivy Bridge GPU details

dkanter · Apr 24, 2012

For those of you wondering about the changes in the IVB graphics architecture, I have a deep dive that compares IVB to SNB and discusses the details of how the GPU was improved:

http://www.realworldtech.com/page.cfm?ArticleID=RWT042212225031

Thanks to Willard for posting on the front page!

DK

Pressure · Apr 24, 2012

I find this quote interesting from the AnandTech review of the Intel Ivy Bridge.

AnandTech said:
More importantly however, a tiny Ivy means that Intel could have given us a much bigger GPU without breaking the bank. I hinted at this possibility in our Ivy Bridge architecture article. Unfortunately at the time only Apple was interested in a hypothetical Ivy Bridge GT3 and rumor has it that Otellini wasn't willing to make a part that only one OEM would buy in large quantities. We will eventually get the GPU that Apple wanted, but it'll be next year, with Haswell GT3. And the GPU that Apple really really wanted? That'll be GT4, with Broadwell in 2014.

So I am just wondering what it could have been.

Davros · Apr 24, 2012

"Unfortunately at the time only Apple was interested in a hypothetical Ivy Bridge GT3 and rumor has it that Otellini wasn't willing to make a part that only one OEM would buy in large quantities."

surely if its faster than what amd offer it will sell ?
apple is unique in oem's as it builds not just systems but operating systems and applications so they would care about capabilities, other oem's just care about price and is it desireable to end users (ie: bang for buck)

dkanter · Apr 25, 2012

A GT3 part would require a lot more effort. I also wonder at what point do you start to get memory bandwidth limited...

David

iwod · Apr 25, 2012

Even if Anandtech predict Haswell GT3 will offer 3x the performance of Ivy it is still no where near what discreet would be able to do. And since Discrete Graphics having gone through 2 cycle of design based on power efficiency, they can now idle at very low power.

So we got back to the questions, why should we ( end user ) want or need Integrated Graphics?

Davros · Apr 25, 2012

price

3dilettante · Apr 25, 2012

Discrete cards are the additional component to the system, while the IGP is there by default.
The add-in board is what needs to justify itself.
For those who want the performance, upgradability, and a broader and somewhat fresher set of secondary features, the cards are justifiable.

For the vast majority of systems, the argument for moving beyond an IGP weakens the less the user demands of the system.

Ivy Bridge's slow evolution for the IGP means that the read/write paths are separate, something AMD has only just moved past for GCN.
At some point, it seems Intel would move past this.
Trinity and either this or Intel's next IGP may be the last examples of the split memory pipeline.

With the programmability aspects being fleshed out, the thing that seems more important is how effectively stacked DRAM or interposer connections can begin to eat into the discrete board's memory bandwidth advantage, and how quickly the various competitors can get to that point.

nAo · Apr 25, 2012

3dilettante said:
Ivy Bridge's slow evolution for the IGP means that the read/write paths are separate, something AMD has only just moved past for GCN.

What do you mean by separate read/write paths?

3dilettante · Apr 25, 2012

The memory pipeline has read-only paths for the L1, L2, and L3 caches.
GCN now has a read/write capability down the same path.

mczak · Apr 26, 2012

3dilettante said:
Ivy Bridge's slow evolution for the IGP means that the read/write paths are separate, something AMD has only just moved past for GCN.
At some point, it seems Intel would move past this.
Trinity and either this or Intel's next IGP may be the last examples of the split memory pipeline.

I don't think it's that much of a problem for intel. They have split caches in the gpu itself but they still have a coherent read/write cache in the form of the LLC. But yes it looks like there's room for improvement.

rpg.314 · Apr 26, 2012

3dilettante said:
With the programmability aspects being fleshed out, the thing that seems more important is how effectively stacked DRAM or interposer connections can begin to eat into the discrete board's memory bandwidth advantage, and how quickly the various competitors can get to that point.

That is not clear cut.

Discretes can use interposers too. Probably more expensive interposers since they are now limited to higher price points.

3dilettante · Apr 26, 2012

The CPU memory bus has additional constraints that weigh it down, thanks to the multidrop bus and the number of discontinuities from the CPU to socket to motherboard to slot to DIMM.
A discrete board could still hold the advantage, but it may not be the near order of magnitude between a desktop processor and an enthusiast video card.

The larger power envelope remains an advantage, for now.
However, I remember when Intel introduced the BTX spec, to betther handle heat from the CPU socket. The primary need evaporated when more efficient chips than Prescott came about.
The funny thing is that back in the days when BTX was mocked for trying to cater to an overheated CPU burning north of 100 watts, GPUs weren't 300 Watt monstrosities with blowers.

Ivy Bridge didn't signficantly change the level of integration between the CPU and GPU portions, but it was hinted that the next round will be different.
Once memory spaces become shared with an Intel design or AMD's avowed goal with its heterogenous compute model, the discrete board's real weak point as a slave device spanning a high latency bus will become more difficult to hide.
Once the GPU is on a interposer, why keep it on the far side of an expansion bus, or out of a socket?
Was something like the BTX thermal module so bad now that we have graphics boards taking up two or three expansion slots?

dkanter · Apr 26, 2012

Today, the fundamental advantage for discrete GPUs is a larger power budget and dedicated memory. Looking out 5 years, I think only the larger power budget will remain.

Sure, Intel and AMD might not throw down 400mm2 on an IGP...and dedicated GPUs will probably have more memory bandwidth, but those are largely cost driven constraints. It's a matter of wanting to expand into higher cost markets, and that desire probably isn't there.

DK

rpg.314 · Apr 26, 2012

3dilettante said:
The CPU memory bus has additional constraints that weigh it down, thanks to the multidrop bus and the number of discontinuities from the CPU to socket to motherboard to slot to DIMM.
A discrete board could still hold the advantage, but it may not be the near order of magnitude between a desktop processor and an enthusiast video card.

The larger power envelope remains an advantage, for now.
However, I remember when Intel introduced the BTX spec, to betther handle heat from the CPU socket. The primary need evaporated when more efficient chips than Prescott came about.
The funny thing is that back in the days when BTX was mocked for trying to cater to an overheated CPU burning north of 100 watts, GPUs weren't 300 Watt monstrosities with blowers.

Ivy Bridge didn't signficantly change the level of integration between the CPU and GPU portions, but it was hinted that the next round will be different.
Once memory spaces become shared with an Intel design or AMD's avowed goal with its heterogenous compute model, the discrete board's real weak point as a slave device spanning a high latency bus will become more difficult to hide.
Once the GPU is on a interposer, why keep it on the far side of an expansion bus, or out of a socket?
Was something like the BTX thermal module so bad now that we have graphics boards taking up two or three expansion slots?

Once coherence extends to discretes, I think a lot of their weakness can be done away with.

I think discretes will have a bw advantage since they don't have to share it with a CPU and since they tend to employ larger die sizes, I am guessinig they will be in a position to afford wider mem buses, even on an interposer.

And who knows, may be adding a small CPU core or two (bobcat ish) on a discrete might not be such a bad idea after all.

nAo · Apr 26, 2012

3dilettante said:
GCN now has a read/write capability down the same path.

For what data types? UAVs?

3dilettante · Apr 26, 2012

Untyped read/write/atomic with MUBUF, image read/write/atomic with MIMG.

MTBUF has read/write for typed buffers, with the type being dictated by a resource constant.
The AMD presentation doesn't list out atomics for this one, and that does sound like it could be used for a UAV with its lack of ordering.

Nvidia's graphics export pipe is the most integrated with the cache hierarchy, since the ROPs use the L2.
AMD seems to be less so since GDS and graphics export have a side path and the ROPs are separate.
IVB looks at a higher level to resemble an earlier AMD GPU, possibly before the introduction of that little UAV cache the preceded the R/W cache hierarchy.

The ROP path seems specialized enough to keep a separation for all three. Nvidia's done the most to update the graphics domain, hence why it seems the ROP path is the most tightly integrated.
AMD's compute side has been overhauled, but it seems like its current design has compromised on a a CU array that prioritizes each CU being able to serve different compute clients. The modestly evolved graphics domain exists at a slight remove, with the specialized export bus between the freer compute array and the ordered ROP and GDS hardware.

Perhaps Intel hasn't opted for closing the loop yet because of the cost involved in making the leap, and because it's really not hurting as badly for compute performance thanks to its CPU dominance.

rpg.314 said:
And who knows, may be adding a small CPU core or two (bobcat ish) on a discrete might not be such a bad idea after all.

They may do it because the shrinking volume of the discrete market may make it too expensive to have a GPU-only chip. There may be a range of APUs, with some having a very high balance of GPU capability. Perhaps a gamer system with dual sockets, one heavy on the CPU, the other on GPU?

UniversalTruth · Apr 29, 2012

iwod said:
So we got back to the questions, why should we ( end user ) want or need Integrated Graphics?

Davros said:
price

Price but with awful performance, it doesn't make sense.

It's not price but I think it's fusion the right answer. In future when you won't be able to recognise what the classic CPU and classic graphics part of the chip are, then this will be very helpful and accelarating all kinds of compute.

AlphaWolf · Apr 30, 2012

UniversalTruth said:
Price but with awful performance, it doesn't make sense.

How fast does it need to be to send email or use facebook?

UniversalTruth · Apr 30, 2012

AlphaWolf said:
How fast does it need to be to send email or use facebook?

That's a question of human psychology, everyone has his/ her own criteria for satisfaction.
If you ask me, personally, then my own 6870 is the absolute minimum for satisfaction.
The more the better you know. And more people understand it when they have to deal with the awful performance of those integrated solutions (even when browsing and scrolling down some web pages there is a possibility you feel how weak actually they are) - they simply won't have the freedom to launch everything they want...

The price is not that much of a problem I think. I mean in the sane zone of prices- like 50-100-150 $.

Alexko · Apr 30, 2012

Davros said:
price

Not to mention power, physical size, easier heterogeneous computing.

Ivy Bridge GPU details

dkanter

Pressure

Davros

dkanter

iwod

Davros

3dilettante

nAo

Nutella Nutellae

3dilettante

mczak

rpg.314

3dilettante

dkanter

rpg.314

nAo

Nutella Nutellae

3dilettante

UniversalTruth

AlphaWolf

Specious Misanthrope

UniversalTruth

Alexko

Similar threads