Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 06-May-2011, 11:17   #1
Nick
Senior Member
 
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
Default 22 nm Larrabee

Hi all,

Since Intel's 22 nm FinFET process technology will be production ready at about the same time as TSMC's 28 nm process, I was wondering if this means Intel is actually two generations ahead now.

I think this could give them the opportunity to launch an improved Larrabee product. The inherent inefficiency of such a highly generic architecture at running legacy games could be compensated by the sheer process advantage. Other applications and games could potentially be leaps ahead of those running on existing GPU architectures (e.g. for ray-tracing, to name just one out of thousands).

In particular for consoles this could be revolutionary. They needs lot of flexibility to last for many years, and the software always has to be rewritten from scratch anyway so it can make direct use of Larrabee's capabilities (instead of taking detours through restrictive APIs).

It seems to me that the best way for AMD and NVIDIA to counter this is to create their own fully generic architecture based on a more efficient ISA.

Thoughts?

Nicolas
Nick is offline   Reply With Quote
Old 06-May-2011, 13:05   #2
Squilliam
Beyond3d isn't defined yet
 
Join Date: Jan 2008
Location: New Zealand
Posts: 3,037
Default

Maybe we'll see a rebirth of 'Larrabee in consoles'? although I had thought they had abandoned it completely...
__________________
It all makes sense now: Gay marriage legalized on the same day as marijuana makes perfect biblical sense.
Leviticus 20:13 "A man who lays with another man should be stoned". Our interpretation has been wrong all these years!
Squilliam is offline   Reply With Quote
Old 06-May-2011, 13:50   #3
HellFire_
Junior Member
 
Join Date: May 2005
Posts: 26
Default

This is the latest Knights Ferry Tech-Demo I could find. A real-time ray-tracing of Wolfenstein running on 4 Knights-Ferry Servers and to be honest it still looks like shit...

http://www.youtube.com/watch?v=XVZDH15TRro

Based on that, I don´t think Intel offers a viable solution for next-gen consoles. 22nm won´t help that much I think.
HellFire_ is offline   Reply With Quote
Old 06-May-2011, 16:02   #4
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
Default

Larrabee in this context would probably be compared in terms of its rasterization rates, not ray-tracing. The ROI for ray-tracing at this point would be an exercise in how many stumbling blocks you can put in the way of a good process.
Using Larrabee primarily as a software rasterizer would probably get more competitive results given the workloads it would probably encounter.

The FinFET's benefits are interesting to consider. At the same process node for a low-power device, the 20-30% gain in power efficiency could negate the ~20% inefficiency in being x86 versus some other less cumbersome ISA.

GPUs would probably be at a higher voltage realm, where the benefits are over 18% but probably less than the maximum 50% improvement over 32nm.

Density-wise, it would be an improvement. Historically, I would characterize Intel's density figures in this particular segment to not be an advantage, even with a node advantage. Cayman, for example is much denser than Sandy Bridge.
Larrabee's density was pretty bad, but this may have been due to a lack of optimization in physical design.
Without knowing how much Intel would try to optimize, 22nm would still leave Larrabee 22nm at a marked disadvantage. I would be mildly curious if it would beat the densest 40nm GPUs.

As an aside, Intel will have the distinction of having the first 22nm GPU in IB.

The power advantages to the process would be notable if it were facing off against a similar generic manycore with only a different ISA.
While the ISA probably contributed a measurable deficit to the power and performance gap, I have stated my suspicions before that it's really not the biggest factor.
The possible longer maturation period for the novel process may delay the deployment of a chip of Larrabee's size.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 06-May-2011, 17:13   #5
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

It is not at all obvious that a 22nm lrb will be a straightforward scale up of 45 nm lrb. They may very well choose to constrain the architecture or ditch x86 for the next rev.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 06-May-2011, 17:29   #6
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
Default

Larrabee 3 could be very different.
I'm not sure if there is to be a GPU card based on Larrabee that it would depart from Intel's x86 above all else mantra, and the returns on a new ASIC taking on established titans may not be too great.

My question is whether Intel even wants to make a discrete card anymore, and it still seems to be pitting the onboard GPUs in its current and future CPUs against what should have been the introduction of on-die Larrabee(ish?) cores. A lack of consistency and support could lead to a repeat of the original embarrassment.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 06-May-2011, 17:40   #7
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by 3dilettante View Post
Larrabee 3 could be very different.
I'm not sure if there is to be a GPU card based on Larrabee that it would depart from Intel's x86 above all else mantra, and the returns on a new ASIC taking on established titans may not be too great.
While some kind of x86 presence seems realistic, it might not be fused with the vector cores at instruction stream level.
Quote:
My question is whether Intel even wants to make a discrete card anymore, and it still seems to be pitting the onboard GPUs in its current and future CPUs against what should have been the introduction of on-die Larrabee(ish?) cores. A lack of consistency and support could lead to a repeat of the original embarrassment.
The discrete market is obviously declining, but I'd expect a discrete product in the beginning, if only to lower the risk of shoving it onto their shiny cpu's and have both fail.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 07-May-2011, 11:03   #8
Voxilla
Member
 
Join Date: Jun 2007
Posts: 263
Default

In my opinion Intel has completely abandoned the idea of producing Larrabee GPUs. If Larrabee would be a viable architecture, Intel would have used it in Ivy bridge, but they don't.
Voxilla is offline   Reply With Quote
Old 07-May-2011, 12:05   #9
Voxilla
Member
 
Join Date: Jun 2007
Posts: 263
Default

And personally I would have loved it would have been used in Ivy bridge, because for us rendering specialists it is a dream.
Voxilla is offline   Reply With Quote
Old 11-May-2011, 07:08   #10
Nick
Senior Member
 
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
Default

Quote:
Originally Posted by Voxilla View Post
If Larrabee would be a viable architecture, Intel would have used it in Ivy bridge, but they don't.
It doesn't make sense to have x86 cores with different features. It looks like they plan to add LRBni type instructions to AVX though.

AVX is specified to support register widths up to 1024 bits. So they could relatively easily execute 1024-bit vector operations on the currently present 256-bit execution units, in 4 cycles (throughput). The obvious benefit to this is power efficiency. Then all that's left to add is gather/scatter support and the IGP can be eliminated, leaving a fully generic architecture that is both low latency and high throughput. Larrabee in your CPU socket, without compromises.
Nick is offline   Reply With Quote
Old 07-May-2011, 12:43   #11
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

They did claim a 50+ core part would be out on 22nm.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 07-May-2011, 13:19   #12
moozoo
Junior Member
 
Join Date: Jul 2010
Posts: 31
Default

>They did claim a 50+ core part would be out on 22nm.
But its not a GPU as such. Purely a computing accelerator to compete with Nvidia Teslas. I don't see how that can be comercially viable without a mass market GPU product line to pay for the develoment costs.
moozoo is offline   Reply With Quote
Old 07-May-2011, 13:42   #13
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by moozoo View Post
>They did claim a 50+ core part would be out on 22nm.
But its not a GPU as such. Purely a computing accelerator to compete with Nvidia Teslas. I don't see how that can be comercially viable without a mass market GPU product line to pay for the develoment costs.
Entirely depends on just how competitive their renderer is.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 07-May-2011, 15:22   #14
Voxilla
Member
 
Join Date: Jun 2007
Posts: 263
Default

Also depends on how deep their pockets are, think Itanium.
Voxilla is offline   Reply With Quote
Old 08-May-2011, 07:07   #15
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by Voxilla View Post
Also depends on how deep their pockets are, think Itanium.
Well, then they would definitely release a discrete part.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 08-May-2011, 10:38   #16
Pressure
Member
 
Join Date: Mar 2004
Posts: 751
Default

Didn't they bascially say that they carved it up as experience for future IGP designs?

However a dedicated card that could be used as GPGPU for professionals would be grand. Many professional applications (video-editing, photographing etc) could really use the power.

Nothing I hate more than waiting for a render to complete.
__________________
Never Argue With An Idiot. They'll Lower You To Their Level And Then Beat You With Experience!
Pressure is offline   Reply With Quote
Old 11-May-2011, 08:48   #17
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,842
Send a message via ICQ to CarstenS
Default

The way you're putting makes it sound so easy…
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote
Old 13-May-2011, 13:08   #18
liolio
Ohio frog
 
Join Date: Jun 2005
Location: Ohio, USA
Posts: 4,172
Default

I've a couple of "honest" questions. Some here are real software developers as Nick others seems to know their fair share either about hardware/micro-electronic and software, I'm just a geek, so no offence

First in regard to the comparison between SwiftShader and intel HD3000.
There's x5 difference in the 3D mark06 score. OK.
*What is the cost of running ShiftShader "itself" on the CPU? Is it in the same ball park as running the HD3000 drivers? Or higher, if yes significantly?
*In regard to power consumption, what is the usual power consumption of a CPU running 3Dmark06 on a discrete GPU? As I think it would be fair to consider the incompressible/fixed CPU cost to run something as 3Dmark.
*Overall can we consider the overall cost (in power and compute power) of swiftshader in the same ballpark as drivers?
*Another thing the HD3000 is not tiny by any mean if this floormap correct, it looks more like equal ~2 cores:


Overall it would be more fair to compare a quadcore to a dual core+IGP. From a costumer POV what serves the most? A quadcore? a dual core+ (shitty anyway)IGP? In regard to power how a HD3000 compares to two SnB cores? I guess that tough to find out. Anyway the IGP is likely way better in perfs per Watts by quiet an healthy margin.

Some questions more specifically aimed at you Nick.
* Is swiftShader optimized for AVX already?
* What are your expectations in regard to for example 3Dmark06 if it were implement if not straight to the metal using various libraries? How close do you think it would come to the IGP/HD3000?
* Say a bench or game were desgin with a CPU as hardware target, how close to think the end result would compare to an IGP (the HD3000 can serve as ref). Say you pass on some calculations and use more complex, bigger datastructures so more precompute values, do sacrifices clever trick elsewhere. Devs could count on 4GB or more of RAM, lot of cache, etc.
Basically do you think that it would be possible achieve for a quad-cores the "same" result as with an IGP+dual cores.

Last edited by liolio; 13-May-2011 at 15:16.
liolio is offline   Reply With Quote
Old 13-May-2011, 15:10   #19
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by liolio View Post
Another thing the HD3000 is not tiny by any mean if this floormap correct, it looks more like equal ~2 cores:
It's more like 1.5. You have to count the L3 cache per core as well.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 13-May-2011, 21:38   #20
compres
Member
 
Join Date: Jun 2003
Location: Germany
Posts: 553
Send a message via AIM to compres Send a message via MSN to compres Send a message via Skype™ to compres
Default

Quote:
Originally Posted by rpg.314 View Post
It's more like 1.5. You have to count the L3 cache per core as well.
Yo are stating that the GPU does not share the L3 BW with the cores. Can anyone confirm?
compres is offline   Reply With Quote
Old 13-May-2011, 21:58   #21
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,071
Default

The GPU has access to the L3. The L3's dimensions are determined by the cores in SB. The tiny L2 on SB and its advanced power gating rely on there being an L3 tile per core.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 16-May-2011, 01:22   #22
Nick
Senior Member
 
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
Default

Quote:
Originally Posted by liolio View Post
I've a couple of "honest" questions. Some here are real software developers as Nick others seems to know their fair share either about hardware/micro-electronic and software, I'm just a geek, so no offence
I'm actually a computer engineer with a minor in embedded systems. But no offence taken.
Quote:
*What is the cost of running ShiftShader "itself" on the CPU? Is it in the same ball park as running the HD3000 drivers? Or higher, if yes significantly?
Good question. The vast majority of execution time goes to dynamically generated processing routines. The rest is divided between some 'fixed-function' processing, format conversions, and the actual 'driver' and API functionality. The latter two (which is what I assume you meant by SwiftShader "itself") are really thin layers. There's a very short path between the application and starting the actual calculations.

That said, some reviews report that Intel puts a lot of load on the CPU while rendering 3D graphics: CPU Usage in Graphics. Some even claim all geometry shaders execute on the CPU.

In any case to objectively compare pure software rendering against the IGP, I don't think we can neglect the many roles the CPU still plays for assisting the IGP. Unfortunately I don't have a Sandy Bridge system myself so I can't provide any accurate numbers.
Quote:
Is swiftShader optimized for AVX already?
No.
Quote:
What are your expectations in regard to for example 3Dmark06 if it were implement if not straight to the metal using various libraries? How close do you think it would come to the IGP/HD3000?
Hard to say. If I recall correctly it uses some blur filters which could be implemented way more efficiently with custom vector code instead of lots of texture lookups. But I'm sure that by having a full overview of the rendering process at an application level, there's a lot more that can be optimized by departing from the legacy graphics pipeline.

Just look at the sheer computing power. An i7-2600 can do 218 GFLOPS (not counting in any turbo mode). At 800x600, that's a staggering 450,000 floating-point operations per pixel per second, or a budget of 15,000 operations per pixel at 30 frames per second. Currently a lot of this power goes to waste though because of the lack of gather/scatter (forcing some memory accesses to be serial scalar operations), and because the API demands certain detours.
Quote:
Basically do you think that it would be possible achieve for a quad-cores the "same" result as with an IGP+dual cores.
With gather/scatter, FMA and AVX-1024 support, yes, I'm convinced that the IGP would be a waste of silicon. It might take many more years for gather/scatter support to be implemented though, so quad-cores are probably outdated by then. But given that the CPU is already ahead of the IGP in GFLOPS, FMA will double it again, the IGP is limited by bandwidth, and graphics itself is getting more generic, I think it's very doubtful that the IGP can outrun its fate.
Nick is offline   Reply With Quote
Old 14-May-2011, 04:12   #23
compres
Member
 
Join Date: Jun 2003
Location: Germany
Posts: 553
Send a message via AIM to compres Send a message via MSN to compres Send a message via Skype™ to compres
Default

So the more cores the more L3. And the GPU has access.

Are there any tests showing the same GPU in 2 vs 4 core sandyb. configurations?
compres is offline   Reply With Quote
Old 14-May-2011, 14:29   #24
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,842
Send a message via ICQ to CarstenS
Default

Quote:
Originally Posted by compres View Post
So the more cores the more L3. And the GPU has access.

Are there any tests showing the same GPU in 2 vs 4 core sandyb. configurations?
The additional SB cores can also access other cores tiles, but they have to go the long way, increasing latency. It's not like the IGP suddenly has more memory for itself.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote
Old 14-May-2011, 14:42   #25
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

afaik, that access is read only.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 03:31.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.