Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 02-Jun-2008, 03:16   #1
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default Larrabee at Siggraph

Intel will present a paper about Larrabee at Siggraph this summer:

Larrabee: A Many-Core x86 Architecture for Visual Computing

Quote:
This paper introduces the Larrabee many-core visual computing architecture (a new software rendering pipeline implementation), a many-core programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as fixed-function co-processors. This provides dramatically higher performance per watt and per unit of area than out-of-order CPUs on highly parallel workloads and greatly increases the flexibility and programmability of the architecture as compared to standard GPUs.
I'm sure this post will put a smile on Geo's face
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 03-Jun-2008, 14:47   #2
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 2,647
Default

Quote:
Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as fixed-function co-processors.
I hope there are some nifty disclosures on just how much is under that umbrella.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 03-Jun-2008, 15:58   #3
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Nice find, nAo.

Quote:
Originally Posted by 3dilettante View Post
I hope there are some nifty disclosures on just how much is under that umbrella.
Definitely looking forward to opening that can of worms.

DK over @ RWT has a discussion thread on the subject (credit nAo again).

speaking of can of worms.... interesting supposition by Doug Siebert (long-time RWT poster):
Quote:
Sure, there may still be some discrete Larrabee parts if there is demand for them in the HPC world or Intel really does plan on pursuing the discrete GPU market. I'm skeptical of that, but I guess once Nvidia is dead in a few years Intel won't want to let AMD have that market to itself, even though it will be much smaller than it is today!
ShaidarHaran is offline   Reply With Quote
Old 03-Jun-2008, 16:19   #4
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default

From the abstract it seems this talk/paper is more software oriented than hardware oriented. I wouldn't be surprised if we won't learn any new technical detail about Larrabee's hardware architecture.
On the other hand I can't wait to know also more about its software architecture to get a glimpse of how Intel will likely expose the hardware to software engineers. (I'm not exactly a CUDA fanboy)

Regarding fixed function units we are going to see some TMUs and probably not much more than that. Adieu rasterizer..
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 03-Jun-2008, 16:23   #5
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Quote:
Originally Posted by nAo View Post
From the abstract it seems this talk/paper is more software oriented than hardware oriented. I wouldn't be surprised if we won't learn any new technical detail about Larrabee's hardware architecture.
On the other hand I can't wait to know also more about its software architecture to get a glimpse of how Intel will likely expose the hardware to software engineers. (I'm not exactly a CUDA fanboy)

Regarding fixed function units we are going to see some TMUs and probably not much more than that. Adieu rasterizer..
Even if they don't give us anymore details on Larrabee's micro-architecture, I'm sure they'll at least provide relevant instruction throughput numbers. Rather useful in your line of work, I'd think
ShaidarHaran is offline   Reply With Quote
Old 03-Jun-2008, 17:02   #6
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 2,647
Default

Quote:
Originally Posted by nAo View Post
From the abstract it seems this talk/paper is more software oriented than hardware oriented. I wouldn't be surprised if we won't learn any new technical detail about Larrabee's hardware architecture.
On the other hand I can't wait to know also more about its software architecture to get a glimpse of how Intel will likely expose the hardware to software engineers. (I'm not exactly a CUDA fanboy)

Regarding fixed function units we are going to see some TMUs and probably not much more than that. Adieu rasterizer..
Even knowing the outlines of how TMU functionality is accessed by the x86 threads will be informative, unless the paper turns out to be entirely made of fluff.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 06-Jun-2008, 19:27   #7
Mat3
Member
 
Join Date: Nov 2005
Posts: 138
Default

There'll be some scalar processors in there too of course, but it's interesting that they will be depending on a wide vector unit to do a lot of the work.
Mat3 is offline   Reply With Quote
Old 06-Jun-2008, 20:19   #8
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Quote:
Originally Posted by Mat3 View Post
There'll be some scalar processors in there too of course, but it's interesting that they will be depending on a wide vector unit to do a lot of the work.
Isn't it more likely that the "ALUs" will simply support both vector and scalar instructions?
ShaidarHaran is offline   Reply With Quote
Old 06-Jun-2008, 20:30   #9
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 2,647
Default

Quote:
Originally Posted by Mat3 View Post
There'll be some scalar processors in there too of course, but it's interesting that they will be depending on a wide vector unit to do a lot of the work.
The scalar units exist alongside a wide SIMD unit in each core.
Larrabee's descriptions don't hint at any scalar-only cores.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 16-Jun-2008, 07:11   #10
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default

It seems Intel licensed Pixomatic IP, another indirect confirmation that Forsyth and Abrash are working on Larrabee's software rasterizer:

http://softwarecommunity.intel.com/i...ghai_short.ppt
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 08-Jul-2008, 16:50   #11
darkblu
Senior Member
 
Join Date: Feb 2002
Posts: 2,481
Default

Quote:
Originally Posted by nAo View Post
It seems Intel licensed Pixomatic IP, another indirect confirmation that Forsyth and Abrash are working on Larrabee's software rasterizer:

http://softwarecommunity.intel.com/i...ghai_short.ppt
i'd have been very surprised if Abrash was not working on a software larrabee rasterizer : )
darkblu is offline   Reply With Quote
Old 08-Jul-2008, 17:08   #12
^M^
Junior Member
 
Join Date: Jul 2006
Posts: 45
Default

Intel bought Pixomatic to RAD in the end of 2005 : http://www.radgametools.com/pixomain.htm

But RAD seems to have kept the right to license current (at the time) versions.
^M^ is offline   Reply With Quote
Old 08-Jul-2008, 17:26   #13
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default

Ouch, they also have a DX9 implementation, I wasn't aware of that
I wonder if Abrash&Co. had the opportunity to ask for specific hardware optimizations that would speed up software rasterization.
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 08-Jul-2008, 17:33   #14
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Quote:
Originally Posted by nAo View Post
Ouch, they also have a DX9 implementation, I wasn't aware of that
I wonder if Abrash&Co. had the opportunity to ask for specific hardware optimizations that would speed up software rasterization.
Would the new Radix 16 Divider and Super Shuffle Engine in Penryn family CPUs qualify?
ShaidarHaran is offline   Reply With Quote
Old 08-Jul-2008, 17:47   #15
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 2,647
Default

Division is helped by the Radix 16 divider and variable-latency divides, though how does that look specific to graphics?

Super-shuffle isn't much of a graphics-only optimization as it's bringing Intel's shuffle latencies within the same range or better than what AMD's had for years.
Conroe had pretty long latencies. Netburst's latencies were pretty brutal.

There was that instruction for blending, which might seem closer to a graphics optimization.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 08-Jul-2008, 19:49   #16
Humus
Crazy coder
 
Join Date: Feb 2002
Location: Stockholm, Sweden
Posts: 3,134
Send a message via ICQ to Humus Send a message via MSN to Humus
Default

Well, nothing can be said to be graphics specific since it's useful for other purposes as well, but there are a few candidates for where graphics might have been one of the main motivators. Super-shuffle would count into that, the blending instructions, and most certainly the DPPS instruction, which can implement DOT2/DOT3/DOT4 stuff. The fast divider is a bit too generic to say it's graphics related, although it'll help for a number of important tasks.

The dot product instruction is particularly interesting to note, especially given that Intel never really seemed to show much interest for graphics in previous SSE instruction sets.
__________________
[ Visit my site ]
I speak for myself and only myself.
Humus is offline   Reply With Quote
Old 09-Jul-2008, 04:57   #17
Geo
Mostly Harmless
 
Join Date: Apr 2002
Location: Uffda-land
Posts: 9,156
Send a message via MSN to Geo
Default

Damnit, I'm going to have to buy a Larrabee viddy card. . . .just because. A) To show what an iconoclast I am. B). As a collectors item of when Intel began their conquest of gpus C). As a collectors item of Intel's Folly in thinking they could conquer gpus. D). Because I enjoy bitching about IHV's not providing robust software support and compatibility, and this is almost certain to be "a target rich environment" with Larrabee.

Or some combination of the above.



Has anybody heard a productization name rumour yet? "Bitchin' Fast 3D" or somesuch?

Oh yes, and y'all feel free to leak that paper to me/B3D in advance. I understand Dougie sees B3D ninjas behind every potted plant.
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee
"Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel
". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006
"Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss
Geo is offline   Reply With Quote
Old 09-Jul-2008, 05:08   #18
Andrew Lauritzen
AndyTX
 
Join Date: May 2004
Location: British Columbia, Canada
Posts: 1,061
Default

Ahahaha... ahh man Geo, you're too harsh
__________________
The content of this message is my personal opinion only.
Andrew Lauritzen is offline   Reply With Quote
Old 09-Jul-2008, 05:11   #19
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default

Too bad we have to wait another month to read that paper..
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 09-Jul-2008, 05:20   #20
Geo
Mostly Harmless
 
Join Date: Apr 2002
Location: Uffda-land
Posts: 9,156
Send a message via MSN to Geo
Default

Quote:
Originally Posted by Andrew Lauritzen View Post
Ahahaha... ahh man Geo, you're too harsh
Man, anytime Intel wants to step up to our interivew, it's been out there for them to do so.

The god's honest truth is I want them to succeed because I love high-end graphics and the more serious deep-pocket players there are the happier I am.

And I've been saying for over a year now that the vibes are Intel is genuinely alarmed this time and that means they won't take one swing and give up. My only point is they need to be psychologically prepared to get their nose bloodied in round one, because they probably will, and it will likely be on the software side no matter how sweet their hardware is on the theoreticals.
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee
"Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel
". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006
"Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss
Geo is offline   Reply With Quote
Old 09-Jul-2008, 05:55   #21
Rufus
Member
 
Join Date: Oct 2006
Posts: 214
Default

Quote:
Originally Posted by ShaidarHaran View Post
Would the new Radix 16 Divider and Super Shuffle Engine in Penryn family CPUs qualify?
Not likely if Jon Stokes's confirmation of the rumor mill is correct that Larrabee is an original Pentium core with a vector unit tacked on:
http://arstechnica.com/news.ars/post...ech-sorta.html
Rufus is offline   Reply With Quote
Old 09-Jul-2008, 06:02   #22
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 4,199
Default

From Ars
Quote:
Intel will claim that Larrabee has 20x the performance per watt of a Core 2 Duo and half the single-threaded performance.
20x perf at what? texture sampling?
Quote:
It also has a 4MB coherent L2, and three-operand vector instructions.
Only 128kb of L2 per core? I'd double that..
__________________
[my blog]
Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too? [Douglas Adams]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 09-Jul-2008, 13:34   #23
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Quote:
Originally Posted by Rufus View Post
Not likely if Jon Stokes's confirmation of the rumor mill is correct that Larrabee is an original Pentium core with a vector unit tacked on:
http://arstechnica.com/news.ars/post...ech-sorta.html
That's a gross over-simplification at best, completely wrong at worst.
ShaidarHaran is offline   Reply With Quote
Old 09-Jul-2008, 13:36   #24
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,814
Default

Quote:
Originally Posted by nAo View Post
From Ars

20x perf at what? texture sampling?
No doubt it's peak SP GFLOP rate.

Quote:
Originally Posted by nAo View Post
Only 128kb of L2 per core? I'd double that..
More cache is always better (until you have to increase latency to accomodate size), but 128KB per SIMD seems adequate to me...
ShaidarHaran is offline   Reply With Quote
Old 09-Jul-2008, 15:18   #25
zsouthboy
Member
 
Join Date: Aug 2003
Location: Derry, NH
Posts: 563
Default

Quote:
Originally Posted by ShaidarHaran View Post
That's a gross over-simplification at best, completely wrong at worst.
What's wrong with it exactly? It's based on the old Pentium + years of bugfixes, right?
zsouthboy is offline   Reply With Quote

Reply

Bookmarks

Tags
graphics, intel

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:22.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.