Larrabee at Siggraph

Communication between cores and special logic goes through the ring.
It's possible the texture logic blocks are distributed along the ring, which would help with congestion.

The diagrams are too abstract to give any real indication, but if the chip were implemented as they were drawn, the ring would be a limiting factor.
Even if they are distributed, 32 filtered pixels per clock could conceivably be enough to saturate the bus.
 
So according to Techreport, this is a tile based deferred renderer. I'm assuming that's Powervr technology they licensed...?

(From what I read, looks like they go through the trouble of storing the scene, but don't take advantage of it by shading just the visible areas, at least not in the same way that you'd expect a TBRD to do.)
 
Last edited by a moderator:
You're incorrect on two counts. First, I didn't start this rumor. I heard the P54C/Pentagon details from two separate sources within Intel. Then I sat on the info until someone else pushed it out there, and I merely confirmed it and added what extra color I had.

You were the first to express it publicly, thus you started the rumor ;)

Second, you're incorrect that Larrabee's individual cores bear no relation to P54C. But I noticed that Intel's fact sheet suggested a relationship, and when someone quoted it in here you didn't bother to respond and subsequently dropped the issue ;)

I've been wanting to respond, been too busy to do so until just now. Had to run my roommate down to the State Attorney General's office to file a complaint about the company handling his tax return, that was a fun way to spend the whole afternoon :p Also I've been pouring over this thread, the paper in question, and the glut of articles on the subject since yesterday. I'm a bit overwhelmed, ATM.

I still don't believe Larrabee's individual cores have anything to do with P54c. This is nothing more than an analogy, the purpose of which is to relate the fact that it is a relatively simple core with a short pipeline, or it is mis-direction meant to lower the competition's expectations or both.

Surely you know enough about CPUs not to buy the line about "adding 64-bit, SMT, etc" to P54c. They may as well have said "yeah, we added a PS2 to our NES, now it PWNS!"
 
So according to Techreport, this is a tile based deferred renderer. I'm assuming that's Powervr technology they licensed...?
Intel had it's own tiler long ago, without deferred shading, they also called it a binning renderer rather than a tiler then ... I'm guessing they are working using that as a guide.

While trying to optimize their binner they might very well find themselves forced to take out a license with IMG anyway though, good engineers think alike when thinking about the same problems ... and even without ever having read PowerVR's patents they are bound to come up with similar solutions to modern problems they never even considered when still working on their own tiler.

(As for software vs. hardware patents ... the "difference" might matter in European courts but I doubt it matters much in US ones.)
 
AFAIK the vast majority of games already use z-pre-passes to improve perf, so why should have taken the risk of having something different from what games already expect for basically no speed improvements?
IMHO the direction they have taken makes sense.
 
Deferred shading doesn't change the behaviour of the GPU in comparison to non deferred shading from the point of view of the rendering engine, they are both equally not "what games already expect".

Compared to say tiling inside the rendering engine (which I would only expect on consoles) simply dropping the z-pre-pass is a small enough change they could convince/bribe developers to do it for them ... and that's a pretty big advantage in workload. (Maybe with a "and we throw in a source code license for our rasterizer in for free" or a "and we throw a license for our irregular Z-buffer shadowbuffer solution in for free" promise ... or simply cold hard cash.)
 
Deferred shading doesn't change the behaviour of the GPU in comparison to non deferred shading from the point of view of the rendering engine, they are both equally not "what games already expect".
Wait, here deferred has a different meaning, I wasn't taking about deferred shading in a modern sense, I was referring to what IIRC PowerVR chips do.

Compared to say tiling inside the rendering engine (which I would only expect on consoles) simply dropping the z-pre-pass is a small enough change they could convince/bribe developers to do it for them ... and that's a pretty big advantage in workload. (Maybe with a "and we throw in a source code license for our rasterizer in for free" or a "and we throw a license for our irregular Z-buffer shadowbuffer solution in for free" promise ... or simply cold hard cash.)

Oh yeah, on consoles it would make sense and I'd go for it.
 
I still don't believe Larrabee's individual cores have anything to do with P54c. This is nothing more than an analogy, the purpose of which is to relate the fact that it is a relatively simple core with a short pipeline, or it is mis-direction meant to lower the competition's expectations or both.

From the SIGGRAPH paper in section 3.2: "Larrabee's scalar pipeline is derived from the dual-issue Pentium processor."
 
Does anyone know if the vector units can operate on integers? Data format conversion aside..
 
Speaking of consoles, would there be any sense in the idea of a highend version of Larrabee as the next XBox? Meaning without another CPU, without another GPU. Seems like a natural progression, something that I'm sure would suit the likes of Epic and Id given their PC roots and programming style.

Would the costs be prohibitive? Could Larrabee scale in terms of cores and clock rate to be competitive in four years?
 
Does anyone know if the vector units can operate on integers? Data format conversion aside..

Interesting question. But you would think the answer is yes, because it is required for DX10/DX11? Else it would break the SIMDing of shader programs using integer types. X86 has far to few non-vector integer regs to be of any use.
 
You are right Timothy, someone just made me notice exactly the same thing over MSN :)
 
Speaking of consoles, would there be any sense in the idea of a highend version of Larrabee as the next XBox? Meaning without another CPU, without another GPU. Seems like a natural progression, something that I'm sure would suit the likes of Epic and Id given their PC roots and programming style.

Would the costs be prohibitive? Could Larrabee scale in terms of cores and clock rate to be competitive in four years?
Can't answer all these questions but doesn't MS want to own the IP of the processors they use?
If this is the case do you think Intel would sell their IP to MS?
 
Wait, here deferred has a different meaning, I wasn't taking about deferred shading in a modern sense, I was referring to what IIRC PowerVR chips do.
You made a point that deferred shading is a risk because it isn't "what games expect". For that to be true the deferred shading has to act differently than the non deferred shading tiler and I just don't see any cases where that would be true, there's only so much you can do with shaders ... stuff like z-kill will make the deferred shader have to treat the surface as transparant but it's still going to give the same result. Timing wise they both act differently from a direct rasterizer, but that's a given.
 
Speaking of consoles, would there be any sense in the idea of a highend version of Larrabee as the next XBox?

From what I've heard, Microsoft and IBM have already started working on the next Xbox chip. Now that Microsoft has all the development tools and such for PowerPC, I think they will stick with PowerPC at least for one more generation, if for the reason of compatibility if nothing else.

Of course, once the other microprocessor vendors hear more about Larrabee, we might see lots of big vector units being tacked on to conventional cores.

Meaning without another CPU, without another GPU.

Now that is an interesting thought. As the Larrabee cores can boot operating systems and handle virtual memory and such, you wouldn't actually need a host processor. You could just have a Larrabee...
 
MfA, what nAo is saying is that if games do a Z-only pass then there's no need for the tiler to do anything special to "draw only what's visible", which was a tiler's claim to fame in the past. The risk is in trying to do something fancy which rearranges the draw order.

I don't know if in the past tilers sorted polys within each bin from front to back to minimize overdraw, but nowadays with bigger tiles and more polys per frame the only reasonable way to avoid overdraw is to blast through each tile with Z-only rendering and then begin normal rendering. I actually thought that this is how tilers did it in the past, too. Anyway, since most games already do this, there's no need for it.

I think this distinction between TBR and TBDR that people are making is rather moot. IMO, the D in TBDR refers to the binning, not overdraw elimination or poly reordering.
 
They didn't sort them exactly, but they probably did render (and shade) all opaque surfaces first. There is only so much the developer can do to trip up deferred shading though ... Z-kill, changing the Z-test ... what more? When it detects chicanery it can always fall back to normal tiled rasterization (it has to be able to do that anyway for transparant surfaces).
 
~1million triangles/frame may be about right for current games but tesselation dramatically boosts that & in a way that doesn't hurt the basic setup engine.
Should presumably be well in use by the time Larrabee comes out?
Tessellation happens before setup so if anything setup is more likely to be a bottleneck.

I think the sort-X terminology is really not a good one to distuingish tilers from "traditional" GPUs, it's usually used when talking about parallelization ... sort middle happens to be the one "traditional" GPUs use internally for parallelization too.
That's true. At least some GPUs order primitives at the output of the vertex shader. This makes good use of the post transform cache.

Ed Grochowski (one of the authors of the SIGGRAPH paper) was one of the chief architects of the original Pentium (source), so the Larrabee team certainly had access to someone that was familiar with the original Pentium.
I hope he has a better memory than I do. I have trouble remembering details from a few months ago, let along more than 10 years. :D

Yet, Atom is clearly a disappointment. It is a chip aimed at a market that doesn't yet exist: something between a mobile PDA/phone/iPhone device and a full-blown laptop. Unless this new market segment takes off, I can't see Atom doing so well.
Someone at Intel told me there's been more interest in Atom than expected so if he's correct it's not disappointing so far.
 
Even if they are distributed, 32 filtered pixels per clock could conceivably be enough to saturate the bus.

Only if they're transferred as fp32: 32 pixels * 4 components/pixel * 4 bytes/component = 512 bytes. The ability to convert 8-bit unorm to fp32 when reading from L1 means that for 8-bit textures they can stay 8-bit over the ringbus: only 128 bytes for 32 filtered pixels.

I'm really curious whether they do fixed-function filtering for fp16 and fp32. The two reasons they gave for having FF filtering hardware were (1) texture decompression (DXT formats), and (2) area-efficient 8-bit filtering. The first only applies to 8-bit formats, the second is much less compelling for fp16 and fp32.
 
I have a few questions about Larrabee:

1) Is it be possible that someone/some company writes their own DirectX/OpenGL driver-program for Larrebee and sell it like a normal program? If this "driver" is maybe 20% faster than the "driver" supplied from Intel and with a better support maybe thats all whats needed.

2) Larrebee allows everyone to try all fancy rendering algorithm at high speed (except something which changes the texturing like FAST-texturing for example). So would this allow a faster progress from now onwards with new rendering systems only being programs? Could this mean that we would actually see something like "Delaystreams", "Image based Rendering", "Talisman" or even a highend software TBDR from IMG ;) in the future?
 
Back
Top