kyetech said:
So is Larabee a pure x86 core array, or does each core have co-processing and fixed function with it?
From what I know it has both, array of x86 cores and fixed-function HW. FF should be for texture sampling I assume, it is awfully expensive to do with programmable CPU. Basically this is what Sony used to show that RSX can achieve around 1TF computing power, they simply used the sampling HW and calculated how many instructions would a regular CPU/GPU take to calculate the same thing in software
Btw, can anyone make a rough guess how many instructions it takes to take a 16x anisotropic sample from 3D texture?
If its pure x86, I dont see it doing that much better than an 8 core Nehalem
Why so? It is not as if GPUs run a lot of OoO code with lots random reads from RAM.
The main issue of course is how do you render a real time scene without having to duplicated the data sets in memory
I see no reason why would it be any different than it is today
But 8 core Nehalem is only rumored to be capable of ~200 DP GFLOPs, whereas Larrabee is north of a TFLOP.
Actually I have no idea what will be the Nehalem speed but I do know tha Intel said
8-core Gesher* at 4GHz can achieve that speed (page 31). Larrabee is stated to be at 1.7-2.5GHz with 16-24 cores achieving from 0.2-1TF/s. I remember some slides showing that Larrabee will be at 48 cores in
2010. Assuming a bit higher clock speed to go with that I wouldn't be surprised if we had around 2-3TF to play with in 2010
*) Got the PDF before they castrated it , yay for browser caches
Will 1TF be enough to enable real time rendering on it, without the use of fixed function or co-CPUs?
Assuming that texture sampling is still done (mostly) in dedicated HW I'd say yes, you can do pretty good real-time rendering on Larrabee.
Anyway,
here should be pretty much all the information about Larrabe known so far. If there is something missing just tell me
[edit]
I remember some link showing that Larrabee has two memory controllers on the chip, both at opposite ends of the ringbus and fixed function was at there also. That sounds quite logical too assuming that sampling is there. When a core asks for a sample it likely has to go through the memory controller anyway.
[edit2]
Duh, it was on that same .pdf on page 16