Jawed
Legend
http://techresearch.intel.com/UserFiles/en-us/File/SCC_Sympossium_Mar162010_GML_final.pdf
I know this isn't a GPU, but I can't help feeling there's a lot of stuff in here that's relevant to Larrabee.
The chip is huge (567mm² on 45nm), has 48 P54c-based x86 cores, 12MB of L2, 1.3 billion transistors and 48 million transistors for 2 cores + gubbins to make a "tile".
Actually one thing really leaps out at me in that list: this chip is bigger than GF100, is on a notionally similar node (45 versus 40nm), but the transistor count in GF100 is ~2.3x higher
I suppose the key thing is that there's no cache coherency in hardware. Each tile has a 16KB message passing buffer which is shared by the 2 cores and the router gubbins in each tile deals with getting packets from MPB to MPB. Sort of seems like "software DMA" instead of the DMA that Cell has. And dependent upon a mesh rather than a ring.
Jawed
I know this isn't a GPU, but I can't help feeling there's a lot of stuff in here that's relevant to Larrabee.
The chip is huge (567mm² on 45nm), has 48 P54c-based x86 cores, 12MB of L2, 1.3 billion transistors and 48 million transistors for 2 cores + gubbins to make a "tile".
Actually one thing really leaps out at me in that list: this chip is bigger than GF100, is on a notionally similar node (45 versus 40nm), but the transistor count in GF100 is ~2.3x higher
I suppose the key thing is that there's no cache coherency in hardware. Each tile has a 16KB message passing buffer which is shared by the 2 cores and the router gubbins in each tile deals with getting packets from MPB to MPB. Sort of seems like "software DMA" instead of the DMA that Cell has. And dependent upon a mesh rather than a ring.
Jawed