Jawed
Legend
It's a centrally managed bus, so it's definitely more than repeaters.That was my point. Cell has a contiguous area of the die devoted to the ring bus and its logic. I'm not privy to the details of the design, but I have a hard time accepting it is wholly made up of repeater blocks.
http://www.ibm.com/developerworks/power/library/pa-expert9/
Did Core 2's L2 latency improve 65nm->45nm?The dominant factor for that is area, and latency roughly scales with sqrt2 of the physical area of the cache.
If the SRAMs shrank, we'd expect better latency.
If the cache capacity were expanded to give roughly equivalent area, we'd have the same latency with more capacity.
http://www.extremetech.com/article2/0,2845,2208245,00.asp
Steve Fischer said:The latency for accessing the L2 cache increased by 1 core clock cycle (from 14 to 15 clocks) due to the increase in size.
Though those L2s are so big in comparison with what we're talking about in Larrabee (or what's in Nehalem). Nehalem's 256KB L2 is 2 cycles faster than Conroe's 4MB L2, not much of an improvement considering it's 1/16th the size and on a smaller process. Obviously fiddly comparing these as other parameters have been adjusted at the same time.
Agreed with all that. I just suspect it's not a binary design decision, whether interconnects fly over non-interconnect logic.It might require the redesign or rerouting of all the logic it flies over, possibly at the expense of poorer density in logic that already scales worse than SRAM.
Depending on how large an L2 tile is compared to its directly linked compute core, the penalty may be worse if the logic expands.
The SRAMs might not require too many additional layers for their signalling, the more complex logic of the cores might have uses for the interconnect at the altitude of the ring bus, plus whatever margin of safety is needed to keep both layers from interfering with one another.
If you look at the Cell die shot the 2MB of SPE LS covers considerably more area than the EIB. I reckon EIB is 17% of the area of this 2MB of memory.
This could indicate that the interconnects consume a tiny proportion of the area of L2 in Larrabee, particularly as the ring bus in Larrabee almost has "no protocol" so has little control logic associated with it.
But, obviously, we can't see the ring interconnect fabric itself on Cell, so who knows, maybe it covers 8x the area of the EIB logic
Overall it seems the scaling question isn't a big deal. Famous last words.
Jawed