Intel Skylake Platform

Realistically, the cost-per-mm^2 measurement ignores all complexity involved in the lithography process standup, the masking of the chip, the layout and design of the chip, and all the qualification steps between paper and product. Final pricing is based on actual die size about as much as it is about the cost of the PCB substrate underneath. Said another way, final price of each device is multiple orders of magnitude higher than the component cost of the part itself.

I assumed Vox's statement was more tongue-in-cheek, as a polite and well-humored "haha, the profit on this chip has to be good for you, Intel." The actual measurement is effectively meaningless as soon as you allow the conversation to stretch to "upper echelon" parts. I think he knows that, too.
 
All valid points.
But you have to wonder what is the driving force these days to go to smaller processes. Going from Sandy 32 nm to Skylake 14 nm didn't bring much for performance and power consumption in the mainstream performance desktop processor. Performance increase is mostly due to architectural improvements. (the GPU I don't care)
 
But you have to wonder what is the driving force these days to go to smaller processes. Going from Sandy 32 nm to Skylake 14 nm didn't bring much for performance and power consumption in the mainstream performance desktop processor. Performance increase is mostly due to architectural improvements. (the GPU I don't care)
That's the problem. Your requirements do not align with the requirements of the market.
Going from 32nm Sandy to 22nm Haswell has given me a laptop with a usable GPU and a 50% faster CPU within the same thermal envelope (with Sandy Bridge, one wanted a 35W CPU; with Haswell+, one only wants the 28W TDP CPU if one cares about GPU performance).
 
Anyone mentioned it's faster than Haswell/Haswell-E and it also seemed to be lower power, which is nice...
And it has AVX2, although using SSE/AVX properly isn't as easy as one would think... :(
 
Skylake seems to have more robust sustained loads from the L2 that helps the two FMA pipes. Those, on the other hand, have reduced instruction latency, as well (the FDIV unit's throughput is again doubled up). The L3 improvements are more nebulous -- most likely improved prefetching, sort of.
 
The shared L3 seems to take up much less die space relatively speaking as well. Looking forward to more details at the IDF.
 
Is the L3 size estimate including what appears to be storage arrays flanking each core?
The area labeled L3 looks consistent with the ring stop and interface logic, but does not resemble the storage componenent.
 
Fair point, it's interesting that they labeled that "CPU!" The SRAM cells blocks look much more entangled with the rest of the labeled CPU part than prior dies that used the ring bus. (Shorter wires, better latency?) Prior cores seem to have a clean separation and all the cells arranged in a regular grid. From homologous die shots of other Intel CPUs, (page 3 top left core) I assume the top left rectangle is the FPU area and the square cells near the middle left are the L2.
 
I assume the top left rectangle is the FPU area and the square cells near the middle left are the L2.
The L2 is located in the lower left corner (referencing the top row of cores) and definitely spots altered layout of the SRAM banks, compared to the previous generations.
The "square cells" array is in fact the L1 d-cache.
 
So do you think in the bottom region (page 3 top left core) the two yellowish sram banks are L2 and the bluish one are L3? I had thought all the cells there would be L3, but this would imply more mingling of the banks. (It would make sense to reduce latency for wires going back and forth from L3 to L2.)
 
So do you think in the bottom region (page 3 top left core) the two yellowish sram banks are L2 and the bluish one are L3? I had thought all the cells there would be L3, but this would imply more mingling of the banks. (It would make sense to reduce latency for wires going back and forth from L3 to L2.)

The wires would be going between the core and the interface/ring stop that is part of the green box labelled as the L3. That in turn hooks into the local L3 slice and the ring bus. I don't think the L3 is in the blue region.
 
Another possible consideration for the changed CPU arrangement is packing efficiency. The small dark blue rectangles at the bottom of the die, past the edges of the arrays could be dead space.
Keeping to the original row of CPUs with the height of the GPU and System Agent would add length to the die and leave more silicon below unused.

If the GPU and System Agent were adjusted so that they could match the height of the CPU/cache section in order to eliminate the dead space, the die would be even more rectangular, which might have implications for how flexibly the GPU could scale if it stretched even further, and for the blank area at the top of the die past the end of the memory interface.
 
Back
Top