22 nm Larrabee

dnavas · Nov 15, 2012

rpg.314 said:
Look at the SRAM size he is promising. 256 MB / chip

Hmm, I missed that -- where did you see that?
I see 256GB NVRAM (whatever that means) with 1.6TB/s and 16TF, and I see 1024 L2s, but not seeing onboard sram sizes -- you must be looking at a different slide?

Man from Atlantis · Nov 15, 2012

A1xLLcqAgt0qc2RyMz0y said:
The Xeon Phi 5110P only comes in at 1.01 Peak DP TFlops and burns 225 watts on the 22nm process.

http://techreport.com/news/23884/intel-joins-the-data-parallel-computing-fraternity-with-xeon-phi

The Tesla K20X has 1.31 Peak DP TFlops and burns 235 watts on the 28nm process.

http://techreport.com/news/23882/nvidia-intros-tesla-k20-series-as-titan-snags-top500-lead

The Xeon Phi 5110P is 23% slower than the Tesla K20X yet burns about the same power and is on a newer process (22nm). It look pretty underwhelming to me.

and GK110 has tons of non compute stuff.. imagine how many more cores could nv have added instead.. not to mention a full node handicap

rpg.314 · Nov 16, 2012

dnavas said:
Hmm, I missed that -- where did you see that?
I see 256GB NVRAM (whatever that means) with 1.6TB/s and 16TF, and I see 1024 L2s, but not seeing onboard sram sizes -- you must be looking at a different slide?

That slide has been going around for a while. It says each L2 is 256K.

gl33k · Nov 19, 2012

SRAM is the memory used for L1/L2 cache right ?
jesus , 256MByte of this would be devastating

Blazkowicz · Nov 19, 2012

NVRAM can also be MRAM or whatever available (plain DRAM with battery or supercapacitor backup also qualifies, it's why you lose your BIOS settings if you remove the coin shaped battery on your motherboard. You've got NVRAM there)

But, I assumed these 256GB are not the NVRAM. Did you miss the multiple "DRAM cubes"?

. 256GB are on multiple stacks of DRAM on interposer, that's not too bad either.

dnavas · Nov 26, 2012

Blazkowicz said:
NVRAM can also be MRAM or whatever available (plain DRAM with battery or supercapacitor backup also qualifies

Oh, non-volatile. Hah, I'll not embarrass myself and explain what I thought it stood for :>

But, I assumed these 256GB are not the NVRAM. Did you miss the multiple "DRAM cubes"? . 256GB are on multiple stacks of DRAM on interposer, that's not too bad either.

Well, I'm running on memory here, but it seemed like that was on-chip, and I had trouble imagining 256GB on an interposer. Some serious potential bandwidth with 256GB onboard....

Blazkowicz · Nov 26, 2012

I now have trouble imagining 256GB on interposer too but it doesn't feel too impossible in 2020 for the highest end chip ever. They would kind of max out the tech.

I believe there was confusion with rpg.314 reading 256 MB and concluding it was SRAM, but I'm 100% sure 256 GB is written there.

for 256GB that could be eight times a pile of eight stacked memory dies, with 4GB i.e. 32Gbit per unit of memory (not too far of contemporary 4Gbit chips). Of course this doesn't really exist, nor a 10nm process. I hope 2020 is a far away enough date, these techs are maybe the far end of current realistic R&D.

PS : well it can be 128GB "D-RAM cubes" and 128GB NVRAM or something.

iMacmatician · Dec 3, 2012

Does anyone know anything about this (from many months ago)? Is this reliable in any way?

Intel MIC: 14nm Knights Landing to have both PCIe & socket versions (14-16 DP GFLOPS/Watt)

I would expect something like half of those numbers.

rpg.314 · Dec 4, 2012

iMacmatician said:
Does anyone know anything about this (from many months ago)? Is this reliable in any way?

I would expect something like half of those numbers.

Then may be on 10nm we'll see Intel bringing LRB cores on die.

Blakhart · Dec 8, 2012

Is lrb going to be produced and released?

rpg.314 · Dec 9, 2012

Yes, in the form of HPC accelerators called Xeon Phi.

Frontino · Mar 11, 2013

I have a doubt about Knights Corner specifications: I know that each core has a 512 bit vector processor, but is it a single unit capable of multiple operations at lower width (32 & 64 bit) in the same cycle or is it composed, like some table shows, by a 16-way 32 bit and an 8-way 64 bit vector unit? Wouldn't that make the processor actually 1024 bit wide?

Gipsel · Mar 11, 2013

Frontino said:
I have a doubt about Knights Corner specifications: I know that each core has a 512 bit vector processor, but is it a single unit capable of multiple operations at lower width (32 & 64 bit) in the same cycle or is it composed, like some table shows, by a 16-way 32 bit and an 8-way 64 bit vector unit? Wouldn't that make the processor actually 1024 bit wide?

It's the former. It either processes 16 32bit float operations or 8 double operations. It basically works the same as the SSE or AVX units, it's just wider.

Frontino · Mar 11, 2013

Thanks, Gipsel.

LiXiangyang · Apr 8, 2013

MiC's L1 cache is not programmable, the inter-thread commuications on MiC is pretty much like the case of CPU. Intel's developer's forum is near, I will definitely go there to verifty if my experience with MiC is merely an exception.

DavidC · Apr 10, 2013

Xeon Phi 7120P/7120X coming -

http://cdn4.wccftech.com/wp-content/uploads/2013/04/Intel-Xeon-Phi-SKUs.jpg

Apparently the specs are 61 cores/31MB cache/1.238GHz frequency.

iMacmatician · Apr 11, 2013

It apparently also has 8 GB of 5.5 Gbps memory with a 512-bit bus, plus a 300 W TDP.

LiXiangyang · Apr 13, 2013

Just returned from the IDC, according to the intel guys:

1)The LLC cache arrangement of Phi is not like these found in intel CPU, LLC(which is L2 for phi) of Xeon Phi is local to each core, so for each core there is only 512kB L2 cache, instead of the 31MB number Intel promoted, any data cached that need to be accessed, that not avilable at the local L2 cache, will need to be transfered to the local L2 before accessing.

For comparison, GK110 has 1.5MB of L2 cache, but it is global cache like Intel's LLC on ivy bridge/sandy bridge CPUs, so its data is accessable to all gpu cores.

2)At least according to the intel guys at IDC, Intel has no plan to introduce programmable L1 cache into their future generation MIC co-processors.

3)Xeon Phi's SIMD unit is more or less the same as Haswell's AVX-2, just wider.

4)Unlike HT in CPU, hardware multi-threading on MIC is estenial for MIC to achieve peak performance.

5)Intel's guys here are very open to promote MIC's programmability comparing to Nvidia's offers, but remain tight-lipped regarding the performance comparison between the two products.

6) The card is likely to be cheaper than K20/K20X, but it is not for retail, only provided with whole system solution, and some company at IDC manage to pack 4 of these cards in one case with dual socket CPUs.

moozoo · Jun 4, 2013

Xeon Phi OpenCL performance

http://clbenchmark.com/device-info.jsp?config=15887974

lol, I hope it can do a lot better than this.

mczak · Jun 4, 2013

moozoo said:
http://clbenchmark.com/device-info.jsp?config=15887974

lol, I hope it can do a lot better than this.

But well look it's really good at tree search

.
Overall though the results really are terrible indeed.

22 nm Larrabee

Similar threads