Haswell vs Kaveri

CPU-world reported +20% EUs for GT1-GT3 Broadwell, so yes the GenX diagram from HPG2012 could belong to Broadwell/Gen8. Based on 96EU @1 Ghz we might look for 1,2-1,3 Tflops for the faster GT3 parts assuming frequency can go up to 1,2-1,3 Ghz. At least Broadwell-K with high TDP should hold it.

Where do you get 96 EU's from? 20% on top of Haswell GT3e is only 48EU's and at 1Ghz that only comes in at 768 GFLOPs
 
Ah I see. But the 96 EU's of "GenX" don't tally with the 48 predicted from PC world. Plus 96 EU's at 1Ghz would weigh in at 1.5 - 2 TF depending on what you're counting (MADD or MADD+Plane)
 
Ah I see. But the 96 EU's of "GenX" don't tally with the 48 predicted from PC world.

I have used the GT4 2 Tflop 1 Ghz estimate from Intel to estimate Tflops for GT3 48EU, this is what we get for mobile according to cpu-world (not PC world).

Plus 96 EU's at 1Ghz would weigh in at 1.5 - 2 TF depending on what you're counting (MADD or MADD+Plane)

I guess your estimate is based on Gen7/7.5. Since Gen8 is a big redesign you can't apply it. Otherwise if you have more Gen8 infos share with us.
 
There doesn't need to be a re-design though. 96 gen 7 EU's could hit 2 TFLOPs at 1Ghz if you count MADD+Plane so I think that's all the slides are saying. The problem is we're being told the Broadwell IGP will have only half that number of EU's by CPU World.

You could however get to 1.25 TFLOPs from gen 7 EU's at 1.3Ghz with only 48 of them and counting plane as well (I'm not actually sure what plane is and how relevant it is to count it).
 
Do you saw MADD+Plane estimates from Haswell? In the datasheets and all documents Intel refers always to MaDD. Should be the same for Broadwell. It also make sense to add "just" 8 additional EUs for GT3. They don't need a huge EU addition because Gen8 itself gives them more flops per EU and also twice the sampler which should improve the efficiency.

edit:
MADD+Plane doesn't make sense. See page 18. Ivy Bridge has 461 Glops for Madd+Plane at 1,2 Ghz. 96EUs is 6x of Ivy Bridge, would result in 2305 Gflops at 1 Ghz. Gen8 is per EU surely not worse than Gen7.
 
Last edited by a moderator:
Do you saw MADD+Plane estimates from Haswell? In the datasheets and all documents Intel refers always to MaDD. Should be the same for Broadwell. It also make sense to add "just" 8 additional EUs for GT3. They don't need a huge EU addition because Gen8 itself gives them more flops per EU and also twice the sampler which should improve the efficiency.

That's interesting, I wasn't aware of any details around the EU improvements for Gen8. Do you have any links to that?

edit:
MADD+Plane doesn't make sense. See page 18. Ivy Bridge has 461 Glops for Madd+Plane at 1,2 Ghz. 96EUs is 6x of Ivy Bridge, would result in 2305 Gflops at 1 Ghz. Gen8 is per EU surely not worse than Gen7.

You're right sorry, I was counting Plane to be 4 flops but it's actually 8. So at 1Ghz 96 Gen7 EU's would indeed produce 2.3 TFLOPs when counting Plane and and 1.5 TF just counting MADD.

I expect the slide is just generalizing when it says "2 TF"
 
That's interesting, I wasn't aware of any details around the EU improvements for Gen8. Do you have any links to that?


I don't have detail informations. Intel itself told here that there are absurd amount of changes made on the EU (and therefore we can't apply Gen7 Gflops calculations to Gen8). As mentioned the GenX diagram looks vastly different to Gen 7.5, assuming it is Gen8 and not some fantasy. The +20% EU rumour from cpu-world coincides at least. 12, 24, 48 for GT1-GT3 is what the GenX diagram stated. As for GT4, if it comes I think we will see it only for 65W AIO SKUs. Maybe as a DDR4 SKU and not before Q1 2015.
 
From Broadwell Alpha driver inf: http://www.station-drivers.com/index.php/articles/713-intel-hd-iris-graphics-version-15-36-0-3353


; BDW Simulation
%iBDWGD0% = iBDWM_w8, PCI\VEN_8086&DEV_0090
%iBDWGT0% = iBDWM_w8, PCI\VEN_8086&DEV_0BD0
%iBDWGT1% = iBDWM_w8, PCI\VEN_8086&DEV_0BD1
%iBDWGT2% = iBDWM_w8, PCI\VEN_8086&DEV_0BD2
%iBDWGT3% = iBDWM_w8, PCI\VEN_8086&DEV_0BD3
%iBDWGT4% = iBDWM_w8, PCI\VEN_8086&DEV_0BD4
; BDW HW
%iBDWMULTGT1% = iBDWM_w8, PCI\VEN_8086&DEV_1602
%iBDWMULTGT1% = iBDWM_w8, PCI\VEN_8086&DEV_1606
%iBDWULXGT1% = iBDWM_w8, PCI\VEN_8086&DEV_160E
%iBDWMULTGT2% = iBDWM_w8, PCI\VEN_8086&DEV_1612
%iBDWMULTGT2% = iBDWM_w8, PCI\VEN_8086&DEV_1616
%iBDWULXGT2% = iBDWM_w8, PCI\VEN_8086&DEV_161E
%iBWDHALOGT1% = iBDWM_w8, PCI\VEN_8086&DEV_160B
%iBWDHALOGT2% = iBDWM_w8, PCI\VEN_8086&DEV_161B
%iBWDHALOGT3% = iBDWM_w8, PCI\VEN_8086&DEV_162B

Anyone with an idea what BDW Simulation means?
 


Unless it's a typo, it sounds like it is.
Looking at other results, LGA2011/X79 systems show the same 256-bit info in that field and LGA1150/Z77 systems show 128-bit even when all memory slots are populated.

Even better, it looks like they're using that in a laptop.

If true, this would be the best kept secret of 2013.
 
So no one is commenting on the possible elephant in the room that is Kaveri having a 4-channel memory controller after all?
 
Quad channel huh? Like the expensive LGA 2011 platform. Well they have to do something but my opinion is that it is just the next gradual step in IGPs. Unfortunately it is not coming cheap and you would be better off with a video card.
 
It's probably fucked up : the syndrome on running unknown hardware on "system diagnostics" software.
Memory controller is probably new, possibly with gddr5 support (even if that goes only in specific embedded or server designs)

Or else, those 256bit must be physically connected to something, but that won't go into an FM2+ socket. Only option is a specific package (like Crystallwell, but the concept is much easier) and it'd be weird to not have heard any of this while it's out somewhere and running.
 
I fail to see how that would work on FM2+.

Do you know the full pinout of socket FM2? Is there such information available to the public?

I for starters have no idea why the socket FM1 had to be discontinued after a single year in the market.
 
Do you know the full pinout of socket FM2? Is there such information available to the public?

I for starters have no idea why the socket FM1 had to be discontinued after a single year in the market.

I don't, but it has about as many pins as FM1 and FM2, which are both dual-channel sockets. I doubt AMD magically found a way to double the width of the memory bus with the same pin count, or that they anticipated this when they introduced FM1.
 
Back
Top