Fusion die-shot - 2009 Analyst Day

Four 1MB L2's -- good, although a bit of waste on K10 arch. Probably they used the leftover die area to fill it up.
 
Last edited by a moderator:
Seems to be a 6 SIMDs design. 240SPs or 480SPs?

Whatever it is, is definitely a version/variant of evergreen, juniper more likely. Is that why AMD did 2 mid-range gpu's (cypress and evergreen) this gen? To get half way through to their fusion thing?
 
apu.png


A TMU quad and half-width "compacted" SIMD multiprocessor -- 4x or 8x 5D ALUs?
24 INT8 texels and 120/240 FP32 MADs...
 
Wow, that's way more GPU power than I expected in the first iteration. Fusion could wipe out a chunk of the low-end discrete market before long.
 
OK, let's scramble the well known R700 SIMD multiprocessor into a fused IGP part:

simd.png


The ALU "quads" are rotated, aligned and fit (surprisingly accurate) on the longer side of the TMU block, while the thread sequencer shall be just "fused" (no pun) into the new SIMD unit. The leftover parts are the other half of the R700's ALUs and the redundancy block.

Do I get a cookie? :D
 
Last edited by a moderator:
This would be awesome in a laptop. I wonder about clocks, CPUs have been way ahead in clocks.

Edit: maybe they slplit clocks like nvidia for the cpu and gpu parts.
 
Athlon II X4 300M Trannies.

4c 2ML2 + Misc (slightly more?) = 300

Athlon II X2 234M Trannies.

2c 2ML2 + Misc = 234

2c + very little misc =~ 66M

Assuming misc difference is negligible, 4C = 132M

2MB L2 + MC and such take the rest.

Not sure how much the MC takes, or we can estimate out the rest.


But you can have probably at least 350-400M of transistors, since PhII with L3 cache would still just take 758M.
 
This should bring back a bit of life into PC Gaming:!:

Enough horespower to play any console port on APU alone :D
 
Well I think we're getting something between a 4670 and 4570, no idea how the limited memory bandwidth will impact things.

Oh, and I've brought this up somewhere, but I'll bring it up again- is the GDDR5 sideport idea feasible now? Or are trace lengths etc still an issue?
 
By the way, I see this "APU" thing could suite rather well as a main drive for the next Xbox.
Three POWER7-based cores with 256K fast L2 cache per core, 4-way SMT, some amount of shared L3 (eDRAM ?) and a glued GPU core -- 8 SIMDs, for instance. The external eDRAM die would still be there with all the RBE guts, leaving the GPU in the main die simpler to implement, all this on a single BGA substrate.
 
One way to go would be to split the ram pool, ie allocate a known chunk to gddr, which could be soldered on the mobo. I doubt DDRx-y alone can feed this beast.
 
Interesting, the gpu seems to have been allocated much larger area in this pic than the one in the first post.
 
Back
Top