AMD Bulldozer Core Patent Diagrams

http://blogs.amd.com/work/2010/11/09/server-highlights-from-financial-analyst-day/


This is the most asked question that I get. Today we gave granularity down to the quarter. We expect to launch the client version of “Bulldozer” (code named “Zambezi”) in Q2 2011. The server products (“Interlagos” and “Valencia”) will first begin production in Q2 2011, and we expect to launch them in Q3 2011.

who would have guessed client gear first!!!!


Turbo CORE – We have disclosed that we would include AMD Turbo CORE technology in the past, so this should not be a surprise to anyone. But what is news is the uplift – up to 500MHz with all cores fully utilized. Today’s implementations of boost technology can push up the clock speed of a couple of cores when the others are idle, but with our new version of Turbo CORE you’ll see full core boost, meaning an extra 500MHz across all 16 threads for most workloads.

.

it will be interesting to see just how much head room there is and what clock speeds are obtainable on good air cooling.
 
Last edited by a moderator:
Vector-heavy code tends to push power consumption under load.
It is possible that the 500 MHz with all cores active can be achieved when running integer-only threads, which seems to track with one subset of server loads Fruehe is able to speak about.
 
do you think they are able to either power down or clock down the FPU seperate from the rest of the module?

im really looking forward to seeing how consumer bulldozer does on 4 threaded general purpose apps(games etc) assuming each thread is on a seperate module and how high the turbo will do.

or better yet i know they can power down a module so what provides better performance on a 4 thread apps, having each thread on a seperate module or powering down 2 modules and clocking the other two higher?

i guess clock vs power scaling is going to be key.
 
do you think they are able to either power down or clock down the FPU seperate from the rest of the module?

No, but they don't have to.

When the FPU is idle, it will not consume any dynamic power, only the leakage power.

This means it will consume much less power than when it's used.
 
So we have 2MB L2 & 8MB L3 confirmed :)

If they can do up to 500Mhz turbo with all cores active, that should bode well for when less are active, though that depends a lot on what the base clocks actually are.

what provides better performance on a 4 thread apps, having each thread on a seperate module or powering down 2 modules and clocking the other two higher?
Will differ if the threads are dependant or not. With dependant threads then having them on same module is likely to be better but for independant threads getting to use the whole L2 should be better.

The memory controller improvement presumably is ATI influenced? 30% throughput improvement from more efficient design seems like a big improvement.
 
Last edited by a moderator:
Some very concrete info about BD-modules die area, transistor count, and clockspeeds courtesy of Dresdenboy http://citavia.blog.de/:

4.5 Design Solutions for the Bulldozer 32nm SOI 2-Core Processor Module in an 8-Core CPU
T. Fischer, S. Arekapudi, E. Busta, C. Dietz, M. Golden, S. Hilker, A. Horiuchi, K. A. Hurd, D. Johnson, H. McIntyre, S. Naffziger, J. Vinh, J. White, K. Wilcox, AMD
The Bulldozer 2-core CPU module contains 213M transistors in an 11-metal layer 32nm high-k metalgate SOI CMOS process and is designed to operate from 0.8 to 1.3V. This micro-architecture improves performance and frequency while reducing area and power over a previous AMD x86-64 CPU in the same process. The design reduces the number of gates/cycle relative to prior designs, achieving 3.5GHz+ operation in an area (including 2MB L2 cache) of 30.9mm2.
4.6 40-Entry Unified Out-of-Order Scheduler and Integer Execution Unit for the AMD Bulldozer x86-64 Core
M. Golden, S. Arekapudi, J. Vinh, AMD
A 40-instruction out-of-order scheduler issues four operations per cycle and supports single-cycle operation wakeup. The integer execution unit supports single-cycle bypass between four functional units. Critical paths are implemented without exotic circuit techniques or heavy reliance on full-custom design. Architectural choices minimize power consumption.

I wonder if those clocks take into account Turbo Core or not. Looking forward to this presentation.
 
So AMD is releasing the 8-Bulldozer modules processor first ? I thought they're doing Bulldozer+GPU first. Sandy Bridge looks very tempting, but must resist.
 
So AMD is releasing the 8-Bulldozer modules processor first ? I thought they're doing Bulldozer+GPU first. Sandy Bridge looks very tempting, but must resist.

AMD is launching alot of of bulldozers all at once in q2. I think we are going to see 4 and 8 module processors first and then 2h 2011 we will see server chips.

I think 4and 6 core phenom 2s on 32nm will be thier low end chips through 2011 until they introduce 12cmodual bulldozers in 2012
 
No 8 module BD is planned for desktop AFAIK. However, in terms of throughput at least, a 4 module BD with its 8 int cores @ 3-4 Ghz will be more than enough.

BD+ IGP is also will also not be released in the near future (Q2/3 2011).
 
Is the fusion processor still on ? I am waiting for that for a new HTPC.

Uh...two major press releases today and multiple fusion netbooks at CES...might want to read the gadget sites or AMD's news releases. Anyway, it's out and I hope we see benchmarks soon. Supports DX11.
 
So the C and E Series are out now and the A Series later ? The E-Series looks good with 18 W TDP. What's the A Series TDP ? Is the A Series the one that'll compete with Intel current line of Sandy Bridge ?
 
So the C and E Series are out now and the A Series later ? The E-Series looks good with 18 W TDP. What's the A Series TDP ? Is the A Series the one that'll compete with Intel current line of Sandy Bridge ?

Well I dunno why they are called A, C and E Series. For all intents and purposes, C and E series are identical except clocks (and hence TDP). A series is definitely a much different beast (Llano) and will indeed come later (32nm SOI at GF isn't ready yet apparently).
I don't know what A series TDP is - since there will be both desktop and mobile offerings (and versions with 2 or 4 cores) I have to assume quite a wide range not a single number like the current C and E series use (I'm thinking 25-95W but that's just a rough guess). Unfortunately there is no way Llano could compete in cpu performance with Sandy Bridge (well they can probably compete if you match them like AMD is doing it today, pit 4 core Llano against 2 core SB), though on a perf/power scale it won't look nice. On the upside though the IGP will spank HD3000 graphics (and obviously HD2000 too).
That said, even E series could compete with SB in some very limited form - I'd guess there will be a CULV celeron version (similar to the current Arrendale Celeron U3400/U3405), and if it clocks as low as that U3400 (1.07Ghz) it won't be too far off from E-350 in cpu performance, TDP and graphics performance could potentially be similar too. (Of course, if that supposed SB CULV Celeron would have higher clock, support HT or Turbo it would be no contest but then you're probably leaving Celeron territory...)

btw fusion products discussion isn't really related to that thread, since none of them are based on Bulldozer. Or rather, none of the announced ones are, those are further off and should appear in 2012 (Trinity, I guess it's a successor of Llano), but needless to say, details are scarce...
 
Last edited by a moderator:
Yes, which AMD Fusion Series will go head to head with Sandy Bridge ? The A, C and E Series seems aim at netbooks and laptops rather than desktops. But they look mighty capable for the purpose.

Also is Fusion = Bulldozer + GPU or is it a different core ? I am a little lost on AMD future lineup.
 
A series will be for desktop too
in 2012 amd will ship the trinity processor for the same segment with up to 4 cores (8 modules) and integrated gpu on 32nm
hopefully upgraded gpu
 
Back
Top