AMD Bulldozer Core Patent Diagrams

Raqia · Nov 2, 2013

itsmydamnation said:
L3 latency isn't the issue its not great but its still way faster then main memory. its L3 throughput ( especially write) that's the issue. The L'2s are large and the L3 is an eviction cache, things are fetched into L1 and L2. But the L3 throughput........

http://www.vmodtech.com/main/wp-con...2133c9d-16gxh-with-amd-fx-8350/aida64-mem.jpg
http://cdn.overclock.net/e/e4/350x700px-LL-e4eb580f_cachememtest.png

want i want to know is the L1 "broken" 6:1 ratio of read to write seems kinda pointless to me.

That does seem like a major stumbling block. One of the slides for SR mentioned that store forwarding was improved so this might be more balanced now.

itsmydamnation · Nov 2, 2013

people seem to think the wcc between the L1 and L2 can cause problems as well.

http://www.agner.org/optimize/blog/read.php?i=225

seems to be some massive performance cliffs if you cause the wcc to overflow.

fellix aida 3 crashes in my vmware guests

mczak · Nov 3, 2013

itsmydamnation said:
want i want to know is the L1 "broken" 6:1 ratio of read to write seems kinda pointless to me.

That's just a result of the L1 writethrough cache. So a L1 write is always writing to L2 too (the number is higher than L2 write presumably because a L2 write would also include a L2->L1 read). And yes this appears to be a weak point, though it's unclear how much this contributes to general lackluster performance of BD. I don't know if SR changes any of this as this was probably chosen for a reason, but one "easy" "fix" would be doubling L2 throughput.

Raqia · Feb 15, 2014

Some ISSCC links for Steamroller (and others):

http://pc.watch.impress.co.jp/docs/column/kaigai/20140214_635132.html
http://forwardthinking.pcmag.com/ch...eamroller-14-and-16nm-process-highlight-isscc
http://www.electronicsweekly.com/news/research/isscc-64bit-arm-v8-power8-2014-02/

3dilettante · Feb 16, 2014

The watch.impress segment has some interesting breakdowns of the chip.
Notably, the amount of HVT transistors drops massively from 32nm to 28nm.

The overall shift to having more nominal Vt transistors and a larger proportion being regular-length seems to match up with the general premise that Steamroller's top end is somewhat lower, so more can be put in the nominal pool than before.
However, the drop in HVT was such that I wonder if it had to do with some quirk like dropping SOI.
The leakage numbers show a generally more leakage-resistant process, except for again the fastest and leakiest transistors.

Electronics Weekly had one sentence mentioning resonant clocking, but is it any more so than the non-appearance for Piledriver?

Another blurb is the mention of the vdroop-detecting clocking scheme.
These days, these dynamic schemes echo for Intel's Foxton technology more than ever.
Intel has a vdroop-aware clock scheme for its experimental graphics core.
AMD seems to be using it to keep things functional at regular voltages, while Intel's for near-threshold.

edit:

One thing I forgot to comment on was the number of custom macros for Steamroller.
It has an order of magnitude more than AMD's Jaguar.
Part of that may go to the requirements for Steamroller's per-core performance range, as well as the historical tie that architectural line has with the old AMD fabs.
I would wonder if a Bulldozer-derived core would ever be found on a non-GF process with that level of specificity (and since Jaguar with much less hasn't been hopping fabs), and whether that could have been what scared Sony away from the rumored Steamroller PS4.

Raqia · Feb 19, 2014

There's a tweet about Steamroller's use of macros:

https://twitter.com/MikeDemler/statuses/433020331532378112

"#ISSCC Design complexity drove more use of automation in AMD Steamroller. Replaced multi-day SPICE pwr analysis with SNPS DOE tool, SP&R."

No idea what some of those acronyms mean. Hopefully, a full slide deck or a taped presentation like the one for bobcat crops up.

AMD Bulldozer Core Patent Diagrams

Raqia

itsmydamnation

mczak

Raqia

3dilettante

Raqia