AMD Bulldozer review thread.

What's interesting is how much better the 2600K does over the 2500K.

Well it´s more like throughout these Bulldozer reviews, all of a sudden 2600K seems to do much better vs 2500K whereas just a month ago when I was buying it I wasn´t so sure... I saw a lot more benchmarks where 2500K was even or even won if the HT wasn´t disabled for the 2600K.

Probably just the benchmark selection where reviewers are finding many more apps where threading is heavy. I don´t know. Anyways, feeling pretty good about my purchase of 2600K right now, especially oc'd.

Muropaketti did seem to find a game where the dozer shines though:

X-20111012090444037155.png

source: http://muropaketti.com/artikkelit/prosessorit/amd-fx-8150-zambezi,3
 

That looks like a 1:1 from AMDs Reviewer's Guide (and I mean that in a copy-and-paste-sense, the same benchmarks are given in there). :)

Lovely! Until now, I was under the impression that DA2 was largely GPU dependant, but then, I cannot understand finnish (and google translate into german doesn't help very much) so I don't know what exactly they tested here.
 
That review is a bit unfair comparison -- the Bulldozer setup is the only with high-clocked memory in the pack (DDR3-1866).

Doesn't seem like memory would account for a nearly 40% increase in average fps. In a techspot review, the i7 920 (2.66 ghz) fares as well as the i5 750 (2.66 ghz) despite having triple channel memory vs. dual.

http://www.techspot.com/review/374-dragon-age-2-performance-test/page8.html

Although it is strange that the Intel cores seem relatively worse than AMD cores in the Finnish review than in the techspot review where even the i5 750 trumps the AMD Phenom II x6. Another techspot chart seems to indicate that DA is insensitive to CPU frequency changes presumably due to being video card bottlenecked:

http://www.techspot.com/review/374-dragon-age-2-performance-test/page7.html

so maybe there's something up w/ nVidia's video card drivers or maybe this game likes a larger aggregate CPU cache. It'd be nice to have an independent source reproduce some of these results.
 
Seconded. There was indeed an issue with DA2 running like crap on Nvidia when it came out, but Muropaketti seems to have used new 285.xx Beta drivers - at least they say so on their systems page.
 
Doesn't seem like memory would account for a nearly 40% increase in average fps. In a techspot review, the i7 920 (2.66 ghz) fares as well as the i5 750 (2.66 ghz) despite having triple channel memory vs. dual.
Higher MHZ for RAM not only gives higher bandwidth but also lower latency. Though obviously it can't be the only reason for 40% difference.
 
Is there any merit in sacrificing single threaded performance for higher clock speed when power consumption is through the roof ?
 
Though that makes me wonder - the test shows the mad and fma rates to be the same for BD. Unless I'm mistaken BD can't do mad in a single clock (in a single fmac unit), only fma. So does OpenCL allow MAD to be performed as FMA? Hard to believe...
(The other way round, it seems at least on intel indeed fma is correctly emulated not just executed as mad, hence the performance is not twice lower but an order of magnitude lower.)
Anyway, the numbers don't mean much given the heritage of that application.

edit: ok I looked it up.
mad definition: "Whether or how the product of a * b is rounded and how supernormal or subnormal intermediate products are handled is not defined"
fma: "Rounding of intermediate products shall not occur"

So indeed mad does not equal normal mul+add - if you need exact mul+add semantics your code apparently cannot use mad. Makes sense I guess - after all if you've got some execution unit which can do "true" mad with intermediate rounding the compiler can still optimize mul+add into using that.
 
Last edited by a moderator:
V3: Higher clockspeed compared to what? The clockspeeds aren´t really higher at all. Even when overclocked, the ceiling is generally 4.6GHzish if you don´t go to extreme cooling. Right around where you can easily oc your 2500k&2600k too.

If they did aim for much higher clocks and had to fall back to what they are... then they fell to the same trap as Intel did with Pentium 4. Remember the time when Intel was saying they would soon have 10GHz Pentium 4? Did not happen :)
 
Lovely! Until now, I was under the impression that DA2 was largely GPU dependant, but then, I cannot understand finnish (and google translate into german doesn't help very much) so I don't know what exactly they tested here.


I´ll do my best translating it:

Dragon Age II was tested at 1920x1080 resolution, without anti-aliasing, with 16x anisotropic filtering, using High image quality preset. Minimum, average and maximum framerates were tested using Fraps. Bigger score is better in the graph.

Dragon Age II seemed to favor AMD processors and FX-8150 fared especially well with its 55,5 FPS framerate, leaving others behind with their FPS of less than 50.
 
V3: Higher clockspeed compared to what? The clockspeeds aren´t really higher at all. Even when overclocked, the ceiling is generally 4.6GHzish if you don´t go to extreme cooling. Right around where you can easily oc your 2500k&2600k too.

Seems like only temperature / power consumption is limiting the overclock. With good air cooling people are passing 5G stable, but no wonder you can only go to 4.6 if already at 80C+... The power consumption at 5G air is astounding. (but all that could easily be GF process related).
 
Years ago the DOSBOX developers wrote an assembly FPU for the DOSBOX x86 dynamic core to speed up FPU-heavy games like Quake. I'm sure they're using x87 in there.

But there is a lot more to what DOSBOX is doing than just the CPU emulation. Audio and video emulation isn't trivial.

I checked the source code and it looks like this is the case: the recompiler calls stub functions for x87 ops which are implemented with hand-written ASM. There's some expensive looking FP state save/restore code along with the usual DOSBox lazy flags stuff, and all operations are loaded from and stored to 80-bit locations w/o register caching. Not really hard to see how this could bog BD down even more than running the original code would have, there'd be some really bad dependency chains for all the load/op/store which have high latency on BD's FPU.

Quake just needs a VGA or VESA dumb framebuffer and Soundblaster dumb audio buffer. Not a whole lot goes into emulating these. Although there's some work involved in changing display format it should absolutely pale in comparison compared to what's needed for the CPU emulation.
 
Thanks for the insight, Exophase. I suppose we're seeing another serious weakness in BD. It would be interesting to compare the DOSBOX "normal" and "dynamic" cores on BD.

I wonder if the architectural flaws have been caused by engineers leaving over the years and new people trying to pick up the slack.
 
Last edited by a moderator:
I´ll do my best translating it:

Dragon Age II was tested at 1920x1080 resolution, without anti-aliasing, with 16x anisotropic filtering, using High image quality preset. Minimum, average and maximum framerates were tested using Fraps. Bigger score is better in the graph.

Dragon Age II seemed to favor AMD processors and FX-8150 fared especially well with its 55,5 FPS framerate, leaving others behind with their FPS of less than 50.
Thanks a lot! That at least rules out the possibility that someone just typed in the wrong number for the graph. I am stil surprised though that DA2 reflects changes in CPU to such an extent. On the contrary, the predecessor - being CPU bound normally - seemed to heavily favor Intel CPUs. Strange. :|
 
That is a sad month for the industry indeed .

Anyone got any idea why BD has so much poor single threaded performance , even though it should at least be on equal terms with Phenom II AND has higher clocks ?
 
V3: Higher clockspeed compared to what? The clockspeeds aren´t really higher at all. Even when overclocked, the ceiling is generally 4.6GHzish if you don´t go to extreme cooling. Right around where you can easily oc your 2500k&2600k too.

I mean, from some of the review they say AMD designed BD so it can be clocked much higher by sacrificing some performance per clock, but missed the target anyway.

I thought since P4, designing chips for clock is a dead end. Why did they attempt it for BD ? What's their reasoning ? I wasn't aware they were going the Pentium 4 route with BD till I read the reviews.
 
Well going for more performance per clock isn't exactly super easy to accomplish either. People were thinking years ago that multicore was the only answer for more performance but Intel is clearly coming up with ways to accomplish all of the above and that's unfortunate for AMD. I'd say SB has the same clock headroom as BD and that's pretty sobering considering what SB can do.

Things would be prettier if AMD hadn't seemingly made so many misjudgments in their new architecture that cripple it so often. Sometimes it really does shine but that is rare.
 
Back
Top