Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 30-Mar-2011, 22:30   #1
AlStrong
penguins
 
Join Date: Feb 2004
Posts: 13,978
Default RWT Analyzes Bulldozer Benchmarks

Recently, benchmarks for AMD's eagerly awaited Bulldozer architecture have leaked online. So far, this has mostly created uncertainty about the performance of future products, rather than answering questions.

David Kanter, long time friend at RealWorldTech and always eager to discuss CPU architecture and performance, takes a look at the test system and benchmarks and explains the difficulties in precisely estimating performance. He also goes on to analyze the benchmark results and draws several conclusions about Bulldozer's microarchitecture and performance and what it may mean for future products.

Well, we certainly aren't going to spoil you, but we do encourage you to head over and check out the thorough analysis for yourself! Anyone remotely interested in Banana Dong (*ahem* B3D Codename for Bulldozer) shan't be disappointed.
__________________

AlStrong is offline   Reply With Quote
Old 30-Mar-2011, 23:08   #2
Pete
Moderate Nuisance
 
Join Date: Feb 2002
Posts: 4,653
Default

I can't help but read the article title as a reference to a certain triple rainbow.

Reading now!
Pete is offline   Reply With Quote
Old 31-Mar-2011, 07:15   #3
dkanter
Regular
 
Join Date: Jan 2008
Posts: 354
Default

Quote:
Originally Posted by Pete View Post
I can't help but read the article title as a reference to a certain triple rainbow.

Reading now!
That was on purpose : P

David
__________________
www.realworldtech.com
dkanter is offline   Reply With Quote
Old 31-Mar-2011, 23:59   #4
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,141
Default

I've tried to imagine what a bad case would be for BD.

I suppose it would be code that didn't use FMA, shuffled a lot (cutting FP throughput in half), had two threads slamming the write pipe with scattered writes that didn't coalesce in the write coalescing cache, and potentially wasn't blocked optimally for the smaller L1.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 01-Apr-2011, 06:44   #5
entity279
Member
 
Join Date: May 2008
Location: Romania
Posts: 341
Send a message via Yahoo to entity279
Default

But what does the FP shuffle actually do? Sorry, but i really am clueless on this one.
entity279 is offline   Reply With Quote
Old 01-Apr-2011, 14:48   #6
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,141
Default

It moves values around within a SIMD register(s).
The XBAR unit can go further in how it can permute vectors than what AVX is able, but it also takes up one of the two FP issue ports.
This could save instruction usage by having a permute move values around within and between vectors in a single operation, instead of having to use multiple less generic shuffles to achieve the same end.
That's in XOP, however, so it may be a very useful instruction that will not get used as much as it could.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 01-Apr-2011, 15:29   #7
entity279
Member
 
Join Date: May 2008
Location: Romania
Posts: 341
Send a message via Yahoo to entity279
Default

Thank you
entity279 is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:46.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.