Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 31-Jan-2012, 01:35   #1151
mczak
Senior Member
 
Join Date: Oct 2002
Posts: 2,727
Default

That's the thing I'm not sure there are really "performance bugs". Sure there are performance issues but it's unclear if these are really the result of bugs as such and not just unbalanced architecture. (I would have considered the low li1 associativity a likely candidate for causing performance issues but apparently amd didn't think so...)
Maybe some changes like being able to execute some more instructions in the AGUs go in the right direction, but those are not in Piledriver.
mczak is offline   Reply With Quote
Old 31-Jan-2012, 06:16   #1152
3dilettante
Regular
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 5,486
Default

I think someone said the leaked die shot showed extra SRAM in the area of the branch predictor. I'm not entirely sure.

The bulk of the improvement seems to be around reducing the power draw of the cores and better clocks. Llano had some notable weaknesses here that Piledriver's more advanced turbo and circuit implementations should improve upon.

Llano's inflexible turbo option doesn't set the bar that high.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 31-Jan-2012, 11:33   #1153
fellix
Senior Member
 
Join Date: Dec 2004
Location: Varna, Bulgaria
Posts: 3,033
Send a message via Skype™ to fellix
Default

Quote:
Originally Posted by 3dilettante View Post
I think someone said the leaked die shot showed extra SRAM in the area of the branch predictor. I'm not entirely sure.
Yes, that area is where the branch prediction logic resides. Probably the BTB size has been increased, along with other still unknown tweaks.
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic.
Microsoft: Russia -- Big and bloated.
Linux: EU -- Diverse and broke.

Last edited by fellix; 31-Jan-2012 at 16:12.
fellix is offline   Reply With Quote
Old 31-Jan-2012, 15:17   #1154
eastmen
Senior Member
 
Join Date: Mar 2008
Posts: 6,390
Default

I'm going to assume a more mature 28nm process will make up the performance diffrence.
eastmen is offline   Reply With Quote
Old 31-Jan-2012, 15:47   #1155
3dilettante
Regular
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 5,486
Default

I thought Trinity was 32nm.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 31-Jan-2012, 17:10   #1156
mczak
Senior Member
 
Join Date: Oct 2002
Posts: 2,727
Default

Quote:
Originally Posted by fellix View Post
Yes, that area is where the branch prediction logic resides. Probably the BTB size has been increased, along with other still unknown tweaks.
According to the SOG all family 15h processors have the same number of BTB entries.
That's why I said there doesn't really seem to be many architectural changes, the SOG really has quite some details. But yes, better clocks / turbo could help quite a bit.
mczak is offline   Reply With Quote
Old 31-Jan-2012, 22:32   #1157
Lightman
Senior Member
 
Join Date: Jun 2008
Location: Torquay, UK
Posts: 1,173
Default

Apparently 5%-8% better IPC for new module. Rest of performance jump is made by better power management / Turbo / Clocks.
Lightman is offline   Reply With Quote
Old 03-Feb-2012, 10:38   #1158
almighty
Naughty Boy!
 
Join Date: Dec 2006
Posts: 2,469
Default

Quote:
Originally Posted by Lightman View Post
Apparently 5%-8% better IPC for new module. Rest of performance jump is made by better power management / Turbo / Clocks.
So it'll be on par with Phenom 2 and around 3 generations behind Intel latest and Greatest.

At this pace Intel will be a good 60%+ faster per clock.
almighty is offline   Reply With Quote
Old 03-Feb-2012, 13:19   #1159
eastmen
Senior Member
 
Join Date: Mar 2008
Posts: 6,390
Default

Quote:
Originally Posted by almighty View Post
So it'll be on par with Phenom 2 and around 3 generations behind Intel latest and Greatest.

At this pace Intel will be a good 60%+ faster per clock.

Well they are saying a 17watt Trinity will be as fast as a 30w Llano. That should put it pretty close to Sandy bridge IMO.

The jury is still out on what increases if any we will see with Ivery bridge
eastmen is offline   Reply With Quote
Old 03-Feb-2012, 13:38   #1160
almighty
Naughty Boy!
 
Join Date: Dec 2006
Posts: 2,469
Default

Quote:
Originally Posted by eastmen View Post
Well they are saying a 17watt Trinity will be as fast as a 30w Llano. That should put it pretty close to Sandy bridge IMO.

The jury is still out on what increases if any we will see with Ivery bridge
Leaked trinity over at overclock.net show that its no faster then Llano
almighty is offline   Reply With Quote
Old 03-Feb-2012, 15:03   #1161
DavidC
Member
 
Join Date: Sep 2006
Posts: 307
Default

Quote:
Originally Posted by almighty View Post
Leaked trinity over at overclock.net show that its no faster then Llano
What, the OBR result? He updated later to say it was a Chinese fake. I've read from somewhere else it was a Bulldozer chip, not Piledriver.
DavidC is offline   Reply With Quote
Old 11-Feb-2012, 07:10   #1162
denev2004
Member
 
Join Date: Apr 2010
Location: China
Posts: 143
Send a message via MSN to denev2004 Send a message via Skype™ to denev2004
Default

Quote:
Originally Posted by mczak View Post
I wonder how AMD expects to get 10% higher performance from Piledriver. According to that Software Optimization Guide, there's really not that many performance enhancing changes. Ok 10% more entries in load/store queue (44 instead of 40) is nice, as is the doubled l1 dtlb size (from 32 to 64). There's also a couple more supported instructions but otherwise it seems virtually unchanged - even keeping the very lame l1i associativity of Bulldozer for both Trinity and Vishera.
Of course you can't directly estimate performance by this guide but since everything seems to be so extremely similar I really don't expect much (except maybe higher clocks).
Not really. It could happens on Latency as well as CR
I hope the 32nm of Piledriver will be a later version. Remark that the first version of 45nm of GF is not that good, while the latest of which is really great.
__________________
Well I'm not a native English speaker so there might be misuse through my words. I just hope it won't cause too much misunderstanding.
denev2004 is offline   Reply With Quote
Old 11-Feb-2012, 10:00   #1163
AlexV
Heteroscedasticitate
 
Join Date: Mar 2005
Posts: 2,438
Default

Quote:
Originally Posted by eastmen View Post
Well they are saying a 17watt Trinity will be as fast as a 30w Llano. That should put it pretty close to Sandy bridge IMO.

The jury is still out on what increases if any we will see with Ivery bridge
No, what they're cleverly saying is that perf/watt should be sortof equal...in a pretty specific setting. And the jury is far less out on what Ivy Bridge (please make an effort and at least use proper codenames and proper form in written communication, there's honest mistakes and just sloppiness) than it is on what Piledriver brings to BD. Let's not build up unrealistic expectations again only to have them go kaboom.
__________________
Donald Knuth: Science is what we understand well enough to explain to a computer. Art is everything else we do.
AlexV is online now   Reply With Quote
Old 13-Feb-2012, 09:19   #1164
I.S.T.
Senior Member
 
Join Date: Feb 2004
Posts: 2,570
Default

The only way a Trinity at 17W could compare to a 30W Llano anyway would be solely in the GPU. The CPU will be slower.
I.S.T. is offline   Reply With Quote
Old 13-Feb-2012, 09:52   #1165
itsmydamnation
Member
 
Join Date: Apr 2007
Location: Australia
Posts: 835
Default

Quote:
Originally Posted by I.S.T. View Post
The only way a Trinity at 17W could compare to a 30W Llano anyway would be solely in the GPU. The CPU will be slower.
you know this based on what? bulldozer does very well at low power. quadcore llano at 35watt TDP is only 1.4 ghz. so a little bit of extra IPC over bulldozer and if they can hit 100-200mhz higher base clock and a better turbo then llano i can see it matching.

not saying that it will..... but you speak like its fact when reality is you have NFI.


the 35watt 8 core does have a base clock of 1.6 and a turbo of 2.8 so it could get close.
itsmydamnation is offline   Reply With Quote
Old 13-Feb-2012, 09:58   #1166
Zaphod
Remember
 
Join Date: Aug 2003
Posts: 2,116
Default

The analyst day slides are a bit inconsistent:
Quote:
The score for the 2012 AMD A4-4355M (ULV-17w) on the "Pumori" reference design for PC Mark Vantage Overall benchmark is projected to score 3525
The score for the 2012 AMD A6-4455M (ULV-17w) on the "Pumori" reference design for PC Mark Vantage Overall benchmark is projected to score 4200
The former is certainly much less impressive compared to the A6-3400M (35w) on the "Torpedo" reference design scoring 4545 and the E2-1800 (another slight boost to Zacate?) on the "Torpedo" reference design scoring 2757.
Zaphod is offline   Reply With Quote
Old 09-Mar-2012, 09:24   #1167
AnarchX
Senior Member
 
Join Date: Apr 2007
Posts: 1,515
Default

Quote:
Testing performed by AMD Performance Labs. The score for the 2012 AMD A10-4600M on the “Pumori” reference design for PC Mark Vantage Productivity benchmark shows an increase of up to 29% over the 2011 AMD A8-3500M on the “Torpedo” reference design. The AMD A10-4600M APU has a score of 6125 and the 2011 AMD A8-3500M APU scored 4764.
http://blogs.amd.com/fusion/2012/03/...9D-generation/

FX-4100 (3,6GHz+, 8MiB L3): 6113
AnarchX is offline   Reply With Quote
Old 10-Mar-2012, 14:55   #1168
Raqia
Member
 
Join Date: Oct 2003
Posts: 426
Default

Quote:
Originally Posted by AnarchX View Post
The lack of L3 seems to hurt for some tests.
Raqia is offline   Reply With Quote
Old 15-Mar-2012, 08:47   #1169
denev2004
Member
 
Join Date: Apr 2010
Location: China
Posts: 143
Send a message via MSN to denev2004 Send a message via Skype™ to denev2004
Default

Anger Fox Test results for AMD Bulldozer processor

BTW the new version of The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers has been released
__________________
Well I'm not a native English speaker so there might be misuse through my words. I just hope it won't cause too much misunderstanding.
denev2004 is offline   Reply With Quote
Old 11-Apr-2012, 01:11   #1170
fellix
Senior Member
 
Join Date: Dec 2004
Location: Varna, Bulgaria
Posts: 3,033
Send a message via Skype™ to fellix
Default



AMD Trinity (Piledriver) Outperforms Bulldozer in Integer and Floating-Point Benchmarks
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic.
Microsoft: Russia -- Big and bloated.
Linux: EU -- Diverse and broke.
fellix is offline   Reply With Quote
Old 11-Apr-2012, 13:38   #1171
fehu
Member
 
Join Date: Nov 2006
Location: Somewhere over the ocean
Posts: 806
Default

two a8-4500m with different performance?
fehu is offline   Reply With Quote
Old 11-Apr-2012, 13:54   #1172
Gubbi
Senior Member
 
Join Date: Feb 2002
Posts: 2,869
Default

And a 4.2GHz multi core K8 derivative ?

Cheers
__________________
I'm pink, therefore I'm spam
Gubbi is offline   Reply With Quote
Old 11-Apr-2012, 14:09   #1173
hoho
Senior Member
 
Join Date: Aug 2007
Location: Estonia
Posts: 1,218
Send a message via MSN to hoho Send a message via Skype™ to hoho
Default

Quote:
Originally Posted by fehu View Post
two a8-4500m with different performance?
Different work units having different calculation requirements is my guess. Though without having actual links to the work units used to make that table I can't be sure.
hoho is offline   Reply With Quote
Old 14-Apr-2012, 04:54   #1174
denev2004
Member
 
Join Date: Apr 2010
Location: China
Posts: 143
Send a message via MSN to denev2004 Send a message via Skype™ to denev2004
Default

Quote:
Originally Posted by Gubbi View Post
And a 4.2GHz multi core K8 derivative ?

Cheers
It seems the clock here means the highest TB rate.
__________________
Well I'm not a native English speaker so there might be misuse through my words. I just hope it won't cause too much misunderstanding.
denev2004 is offline   Reply With Quote
Old 14-Apr-2012, 19:25   #1175
Lightman
Senior Member
 
Join Date: Jun 2008
Location: Torquay, UK
Posts: 1,173
Default

Original source explains why:

Quote:
Since we don't know the exact clock frequencies of the benchmark runs, it is difficult to find the correct value for calculating per GHz results. I estimated those based on turbo clocks, which might lead to skewed results. At least in the case of comparing Trinity with its Piledriver cores to the FX models, I hope that rather similar turbo mode behaviour should reduce the error margin.
OK, here comes the table comparing several values I filtered out of my collected BOINC results to have OS and client version the same. As you can see, Piledriver w/o L3 cache seems to perform a bit better than BDver1 based FX models:

Note: I used "Trinity vs. Bulldozer" to denote the difference between a L3-less Piledriver core and a Bulldozer core, which always had L3 available.
Another note (as of 04/10): In the Piledriver vs. Bulldozer columns I divided the Trinity value by the maximum of all FX values. Further the FP benchmark likely run at base clock frequency. I'll add more on that in a follow up article.
http://citavia.blog.de/2012/04/08/tr...ance-13460109/
Lightman is offline   Reply With Quote

Reply

Tags
amd, blewdozer, oh well, patents

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 01:38.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.