AMD Bulldozer Core Patent Diagrams

Discussion in 'PC Industry' started by Raqia, Apr 16, 2009.

Tags:
  1. Doomtrooper

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,328
    Likes Received:
    0
    Location:
    Ontario, Canada

    I would not snicker unless you want to pay a small mortgage to Intel without competition like back in the Pentium 60 days...but be my guest.
     
  2. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,528
    Likes Received:
    107
    This may happen in terms of fixing the currently crippling bug they have with writes to the L1, and other apparent gimpyness. So it will be improved in terms of being less weak, but it'll still be far too weak, IMHO.
     
  3. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    Or strongly lowering latency. It would be probably more useful for desktop workloads to have a lowe ramount (512K-1Mbyte) of cache L2, but with much lower latency than having 2 Mbyte of cache with those pesky timings (L1 bug aside, which anyway could be very useful to analyze in detail, because if with a "crippled" L1 BD performs on par with a 2600K, it would be interesting to know how well it could have performed without problems).
     
  4. fehu

    Veteran Regular

    Joined:
    Nov 15, 2006
    Messages:
    1,441
    Likes Received:
    380
    Location:
    Somewhere over the ocean
    so all the problems came from faulty cache design in your opinion?
    and something that can be easily addressed in pilediver?
     
  5. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    If you are asking to me, not all the possible issues can come from the cache design (lower execution unit count per "core" and need for high clocks are examples). But having a better cache system always helps :grin:
    Never said that this could be easy. But 18-20 cycles compared to SB's 10 with frequency being higher but not enough to compensate the difference could hurt performance. This is of course the result of cache size and frequency targets but larger caches use more die area and thus this lowers the performance/area ratio.
     
    #865 leoneazzurro, Sep 26, 2011
    Last edited by a moderator: Sep 26, 2011
  6. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,528
    Likes Received:
    107
    No and no. Their problems seem to be quite uArch related. I'm also not seeing the 2600K parity happening all that much (I'm pretty sure 2500K parity is not as clear-cut either, but hey, I do hope I'm wrong!). Also, why are we pleased that a much larger, 125W TDP CPU sortof almost matches a smaller, 95W part?
     
  7. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    Because before that we had an even larger 125W TDP CPU that not even matches a smaller 95W part, maybe? So at least until Ivy Bridge we could have at least a little more competition.
    Which µarchitecture problems are you referring to? Narrow execution units, problems with the front-end or whatever else?
     
  8. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,486
    Likes Received:
    397
    Location:
    Varna, Bulgaria
    The Core 2 line proved that a big and fast L2 (shared between 2 cores) is a very good fit for desktop and some WS applications, but due the the inadequate system architecture it scaled poorly in SMP configurations. In Nehalem, Intel literally pushed the L3 cache as "the new L2" -- a sort pf a backbone for the whole memory sub-system, keeping actual copies of every cache above it, so the coherent traffic is kept out of the "real" L2s, and still being a fast and large enough to rapidly serve any misses from the upper levels. That turned the L2 cache into a more specific role of a truly private small but very low latency (only 10 cycles) piece of memory. The L1D cache (32KB) is 4 cycles in comparison, but 8 times smaller in size.
    That way, Intel stroke two birds with one stone - a new scalable and very efficient server architecture for heavily threaded loads, and at the same time a potent performer for desktop applications, all thanks to the versatile "backbone" concept. Westmere-EX and SNB naturally developed this philosophy even further ahead. All this is on top of the already first class HW prefetch mechanism and memory disambiguation.
    It all proves that the sheer size advantage is simply not enough, anymore. AMD didn't invested in more sophisticated and elegant workaround for this problem - they just scaled up the same old concept to a new ridiculousness in a gamble to save the day.
     
  9. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,528
    Likes Received:
    107
    BD, if those numbers are even remotely accurate, doesn't change the competitive landscape at all, AMD is still left fighting for the same scraps it was fighting before, without impacting Intel in any way shape or form. Front-end should be fine on paper, just like many other things (hard to gauge how that pans out in practice though), but the Execution engine seems bonkers in practice.

    Their cache hierarchy is pretty much bonkers too, with its exclusivism and other contortionisms even excluding the apparent bug(this is uArch, not something that's trivially fixed or validated IMHO). Their Turbo implementation seems rather limited too, but maybe that can be tweaked. It looked pretty decent on paper mind you, but the paper was pretty vague and the implementation is anything but.
     
  10. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    Of course Intel has the advantage. But at least the situation improved.

    Cache hierarchy cannot be changed or overhauled in Piledriver? Execution units cannot be improved (maybe dedicating less space to caches)?
    If you know that Piledriver will be only a slightly tweaked Bulldozer, OK.
     
  11. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    I'll put my money to where the performance is. If you like to support a company just for the sake of "competition" even if the item in question is underwhelming, then be my guest.
     
  12. Sxotty

    Veteran

    Joined:
    Dec 11, 2002
    Messages:
    4,869
    Likes Received:
    330
    Location:
    PA USA
    That has nothing to do with snickering about it. I am just tired of AMD failing. It isn't funny it is sad. Intel is already slowing down b/c they have no reason to put anything better out. They are charging $1k for a processor. It is ridiculous.
     
  13. hoho

    Veteran

    Joined:
    Aug 21, 2007
    Messages:
    1,218
    Likes Received:
    0
    Location:
    Estonia
    They also charged 1k when their highest-end CPU was miles behind AMD's midrange. Since around Core2 you've been able to get a decently performing CPU from Intel for around $200-300. Sure, it's not the highest-end but I wouldn't say paying 4x higher price for a few hundred MHz extra is worth it. Going from 4 -> 6 cores is another thing of course but their prices start from around €500, not sure how much cheaper they could be in US.
     
  14. Doomtrooper

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,328
    Likes Received:
    0
    Location:
    Ontario, Canada
    Never said that, just said laughing at a competitor trying to take on a giant when as consumers we NEED AMD is not the smartest move....wait for full benchmarks before snickering.
     
  15. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    8,456
    Likes Received:
    578
    Location:
    WI, USA
    AMD too had a bit of fun with selling desktop processors for $1000 back during the Athlon 64 era. Those FX chips on 939 come to mind. Then they got bulldozed (harhar) by Core 2 and cut most of their product prices in half. Thanks Intel. ;)
     
  16. Sxotty

    Veteran

    Joined:
    Dec 11, 2002
    Messages:
    4,869
    Likes Received:
    330
    Location:
    PA USA
    They are not cheap and that is the problem I have a 6 core (AMD) and don't want to go to 4 core.
     
  17. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,151
    Likes Received:
    571
    Location:
    France
    If 4 cores can beat your 6 cores even in heavy multithreaded apps, what's the problem ?
     
  18. Malo

    Malo Yak Mechanicum
    Legend Veteran Subscriber

    Joined:
    Feb 9, 2002
    Messages:
    6,972
    Likes Received:
    3,050
    Location:
    Pennsylvania
    Because 6 > 4!
     
  19. Sxotty

    Veteran

    Joined:
    Dec 11, 2002
    Messages:
    4,869
    Likes Received:
    330
    Location:
    PA USA
    The problem is also I run multiple instances. I run one for each core. The app doesn't need multithreaded since they are completely independent.
     
  20. hoho

    Veteran

    Joined:
    Aug 21, 2007
    Messages:
    1,218
    Likes Received:
    0
    Location:
    Estonia
    In that case higher single-threaded performance should provide even higher overall throughput due to less cache trashing.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...