AMD Bulldozer Core Patent Diagrams

Discussion in 'PC Industry' started by Raqia, Apr 16, 2009.

Tags:
  1. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
  2. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,969
    Likes Received:
    963
    Location:
    Torquay, UK
    At least this time it will find some use in HPC tasks and most likely x264 codec will get optimization as well.

    Of course I don't expect broad adoption any time soon, not until Intel jumps on FMA4 bandwagon.

    One area where AMD can and will utilize it is of course for their GPU drivers. OpenCL especially ... so not quite as bad as 3DNow!
     
  3. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Intel will use FMA3 for their next AVX ISA extension in Haswell, so AMD's implementation will be incompatible, at least in this first iteration of Bulldozer.
     
  4. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,969
    Likes Received:
    963
    Location:
    Torquay, UK
    True, Intel moved goalpost mid match with regards to FMA.
    Anyway AMD already suggested they will introduce FMA3 with future revisions as you say.

    It will be interesting to see how this pans out and compare to the rate of adoption of SSE4.x which also is not great.
     
  5. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,112
    Location:
    New York
    Not looking good.
     
  6. Otto Dafe

    Regular

    Joined:
    Aug 11, 2005
    Messages:
    400
    Likes Received:
    59
    Main selling point seems to be that BD is more cost-effective than a 980x. Nope, not looking good.
     
  7. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    I would be glad, if this positioning of BD against 980X could have some real effect -- if this mini price war from AMD could slash any Gulftown SKU into some more manageable purchase option, it will be very nice upgrade point for many LGA1366 users, including me. :p
     
  8. Tchock

    Regular

    Joined:
    Mar 4, 2008
    Messages:
    849
    Likes Received:
    2
    Location:
    PVG
    Looks like they really missed the clocks they wanted by a big notch- and despite the large turbo numbers, it doesn't really help that much (?)- less than 10% gains allround.

    Right now it seems like a rather inefficient use of die area, but if they keep churning this out and later on push a 20-30% higher clocked (and 140W obviously :roll:) "8190", that might actually sell the platform well (say what you may, but AM3+ is probably much cheaper to bring over than a new socketed motherboard).

    But right now they're really in SB territory, and that's not a good place to be. SB GT2 is pretty much an amazing sweet spot chip to say the least, you really wonder what the GT1 is for...
     
  9. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,969
    Likes Received:
    963
    Location:
    Torquay, UK
    It's a marketing fault that BD doesn't look good as an 8 core.
    If you look at BD as 4 core with 8 threads it does quite well, especially looking at leaked prices.
    One thing where it fails is obviously die area, but that always was case with AMD and their strategy to fit one die for servers and desktops.
     
  10. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    While looking at those performance slides, I couldn't help but snicker and laugh at the content because it just reeked of desperation.
     
  11. Accord1999

    Newcomer

    Joined:
    Jun 21, 2003
    Messages:
    133
    Likes Received:
    6
    Wasn't the whole point of CMT was that you would get nearly the same throughput of two full cores and constantly repeated again and again by AMD?

    So far the leaks indicate that BD has decent throughput but extremely weak single-threaded performance, which supports that BD behaves more like a 8 core CPU. Intel 4C/8T processors like the 2600K have exceptional single-threaded performance which combined with the ~20% boost of Hyperthreading gives it good throughput.

    Based on the performance from the leaks, BD's key problem is that the cores are only about as fast as a K8 of the same clock speed.
     
    #851 Accord1999, Sep 25, 2011
    Last edited by a moderator: Sep 25, 2011
  12. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    Ugh, x264 is nice because the maintainers are awesome and support whatever's nice for them (IIRC they also included support for POPCNT from SSE4A, which makes them the...only? people to support that ISA extension). It means pretty much jack for general adoption/impact, though. And I'm seriously missing how this will help their GPU drivers in any significant form. Their CL stack should first figure out how to spew SSE code in any worthwhile manner, before moving on to FMAs and XOP, IMHO. Intel at least tries to do it!
     
  13. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    So, another disastrous CPU. The only upside seems to be that Piledriver is about ~6 months away so this debacle shouldn't last much longer than their TLB fiasco.
     
  14. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    Pretty much depends on what we expect Piledriver to be, no? Also, this isn't as bad as Failcelona IMHO, at least they're not vastly underperforming compared to their prior offerings. Also no nonsense about "definitely in the double digits" this round, although some of the official on-forum noise was somewhat disturbing, to say the least.
     
  15. hoho

    Veteran

    Joined:
    Aug 21, 2007
    Messages:
    1,218
    Likes Received:
    0
    Location:
    Estonia
    Then again for a 4-core CPU it doesn't use the transistors too efficiently considering how big it is.
     
  16. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Well, considering that AMD has stuffed BD with 16MB of caches (and not from the most dense type), it's bound to be big, and for a host of many other reasons, of course.
    On the matter of whether BD is 4 or 8 core design, I'm more inclined to accept it as a 4-core CPU... or 8-core, with shared front-end and FP/SIMD logic - meh. Whatever, just bring it on!
     
  17. AlexV

    AlexV Heteroscedasticitate
    Moderator Veteran

    Joined:
    Mar 15, 2005
    Messages:
    2,535
    Likes Received:
    144
    Maybe they'd have been better off stuffing it with less cache of the silghtly faster type, as it appears their current cache hierarchy is quite ludicrous in terms of throughput, with L1 being crippled AFAICT.
     
  18. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    AMD could still claim their CPUs have moar cache than the other guys around. Oh wait, wasn't this the case ever since T-bird came out some 10 years ago? Very naive reason to stick with the exclusive hierarchy, when Intel's Nehalem clearly demonstrated how you can have more-for-less in a very clean and streamlined cache architecture.
     
  19. rpg.314

    Veteran

    Joined:
    Jul 21, 2008
    Messages:
    4,298
    Likes Received:
    0
    Location:
    /
    I am hoping that Piledriver would be aimed at consumer markets, and hence might end up increasing it's area efficiency.

    I think it is clear that BD is a poor fit for client workloads.
     
  20. mczak

    Veteran

    Joined:
    Oct 24, 2002
    Messages:
    3,022
    Likes Received:
    122
    I wonder how much it can improve though, I see not many possibilities without fundamentally changing the architecture:
    1) increase clocks
    2) improve cache subsystem

    There's of course always other possibilities (like 256bit FP unit) and tweaks here and there but I'm not sure this can change the overall picture.
    If you look at BD, it's not terribly efficient for server loads neither. One module is ~30mm², whereas one SNB core is only ~20mm² or so. Now given the right loads that BD module might be faster but on a perf/area scale it'll lose pretty much no matter what. If AMD could go by with much less L2 cache (say 512kB instead of 2MB but faster instead per module) it would look much better there (as one module would only be slightly larger than a SNB core), though they probably can't because the L3 has neither the bandwidth nor the latency to make this really work. But even if it would be possible it still would lose (very badly) in lightly threaded loads. That's a tradeoff which is built right in the BD architecture with the 2-issue INT cores, unless it was really designed for MUCH higher clocks (which I rather doubt).

    Still I guess some changed cache architecture is something we'll see at least with Trinity. Either ditch the L3 cache or make L2 smaller (while improving L3 cache bandwidth/latency and also share it with IGP).
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...