Is everything on one die a good idea?

Discussion in 'Architecture and Products' started by punchinthejunk, Jul 21, 2014.

  1. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    You seem to be confusing words; nothing you mentioned requires them.

    I never stated otherwise?


    EDIT: Just to put this in context... Let's budget 32mm^2 for 8 CPU cores at 10nm. That is 6.4% of a 500mm^2 chip's die area. Best case scenario you get 6.4% more performance for eschewing an APU design (assuming you are not power or thermal limited). Real world the difference is probably smaller, in fact you may very likely net overall performance in heterogeneous workloads with the APU design. The next question is cost; does a company simply charge more to maintain profits with the die size penalty or do they go to the trouble of producing an entirely distinct chip?
     
    #41 ninelven, Jul 23, 2014
    Last edited by a moderator: Jul 23, 2014
  2. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    19,426
    Likes Received:
    10,320
    I don't see why not. The PS4 already sports an APU with the bandwidth of a midrange GPU. And that can't even be considered a high end APU as it must fit into the budget of a 399 USD device that had to fit a power profile suitable for quiet operation in a living room environment..

    Imagine what you could do if you had the budget for a high end device. An APU around 200-400 USD which doesn't need to worry about the power constraints of operating in a quiet living room environment. It's certainly possible for it to have the same memory controller as a current generation high end GPU.

    The only reason it isn't done at the moment is that the market for it isn't that large. You'd basically be appealing to a demographic that is skeptical about the capabilities of an integrated solution to fill the shoes of a dedicated GPU. Not only that you'd be trying to get buyers from the very demographic which doesn't see the point in an APU.

    It's going to take time until the market is ready for such a product for the desktop market. I could perhaps see it succeed now in a laptop, but there is still a very serious risk for such a product as well as the power constraints. Because of the power constraints that already precludes high end desktop graphics solutions from such a product.

    Regards,
    SB
     
  3. pMax

    Regular

    Joined:
    May 14, 2013
    Messages:
    327
    Likes Received:
    22
    Location:
    out of the games
    ...Mantle is quite late as project, as well as asymmetric GPGPU.
    imho if you could have easily used 1 GPU for the 3d pipeline and a 2nd for GPGPU, you'd have seen a market shift toward APU.
     
  4. keldor314

    Newcomer

    Joined:
    Feb 23, 2010
    Messages:
    132
    Likes Received:
    13
    The big problem is that high bandwidth in general is more or less incompatible with slotted memory, so and move to a high performance APU will require RAM soldered to the motherboard. This is a nonstarter in the desktop world, but makes sense for laptops and tablets. I suppose it's reasonable to have say 32 GB stacked memory used as a cache and then DDR expansion slots for the other 128 GB.

    Modern GPUs are rapidly becoming massively parallel optimized CPUs - just look at the docs for the latest version of CUDA to see what I mean (OpenCL is a very long way behind in this regard). This means there's not much room for adding large numbers of cores to your CPU - if your algorithm has that much parallelism, it's likely to be more efficient on a GPU architecture. Between this and the sequential performance wall that CPUs have hit in the last decade, and there's not much room for the CPU to do anything other than shrink with each process, which leaves more and more room for the GPU.

    The real problem I see with having high performance APUs is Intel/Nvidia/AMD. Intel is not likely to release a 400+mm consumer part any time soon, so they'll have a very hard time competing with a high end GPU, especially when 50mm of that die goes to CPU. Nvidia doesn't have x86 IP rights, so they're not going to be releasing a desktop APU any time soon (though the rise of ARM makes things very interesting - if WinRT ever matures to the level where you can write serious programs with it, and free from the Windows App Store at that, an ARM based system becomes a real possibility). AMD is the most likely to release a high end APU, the consoles are already close to this, but their problem is that they've fallen behind Intel in CPUs and Nvidia in GPUs, especially when you look at the software side of things.
     
  5. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Some of those things might actually work better with APUs, thanks to tighter integration.

    Still, I don't think many people are claiming that 600mm², 300W monster dGPUs meant to go into ultra-high-end, including dual-GPU boards will disappear any time soon. I certainly am not. I mean, I currently game in 5760×1080 and when I replace my monitors for 4K ones, I'll be running a 11520×2160 (24.8 MPixel!) setup. Needless to say, that's going to require some massive graphics computing power, nothing an APU is likely going to be able to manage.

    But most people don't do that. Most people use moderate definitions (which tend to increase over time, of course) and are content with 100~250mm² GPUs. Those won't make much sense for long.
     
  6. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    Yes you did, the problem with your theory is that you insist CPUs will stagnate, thus making room for more massive GPUs, I disagree.

    There it is, the flaw in your whole argument, many years into the future you still expect CPUs to only have 8 cores, which is not compatible with the statement I made, progress requires more processing, not less.

    In 10 years time, at least one of the things I mentioned is bound to take off as a mainstream tech.

    Exactly, they combined a low-clocked CPU with a low clocked GPU as well, not just because of noise concerns or power consumption, but durability as well, having two fast and big chips next to each other will increase the percentage of faulty chips, they can't have that anymore .. not with the shadow of the red ring of death still looming in the horizon to this day.

    Not to mention consoles are a different economic problem, Sony and MS sell them with zero margins because they can recoup those elsewhere. You can't make that same argument with consumer class hardware.
     
  7. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,382
    You could argue that GPUs will need to become relatively more powerful than less: we're only at the start of an massive increase in resolutions. And that's a quadratic thing. Eventually it will top out, but not before bandwidth and calculation requirements for GPUs have gone up by much more than those required for CPU.
     
  8. ninelven

    Veteran

    Joined:
    Dec 27, 2002
    Messages:
    1,742
    Likes Received:
    152
    Umm... where? Please find the exact post and quote it. Otherwise you are flat out lying.

    I never said this either. Please stop putting words in my mouth.
     
  9. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    I'm sure you can put an APU + memory stacks on a socket (wide or slightly big one for the better models, like a Pentium Pro). But please keep 64bit or 128bit DDR4 there, depending on low end or higher end platform. Then PCIe lines, video outputs, random I/O etc. through the socket pins.

    I'm not very pleased by the notion of a 200W APU, though. Seen an OEM consumer desktop PC?, the cases haven't changed since a decade ago. often a micro-ATX normal tower with one rear 80mm fan (if it's installed at all?) and a PSU with either 80mm or 120mm fan and then some low cost CPU heatsink (but much bigger than in the 90s) like the Intel one.
    That is all very decent : this design allowed to build Pentium 4 Prescott machines that actually work.

    At 220W, 225W? it seems to be a folly. Not everyone buys $150 cases and huge heatsinks on pipes with the fans that blow sideways (and maybe fail to cool the VRMs if there's no other air flow). OEMs use the most cheap motherboard quality and even with low end motherboards for the parts market, we had the boards that took up a 95 watt CPU (if that, as FX 4xxx can be troublesome) and not a 125 watt one.

    Now you can have a socket APU and upgrade, downgrade whatever the hell you want but that flexibility might be useless if you want to buy a 220W APU but your motherboard only takes a 100W one.. Or the cost of the bigger power circuitry will be passed to everyone, but cooling is barely reasonable.
    I fear it would be all to easy to end up with lots of throttling or instability due to power and cooling.. on a desktop.
     
  10. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    I would like to keep Micro-ATX at least. not everyone wants a mac mini or a mac pro, and PCs may serve very different needs and configurations. I like to open the side door and put a hard drive in, without it looking like I'm doing a disassembly and reassembly of the system. I even like the card slots (thought I only use two cards currently).

    PC always had that maximal flexibility, cheapness (well, except when they were 50x more expensive that a C64) because of the card slots. You can add anything be it additional network cards, 10Gbe networking, a card with four RS232, more USB controllers or what not and it's all real devices, PCIe/PCI, low latency and low overhead.

    To reduce to the extreme.. I would not like ending with a single micro-USB 3.0 port, hdmi/displayport and that's all.
     
  11. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    I'm not so sure. It's definitely happening for phones and tablets, but for PCs I'm afraid it's not, or at least much slower. The sheer number of laptops with 1366×768 displays, even 15.6" ones, is (depressingly) staggering when you consider how many phones have 1920×1080 displays, or even 2560×1440 ones now (though that's possibly pointless).

    I'm sure 4K monitors will become more or less standard at some point, but I'm not sure it will happen at a pace that will require a faster increase in GPU power than is usual.
     
  12. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Well, I'm usually morosely pessimistic about these things, but I also can't help playing contrarian :)
    The past few years may well be an anomaly brought about by the recession. I don't have graphs, so maybe my memory is playing selectivity tricks on me, but harddrive densities, which I have been waiting to see revive, seem to be growing again. My 2TB drive bearing NAS which I populated back in 2010, may finally be due for an upgrade. 6TB Reds (w/ nas3.0) are (or will be shortly) shipping, and Seagate is sampling 8TB drives. Monoprice and ASUS both have (or will shortly, in the case of monoprice) reasonably priced 4k monitors, although I'm waiting for an IPS panel before I take the plunge. 4k live content is coming -- the AX100 is a pretty affordable camcorder, and although I personally need 60fps, I'd have already bought one if I didn't take so many sports-related vids. 10Gbe is threatening to sputter to life (and I'll need it if I plan to edit 4k files I store on the NAS) -- there was a series of articles written about that over on smallnetbuilder.

    My real problem is cpu cycle costs. There is at least a small sliver of a reason to be optimistic that the present situation is a diversion from baseline growth. Just, y'know, none based on any of the rumors regarding Intel's legacy desktop line ;^| OTOH, if the "news" nowadays is that broadwell is going into tablets, maybe we can hope that the workstation parts fall in price and become the new desktop platform. We certainly need the bandwidth if we hope to feed more cores....
     
  13. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Haswell is in tablets already, with CPU + chipset next to each other on a package. Maybe just the Microsoft tablet but it's there.

    It's staged already : processor ending in -Y is the very low power, likely tablet.
    -U is about 15 watt (ultrabook laptop), -M is regular laptop, -T low power desktop, -S slightly low power desktop (capped at 65 watt), no ending (or K) is the regular stuff.

    So there's no reason in particular that Intel would let its range "slide down" so you get higher end stuff for cheaper.
    I think they're going to add other steps in the range, even. In 2015 on the desktop you'll have Skylake-S (4 cores, 16 PCIe lines, overclocking disabled) < Broadwell-K (4 cores, L4 memory, 16 PCIe lines) < Haswell-E (6 cores, 24 PCIe lines) < Haswell-E (6 cores, 40 PCIe lines)

    I think those are the numbers of PCIe lanes that are managed directly by the CPUs, without counting DMI link.
     
  14. Blazkowicz

    Legend

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    It's not even clear what is best for you lol.
    If you're a memory bandwith/latency junky, Broadwell-K seems great.
    If you want cores, Haswell-E.
    If you're worried about PCIe bandwith (graphics card at full 16x 3.0, M.2 PCIe SSD and 10Gbe at once) then Haswell-E.

    But Skylake has new GPU, and new very wide CPU instructions (AVX-512) which might be maybe useful in niches like video editing so good for your need once your software implements it. :razz:
     
  15. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    Oh, there's a definite hierarchy.
    I need cores. Badly. Some of my timelines are yielding somewhere in the neighborhood of 2-4qps real-time, which is just painful. I also need lanes -- 10Gbe, a gfx card (titling and some effects), a capture card, and a native YUV output card. I don't edit locally (pita copying around files), so my need for SSD is not all that high (but that IS why I need/want 10gbe). I doubt I'm memory bound until I start dealing with 4k non-long-gop formats.

    They require ssse3, not avx. There's always compatibility to consider, so avx-512 support is not something I'd expect in the short-medium term :) They do support quicksync, which is helpful when generating video, but not terribly useful otherwise.

    I'd dearly love to see Edius ported to take advantage of gfx cards. Not sure how likely that is given their continued resistance. Their quick sync doc indicates that *memory bandwidth* is, in fact, an issue. I'm not sure how much I buy that, with full-frame 1080-60p video taking, what, 1/2GBps? Nevertheless, may be useful as evidence in the larger question under discussion: http://www.grassvalley.com/docs/App...onal/edius/PRV-4140M_EDIUS_SandyBridge_AN.pdf

    -Dave
     
  16. keldor

    Newcomer

    Joined:
    Dec 22, 2011
    Messages:
    75
    Likes Received:
    113
    Don't forget that as game engines become more realistic you have to put more work into each pixel. Given a choice between higher quality rendering vs. more pixels, I'll have to go with the rendering.
     
  17. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Sure, but that's not a new phenomenon.
     
  18. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Likes Received:
    17
    Location:
    Montreal, Quebec
    In 10 years from now we'll have AVX-1024 and the CPU and GPU will unify.

    GPUs will cease to exist, but only as we know them today. Fixed-function GPUs have long been dead and buried. Non-unified GPUs have long been dead and buried. Both have been replaced by hugely inefficient fully programmable unified computing devices. Of course they're only inefficient in terms of power and area if they were implemented on the silicon processes used back when GPUs were fixed-function and non-unified. Nobody has shed a tear about their complete disappearance, because silicon technology advanced and we didn't lose anything in absolute terms. Programmability and unification instead brought us a lot more functionality, and offered higher efficiency for things the old architectures weren't designed for.

    So the death of the GPU will be a joyous moment as well. We'll get a new breed of processors that fully supersede its functionality and will extract maximum amounts of ILP, TLP and DLP from any code you throw at it. Of course you can also think of it as a continuation of the GPU, or of the CPU for that matter, but the way we know either of them today will cease to exist.

    This future is inevitable due to both the Memory Wall and Amdahl's Law. The Memory Wall stems from computing power increasing at a faster rate than memory bandwidth. This has been a consistent law through each decade of computing. It has resulted in GPUs with memory interfaces running at 6+ GHz. Continuing down that path to satisfy the GPU's bandwidth hunger is impossible. It needs to vastly reduce the number of threads being processed in parallel, so that it can benefit from hierarchical caches. This makes it inherently more CPU-like. It will also help Amdahl's Law to require less parallelization and process sequential dependencies faster. Meanwhile nothing is stopping the CPU from becoming much more GPU-like by widening their vector units and adding SIMT-capabilities like gather/scatter and lane predication.

    So everything on one die isn't just a good idea, it's the only way forward. Both CPUs and GPUs started out with multiple chips (e.g. the i387SX coprocessor, the Pentium II's cartridge with separate L2 cache chips, or the Voodoo 2's two TMUs and one FBI). Integration has provided many benefits, and we certainly haven't seen the end of it yet. Even at the chip level there's much integration and unification opportunity. Heterogeneous computing is merely a step in between functional separation, exploiting functional overlap, but ultimately leading to homogeneous unification.
     
    #58 Nick, Jul 26, 2014
    Last edited by a moderator: Jul 27, 2014
  19. dnavas

    Regular

    Joined:
    Apr 12, 2004
    Messages:
    375
    Likes Received:
    7
    The Memory Wall is exacerbated by the co-location of code and data and MIMD-style, multi-core execution. In that sense, the future may be more GPU-like than CPU. It isn't clear to me how serial architectures are better at dealing with higher latencies either. The reason (well, one of them) why we only have quad-core Intel CPUs is that there isn't the bandwidth on the 115X sockets [one of the reasons why I only compared the 860 to the 4770s, instead of the 9X0s, which are on a different class socket entirely]. Whatever you think about the GPUs and memory bandwidth, CPUs are already up against the wall.

    I do think we're in agreement that the issue here is memory bandwidth. I think you'll get no argument from anyone that putting a large amount of memory very near the CPU and very near the GPU would be awesome. I would love to have 100 MIMD cores and 10k SIMD cores to play with. It's less clear to me that I need to have a large number of MIMD cores co-located with my SIMD cores. Even if I accept that there's a large crossover between those worlds, it isn't clear to me whether the world belongs to traditional CPUs with vector style extensions, or GPUs with (for example) a "real" core per SMX.

    I do hope that we get a few years of CPUs with vector instructions, and GPUs with arm/whatever cores on them. That's where all the fun is! Once hardware gets homogenized, we have to put our coding straightjackets back on :(
     
  20. Alexko

    Veteran Subscriber

    Joined:
    Aug 31, 2009
    Messages:
    4,541
    Likes Received:
    964
    Are those two things really all that different? Isn't it mostly semantics at this point?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...