Recent content by psurge

  1. P

    MS leak illustrates new console development cycle

    ARM’s weaker consistency does not mean a particular implementation can’t provide TSO though. (As I understand it, the Apple M1 has a TSO mode, and it’s one of the reasons emulating x86 doesn’t crater performance as much as you’d expect on that CPU.)
  2. P

    Switch 2 Speculation

    I think LPDDR5 (x?) is more likely than GDDR for power reasons. And I still think they’ll go with a 64bit bus and a larger LLC, but I’m obviously just guessing.
  3. P

    Switch 2 Speculation

    I think 64bit LPDDR5(x) with a much larger (say 32 MB?) LLC could be still be pretty interesting, at least in handheld mode.
  4. P

    Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

    Is there a large jump in wafer pricing when moving from n6 to n5?
  5. P

    Switch 2 Speculation

    If it’s really a custom chip, does waiting that long make sense? I guess it gives devs time with actual HW, and maybe they are not in a rush with Switch selling so well. Anyway, if it’s an Ampere GPU, I doubt it will be 5nm - porting doesn’t sound trivial, and 5nm is expensive. Maybe TSMC 7...
  6. P

    Switch 2 Speculation

    Hmm - the rumored core count seems large for the target power envelope. Maybe they brought in a larger L2 from Ada to save on DRAM access power, or ported to another process? But for availability’s and all of our wallets’ sake, I hope it’s not 5nm - TSMC 6/7 would probably give a nice boost from...
  7. P

    NVidia Ada Speculation, Rumours and Discussion

    Well, if I remember correctly, there was that NV blog post on using AI to design smaller circuits at a given performance level for Hopper. Maybe they applied the method much more broadly on Ada? But I agree that it seems more likely for the rumored transistor count number to be wrong.
  8. P

    NVidia Ada Speculation, Rumours and Discussion

    Man. I was already priced out of the soon to be previous gen before the crypto mining craze / inflation. I hope AMD seriously undercuts NV on price or these rumors are way off, but … I’m not holding my breath.
  9. P

    NVidia Ada Speculation, Rumours and Discussion

    Guessing, but maybe “cut” in this case just means that L2 slice capacity on AD102 was designed to be smaller than in other configurations, e.g. to keep die size in check, and not that they are throwing away 25% of the capacity to increase yield.
  10. P

    NVidia Ada Speculation, Rumours and Discussion

    Stupid non-HW person question, but would it make sense to allow voltage to vary on a per SM/GPC/whatever level?
  11. P

    NVidia Ada Speculation, Rumours and Discussion

    Also, don’t the AMD rumors say 500W for even more transistors, cache, perf (and higher clocks, IIRC 2.5GHz vs 2.2) spread across multiple dies? Maybe the higher power numbers on the NV side come from having to up the clocks far into the non-linear part of the perf/power curve so that they can...
  12. P

    Speculation: GPU Performance Comparisons of 2020 *Spawn*

    Having the upper hand in perf/W means you can provide more performance (well, throughput anyway) than your competitor for a given level of input power (or output noise, for those into quiet PCs). I'd say it's a critical metric, for CPUs or GPUs, from mobile to desktop to server. In this case...
  13. P

    Nvidia Ampere Discussion [2020-05-14]

    What fraction of board power is consumed by memory? If bandwidth is increasing by 50%, and GDDR6X uses only 15% less energy per bit transferred than GDDR6, memory power consumption will increase by almost 30%.
  14. P

    Speculation: GPU Performance Comparisons of 2020 *Spawn*

    Me too, especially considering A100 on PCIe has a TDP of 250W. Maybe the consumer GPUs really are on some process other than TSMC 7nm, and whatever it is provides the density increases over 16/12nm required to increase perf by 50% without ridiculous die sizes, but not much in terms of perf/power...
  15. P

    Nvidia Ampere Discussion [2020-05-14]

    I still think it’s going to be 2 16-wide fp32 + 1 16-wide int32 ALUs per scheduler - with integer macs maybe getting performed by one of the fp32 units. Not sure what to expect from the consumer tensor cores or the ray tracing hw though.
Back
Top