Nvidia Post-Volta (Ampere?) Rumor and Speculation Thread

Discussion in 'Architecture and Products' started by Geeforcer, Nov 12, 2017.

Tags:
Thread Status:
Not open for further replies.
  1. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,254
    Likes Received:
    3,463
    The guy is a complete waste of time .. his track record is non existent, he talks about so many things with no technical basis to most of his claims. Most of his videos are pure sensationalism that is filled with so much over generalizations.

    I would say the upper case of XSX WITH console optimizations is the 2080, maybe the 2080 Super, any game that would stress both the CPU and GPU will slash the memory bandwidth of the device due to memory contention. This will suppress the performance of the GPU, capping it's effective bandwidth well below the 2080.
     
  2. ShaidarHaran

    ShaidarHaran hardware monkey
    Veteran

    Joined:
    Mar 31, 2007
    Messages:
    4,000
    Likes Received:
    50
    This video reminded me why I stopped clicking on his videos. Wannabe analyst with an obvious vendor bias and no engineering background.
     
  3. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    453
    Likes Received:
    171
    Minimum requirements are a thing.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,719
    Likes Received:
    929
    Location:
    Guess...
    Yes but say 8GB vram + 16GB RAM + SATA SSD (let's call that min spec) would presumably require significantly different hand optimisation to 12GB vram + 16GB RAM + Gen4 NVMe SSD for example. The combinations of different memory sizes and speeds is huge so I don't see how you can hand optimise for everything up front whereas my understanding of HBCC is that it manages the different memory tiers automatically like cache.

    Edit: come to think of it, isn't that how it's supposed to work in the PS5 as well? Cerny said that the game engine didn't need to know what data was stored in which memory partition, it just called for it and the system itself managed data movement optimally. That sounds a lot like HBCC to me.
     
  5. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    1,946
    Likes Received:
    815
    Location:
    Earth
    Sony and hbcc is not same. HBCC in essence is page swapping. Unused pages are moved to disk and pages are loaded back in miss. And there could be some arbitrary helper logic to try to avoid misses. This is something that just works without engine integration but when miss happens it's very expensive as the data is not in ram and needs to be loaded from disk, inserted to ram and then gpu can continue.

    Sony solution requires developer to explicitly load content via those 6 different priority queues and by extension manually remove data from ram to make space for new data. Where sony has smartness is that the controller decompresses content, manages cache lines and loads the data directly to given address without going through cpu/os layer. i.e. data goes straight from disk to ram via dma. Developer still needs to manually manage what is loaded and when and what is discarded from ram to make space for new data. When the data is discarded from ram and replaced with new streamed content the cache lines pointing to old data in ram must be cleared. That's part of the cache scrubbers sony has implemented in hw.
     
    #805 manux, May 6, 2020
    Last edited: May 6, 2020
  6. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,913
    Likes Received:
    2,232
    Location:
    Germany
    Isn't that just unified memory?
     
  7. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,863
    Likes Received:
    2,793
    Location:
    Finland
    The two don't count each other out. It's "just unified memory" if you don't care about speed and you just want to address single memory space. HBCC allows taking things further by taking SSD in too and allowing page-based addressing everywhere, which allows "anyhing" to be loaded up quick no matter where it is
     
  8. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,913
    Likes Received:
    2,232
    Location:
    Germany
    I was referring to Cerny's "the game engine didn't need to know what data was stored in which memory partition, it just called for it and the system itself managed data movement optimally."

    TBC, does Cerny include the SSD or Game-Servers in "memory partitions"?
     
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,863
    Likes Received:
    2,793
    Location:
    Finland
    Servers certainly not, but if they're using HBCC-esque memory controller, SSD should be included - I mean, what would be the point of emphasizing how it doesn't matter in which memory something is, PS4 already had unified address space for CPU & GPU in one unified memory didn't it?
     
  10. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    1,946
    Likes Received:
    815
    Location:
    Earth
    Cerny didn't claim ssd looks like ram. What cerny claimed is there is 6 priority queues developer can use to fetch data from ssd into ram very efficiently. i.e. developer has to manually manage data/memory.

    To me it feels microsoft is taking similar approach. Microsoft side will be clear once microsoft releases DirectStorage api specification.
     
  11. seahorsesaw

    Newcomer

    Joined:
    Oct 21, 2017
    Messages:
    45
    Likes Received:
    26
    pharma and PSman1700 like this.
  12. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,436
    Likes Received:
    813
    Location:
    France
    Remij, Adonisds, del42sa and 3 others like this.
  13. Konan65

    Joined:
    Sep 9, 2018
    Messages:
    6
    Likes Received:
    2
    Moore's Law is Dead update video -

    GA102 has apparently 384-bit Bus Width | 5376 CUDA Cores | 230W | 18gbps and boosts 2.2GHz +.
    864GB/s bandwidth (40% more that 2080Ti) | Overall performance 50% faster than 2080 Ti
     
    pjbliverpool likes this.
  14. techuse

    Newcomer

    Joined:
    Feb 19, 2013
    Messages:
    220
    Likes Received:
    121
    Now 50% i can believe. I think hes just guessing but 50% will probably be pretty accurate.
     
    pjbliverpool likes this.
  15. Frenetic Pony

    Regular Newcomer

    Joined:
    Nov 12, 2011
    Messages:
    453
    Likes Received:
    171
    Boost clock seems pretty damned high, especially since it's supposed to be both cutting down on power use AND boosting it over 20% higher than their previous highest clock. But the bus width and memory speed actually add up to the correct number! So, at least that's good. And the actual performance boost seems within reach.
    BUT... this is literally just the leak from like, over a month ago as far as actual information involved that you can see a few pages back. That this guy wasn't involved in at all.

    And while I still doubt the ram specs are a good for the consumer as devs will clearly, again, have finer grained control over memory in consoles than the PC; well looking on it that doesn't mean it's not what Nvidia is going to do anyway. I can sympathize slightly, ram to bus width is hard to figure out in terms of matching the new consoles. Sure you want at least 10gb at minimum, thanks MS, and 12gb should play it safe. But those are both fairly awkward numbers to hit with usual bus widths. Do you go a full 16gb for your standard mid tier 256bit bus? But then that means your high tier ones need like, 20gb and 24gb or whatever right? Seems like a path towards frustration and excessive material costs that may not see much use.
     
    #815 Frenetic Pony, May 12, 2020
    Last edited: May 12, 2020
  16. Kugai Calo

    Joined:
    Mar 6, 2020
    Messages:
    1
    Likes Received:
    0
    I'm curious whether they will make SIMD width match warp size, like what AMD did with RDNA.
     
  17. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,436
    Likes Received:
    813
    Location:
    France
    Sorry for the question, and it's maybe cross topic with rdna2, but, given the latest rumors, Ampere (for gaming) is coming before rdna 2, right ?
     
  18. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    1,479
    Likes Received:
    219
    Location:
    msk.ru/spb.ru
    They've done that back in Kepler I think?
     
  19. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,719
    Likes Received:
    929
    Location:
    Guess...
    Some really interesting info in that video (again). It's all still sounding quite plausible to me and he's seems to have staked a lot of his credibility on this as he's presenting a lot of very specific information as factual rather than speculation. If he's wrong about even half of it his credibility is going to be shot.

    He also seems to mix factual leak info with his own speculation without clearly indicating which is which. That's particularly apparent in the Tensor compressed video memory claims where previously he talked about this as essentially increasing both video size and bandwidth, but now we learn that it actually comes with a performance penalty and so is really only useful at giving the GPU some extra VRAM space if it runs out, and it's a toggle, not on as default - this sounds far more believable than the previous claim.

    The thing that most interests me is NVCache and we should learn whether that's real or not in the HPC launch in a few days. So that should give a good indicator as to the reliability of the rest of this info.

    The claims on DLSS 3.0 were interesting too. Nvidia will override settings in some games forcing it on?? A controversial move if so....
     
  20. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    10,558
    Likes Received:
    600
    Location:
    New York
    Or like Nvidia did with Maxwell and Pascal?

    That would require more instruction scheduling hardware or abandoning the separate INT pipeline.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...