General Next Generation Rumors and Discussions [Post GDC 2020]

Discussion in 'Console Industry' started by BRiT, Mar 18, 2020.

  1. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    I can't find this, can you quote windowscentral with regards to bcpack?.
     
    disco_ likes this.
  2. Xbat

    Veteran

    Joined:
    Jan 31, 2013
    Messages:
    1,650
    Likes Received:
    1,315
    Location:
    A farm in the middle of nowhere
    Do we know this for a fact? I'm very sceptical it's for backwards compatibility reasons.
    It's not like they don't run at less CUs for legacy PS4 base emulation.
     
    disco_, Barrabas and BRiT like this.
  3. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,157
    Likes Received:
    7,965
    Location:
    Barcelona Spain
    Yes...
     
  4. snc

    snc
    Veteran

    Joined:
    Mar 6, 2013
    Messages:
    2,115
    Likes Received:
    1,745
    According to crytek developer resuming game is like 6s vs <1s so not so close ;)
     
    egoless likes this.
  5. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,632
    Location:
    The North


    Windows Central is referencing this tweet.
    This guy is a compression expert based on his history
     
    AzBat and PSman1700 like this.
  6. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,632
    Location:
    The North
    Both consoles resume < 1s right now. So he's referring to complete game switches out of memory. 1s to hold 6 different games is very amazing.

    We don't know for sure. It may for legacy PS4 mode but most games should run PS5 native mode. We have no details on the other 2 modes (legacy 4Pro, and legacy 4)
     
    PSman1700 likes this.
  7. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,502
    Likes Received:
    24,397
    So question on BCPack... Does what is "BCPack(ed)" need to be decompressed for GPU use or is it a new native GPU format?
     
  8. Xbat

    Veteran

    Joined:
    Jan 31, 2013
    Messages:
    1,650
    Likes Received:
    1,315
    Location:
    A farm in the middle of nowhere
    Would they compromise there design so they could have boosted backwards compatibility when they could have legacy PS4 mode work.
    I feel it's more to do with cost reduction over time or resources would be better used elsewhere such as the SSD.
     
    disco_ likes this.
  9. Nesh

    Nesh Double Agent
    Legend

    Joined:
    Oct 2, 2005
    Messages:
    13,998
    Likes Received:
    3,713
    They are following multiples of the PS4 CUs though. Their BC method is probably tied to it for whatever reason. Their other option was probably 64 CUs which was probably avoided due to costs
     
  10. Jay

    Jay
    Veteran

    Joined:
    Aug 3, 2013
    Messages:
    4,029
    Likes Received:
    3,428
    That's my point before, it's not their own sources, their reporting on what is being said on the net.
    It's an important distinction when using them as a source.

    He does seem to be an expert but also have no idea if what/how much he says is in anyway correct in relation to xsx.

    If he's right and it allows partial texture loads instead of the full texture compared to possibly ps5, that can help with effective throughout
     
    egoless, disco_ and snc like this.
  11. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,763
    Likes Received:
    280
    Location:
    In the land of the drop bears
    But the PS5 won't just use Kraken? it'll likely use BC compressed textures and then Kraken or Zlib on top of it.
     
    disco_ likes this.
  12. chris1515

    Legend

    Joined:
    Jul 24, 2005
    Messages:
    7,157
    Likes Received:
    7,965
    Location:
    Barcelona Spain




    He said later...
     
    goonergaz and zupallinere like this.
  13. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,678
    Thought experiment - assuming that the basic premise of the tweets was correct:

    The likely places Series X could fall behind are: SSD throughput/latency, GPU bandwidth, gpu front-end (clock speed)

    I can't think of anything else that would be really obvious.

    I'm just speaking hypothetically, but thinking about gpu bandwidth, the only thing I can come up with is that the ratio of memory access is not quite right. Maybe they need a little bit more than the 10GB for fast gpu accesses.

    The rest of my thoughts really come down to bandwidth mitigation, the memory model and how things are streamed from the nvme. They are using some kind of virtual memory setup where the nvme is addressable through a new API called DirectStorage. Maybe DirectStorage is a solution, to CPU overhead and latency, but comes with complexity. For example, on PS5 maybe there's just one way to access the drive through the filesystem. Just a basic open, read, write, close asynchronous or synchronous api like any other api. Maybe DirectStorage has a different programming model, so accessing data from the CPU and from the GPU are different. There could be unwanted complexity there. On top of the raw bandwidth disadvantage, maybe the programming model is just harder to use.

    As for GPU bandwidth, the Series X really seems to be built around the idea of efficiency in accesses vs raw bandwidth. For example, Sampler Feedback's intention is to access only the parts of textures that are needed vs loading the whole texture into memory for sampling. The Sampler Feedback API allows you to figure out which parts of the texture will be sampled, and then load only those parts. The thing is, it seems to best fit into a particular model of virtual textures with a tile cache. You have a bit of added complexity in terms of learning a new API, but also are somewhat forced to adopt a particular memory management model for textures. Maybe that's not compatible with how some existing code bases are already set up. I don't know how cryengine works right now. So, as a thought, in the situation where the engine is streaming large textures from nvme in RAM, you're now in a situation where you're exceeding the 10GB because you're loading entire textures instead of the necessary parts, and you're hitting the limit of the SSD bandwidth because you're not selectively reading.

    As for the front-end, I think it's somewhat the same situation. Mesh Shaders is a totally re-write of the render pipeline before rasterization. It's a fully threaded and compute driven approach. It will take time for developers to learn and optimize for mesh shaders. If you haven't done that yet, you're left with the existing pipeline which has bottlenecks that will favour high clock speeds.

    This is all speculation on my part.
     
    egoless, disco_, PSman1700 and 3 others like this.
  14. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    If there's less latency in the overall PS5 solution vs commercial off the shelf products it's down to the custom elements outside of the SSD rather than the SSD itself which as far as I've read is only non standard in the number of priority levels it allows (6 vs 2). I guess you could argue it's size and speed are a little unusual too. I am struggling on the latency argument though. If we talk in system memory terms then both devices are pulling data from a high speed SSD over PCIe 4.0 4x interface through an IO block which connects to the memory/CPU/GPU via AMD's infinity fabric. Sony have added a few extra elements to IO block like the decompressor and coherency engines, but a decompressor isn;t going to reduce latency over an uncompressed data stream (if anything it's going to increase it). SO you're left with the coherency engines+cache scrubbers and any differences to the software stack between the two. But since no-one actually knows what Direct Storage does or how it works in the PC space I don't see how we're in a position to be comparing those at this stage.

    What's the current native GPU compression format? Have we been comparing Apples & Oranges here? i.e. if GPU's already handle data natively in a compressed format then should we be adding that compression ratio to the raw speed throughput of all drives?
     
    PSman1700 and BRiT like this.
  15. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,632
    Location:
    The North
    surprised he's still updating that thread. He just seems to keep moving on with his own compression work
     
    PSman1700 likes this.
  16. DSoup

    DSoup Series Soup
    Legend Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    16,775
    Likes Received:
    12,690
    Location:
    London, UK
    I really hope PS5 and XSX are not saving the entirety of RAM when switching between games or going into rest mode. Most of what is in RAM are game assets that can be re-loaded. You really only what the game state, i.e. where everything is in the game world and what it is doing - just like a save files. This should be a considerably smaller amount of data.

    Otherwise, switching games is going to quickly eat onto that 825/1000Gb SSD quickly. I.e. you regularity swap between 5 games and you've just lost 50Gb of 'swap' space. Writing out gigabytes of RAM state every switch/sleep is going to put a lot of wear on those drives. That is potentially way more than typical PC drives have to contend with.

    Are you talking about the PC you and I own today or PCs that will be built in a 12-24+ months time with new bus architectures and controllers and I/O chains that take advantage of DirectStorage? Because an API cannot negate the bottlenecks that exist in the PCs that you and I own today. You need better hardware. New hardware needs a new API. The API has to come first, the hardware will come after.

    You need new hardware to support a fast SSD coupled with controller that can decompress certain data and dump it at ~20Gb/s to DDR4 and/or GDDR6 without impact the rest of the system. That's the goal.
     
    #796 DSoup, Apr 6, 2020
    Last edited: Apr 6, 2020
    egoless and TheAlSpark like this.
  17. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    I was thinking about suspend. It would be possible to do that in super smart way but it would mean game engines have to support it. Low hanging fruit would be force developers to use streaming heavily. Then when suspending to disk only store metadata and when returning to game stream the content back from original location. This would likely save insane amount of space on textures and sounds and other immutable data. This could also go long way on fixing the load times issue, stream first the lowest level lod to be quickly in game and proceed to stream in the higher quality assets as soon as possible.
     
  18. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,104
    Likes Received:
    16,896
    Location:
    Under my bridge
    Pretty much. There's no reason for 36 CUs beyond that, and we know devs did target specific CUs with their code. It seems bizarre that the GPU is so constrained as we're used to swapping GPUs with differing core counts on the PC and it just working, and it's hard to imagine why devs would be targetting so low level still that games can break on compatible hardware. But if you think about it, there's some reason, even if odd, to go with 36 CUs whereas no particular reason to go with a really hot, narrow chip. So BC seems the only justification.
     
    PSman1700 likes this.
  19. Jay

    Jay
    Veteran

    Joined:
    Aug 3, 2013
    Messages:
    4,029
    Likes Received:
    3,428
    My question still comes down to out of 13.5GB how little would the game code, audio, anything else that doesn't need high bandwidth take up as 3.5 GB sounds pretty small to me, I'm more expecting that the slow access stuff over flow into fast section than the other way around.
    But I'll be more than happy for someone to show me that game engine only needs a fraction of 3.5GB and that 10 is a huge hinderance compared to 11GB.
    this is also where I believe bcpack comes into play, that it allows better partial texture retrieval compared to other package formats.
    Could be mis remembering though.
    But I thought that was one of the positives that he was saying, and why throughput essentially could be better than initial perceptions.

    In the end devopers will have to code to take advantage of these things like always, just depends how hard it is. But some things, will just get used as it's the best way to get the performance you require.
     
    egoless likes this.
  20. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,632
    Location:
    The North
    substantially different take on their smart shift from what everyone else has talked about. He's making ti seem like power is shifting around all the time, when we have been told that it's holding at max at all times.
     
    PSman1700 likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...