Current Generation Hardware Speculation with a Technical Spin [post launch 2021] [XBSX, PS5]

Discussion in 'Console Technology' started by pjbliverpool, Feb 9, 2021.

  1. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,546
    Likes Received:
    2,890
    Location:
    Guess...
    Not to derail the topic of misaligned reflections but I found the above to be more interesting.

    Specifically because this would imply that the XSX version is not using BCPACK, since BCPACK should produce very similar results to Oodle Texture + Kraken. By extension that means the XSX may not be using it's hardware decompression block while the PS5 is, thus putting a greater load on the CPU which may have to stand in for the decompression. This could explain why the IO is being blamed for the stutters. Rather than it being down to the PS5 having a faster IO system (the XSX should be more than sufficient for anything this game needs) it's more a case of the XSX system not being fully utilised.
     
    RagnarokFF, PSman1700, Dural and 4 others like this.
  2. Shortbread

    Shortbread Island Hopper
    Legend Veteran

    Joined:
    Jul 1, 2013
    Messages:
    5,191
    Likes Received:
    4,176
    I don't know, anything is possible. Are there any other third-party titles on XBSX that exhibit similar stuttering issues even with similar installation sizes? If the answer is yes, then it's something else that's hardware related.
     
  3. Globalisateur

    Globalisateur Globby
    Veteran Regular Subscriber

    Joined:
    Nov 6, 2013
    Messages:
    4,116
    Likes Received:
    3,034
    Location:
    France
    Since when framerate drops caused by CPU bottleneck during data streaming are bugs? We have those odd framerate drops since PS360 generation in almost all games that need to stream new data (PS5 and PC included) and this is the first time in years that I hear people calling those as such.

    Those framerate drops are not bugs, there are performance issues caused by specific bottleneck (usually a combination of software + hardware limitation, at some point any software job will be limited by something) and like the performance problems caused by the GPU limits, these can be alleviated if the developers improve their code in order to better use the available hardware and software. We usually say the developers optimize their code.

    And as this game is not using PS5 custom I/O hardware for the loadings, I doubt it is using it elsewhere.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,546
    Likes Received:
    2,890
    Location:
    Guess...
    Just because a game isn't coded in such a way as to take advantage of a high speed IO system for loading doesn't mean it still can't decompress its optical drive files on a hardware decompression unit rather than a CPU. Its possible the PS4 version of the game already does that.
     
    PSman1700 likes this.
  5. thicc_gaf

    Regular Newcomer

    Joined:
    Oct 9, 2020
    Messages:
    324
    Likes Received:
    247
    So uh, this isn't really related to the Control discussion but, came across a curious post on Era with someone saying the PS5 has one TMU per CU while Series X has four TMUs per CU. Can anyone here verify if that's true or not?

    Because I've always assumed that both systems had four TMUs per CU, and looking over some of the RDNA 2 GPU specs you can basically work out those having four TMUs per CU and I figure that would be standardized in the RDNA 2 spec except maybe for very small laptop/mobile APUs. But it almost sounds too wild to be true, that'd create an absolutely massive gap between PS5 and Series X (in favor of the latter) when it comes to texel data and texture fillrate. We're talking 36 vs. 208 here!

    Again though, it's just someone's else post and I can't even verify if that person is a dev or has access to devkits for these systems. But I'm curious if anyone here knows about this and can verify or debunk it. Really can't picture a TMU disparity that huge between these systems but hey, if so, it is what it is :/.
     
  6. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,903
    Likes Received:
    11,013
    Location:
    London, UK
    You don't want the I/O system decompressing data willy-nilly. Too much check-in code is written on the basis that it will load compressed data from disk, allocating only as much memory as needed to load that compressed data into before decompressing it (whether by CPU or hardware). If the data was already decompressed during transfer it won't fit into the allocated memory.

    To get the best of the new I/O systems developers need to rethink how assets are stored from the ground up. This isn't something you can just shoe-horn in, this is literally changing the structure and storage of multiple gigabytes of data in most games.
     
    BRiT, Shortbread, Allandor and 4 others like this.
  7. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,727
    Likes Received:
    4,002
    Location:
    Wrong thread
    One of my suggestions about Control on XSX was that - possibly - the CPU side hitches weren't caused by high IO overhead as such, but more by the CPU being briefly deluged with work stemming from reads that would have previously been limited by natural bottlenecks on console and PC.

    If your CPU is five times as fast, but your new storage system is effectively serving up data in the same way but ten or twenty times faster than even an SSD was last gen, perhaps what was an easily manageable instantaneous workload in the past become something that you start to choke on due to some element of how you manage data and operate upon it.

    Even once you've got something in RAM, there can be quite a bit of work involved in getting it ready to be used by the game (and DX).
     
    PSman1700 likes this.
  8. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,903
    Likes Received:
    11,013
    Location:
    London, UK
    If it's this alone, it should be more pronounced on PS5 where the I/O is faster and the CPU is slower than Series X.
     
    goonergaz likes this.
  9. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,727
    Likes Received:
    4,002
    Location:
    Wrong thread
    Indeed, which is why I was thinking about something like DX resource binding (just as an example) and the cost of getting assets ready for use once they're in memory. I believe certain API operations can really start to add up if you do too many per frame, and heavily impact frame rate.

    At its heart Control for MS platforms is a 2019 DX11 game, designed around mechanical HDDs. The PS5 version is definitely is doing something better, probably on the CPU side, and I think it's most likely to be something to do with how the game is able to use data once it's in memory.

    I understand that it's compelling to see stutters and hitches related to area transitions and say "Cerny IO block!!", but getting stuff into memory is only a small part of what it takes to get assets rendered on screen without hiccups.
     
    egoless, PSman1700 and mr magoo like this.
  10. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,903
    Likes Received:
    11,013
    Location:
    London, UK
    I don't think anybody is saying this but I do have a fair number of people on my ignore list.
     
    function likes this.
  11. function

    function None functional
    Legend Veteran

    Joined:
    Mar 27, 2003
    Messages:
    5,727
    Likes Received:
    4,002
    Location:
    Wrong thread
    Well I'm probably getting a bit carried away and bringing in baggage from other conversations and other places, especially after seeing how NXGamer's comments have been unfairly used (not his fault, he make great videos).

    I know you're not saying that, and I should have kept my reply more focused on what you were. I shouldn't have tossed that in, so sorry about that.
     
    Pete, PSman1700, BRiT and 1 other person like this.
  12. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,903
    Likes Received:
    11,013
    Location:
    London, UK
    No probs. Like I said I have a lot of people on my ignore list. Some threads look weird, like folks arguing with themselves. :yep2:
     
    Sonic, BRiT and function like this.
  13. goonergaz

    Veteran

    Joined:
    Jun 3, 2005
    Messages:
    4,195
    Likes Received:
    1,509
    HHmmm...I don't think I'm on you ignore list :D, but I am curious what cache scrubbers bring to the table and why these are not a part of any discussion around the PS5 closing the gap in Control.

    ...I'm also interested to see, if I'm on your ignore list and the fact that I've quoted you, does this cause a rift in the space time continuum that ultimately get's me a ban!? :runaway:
     
  14. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    14,903
    Likes Received:
    11,013
    Location:
    London, UK
    I think the answer to that is that nobody able to post knows which is probably why there is little discussion. The speculation : fact ratio is already bad in the Console Technical forums! Too many folks participate on the premise of wanting to learn but they don't want to learn, they want their view to be correct.

    Assuming Sony even have devtools to measure the effectiveness of the cache scrubbers (and how you would even measure it) maybe at some future GDC some dev will include this.
     
    #14 DSoup, Feb 10, 2021
    Last edited: Feb 10, 2021
  15. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    hardest issue with Sony in general is how tight lipped they are about their hardware. Even finding something on PS3 is extremely challenging at this point in time. For whatever reason it's always been easier to discuss MS consoles, either the information flows easily, or there are just leaks everywhere; and perhaps this creates a bias against PS because there's no real talk happening there, but if cache scrubbers do something, we're not likely to hear about it until a developer interview mentions it.

    Right now I think a majority of PS5s biggest advantage is probably sitting with their geometry engine. I do believe after some time, and sort of aligning with Matt H comments around it vs VRS. the geometry is culled so early, and their hardware seems biased to cull significantly more triangles than it can rasterize, that there is significantly less workload going forward. Typically back face culling happens very late in the 3D pipeline, so you're doing lots of work on a lot of triangles and then tossing them very late. I think this can explain some of performance boosts we're seeing with RDNA 2 (6800 series) and PS5 for some titles for sure (just peeking at other threads for Triangle generation etc).

    At least I think it plays a larger role than cache scrubbers, which my understanding from reading seems like a general non issue. I think however, when Mesh Shaders do finally come into play (as I don't think cross gen titles will write a separate path for them), this advantage would go away in theory. But right now, I believe the compilers are making full use of converting calls into primitive shaders, and they are culling triangles very quickly. This looks like a significant pain point for XSX if (a) they aren't setup to do this (can't convert 3d shaders to primitive shaders) or (b) they don't have the fixed function pipelines of RDNA 2 for it in favour of having more compute and feature sets (aligning a bit more with nvidia in this case). At least, correct me if I'm wrong, but I haven't heard any hub bub around how well XSX can cull triangles it's possible that it's not big on FF triangle discard. And according to hotchips (b) is very improbable. They indicate unified Geometry engine, that also supports mesh shading. Leaving (a) which the last we looked at the documentation back in June, they were not capable of leveraging the NGG yet.

    tldr; Geometry Engine and Primitive Shaders are my main focus for investigation for PS5. If there is statistical bias moving PS5 ahead of regression, this is where I would investigate given the information that is available.

    I recall this one moment in DMC 5 demo where XSX completely tanked and PS5 held steady but we're literally rotating in a near empty room with a statue during a cutscene.

    Just spitballing, that was a possible situation where PS5 was obliterating triangles out of view/blocked/etc and because it discarded so much so effectively, all of it's triangle generation could be put towards visible triangles. And we saw XSX dropped really badly there, meaning it was wasting triangle generation on triangles that would later be discarded and thus we saw a huge frame drop. This post here on B3D where Voxilla was doing some benchmarking was the inspiration here to look for these moments:
    https://forum.beyond3d.com/posts/2191463/

    For perspective XSX is 7.3GTris/s maximum rate.
     
    #15 iroboto, Feb 10, 2021
    Last edited: Feb 10, 2021
  16. cwjs

    Newcomer

    Joined:
    Nov 17, 2020
    Messages:
    164
    Likes Received:
    342
    Are geometry shaders something that could be generated automatically from regular primitive shaders? I haven't read much about them. Mesh shaders certainly cannot, especially not with any performance advantage (culling is manual in them, you'd need to process the meshes and go in and write the code -- maybe not a huge undertaking, but definitely not a free update.)

    Regarding DMC5 (and control, which runs better than I expect if this is true) if what posters were saying in this thread about some xsx games being fast, presumably low budget dx11->dx12 updates, I think basically anything is on the table as far as performance dips go. Dx12 requires (careful) manual memory management -- the developer has to design the pipeline, manage gpu parallelism, control when resources are accessed simultaneously, and schedule things so the whole renderer doesn't grind to a halt (while it say, waits on one whole shader getting finished before any more code on the next one can start up due to memory barriers between resources). If that's what control is, without a significant amount of work to re-do the renderer, it's a miracle it works as well as it does.
     
  17. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    This post here on B3D: https://forum.beyond3d.com/posts/2180107/
    I'll post the tweet however. Unfortunately the follow up tweets were deleted, and I wish we took a screenshot of them. But it was very clear how effective the drivers were at taking raw front end shaders and converting them to primitive shaders.


    It's about converting front end shaders into primitive shaders (near equivalent to Mesh shaders), I believe geometry shaders would be part of that, but no one really uses them as far I recall. The geometry engine would be responsible this task.

    Agreed, all sorts of possibilities, but I'm specifically looking for situations where there's nothing much really going on and XSX is tanking and PS5 is holding. Just seems to happen more often than not. Inside the room of death for Valhalla for instance, it's just dying. When you zoom in with the sniper rifle for hitman 3 for instance, it must keep all the scene geometry in memory, is it being drawn but not culled fast enough? The flowers obstruct all views of any geometry, but it still needs to be rendered, perhaps another culling problem. Dirt 5 may very well be a similar issue, it's just unable to discard the triangles or it's processing too many triangles that will later be discarded anyway. Dirt 5 is very tesselated track floor! All of that triangle generation cannot be wasted!

    Just playing through demon souls for instance, was also something I really spent time to look at. The amount of tesselation and geometry everywhere.

    I've been giving this a lot of thought, and with Control photomode showing me it's not an alpha issue (because PS5 should have won if this was the case, and therefore not a bandwidth issue, because PS5 would have won in this case because ROPs are largely bandwidth limited) then I really needed to look elsewhere.
     
    #17 iroboto, Feb 10, 2021
    Last edited: Feb 10, 2021
    PSman1700 and cwjs like this.
  18. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,018
    Likes Received:
    15,763
    Location:
    The North
    not true as far as I know. It's the first I've heard of it if I'm honest.
     
    thicc_gaf likes this.
  19. cwjs

    Newcomer

    Joined:
    Nov 17, 2020
    Messages:
    164
    Likes Received:
    342
    Thanks a lot for the info -- this is something where my knowledge of the hardware side is way below my knowledge of the software side so bear with me if this is a stupid question: Automatically converting standard fixed function shaders (vertex, etc) into something that the hardware is better designed for (more compute, etc) is one thing, but where would culling get introduced here? Does the hardware have some way to know, or is something happening on the developer side? With Mesh Shaders, for example, theyre not necessarily faster than traditional fixed function geometry at all, at least for simple cases -- but because of how they're structured (Task shaders -- specialized compute shaders that dispatch Mesh Shaders, specialized compute shaders that take the place of Vertex Shaders) you can relatively simply introduce huge culling benefits that just aren't practical in a straightfoward way on the old pipeline. (There are some recent xbox developer youtube videos about this on the dx12 side)
    I mean, I think the "nothing much going on" tells us that something shady is going on (bad dx12 renderer constantly hanging on fences, serious tool problem, something wrong with the hardware) rather than a performance difference. With the hitman example: I don't think that's actually very easy to cull. The kind of thorough geometric tests that can guarantee 'the flowers obstruct all views' are expensive and hard to get right -- I think the safer guess is that one has more to do with: 1- the way the hitman renderer works (maybe a relatively 'straightforward' forward renderer or something?) 2- the xbox running at a way higher resolution and 3- yeah, a ps5 hardware advantage on fill rate would make sense, but not that big.
     
  20. cwjs

    Newcomer

    Joined:
    Nov 17, 2020
    Messages:
    164
    Likes Received:
    342
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...