Digital Foundry Article Technical Discussion [2021]

Discussion in 'Console Technology' started by BRiT, Jan 1, 2021.

  1. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    2,822
    Likes Received:
    2,002
    Location:
    Earth
    I have no disagreement here. I'm just trying to say specifically the additional sampler feedback(collecting misses) features in xbox are useful but not mandatory. There are many ways one can go about implementing rendering. Even going to the extreme of UE5 or dreams that change geometry representation and even use sw rendering implemented as compute shaders.

    I wonder how much ue5 for example uses SFS or is ue5 using completely custom renderer implemented in compute? UE5 is good example as it's probably the engine that depends on most heavily in streaming. At least if we discount sony 1st party titles/engines like ratchet&clank/spiderman miles morales.
     
    mr magoo likes this.
  2. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    You're ultimately relying on the GPU hardware to do texture sampling, and you cannot peek at what was sampled. So you make decisions based upon what you think is being sampled.

    The best systems will just do better at guessing at what was sampled and making decisions from there. With SFS you are provided feedback on how you did, and you can choose what to do with it. Full link on how it works above. Flowchart is fairly high level but useful in describing why a developer would want it.

    You sample tiles, with SFS it tells you what you got; which may not be what you wanted. You then make another request for those tiles based upon knowing what you sampled previously, or whatever it is you wanted for instance. The key here from the Game Demos shown earlier, is that SFS provides you the feedback of your samples in which you create requests for new tiles to represent the ones you didn't want. You can unload those resources and that is where the savings are happening. It is not solving visibility, it is solving the decision of which MIP tiles it should have loaded and where.
     
    #1702 iroboto, May 4, 2021 at 6:14 PM
    Last edited: May 4, 2021 at 6:21 PM
    BRiT and pjbliverpool like this.
  3. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    2,822
    Likes Received:
    2,002
    Location:
    Earth
    Here is a another angle. Let's say we do ray tracing instead of raster. We get exact hits to triangles and textures. If we use mostly ray tracing SFS is not needed as ray tracing provides us hits. This becomes then issue that how long does it take to fetch the misses? Similar issue with misses is happening with SFS. Ray tracing makes things more difficult though. Any kind of frustrum/visibility based optimizations go out to trash bin as traced rays can hit things outside frustrum(lights, reflections,...)

    For ray tracing we could first collect the hits, fetch textures best we can and then shade in another pass. What is missed is missed same as SFS. Missed things come available later and perhaps they still are needed or perhaps not. We still would want to predict what is needed to avoid misses. Miss is unfortunate thing when it happens.
     
  4. mr magoo

    Newcomer

    Joined:
    May 31, 2012
    Messages:
    193
    Likes Received:
    322
    Location:
    Stockholm
    Sorry for terrible formatting in my previous post I am sitting on a old tablet
     
    manux likes this.
  5. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    depends if you're referring to visibility hits or LOD hits. You can hit, but still sample an undesirable LOD.
     
  6. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    18,883
    Likes Received:
    21,271
    Did a quick pass at making it a bit more seemly, but possibly still missing nuances and natural breaks from the video.
     
    mr magoo likes this.
  7. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    2,822
    Likes Received:
    2,002
    Location:
    Earth
    I think that is same between rt and sfs. One would want to keep low(er) level mip maps always in ram to be able to sample something. Missed data is unlikely to come in same frame as miss was hit. It probably is many frames before the missed data is in ram and available to be used.
     
  8. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    I suspect it a lot of this depends on how you want to sample textures. If you want to use the 3D Pipeline to sample or you want to use your own compute shader to sample.
    I don't know if SFS is available to use in a compute shader.

    ie.
    SampleLevel() is available in compute shader for invocation.
    Sample() is not available in compute shader.

    going further, just reading through. unless I missed something; SFS requires Tiled Resources to run, which many VT systems have never wanted to adopt.
     
    #1708 iroboto, May 4, 2021 at 6:39 PM
    Last edited: May 4, 2021 at 6:44 PM
    manux likes this.
  9. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    919
    Likes Received:
    668
    xsx has modest advantage in rt mode as expected (as ray/triangle intersection depands of cu count and clock)
    [​IMG]
     
  10. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    The frame rates are capped in that demo. How do you know how large the difference when XSX is capped at 60 and PS5 is below the cap?
    Dips and peaks are not expected to be the same between the two.
     
    mr magoo and BRiT like this.
  11. snc

    snc
    Regular Newcomer

    Joined:
    Mar 6, 2013
    Messages:
    919
    Likes Received:
    668
    maybe its closer to theoretical 20-25% ? ;) who knows but both console dip to 50ish in some scenes
     
  12. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    that's fine, I was just curious how you look at the clamping problem that's all. Honestly I don't know how close or far they are, but dips and peaks aren't the same unfortunately. As per the other thread, RT is a fixed calculation here regardless of resolution, so rasterization speed will matter as well and that is going to be dependent on other factors. If you want to single out the RT aspect, you'd have to separate out the rasterization aspect.

    At lower resolutions in this title, on the PC side of things, OlegH found RT takes longer the lower the resolution indicating a CPU bottleneck. So that's basically why I don't like to use this as a RT benchmark. Since they both use CBR, the rendering resolution may be lower, possibly leading to a CPU bottleneck which could cap RT performance.

    Which is why the dips sometimes intersect or get very close of one another. I do wonder if that is a CPU issue etc. Not sure. Too little information to go on.
     
    scently, BRiT and mr magoo like this.
  13. see colon

    see colon All Ham & No Potatos
    Veteran

    Joined:
    Oct 22, 2003
    Messages:
    2,059
    Likes Received:
    1,144
    My experience on PC has always been that RT has a larger percentile hit to performance at lower resolutions. I've always assumed that this was because it's more of a fixed cost, and the raster time of each frame would be lower at lower resolutions.
     
    iroboto likes this.
  14. Globalisateur

    Globalisateur Globby
    Veteran Regular Subscriber

    Joined:
    Nov 6, 2013
    Messages:
    4,168
    Likes Received:
    3,079
    Location:
    France
    About 5% advantage for XSX in this game using exactly similar scenes shown by VGTech (when both are dropping). Most gameplay scenes shown by DF weren't exactly like for like.
     
  15. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    This is what I assumed, actually, I always assumed RT would vary with resolution considering the nature of rays per pixel cast. But I dunno, finding out RT was a fixed % of frame, then it made sense to look at it the way you do. But then finding out that lower resolution had a longer RT time was confusing.
     
  16. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    you're only looking at dips though. That's like looking at a massive 1 million frame dataset. Clamping the max value of both series to 60 and comparing the minimum values and making a claim on performance based on the remaining non-clamped data as a representation for the whole population.

    No one would ever do that. While certainly that will be how players experience the game. That is not an evaluation of how successfully the hardware is running the game.
     
    mr magoo and BRiT like this.
  17. cwjs

    Newcomer

    Joined:
    Nov 17, 2020
    Messages:
    185
    Likes Received:
    369
    That is confusing -- you mean longer in total, not longer proportionally? Something weird is up.

    As far as fixed costs, you can shoot less rays per pixel, but you could also choose to shoot a fixed count I guess. Additionally, as far as fixed costs go, dealing with the bvh tree (on gpu or on cpu -- theoretically you could do either one, not sure what the rt apis permit) is going to be the same regardless of resolution.
     
  18. manux

    Veteran Regular

    Joined:
    Sep 7, 2002
    Messages:
    2,822
    Likes Received:
    2,002
    Location:
    Earth
    Could you be seeing fixed cost(overhead?) of ray tracing? Building BVH for example takes same amount of time irrespective of display resolution. Another thing could be some fixed hw/driver overhead that is more visible in lower resolutions.

    Second thing that comes to mind is effect of caching. Lowering rendering resolution makes the rays more divergent as they still have to cover same area with less rays. These more divergent rays could be worse for cache and cause relatively worse performance. Maybe in higher res/higher ray count case hw/driver can group rays together in more cache friendly way. This could lead to lowered resolution not to give linear performance increase. The potential bottle neck due to cache misses can happen both in going through BVH and shading steps.
     
    #1718 manux, May 4, 2021 at 9:45 PM
    Last edited: May 4, 2021 at 10:02 PM
  19. Shortbread

    Shortbread Island Hopper
    Legend Veteran

    Joined:
    Jul 1, 2013
    Messages:
    5,273
    Likes Received:
    4,286
    If there was/is any 'additional headroom' it should be quite visible. Seeing that both are dropping down into the 50s (RT mode), lets me know that any additional or "worthwhile headroom" isn't there. Capping both at 60fps was to guarantee 60fps, which neither system is quite capable of maintaining while RT is enabled. We can talk about potential max values and uncapped framerates all day and their relevance to the bigger picture when comparing performance metrics, but as of now, whatever additional headroom XBSX has over PS5 while running RE8 in RT-mode has manifested itself into a 9% framerate advantage which Rich mentioned.
     
  20. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,182
    Likes Received:
    16,037
    Location:
    The North
    how would it be visible though if you clamped the headroom at 60.

    I mean, hypothetically, what if I clamped the frame rate to 55 fps? You'll see a perfect straight line until the really bad dip; and when they dip both they're within 1fps of each other. at 54 and 53 respectively. Is their performance gap < 3%?

    I'm not trying to say that XSX is performing better. I'm just trying to ensure there is a separation of arguments that
    a) PS5 performs more or less like XSX in the game (true; respectably there's no difference imo, 5% is not enough to really matter)
    b) PS5 performs more or less like XSX with respect to the settings they have provided and within range of 9% (true, from an experience perspective you're unlikely to notice without a metric counter)
    c) That the additional RT Units that XSX has only manifest to a 9% increase in performance over the clockspeed differential on PS5 for ray tracing (false, the metrics are not an indication of how the individual components are working towards their final output)

    You can't prove C. Because at the very least you'd need to see the whole thing uncapped to really know what's going on beneath the hood.

    Typically XSX has been a poor performer with a lot of alpha, how do you separate it dipping from having issues with alpha and PS5 not having issues with alpha. That may be a scenario where PS5 is making up time on RT by being better at rasterization since RT is a fixed cost. While I'm not saying it is, I'm just saying, you can't use this to describe how well the hardware performs on RT. Dips are not equivalent (see Hitman 3)

    I mean; there really isn't enough RT computation here to really put the RT units to test.
     
    #1720 iroboto, May 4, 2021 at 11:13 PM
    Last edited: May 4, 2021 at 11:32 PM
    PSman1700, mr magoo and BRiT like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...