Digital Foundry Article Technical Discussion [2021]

Discussion in 'Console Technology' started by BRiT, Jan 1, 2021.

Thread Status:
Not open for further replies.
  1. rabbit

    Newcomer

    Joined:
    Feb 8, 2021
    Messages:
    43
    Likes Received:
    44
    next step
    stop and think about what this will do to the gpu
    because with sfs it will process a lot less data
     
  2. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    No, I'm not. Basically I'm saying that many games look at camera movement, player position, player movement, object position, object movement etc. Based on this information the needed tiles from texture(s) for correct mipmap levels are loaded and cached. Loading and caching is done as tiles, not as full textures. If something is missed the lower level lod is used instead. SFS helps this by providing a neat way to collect misses and then load more tiles. However this loading happens after the miss so data will inevitably come too late and potentially produce popin effect. A ton of games are already using streaming without relying on sfs. I view sfs as a thing that can help, but it doesn't replace traditional way of figuring out what needs to be streamed.

    My gut feeling is that 10 years from now streaming will rely heavily on neural networks. What to stream feels like a problem neural network can solve better than human(statistics of what is needed). Training can be done as a GAN as one network will produce result for what is needed and another one has source of truth via sfs. Then you keep training and improving your network until result is better than heuristic/human. This allows similar training to be used as alpha zero/go/chess uses. It's not easy, but it's clearly doable and I bet it will happen.
     
  3. mr magoo

    Regular

    Joined:
    May 31, 2012
    Messages:
    582
    Likes Received:
    974
    Location:
    Stockholm
    i don't think this is how it works, i may be wrong but i got impression that this is scenario where SF will help you what to load and at what level (which mips and even which parts of the mips).


    https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html
     
    iroboto likes this.
  4. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    SFS isn't responsible for resolution after a miss. SFS provides the precise data every time you want to see what you sampled from a texture. There's no guess work provided by SFS. SFS provides a way to tell you what the hardware sampled when you asked it to sample something. How developers choose to use that data and how many times they want to see/store the results of the sample is up to them - effectively the feedback system is designed to provide back to the developer information (which MIP was sampled on a tile, and where on that tile it was sampled) to improve their accuracy on their guesses of which tiles are needed next (thus the name Sampler Feedback).

    SFS doesn't resolve pop-in necessarily. The goal of SFS is to reduce the amount of committed memory in the residency list (if you are streaming tiles) due to providing better feedback on which mips you should be using and where on the tiles you are sampling - thus reducing overhead on committed residency tiles which may not be in use.
     
    #1684 iroboto, May 4, 2021
    Last edited: May 4, 2021
  5. cheapchips

    Veteran

    Joined:
    Feb 23, 2013
    Messages:
    2,493
    Likes Received:
    2,665
    Location:
    UK
    In DF's Returnal piece, I appreciate them talking through the graininess of the image. I'd clocked it, but not really thought about why it had that element to it, given the temporal stability in the other parts of the image.

    (and I think the game looks great. Good trade off I think)
     
    #1685 cheapchips, May 4, 2021
    Last edited: May 4, 2021
  6. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    I think we are trying to say about same thing about how sfs works. We seem to disagree on where it leads to though. Basically sfs samples, misses and then fetches. If possible it's better to predict which tiles are needed and fetch them ahead of time.

    What I'm claiming is good solution is predictive(heuristic/dnn). Prediction can be made better by feeding it data from sfs. I wouldn't replace prediction by simple just fetch what was sampled logic. Especially if there is no cancel for in flight requests. In some cases the data might not be needed in future frames once data is available. This can happen if camera/player moves, object rotates, object moves out of frustrum or object is destroyed etc.

    Prediction process feels like something neural net could do very well. Of course the neural net could take sfs as input in addition to object movement/rotation, object size, camera movement etc. I really like neural net here as the ground truth is so easy to generate. GAN should be able to do a good job here,...
     
    #1686 manux, May 4, 2021
    Last edited: May 4, 2021
  7. Shortbread

    Shortbread Island Hopper
    Legend

    Joined:
    Jul 1, 2013
    Messages:
    5,632
    Likes Received:
    4,920

     
    Man from Atlantis, mr magoo and BRiT like this.
  8. Allandor

    Regular

    Joined:
    Oct 6, 2013
    Messages:
    842
    Likes Received:
    879
    Essentially this is what was in this forum for months now ;)
    One does it by brute force (via bandwidth) and one by trying to reduce what really is needed (and bandwidth .. ~2.5 GB is still massive). The later has the advantage that the GPU and GPU-memory (and GPU memory bandwidth) hasn't to do much with the "waste". And I really can't say how much the additional "S" (from SFS) does save on the GPU resources. They are making additional steps (filtering in one pass) not available to PC so far and I really can't say how much this will make a difference. This is also a step that can save many GPU resources.
     
    PSman1700 and rabbit like this.
  9. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    What SFS samples and/or misses I don't think is necessarily the point. I look at SVT systems behaving like VT systems, the GPU operates as though all the textures are resident in memory, but the reality is only so much is actually resident the rest of them are paged off on the slower storage. There's nothing that SFS can do to change that, if the memory is resident then no swap will occur, and vice versa.

    What SFS is doing is providing a very accurate feedback on both the locations of samples and the mip level. So you're effectively optimizing what mips you need and where precisely you need them. Consider it additional granularity on your tile based system. Because of this granularity, you can release tiles you don't actually need until you need them, there isn't necessarily any more pop-in than any other VT system there is, theoretically there should be less. You can see here how it works in this radeon video:


    Without SFS, you can see the effect on how much memory must be used to whole tiles not currently in view, and you can see how ungranular the tiles are as the demo flies through here with unity.

    (Thought you need to rewind backwards to see the fly through)

    If you think about tiles per mip, the closer you get to MIP 0, the number of tiles for the texture increases dramatically. As you go to higher mip levels say 2-10, the number of tiles to represent that MIP drop dramatically. So there is always going to be a boundary point in which you need to swap to lower MIP levels (load in more tiles) or increase your MIP levels (reduce the number of tiles). Without SFS, it's very difficult to determine how to do this accurately, so the often end result is to just store those non visible tiles into cache - as what you see with Unity.
     
    #1689 iroboto, May 4, 2021
    Last edited: May 4, 2021
  10. mr magoo

    Regular

    Joined:
    May 31, 2012
    Messages:
    582
    Likes Received:
    974
    Location:
    Stockholm


    I think we can all agree on XSX a single character in entire game has 0.1mm longer eyelashes = instant win
     
    RagnarokFF, Pete, JPT and 2 others like this.
  11. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    I don't agree on this. This only happens on worst case on worst possible implementation. It's very possible to create better than brute force algorithm to stream in textures. It can be small things like looking at size and distance of object and pulling in only the needed mip level(s). Or it can be better and take into account object position and rotation and only fetch specific tiles that are going to be visible. It's not like we didn't have megatexturing etc. available a long time ago,...

    It will be interesting to see how for example unreal5 has solved streaming. I doubt they require sfs despite the whole engine being very streaming heavy,...
     
  12. mr magoo

    Regular

    Joined:
    May 31, 2012
    Messages:
    582
    Likes Received:
    974
    Location:
    Stockholm
    https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html

    Without Sampler Feedback

    For background, the general texture space shading algorithm does not require sampler feedback. The texture space shading process works like this:

    1. Consider a 3-D object or 3-D scene element which should be shaded in texture space.

    2. Allocate a target texture of suitable resolution for how close the object will tend to be relative to the camera.

    3. Determine a scheme for mapping locations on the surface of that object, in world space, to areas of that target texture. Fortunately, real scenarios often have the notion of {U, V} co-ordinates per object, and a {U, V} unwrapping map to act as this scheme.

    4. Draw the scene, targeting the target texture. For this pass, it may be desirable to simply run a compute shader instead of a conventional graphics render, using a pre-canned mapping of geometry-to-target-space with no notion of a “camera”. This pass would be the pass in which expensive lighting operations are used.

    5. Draw the scene once again, targeting the final target. The object is rasterized to the screen. Shading the object is a simple texture lookup which already contains the result of the scene’s lighting computations. This is a far less expensive rendering operation compared to the previous step.

    With Sampler Feedback
    With the general texture-space shading algorithm described in the above section one challenge is, for the first pass, knowing which areas of the target texture to shade. Naively, one could shade the entire texture, but it could be expensive and often unnecessary. Perhaps, shading the entire texture would mean shading all facets of an object, even when only about half the facets can be viewed in the scene. Sampler feedback provides a means of reducing the cost of the shading pass.

    Integration of sampler feedback with texture space shading means splitting up the first pass into two, yielding a three-pass algorithm if implemented straightforwardly. With sampler feedback, the texture-space shading operation would work like this:

    Steps 1, through 3, are the same as in the above section.

    1. Draw objects straightforwardly to the final target in screen space. For each object with which texture-space shading will be used, keep a feedback map of which areas of objects’ target texture would be updated.

    2. For objects with which texture-space-shading will be used, draw the scene targeting the objects’ target texture. This pass would be the pass in which expensive lighting operations are used. But, do not shade areas of the target texture not included in the feedback map.

    3. Draw the scene once again, targeting the final target. The object is rasterized to the screen. Shading the object is a simple texture lookup which already contains the result of the scene’s lighting computations.
    The ability to skip shading operations in step 5 above comprises a performance savings made available by sampler feedback.
     
    Pete, pjbliverpool and Silent_Buddha like this.
  13. Dampf

    Regular

    Joined:
    Nov 21, 2020
    Messages:
    283
    Likes Received:
    474
    So its checkerboarded meaning its running at 1920x2160 and 1280x1440 internally?

    The interlaced option in the PC menu should give the same experience, would it not?
     
  14. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    It's not about tile visibility though, that's sort of what I'm getting at. Here is a simple diagram I ripped off nvidia
    [​IMG]

    The sampling algorithm here in this case are 4 dots. Black dots are misses for visibility, green dots are hits. Imagine doing the same thing for texture sampling, you get a hit, so you load that part of texture, great. But if you're using tiled resources you may want to know where (which of the 4 dots) scored a hit and at which MIP level the hit was scored on. That way the system will tell you precisely during render which MIP to call on and precisely what part of that tile you actually need. Think about the upper part of that triangle where you are only scoring 2 out of 8 hits. A decision needs to be made on which MIP level and which tiles are being used to represent that little bit of polygon. Without SFS you are not provided this information, so what to do? You still need to come up with a way to deal with it. So the SVT systems have no issue resolving these cases, but that doesn't necessarily mean that they are super efficient with tile and mip selection. They may select more tiles than necessarily (you make the assumption that a hit means all 4 dots are hits, so load 4 tiles instead of 1) And that's where SFS helps to improve both tile and mip selection, it tells you where you sampled and tells you at what mip level. What the engine decides to do with that information is up to the developers.
     
    #1694 iroboto, May 4, 2021
    Last edited: May 4, 2021
    mr magoo and PSman1700 like this.
  15. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    One does math based on visible geometry and textures attached to it to figure out what is needed. Based on the size of the rendered geometry the right mip levels can be fetched. There also is going to be a cache so it's ok. to fetch a little bit too much speculatively. One will also make some heuristics based on the motion and rotation of objects to decide which tiles to fetch/not fetch.

    Something like meshlets could be tremendously useful on figuring this stuff out/optimizing it when sfs is not available/desired to be used. Meshlets would likely make things easier as where appropriate we can work on meshlet level for streaming/discarding/rendering versus caring about individual triangles.
     
    iroboto likes this.
  16. cwjs

    Regular

    Joined:
    Nov 17, 2020
    Messages:
    373
    Likes Received:
    733
    Revealing df analysis, looking forward to the whole game. Yet another where the spreadsheet released by other analysts didn't really capture the experience.
     
  17. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    yea absolutely. And it's not that SVT systems today are piss poor or anything like that. They're fantastic. But if the developer is unsatisified or wants to make further optimizations in parts of their SVT pipeline, Sampler feedback has is a tool they can use to be provided feedback instead of having to guess on what to do with edge cases.
     
    BRiT and manux like this.
  18. mr magoo

    Regular

    Joined:
    May 31, 2012
    Messages:
    582
    Likes Received:
    974
    Location:
    Stockholm
    Transcript from the video I pasted

    “No matter how you manage texture residency, whether it's full mip chain, or partial mip chain, you have some decisions to make about what to load, and when. Like, when do you load that 4k mip0?

    Well, maybe something tried to sample from it. Well, how would you know that? Because samplers are opaque, you don't have a built-in way of knowing.

    You can try to calculate mip level selection yourself. Maybe, you can be really clever. Maybe, you can emulate the filter mode of the sampler that you're using in your shader. Maybe, you can try to be really precise about emulating it, but it's really, really hard.

    And, if you're using tiled resources, it's even harder, because it not enough just to know what mip level you're going to end up with. You need to know where on that mip level. So, if you combine that with use of, like, anisotropic filtering or something, trying to emulate that choice of where you sample from and all the site of places that it could have sampled from, it's prohibitively hard. It's just deal-breakingly hard.

    So, enter sampler feedback. Sampler feedback is a way of opening up that black box, so you can find out what mips you tried to sample from and it goes a step further than that. It will tell you what parts of those mips. So, one thing to note is that sampler feedback is not a complete overhaul of sampling hardware, but it's an extension to it. It's a GPU hardware feature that extends existing hardware designs, and gets you something new out of what used to be that closed black box.”
     
    #1698 mr magoo, May 4, 2021
    Last edited by a moderator: May 4, 2021
    Pete and Silent_Buddha like this.
  19. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,235
    Likes Received:
    4,259
    Location:
    Guess...
    So I'm gathering from this that the 2.5x multiplier claimed by Microsoft isn't likely to be in reference to the best alternative streaming methods but rather to something more naive like the Unity example posted earlier?
     
    PSman1700 likes this.
  20. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    Silent_Buddha likes this.
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...