Thoughts on some rendering ideas I had

Discussion in 'Rendering Technology and APIs' started by Infinisearch, Apr 21, 2014.

  1. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    I've had some thoughts and Ideas in regards to rendering over the years and I was hoping the developers and armchair developers here would chime in on some of them.

    1. Around 2003 on gamedev.net I made a post about z-compositing that I'd like to rehash here and now.
    The basic idea was to render the static geometry of the scene at one frame rate and the dynamic geometry at another and then composite the two at the higher frame rate. For an first person camera static geometry at 60fps and the dynamic at 30. For a third person camera 30 for the static, and 60 for the dynamic.

    a. Assuming you can keep up the frame rates and its a game with no overtly fast moving objects, camera, or lights can you think of any rendering artifacts? Do you think players would complain of something akin to micro-stutter or something bothering them?
    b. Related to a. given modern rendering techniques do you feel it would be worth the trouble? In addition at what stage would you do the compositing and why i.e. before or after lighting and shadowing?

    2. I kind of skipped over how HDR was implemented until recently. I've only looked into tone mapping recently so far so essentially all I really know is it seems sometimes you need to calculate the average luminance. So I was wondering if the was possible given either the current hardware or the current hardware/API combination to compute the sum portion of the average as you do your light accumulation?
    It seems rather inefficient to do it after the fact.

    3. I've always had a thing for ID's primitive/patch id's, object/sub obect id's and... Its seems some rendering techniques require multipassing (light indexed) others have an option for it (clustered) and then some ("classical deferred" with z-only pass) seem like they might benefit from it but wind up with too high a geometry load. So I was wondering if anybody has and is it currently possible (api/hardware) to do a Z, primitive id, and whatever other id is necessary for the specific rendering technique, to generate a sorted list of visible primitives to be used as the basis of what to draw in subsequent passes?
    Essentially Z would be read or read modify write, framebuffer output would be write only with no texture reads to accomplish the feat. So speed would be pretty close to a Zonly, and the subsequent pass/passes would only try to render what is visible and the z-buffer would handle the rest. What do you think?

    Thanks in advance for any comments, criticisms, advice, and analysis's.
     
  2. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    859
    Likes Received:
    262
    Putting a copied z-buffer back in place is a bit tricky, blitting isn't available. You can use a fullscreen triangle with z-writethrough (no compare, just write) and pass the old z value from a Texture2D via Load() to SV_Depth out.
     
  3. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    782
    Likes Received:
    22
    Location:
    Wroclaw, Poland
    1. Would probably lead to weird perceived jittering. Interesting idea though. This could be tested easily (and the technique could be proved/disproved) by doing world/actor updates at different frequencies but rendering everything at the higher one.
    3. This would work better as a dedicated piece of silicon than a general purpose computation IMO.
     
  4. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,344
    Likes Received:
    444
    Location:
    Finland
    I'm quite sure it would be more feasible to have some sort of re-projection scheme for background and composite characters on top.
     
  5. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    If you do compute shader based tiled lighting, you can use LDS (thread block shared) atomics (atomic add) to accumulate the total light intensity per thread block, and when the thread block is finished the first thread of the block does one global atomic add (to a shared memory location). This is very efficient.
    However writing z-values out of a shader disables the GPU depth compression, and that hurts your rendering performance. I don't think there's a good way to do this on PC DirectX without any performance loss.
     
  6. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    Actually i'm not so sure about jittering, but rather jumping but maybe we're talking about the same thing. In fact I'm more concerned about temporal occlusion artifacts. I'm not that good at picturing things anymore and picturing the the dual frustum's in various geometric configurations that would allude to artifacts is beyond me at times.

    Dedicated silicon is most likely unnecessary, here are some concerns off the top of my head.
    1.I'm not sure which would be faster generating the list as you render or as a post process, dedicated hardware might speed things up if you do it as you render.
    2.You might need multiple sorts and "bins" on the same list which I'm not sure is possible as a fixed function implementation and if you go programmable you might as well use the current programmable hardware to do it if it maps well to it.
    3. Multipassing is only done on cameras, it would most likely be a waste of space as a dedicated piece of silicon.
     
  7. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    When I looked into shader model 5, I remember looking into atomic adds but it said that it was limited to int and uint. Is this different for compute shaders? (looking up now... thanks for the lead)

    EDIT - Yeah it seems that i'm missed the part where it says "and shared memory variables", thanks once again.
     
    #7 Infinisearch, Apr 24, 2014
    Last edited by a moderator: Apr 24, 2014
  8. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    The day before you posted this I saw a news item over on extremetech in regards to occulus rift called time warping. It had a link to a post by Carmack on altdevblogaday that explained some latency reduction techniques, it explained time warping used reprojection to that end (haven't read the whole thing yet). I'm going to think about it, thank you for your input.

    One thing I was considering is if technique 3 was possible, I would try to combine it with 1 creating one list of triangles for static geometry and one for dynamic geometry.
     
  9. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    859
    Likes Received:
    262
    That's unfortunate yes. But because z-compression stores at most 3 plane-equations + a map, and those aren't there when a manual z-value is pushed, it's just the only way how it could be. Waiting for a block to be filled, whenever that could be inferred, and then figuring out if there are exactly three different derivatives in it is quite impractical. Just if someones wonders why. :razz:

    Actually, I'm not sure anymore if the question above is about frequency of parts of the simulation or frequency of parts of the graphics. I could imagine that games working with parallax can indeed function with stitched together "backups of z-planes".
     
  10. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    You are talking about NVIDIA right? AMD hardware is different. Is there actually any public documents about the depth compression hardware of the current NVIDIA and AMD PC hardware?

    It seems that Mantle allows low level access to GCN HTILE buffer (slide 31):
    http://www.slideshare.net/DevCentra...4-with-mantle-by-johan-andersson-amd-at-gdc14

    Filling HTILE buffer using a compute shader is an efficient way to handle this problem. Too bad there's no cross platform API that allows anything like this.
     
  11. Ethatron

    Regular Subscriber

    Joined:
    Jan 24, 2010
    Messages:
    859
    Likes Received:
    262
    I think only a few persons know which have been the exact algorithms/encodings put into the chips. The only interesting bits I've found in Efficient Depth Buffer Compression. Storing Z in plane-form seems generally reasonable though.
    There is very little information about HTILE in the Northern Island documentation. At least one knows that Cayman has 8x8 tiles, and how big the on-chip tile-buffer is.
     
  12. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    If the info in Battlefield slides is correct Mantle allows access to HTILE, thus Mantle documentation should contain the exact details about the HTILE data structure. However it seems that Mantle SDK is still not publicly available.
     
  13. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,560
    Likes Received:
    157
    Location:
    In the Island of Sodor, where the steam trains lie
    Sort of an improvement on something like Resident Evil?
     
  14. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    10,801
    Likes Received:
    2,172
    Location:
    La-la land
    The 1st person example (30fps actors/60fps backgrounds) would be fairly pointless, as backgrounds in a game inevitably end up consuming far more rendering resources than actors in most cases (unless you hugely unbalance the ratio of resources spent on the two, leading to some very strange, out-of-place-looking visuals...)

    There just wouldn't be much gained by rendering only a small sub-set of the screen at a lower framerate.
     
  15. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    First I'd like to thank Ethatron for his responses, thank you for your time and input.

    I guess you could say that, and although I did play/watch play the original on a friends PS1 it wasn't in my mind at the time. What happened was I had a meager machine at the time (P3 800 Win98se 128MB or 192MB ram and either a 32MB radeon 7200(dx7) or I had recently upgraded to a 128MB radeon 9550 to play with dx9 shaders), and the way I saw it was if I could make my pet project run good on my machine it would run great on something better. So there were all the usual suspects at the time but I wasn't happy with that and I wondered it there was something more I could do. At the time I could only think of three things on my own:

    1 Render front to back, nixxed because broke batching which was the common wisdom I had learned at the time. (Although I still don't know if the batching was primarily to reduce draw calls, or to avoid hardware inefficiency due to state changes, or I suppose both.)
    2. Number 3 from above, if i couldn't render front to back try to defer lighting and texturing some other way. Couldn't figure out how to do it at the time, don't remember my exact reasoning as to why and I can't recreate my thought pattern at the time.
    3. Number 1 from above, IIRC my reasoning was that at the time static geometry had less triangles than dynamic geometry. So I'd reduce my geometry load for every two frames in the first person case. In addition I think I had just started learning about projective texturing/shadow maps and figured if I were to implement it (I was a fan of stencil shadows at the time) I could render my shadow map every other frame (if there were no artifacts), the one with the less geometry load. In all honesty I'm not sure if I thought that last bit up at the time and if my thoughts on the subject are confused.

    Anyway I guess I've always had an interest in less orthodox solutions to problems. Wow did this post turn out way longer than I thought. Not gonna waste all this typing though...post unnecessarily long post with unnecessary information anyway.
     
  16. milk

    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    2,999
    Likes Received:
    2,567
    I remember a game doing exactly that on the DOS era. It was a sequel to a RPG/adventure/action game. The first title had 3d characters on a static environment ala RE, but the sequel started rendering those environments at runtime, but only updated them every so often. Anybody remembers its name?
    It was about saving your planet from some alien thing sort of thing, and you had a baby to feed in the begging haha. All I remember...
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...