Next gen lighting technologies - voxelised, traced, and everything else *spawn*

Discussion in 'Rendering Technology and APIs' started by Scott_Arm, Aug 21, 2018.

  1. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    Continue reading further down:

    This is what caused the fps dip in those vegetation stages, on top of the bounding box expansion.
     
    #461 DavidGraham, Nov 20, 2018
    Last edited: Nov 20, 2018
    OCASM likes this.
  2. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    Haha, :) 'old convictions'... as if raytracing would be something new. I'm not against RT - i criticize with the aim of improvement, and you know this. My critique still holds: Less functionality == less black boxes, but the remainder is still black. Sorry for not being an RTX expert.
     
  3. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    And my criticism toward your criticism (and others) is that it's based on incomplete knowledge about the underlying tech.
     
    A1xLLcqAgt0qc2RyMz0y likes this.
  4. manux

    Veteran

    Joined:
    Sep 7, 2002
    Messages:
    3,034
    Likes Received:
    2,276
    Location:
    Self Imposed Exhile
    This is a good read https://devblogs.nvidia.com/vulkan-raytracing/

    I guess the acceleration structure and traversing it are black boxes. On the other hand user is in full control what and how of shooting rays.

    Who in their right mind would want vendor or even worse chip specific api today? If acceleration structure is opened up only reasonable way for that to happen is for ray tracing to mature and then creating cross vendor api. To me it looks like ray tracing is not yet mature enough for that to happen and a lot of interesting stuff can be done with what is available.

    GDC2019 could be where we get first real indication what is wider sentiment about ray tracing amongst developers and ray tracing adoption.
     
    BRiT likes this.
  5. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    I'm sure, as you are a motivated Dev, Nvidia will kindly provide you one for free :)
     
  6. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    Knowledge about black boxed tech is mostly incomplete, hihi :)

    (sry, stopping wasting space here now, couldn't resist)
     
  7. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    This one is also interesting:
    DICE: Other quality and performance improvements in development include a hybrid rendering system that uses traditional screen-space reflections where the effect is accurate, only using ray tracing where the technique fails. This should boost performance hopefully improve some of the pop-in issues RT reflections occasionally exhibit right now.

    As a bonus, SSR will also give back the reflections of falling leaves, the ray tracing is lacking.
    (they are not lacking because they obscure those nice reflections, but falling leaves are a real performance problem for realtime raytracing)
     
    OCASM and DavidGraham like this.
  8. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    NVIDIA Just released an article detailing it's RTX implementation in a game called Justice "a Chinese MMO".

    RTX is deployed for reflections and shadows together, reflections will be added to armor, weapons, objects, puddles, rivers, canals, and others. Shadows will be added for translucency, complex interactions and the increase of number of shadow casting lights. There is also real-time ray-traced caustics, as reflections can cast light and shadows as well.

    There are video and screenshots comparisons
    https://www.nvidia.com/en-us/geforce/news/justice-online-geforce-rtx-ray-tracing-dlss/
     
    pharma, eloyc and OCASM like this.
  9. OCASM

    Regular

    Joined:
    Nov 12, 2016
    Messages:
    921
    Likes Received:
    874
    Caustics! :runaway:
     
  10. Jupiter

    Veteran

    Joined:
    Feb 24, 2015
    Messages:
    1,583
    Likes Received:
    1,198
    I searched after it. Raytracing effects like reflections are expensive because the shading of the meeting point is as complex as the pixels. The classic screen space methods already use calculated pixels from previous frames where the shading is free. The transfer to the raytracing data structure via compute shader is also important. Generic shading computing power cannot be replaced by anything else. A better handling of divergence and resources benefits all areas and therefore it is from my point of view not completely true that one has to be sacrificed for the other.
     
    #471 Jupiter, Nov 21, 2018
    Last edited: Nov 21, 2018
  11. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I also assume this is the bottleneck with BFV. For example at each hitpoint they need to check all effecting shadowmaps to calculate shading.
    Curious how much they gain from fixing the bbox bug and improving foliage, but likely this is the reason they can only trace 20% of screenpixels, while early on we heard 'about 4-8 rays per pixel'. So, reflections are expensive because shading requirements.
    Worth to mention again how texture space shading could eliminate this completely, if we sacrifice 'reflections of reflections'.

    We will not see this soon, but i'm optimistic on the long run, even for future mid range GPUs...
     
  12. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    15,134
    Likes Received:
    7,680
    Right now the Nvidia drivers don't allow you to run compute shaders and ray-tracing shaders in parallel, so I imagine that's the barrier to their planned implementation of doing reflections in screen space, but spawning ray-traced shaders in the cases where SSR would fail because the ray is reflected off screen.

    Terminology is going to get weird distinguishing between the rays cast for SSR via compute shaders vs rays cast from DXR or RTX.
     
    BRiT and OCASM like this.
  13. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I'm still thinking about replacing my compute raytracing with RTX, and after watching the Remedy video from the other thread, i did some over the thumb comparison given their performance numbers.
    The result is: I have about the same RT performance using FuryX than they mention for Turing. The comparison is not fair. My geometry is simpler (surfel hierarchy, smallest surfels 10cm), but also my diffuse rays have infinite length - not just a short range. Scene size and coarse complexity is similar.
    (I'd still need to run half of my stuff beside RTX, so using RTX here would indeed cause a slow down. Reflections remain my first application for it.)

    I'm saying this is to substantiate my reasons of critique. So you understand why i say we no longer need new fixed function stuff, just improve compute and let us implement what we need in the best way possible.
    Of course you'll still doubt all this, but you should consider you might be wrong yourself.

    ... directed to all who think the purpose requires fixed function and justifies black boxes! (not to you personally)


    That said (again... won't stop until i have my work generation shaders :) ), the video has evidence shading is indeed the bottleneck for reflections.
    They say they need 1ms for tracing, and 7ms for shading the hit. Exactly the 8ms we see in BFV.
    That's awesome! 1ms is better than i've expected. Photorealsim is near, guys, its coming... :) (At least a huge step towards)
     
    Heinrich4, jlippo, eloyc and 3 others like this.
  14. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    How efficient is the triangle intersection in compute? Could some large, low-latency eDRAM/SRAM improve tests or would they still be too broad to fit any size cache and you'll always be limited by bus transfers? From looking at your compute performance, what are the bottlenecks and could they realistically be better addressed through fixed hardware rather than compute tweaks?
     
  15. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    So about the imminent release of RTX shadows in Tomb Raider: having played through the game on max settings, most shadows are very low resolution by today's standards, many small details or geometry don't receive/cast shadows as a result, and shadows flicker a lot! Most small and point lights don't cast shadows, dynamic lights (fire effects, camp fire, explosions) don't as well, also flashlights don't cast shadows on every object. And of course contact hardening/percentage filtering is completely absent.

    What we need ray traced shadows to do is this:

    -Add percentage closer filtering/contact hardening/ to shadows (guaranteed)
    -Increase shadow resolution and force small geometry to cast shadows (guaranteed)
    -Fix shadows flickering (not 100% guaranteed)
    -Make every point or small light cast shadows (announced but not guaranteed on every light)
    -Force every dynamic light (fire effects, explosions, flares, gun muzzle flashes) to cast shadows (not 100% guaranteed)
    -Fix the flash light shadows to include every 3d object (not 100% guaranteed?)

    I think if all of these points are covered, we should have a pretty solid shadow system in Tomb Raider, as for the overall visual impact, it wouldn't completely transform the look of the game, definitely for certain scenes within the game, but not across all scenes. As the impact of shadows varies from scene to scene.
     
    #476 DavidGraham, Nov 22, 2018
    Last edited: Nov 22, 2018
    vipa899, eloyc and OCASM like this.
  16. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I don't know for the triangle because i use discs instead, also bounding spheres instead boxes. (Mainly to make things smaller to fit into LDS. Ray - box test would be very fast with compute, ray - triangle is quite a bunch of instructions and i would assume benefit from tailored instructions here but personally i don't need them for GI)

    I'm not sure about RAM, because at the moment i can not see how bandwidth limits affect me. I made the implementation in Vulkan and OpenCL. VK had no profiling tools at the time, and CodeXL for CL does not help here. I assume i see bandwidth limits when implementing 4*4 or 8*8 environment maps, but i have not ported this to GPU yet. My environment maps would not fit there anyways (200-500 MB or more? Not thought about compression...). But likely i would benefit from such RAM for various worklists. Writing them takes half of the time of my tree traversals.

    About performance, i see a speedup of 2 for VK vs. CL, resulting from prerecorded command buffers and indirect dispatches available with VK. But if i reduce the workload to updating just 10% (enough for dynamic scene), i see only a speedup of 2, not 10. Also, on CPU the raytracing takes 90% of the time (expected), but on GPU it takes only 30-40% (unexpected). Tiny workloads, mostly about work generation eat up performance here, and most likely the cause is zero work dispatches and unnecessary barriers in the command buffers. This is why i want to generate the work and barriers directly on GPU. (RTX likely has fine grained sheduling under the hood which is already beyond my needs but not exposed. I don't know what AMD already has.) Async compute will also help, but it does not allow for fine grained small workloads - sync across queues quickly kills the benefit. So i plan to do async envrionment map or rendering work...

    Of course my stuff would run faster if it would be fixed function, but it is very complex, so only the bottlenecks would make sense. Unsurprisingly that's traversal and tracing - but both are totally different from classical raytracing, and other algorithms would not benefit. No hard shadows or sharp reflections.
    So no, even if successful i would not want fixed function. Maybe some kind of ASICs for the future... would make more sense i guess.

    My main problem is not performance at all - it's good even on old GCNs. The problem is making automated tools for seamless global parametrization - that's very hard and in research for a decade (== Quadrangulation, often used for finite element analysis). I'm on par with current state of the art, but for games we want very large quads (or texels) representing the top levels of a LOD hierarchy (or mip maps, if you want). So that's the remaining open problem i have to solve before i can start to work on a renderer supporting game models. I hope i can do this within next year... last graphics work was a GL ES1.1 mobile game... so some stuff to learn about rendering since then :)
     
    Heinrich4, vipa899, pharma and 2 others like this.
  17. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I have similar thoughts here than for reflections:
    With texture space shading and stochastic updates RT can become even cheaper than shadow maps. (With shadow maps you'd still need to render all of them per frame, even if you want only stochastic updates. :( Only RT can take the full benefit here.)

    But after all those praising i have to mention the drawbacks too:
    The surface we need to shade increases by a factor of... about 8? So even if we update only 10% there is no win, just higher memory requirements.

    The only solution here is to reduce shading- and so visible texture resolution as well.
    Personally i think computer graphics are too sharp anyways and that's no big problem, but some people want 16K textures and flickering pixel crawling high frequency details that hurt your eyes, and a real camera can never produce :)
    I remember the outcry about Quantum Break. To convince such people, we need a real big leap (and a sharpening filter).

    So the solution that i propose is far from surely good, and likely nobody will try this soon and keeps working around with other optimizations.
     
    Heinrich4, egoless, jlippo and 2 others like this.
  18. Voxilla

    Regular

    Joined:
    Jun 23, 2007
    Messages:
    832
    Likes Received:
    505
    That's why rasterization makes use of mip-mapping, bi/tri/ansiotropic, right.
    In texture space you still can do this to reduce cached texture memory requirements in the distance.
     
    jlippo likes this.
  19. eloyc

    Veteran

    Joined:
    Jan 23, 2009
    Messages:
    2,551
    Likes Received:
    1,705
    I'm glad to read your enthusiasm. Gook luck with your work and please don't forget to share your findings!

    ------------------------------------------------------------------------------------------
    upload_2018-11-23_7-20-26.png
    <3:-D
     
    jlippo and OCASM like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...