Next gen lighting technologies - voxelised, traced, and everything else *spawn*

Discussion in 'Rendering Technology and APIs' started by Scott_Arm, Aug 21, 2018.

  1. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,889
    Likes Received:
    4,536
    WCCFTech - Scorn Developer Interview
    May 29, 2020


    https://wccftech.com/scorn-intervie...-system-trailer-was-running-on-an-rtx-2080ti/
     
  2. Dictator

    Regular

    Joined:
    Feb 11, 2011
    Messages:
    682
    Likes Received:
    3,969
    milk, PSman1700, DavidGraham and 2 others like this.
  3. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,104
    Likes Received:
    16,896
    Location:
    Under my bridge
    I wonder if that preference is fuelled by legacy thinking though? The moment one thinks of a problem to solve now, one thinks of using data in RAM simply because fast storage isn't an option. Similar to thinking of a problem in terms of a single thread instead of multiple threads when we move over to multicore, which was forced onto devs. As discussed before, all data pools are cache between storage and CPU registers. As a pool gets faster, the need for interims decreases, so we can ask the same question of any tier. Would a dev prefer more L2 cache and less DRAM? Okay, the sizes and deltas are very different in that case, but as the storage moves closer to RAM in terms of delivery, RAM can be looked at less like working storage and more like a cache for the SSD data, at which point the whole mindset for game design might shift.

    Another big move this way is a move away from Object Orientated development to Data Orientated. Thinking in terms of data and stream-processing everything, steaming the game data becomes a part of the intrinsic design philosophy.

    Store all these developer opinions away now, and we'll compare them to developer opinions at the end of the generation. ;) Maybe next-gen hardware predictions will start with 16 GBs RAM again, only faster, and 100 GB/s SSDs. ;)
     
    w0lfram and milk like this.
  4. w0lfram

    Regular

    Joined:
    Aug 7, 2017
    Messages:
    304
    Likes Received:
    59
    Or just a pcie gen4 NVMe socket right on the GPU board.

    Drop in 1TB of "storage memory"...
     
  5. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    NVDIMMs have potential for even better bandwidth. For the moment it's tech marketed and priced at enterprise, but consoles are a big enough market they can knock the margins down.

    Of course completely custom extremely wide bus flash directly on an interposer with the CPU/GPU could do even better, but that's a significant investment.
     
  6. Ext3h

    Regular

    Joined:
    Sep 4, 2015
    Messages:
    428
    Likes Received:
    497
    Demanding more RAM is IMHO going in the wrong direction. That request is built upon the premise that you could pre-load all assets if you just had enough RAM. When artists now consider it viable to dump 100GB or more worth of assets on the customers hard drive.

    Streaming assets has to be the way to go, but increasing bandwidth to storage can't be the solution either. Slow storage is going to stick with us for at least another 2-3 years before HDD based systems currently in use are ultimately phasing out. Even longer until early SSD adopters with SATA II links are replacing their former enthusiast systems. Half a decade, before you may consider NVMe (or equivalent tech) to be defacto baseline standard.

    So streaming concepts have to be devised with low available bandwidth in mind. What springs to mind, is shifting asset decompression from CPU to GPU. Lossless compression, and especially beyond the block-wise, transparent texture compression formats we have gotten so used to. Even with slow storage, you can afford to prefetch a couple hundred MB on CPU, if that means being able to quickly deliver a multitude of that in decompressed assets.

    About 3 years ago (https://arxiv.org/pdf/1606.00519.pdf), academic research had reached the point where LZ77 like decompression speed on the GPU had exceeded PCIe 3.0 16x bandwidth. But somehow nothing made it from academic research into production. Research by other parties is still ongoing (http://www.bncss.org/index.php/bncss/article/view/143), showing promising results, closing in on 100GB/s decompression speed for highly compressible resources.

    Yet tools provided by our IVHs still all only revolve around plain old lossy texture compression.

    Probably it's time to rethink residence of resources, and to treat uncompressed resources even in GPU memory as transient only. It may even be worth a consideration to treat a portion of VRAM as a prefetch cache for compressed assets, with the goal to be ideally able to decompress missing assets within the same frame on the fly.

    With a complex culling chain, there should be plenty chances to record which assets still need to be decompressed before proceeding with generating the final draws. E.g. as a side product during generation / evaluation of the HighZ buffer.
     
    #2106 Ext3h, Jun 2, 2020
    Last edited: Jun 2, 2020
  7. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,633
    Location:
    The North
    hmm.. perhaps I'm wrong, but you're still going to allocate a fixed size within RAM for streaming though right? Meaning you can only carve up 16 GB in so many ways. The faster the streaming SSD you have, that's great, but there's got to be a pool limit unless you want to be trying to render off your slowest bandwidth.

    So if you have 16GB of memory, 2.5 GB of it reserved. You're going to set aside say 5GB for textures with 7.5GB remaining for render work. You're going to be limited by the 5GB pool you set aside. Even if you stream faster and faster, you're just going to be held by that 5GB allocation. So lets assume your VT system has pools of MIP 0-13 textures, each pool approximately 400MB in size, holding their appropriate number of tiles as the tile sizes go up. Despite how fast you stream, if you put more on the screen than your pool size, something will be held back from entering the next mip level until something exits. Ie. Something in MIP 4 pool can't move into MIP 3 pool if MIP 3 pools are full. So despite how fast your streaming is, something still needs to exit, and that's a memory footprint issue.

    We can't rely on the SSD being able to stream textures and consume them every single frame. Your bottleneck will become the speed of the SSD.

    And then there are still issues with edge cases like transparent textures.
     
  8. Frenetic Pony

    Regular

    Joined:
    Nov 12, 2011
    Messages:
    807
    Likes Received:
    478
    Both bandwidth and latency. Regardless this thread is about lighting. Lighting tends to be quite dynamic by nature, and that means the first bound you're going to hit is almost certainly compute throughput in some way, the SSD won't help much at all.
     
    iroboto likes this.
  9. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    Agree. With storage size being the greater problem than streaming speed, i see options to achieve really good compression ratios only by utilizing repetition. E.g. UE5 with it's instances of rocks, or something more fine grained like seen in DAG compressed shadow maps.
    But this means we need to keep those instances in memory. Streaming them only on demand does not work because there is always at least one instance (or data fragment of a dictionary) visible and required.
    Though, i don't see a problem with 16 GB. Seems enough.
     
    iroboto likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...