Nvidia Turing Speculation thread [2018]

Discussion in 'Architecture and Products' started by Voxilla, Apr 22, 2018.

Tags:
Thread Status:
Not open for further replies.
  1. Malo

    Malo Yak Mechanicum Legend Subscriber

    According to OctaneRender, Turing is providing them 8x RT performance compared to Pascal on the same workload. That's comparing equivalent Quadro cards.
     
  2. Voxilla

    Voxilla Regular

    Didn't they switch from their Cuda backend to the Nvidia Optix backend ?
    The latter including 'hacks' like the AI denoiser.
     
  3. entity279

    entity279 Veteran Subscriber

    For hybrid rendering on Turing the memory system does take a beating from having to serve both RT and CUDA cores. So raytracing may appear slower than it is in that scenario


    Octane offers two rendering modes, local and cloud. I'm told by collegues that cloud rendering is significantly slower. Should be interesting to know for which of these modes the 8x number applies
     
  4. Ike Turner

    Ike Turner Veteran

    DXR driver support is apparently up to the GPU vendor I think. So, if current cards never get it, it would either because of a lack resource (AMD maybe not having enough resources to dedicate to adding this feature vs the potential gains that would come from it) or marketing reasons (Nvidia wanting to sell RTX boards).
    So far, we have yet to see any performance number of a non Turing GPUs doing RT with DXR enabled drivers (besides Volta, which doesn't have RT Cores but does have experimental DXR drivers publicly available since April IIRC). We don't know yet if the Pascal numbers quoted are using the Fallback layer or DXR enabled drivers.

    http://forums.directxtech.com/index.php?topic=5892.msg29731#msg29731
     
  5. Malo

    Malo Yak Mechanicum Legend Subscriber

    No idea, was pretty much a marketing statement and you know how those are with details and accuracy.
     
  6. silent_guy

    silent_guy Veteran Subscriber

    ... or because the performance would be too low for it to be worth doing.

    Isn’t that the most straightforward reason?

    Nvidia has had OptiX for years. And it was never good enough for anything close to real time. If they had been able to make it run fast enough, they would have done it.
     
  7. Ike Turner

    Ike Turner Veteran

    One would assume that having a real driver path would in nearly most cases be faster than and fallback emulation path. Nvidia already did it for Volta which doesn't have "RT Cores" . It will be more of a resources considaration (money, time, coding effort) IMO.
     
  8. silent_guy

    silent_guy Veteran Subscriber

    Ray tracing is sufficiently slow for driver overhead to be a small and irrelevant part of the whole equation.

    To make real time ray tracing possible, a large enough integer factor speedup was needed, not an integer percentage factor.
     
  9. pharma

    pharma Veteran

    Pls delete.
     
    Last edited: Aug 29, 2018
  10. Voxilla

    Voxilla Regular

    For raytracing all the heavy lifting like intersecting scenes with rays using BVHs, BVH rebuild, ... is done 'by the driver', some of that assisted by RT hardware if present. If all of that is done before the driver with generic compute shaders like with the DXR fallback, you cannot expect the same performance that would be possible compared to GPU specific low level optimized RT code (like using PTX) running behind the driver making best possible use of compute resources even without RT hardware like for Pascal or current AMD.
     
    Ike Turner likes this.
  11. trinibwoy

    trinibwoy Meh Legend

    Isn’t that exactly what libraries like Optix have been doing for a long time? The point is that no amount of optimization of a general compute implementation is going to approach the speed of dedicated hardware. If that was the case these raytracing “gimmicks” would’ve been possible years ago on Maxwell and Pascal cards.

    The current BFV implementation is denoising on the shader cores not the tensors so all the acceleration is due to RT hardware.
     
    DavidGraham likes this.
  12. Voxilla

    Voxilla Regular

    Sure dedicated RT hardware will always be faster as GPU code. The point was however optimized RT code in the driver could be a lot faster as the DXR fallback. It's up to the GPU vendors to provide such drivers, so it will be interesting to see what will happen.

    There have already been for years realtime game oriented raytracing engines, see for example video of 2014, Brigade. With proper denoising this would have been quite good. (no need of AI for good denoising as BFV seems to prove)
     
    Ike Turner and OCASM like this.
  13. silent_guy

    silent_guy Veteran Subscriber

    Are you arguing that an Nvidia non-RT core driver solution will be slower than OptiX? While Ike is arguing that OptiX will be slower than an Nvidia non-RT core driver solution? And I’m arguing that they’d both be similar in performance?

    Or am I just very confused?
     
    dirtyb1t likes this.
  14. dirtyb1t

    dirtyb1t Newcomer

    I also have no clue what these two are arguing about...
    [​IMG]
    Optix / Vulkan / DirectX all sit at the same point
    DirectX and Vulkan are nothing more than Hardware Agnostic platforms to extend down into the GPU.
    Optix meanwhile is native and Hardware Specific to Nvidia/CUDA. Optix has the same if not better performance than DirectX in that it doesn't have to traverse any Hardware Abstraction layers. If the diagram was truly accurate, there would be a green section at the bottom of Vulkan and DXR denoting Nvidia's driver. In the case of any other hardware company, a section denoting their proprietary driver providing an interface to DXR/Vulkan.

    Ray tracing is possible right now on Pascal with all of the fancy features you saw demo'd.
    The Demos Jensen ran on stage run on Pascal right now via Optix 5.1. It's just slower than Turing because there is no dedicated hardware acceleration (ray trace cores and tensor cores)
    As far as I understand it, the tensor cores are used for AI accelerated Denoising or DLSS.
    The ray trace cores do the ray intersection tests
    and the BVH generation/traversal/etc are done in CUDA cores/other areas and is mapped to the SM's through an improved and shared caching mechanism between the rasterizer pipeline.

    I have no clue what is being spoken about w.r.t to 'drivers'. The driver for ray tracing is what everyone is already currently using in current gen hardware. DirectX nor Vulkan are needed for this. Each company has their own proprietary "driver" and API/SDK. All DirectX and Vulkan do is provide a higher level API that interfaces to this so that developers don't have to worry about hardware specific implementations. I'd expect Vulkan/DirectX to be slower than Optix or any other company's native software. What Microsoft means by 'fallback' path is probably some janky generic 'OpenCL' like implementation that can run on all cards w/o any optimizations.

    It's important to distinguish between hardware/drivers and APIs.
    DirectX is not a driver. It's an API :

    https://en.wikipedia.org/wiki/DirectX :
    Microsoft DirectX is a collection of application programming interfaces (APIs) for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms.

    Nothing of value is lost w/o it. The hardware has to already be capable and a driver from the manufacturer along w/ an API/SDK made available for DirectX to provide hooks into.
    DirectX 12 (DXR) doesn't enable ray tracing for Nvidia. It has existed for years via Optix. All directX 12 (DXR) provides is a high level easy to use hardware agnostic API for developers.

    Cut Vulkan/DirectX12 out of the picture, and you'd still have the same real-time ray tracing functionality on Turing via Optix. Because such a horrid job is done at detailing 'real-time' ray tracing at the conferences, here's a simplified walk through on how it all works w/o any marketing nonsense:
    https://developer.apple.com/videos/play/wwdc2018/606/
    Yes, that uber demo Jensen showed in the box room of 'real-time' ray tracing and denoising can run on an Ipad.
     
    Last edited: Aug 30, 2018
    pharma and Malo like this.
  15. dirtyb1t

    dirtyb1t Newcomer

    Also note from the Apple video starting @22min, is that the ray/s metric largely depends on the scene.
    An Ipad can do the box demo @ 176 Million rays/s.
    This drops to 20 Million rays/s on a more complicated scene. It drops even more w/ 'reflective' surfaces (secondary divergent rays) which is why Jensen doesn't have many of them in his demos.

    For reference, generally speaking Pascal is said to average around 400/500 Mrays/s... But in what?
    This is the key. Jensen has provided no standard benchmark scene(s) as to how he arrives at this incredible 8/10 gigaray/sec figure. I'll be flat out honest : I think the real-world numbers will be much lower and the Pascal cards will be in the gigaray range themselves in the same scene.

    If Nvidia weren't pulling any shenanigans, they would have shown the ray/s count, like apple, on a particularly known benchmark scene for ray tracing.
    Much like FPS, they still haven't shown live demos... The only one I heard of has the 2080ti at around 3.2Gigarays/s. However, I am unaware of what Pascal does in comparison.
    This is concerning but won't last much longer as people will be able to figure out exactly what the performance is when they get their hands on this hardware.

    I truly hope Jensen wasn't pulling a fast one w/ these Gigaray measures and its a number formed by a range of industry standard Ray tracing scenes.
     
    Last edited: Aug 30, 2018
  16. pharma

    pharma Veteran

    I think during the "live stream" yesterday with Tom Petersen he mentioned approx. 1 Gigaray for Pascal (1080Ti).
     
  17. iMacmatician

    iMacmatician Regular

    From a Reddit user, I don't know if this claim is true or not:
    TU106 in the 2070 matches the information in the AdoredTV Turing rumor from about three weeks ago. So I'm wondering if the 7 GB GDDR6 for the 2070 mentioned in that rumor might not be entirely wrong. Is it possible that the RTX 2070 uses a 7 GB + 1 GB memory configuration similar to the GTX 970?
     
  18. dirtyb1t

    dirtyb1t Newcomer

    Lastly :
    http://on-demand.gputechconf.com/gtc/2017/presentation/s7455-martin-stich-optix.pdf
    See page 24 (Same box demo as the Ipad that can render it at 176Million rays/s). Meanwhile, the ipad could only do about 20 million rays in a more complicated scene.
    Again stressing : The scene matters. Properly comparing cards involves showing how many ray/s is acheivable in the same scene and detailing what rays the metric is composed of : (Primary rays, secondary rays, shadow rays, etc).

    See Page 35 (A range of performance metrics across various ray tracing scenes for Titan X (Pascal). Jensen's lauded gigaray figures better have been composed by the same measures.
     
  19. Scott_Arm

    Scott_Arm Legend

    I think the gigarays/s metrics they put out are mostly junk, but so far the software devs are saying the performance improvement from Turing is large.
     
    pharma likes this.
  20. Scott_Arm

    Scott_Arm Legend

Loading...
Thread Status:
Not open for further replies.

Share This Page

Loading...