Not sure if this is noticed yet but RTX is doing a piss poor job on human skins sometimes, it looks dramatically worse than rasterization here.
Is this with latest updates installed? This bug for example could affect result negatively
Not sure if this is noticed yet but RTX is doing a piss poor job on human skins sometimes, it looks dramatically worse than rasterization here.
Correct.I thought it was understood already that the RT cores in Turing are accelerating BVH traversal?
These questions have been asked in the Impact of Turing on Consoles thread. Not many answers but JoeJ reckons improvements in executing code on compute is all that'll be needed .We know raytracing can be done with compute and we don't really know how nVidia RT cores are working. There could be quite a bit of compute being used.
What kind of fixed-function hardware would benefit raytracing and could be added to existing shader core architectures relatively cheap in order to help (mainly compute-based) raytracing? Ray triangle intersection? AABB bounding box generation of trangle strips, sets, meshes? Support for hierarchical tree-like structures like BVH?
But do we really know that everything is handled in hardware? Could mostly be an elaborate compute shader utilizing some fixed-function hardware for the heavy lifting. Same for BVH generation.Correct.
you specify the DXR command and put in a shader and denoiser (?) as a parameter. Once the ray/triangle intersections are identified, the shader runs on those hit triangles. Which are all done on compute.
How the vendors choose to handle intersection is what will vary from one IHV to the next, but the intersection as we know it is done in the drivers.
I'll try making this clearer.But do we really know that everything is handled in hardware? Could mostly be an elaborate compute shader utilizing some fixed-function hardware for the heavy lifting. Same for BVH generation.
No, there's no BVH hardware in Volta. RT is entirely done in compute. But Volta has advanced compute options, which probably means fine grained work sheduling directly from compute without a need to rely CPU commands.I thought it was understood already that the RT cores in Turing are accelerating BVH traversal?
It has been never mentioned RT cores help to build the BVH, so likely this is done with compute. I heard only the mentioning of BVH traversal and triangle intersection, nothing else.Nvidia does this through their RT cores, and it accelerates the build up, take down, and modification of BVH from what we understand. And it does it fairly precisely as well.
Yes, but this restricts RT to the API. If RT runs on compute, we likely want to implement it ourselves most efficiently and the API does not help at all here. This is the main reason why we can not draw final conclusions from Volta vs. Turing based on BFV or something like that. (just to mention)It can be done on compute, and drivers can do this portion via compute,
... which we can not do, because work generation is not exposed anywhere yet to game APIs, of course.we likely want to implement it ourselves most efficiently and the API does not help at all here
you might be right, it's unsure how the BVH is built up or torn down, perhaps CUDA could have better access to the structure over say, DXR, which won't let you access it.It has been never mentioned RT cores help to build the BVH, so likely this is done with compute. I heard only the mentioning of BVH traversal and triangle intersection, nothing else.
Yes, but this restricts RT to the API. If RT runs on compute, we likely want to implement it ourselves most efficiently and the API does not help at all here. This is the main reason why we can not draw final conclusions from Volta vs. Turing based on BFV or something like that. (just to mention)
Yeah, and likely that's where we are heading. I just don't like it. One more argument is the effort for a RT implementation. Of course it's much easier and faster to just use DXR.It does restrict RT to the API. You're trading off pure optimization for a level of abstraction to deploy your code on multiple IHVs without the headache improving adoption and scaling the platform to a variety of programmers and not just the few.
its on xbox i think... which we can not do, because work generation is not exposed anywhere yet to game APIs, of course.
See another failed request of mine: https://community.amd.com/thread/236715 Zero response. I'll try another time again... I seriously want vendor APIs just for those reasons.
uhh does that say 10fps vs 300 fps?DXR fallback layer is apparently running on Radeon VII, results are not stellar though, on one demo the Radeon achieved 10fps, while the 2080Ti achieved 320fps (yes 300fps)! Of course this could just be a token support from AMD with no substantial optimizations.
I wouldn't read anything into it. I doubt there's any optimizations whatsoever and the fallback path isn't even a thing anymore. Not that it wouldn't be considerably slower, just that it's not a very good comparison and what's possible with AMD.uhh does that say 10fps vs 300 fps?
that's true. It's only reasonable to compare when AMD says, 'hey, this is our RT card', then the comparisons make sense.I wouldn't read anything into it. I doubt there's any optimizations whatsoever and the fallback path isn't even a thing anymore. Not that it wouldn't be considerably slower, just that it's not a very good comparison and what's possible with AMD.
The DXR fallback layer has been depreciated by Microsoft 4 months ago. So those benchs aren't even worth the bandwidth they are consuming on the net.DXR fallback layer is apparently running on Radeon VII, results are not stellar though, on one demo the Radeon achieved 10fps, while the 2080Ti achieved 320fps (yes 300fps)! Of course this could just be a token support from AMD with no substantial optimizations.
Yes.uhh does that say 10fps vs 300 fps?
Because Nvidia is going out its way to make sure that RTX on non Turning GPUs is borked (contrary to what their recommendations are..RTX being supposedly the "optimal" path on GPUs arch in Optix6)A comparison between a 2080Ti vs Titan V vs 1080Ti in some OptiX workloads. The 2080Ti is 3 to 6 times faster than TitanV depending on the workload, and much more faster than that compared to 1080Ti.
Why are results from the same people far lower than the OptiX 5 results?A comparison between a 2080Ti vs Titan V vs 1080Ti in some OptiX workloads. The 2080Ti is 3 to 6 times faster than TitanV depending on the workload, and much more faster than that compared to 1080Ti.