Next gen lighting technologies - voxelised, traced, and everything else *spawn*

https://www.3dcenter.org/news/raytr...uft-mit-guten-frameraten-auch-auf-der-titan-v

Looks to me like Titan RTX beats TitanV quite easily in the reflection heavy maps (rotterdam). It's about 40 or 50% faster though these benchmarks are pretty simple.

I guess that goes back to the question of what BFV is actually doing. How much of the % difference is from shading performance. Would be nice to see both cards with RTX off to get a baseline performance difference.

Spec-wise, Titan RTX does not have any major performance advantages for general gaming except a huge amount of memory, that I'm not sure BFV could even take advantage of. TFlop/s and bandwidth are roughly equal.

Edit: In a roundabout way, this may actually explain what BFV is doing. If a map were primarily casting screen-space rays, you would expect performance to be roughly equal. The performance divergence suggests that there is a significant burden of DXR rays on-screen in Rotterdam map, even after the patch.

There is a huge penalty from activating DXR even on maps like Hamada which has only a few reflected surfaces. It takes twice as long to render a frame from "off" to "Low" (200FPS -> 100FPS), yet there is no real difference between "Low" and "Ultra".
That can explain the "small" difference between Volta and Turing.
 
Insane just expands the number of reflected objects, and the number of affected surfaces, it also increases the resolution of reflections and tries to simulate them in water in a realistic way. Nothing earth shattering, and they are still SSR.
The same can be seen in Unigine Superposition benchmark with Extreme quality setting, which enables ray tracing with large search area and probably a few samples per pixel in screen-space for global illumination and reflections.
With AO, you can go away with simple Monte Carlo integration for random samples inside of relatively small hemisphere (in screen-space area), which is still quite friendly with caches (especially if search radius is small). But you can't do the same with GI and reflections, raytracing is required here and searches across large areas of screen will certainly trash caches and bring performance down (you can see how Volta and Turing with fast and large caches are much better here)

Perf drop in GoW with "insane" preset is insane indeed - https://images.nvidia.com/geforce-c...ar-4-screen-space-reflections-performance.png
And it seems the main difference is that "insane" reflections are proper stochastic reflections with a few rays per pixel - https://images.nvidia.com/geforce-c...teractive-comparison-002-insane-vs-ultra.html
while other presets implement perfectly mirrored ones with 1 and lower number of rays per pixel
 
Last edited:
I just tried the "Insane" SSR reflections in Gears 4 on my 2080.

If we're willing to give RT the benefit of the doubt in terms of a future of optimization & iteration, it would only be fair not to judge this 2 year old implementation of SSR, which is seemingly rather brute force (it's also Unreal Engine 4 *ahem*), and whose existence may have stemmed from exposing the settings/path they used for the cinematic recordings, where performance is not the impetus.

It's also worth considering that Gears of War 4's development was extremely rushed - MS bought the IP in early 2014, and Coalition had to create the game on console and then try and support the rather finicky variety of hardware on PC for an October 2016 release, so we could guess between 18-24 months of development while starting from scratch on a *redacted expletive* version of UE4. If you notice while playing the campaign, it's rather apparent how inconsistent the quality of graphics can be across the levels - the implication being that they were learning the engine as they went along.

It's a rather suboptimal situation when we have evidence of the various PC issues over the past two years, a couple of which still have yet to be fully resolved on nVidia's side.

So it's not a great example. :oops:

/DevAl's Advocate :p
 
Last edited:
There is a huge penalty from activating DXR even on maps like Hamada which has only a few reflected surfaces. It takes twice as long to render a frame from "off" to "Low" (200FPS -> 100FPS), yet there is no real difference between "Low" and "Ultra".
That can explain the "small" difference between Volta and Turing.

You can see a pretty noticeable difference in visual quality in highly diffuse scenes with DXR high or ultra, as show with screens earlier in the thread. I wonder if the number of rays cast in these scenes is still relatively low. I suppose if someone were to go to a particular spot, look at a bunch of diffuse looking rocks, take a screen, then turn DXR off and take another screen, the pictures could be compared to see how many pixels are different? The same could be done on levels with differing amounts of diffuse and glossy scenes to see how much RT is contributing to each scene. I know there are image editors that can do these comparisons. Have never tried it myself. I guess you'd need raw screens too.

Not sure how well that would work with all of the temporal stuff that's going on, plus all of the frantic activity. You might have to lie prone to make sure your character is not moving at all. Not sure how else you could make sure the pictures are pixel aligned, so you don't pick up false difference caused by changes in camera position etc. You'd probably need to turn off film grain, motion blur, chromatic aberration and all that crap. Edit: Oh, and all those particles flying around, like snow, leaves, dust and whatever.
 
Last edited:
Unless the track is a tiny loop, it has to be a far bigger level. But as I said, even if levels in Dreams are constrained, that doesn't mean the tech is.
Sure but it's also barren. None of the footage in that trailer shows anything comparable to top tier AAA games in terms of fidelity/detail.
 
Sure but it's also barren. None of the footage in that trailer shows anything comparable to top tier AAA games in terms of fidelity/detail.
MM are working on creating a package that will allow people the creation of games. That is their main goal, not creating elaborate set pieces. Those barren "levels" are created in a few minutes. Other stuff they've shown is not so barren, honestly. However, they've already showed us that some environments can be huge and with intricate geometry if they want to.

Now I think we must just wait for creators to see what they can truly do.
 
MM are working on creating a package that will allow people the creation of games. That is their main goal, not creating elaborate set pieces. Those barren "levels" are created in a few minutes. Other stuff they've shown is not so barren, honestly. However, they've already showed us that some environments can be huge and with intricate geometry if they want to.

Now I think we must just wait for creators to see what they can truly do.
They've shown small and intricate / large and barren. Not both at the same time.
 
Dumb discussion. Let's wait on evaluating rendering tech until it's out and we can perform our own tests. Some people are jumping to conclusions about what the engine is capable of based off of what has been shown so far, when what has been shown so far hasn't been specifically showcasing the limits of the engine. There are many reasons why it may not look like a AAA game, not least of which is the cost required to create the assets of a AAA game which Dreams isn't afforded.
 
So it's not a great example. :oops:
Yeah, I can see that now, however Gears 4 is not the only example.

I got my fps slashed from 80 to 55 once I activated PCSS+ in Far Cry 4, activating HFTS or VXAO in any game results in similar experiences. In fact you could say the same thing just about any PC game nowadays, moving from High to Ultra always results in a massive hit despite the limited added visual quality.
 
There is a huge penalty from activating DXR even on maps like Hamada which has only a few reflected surfaces. It takes twice as long to render a frame from "off" to "Low" (200FPS -> 100FPS), yet there is no real difference between "Low" and "Ultra".
That can explain the "small" difference between Volta and Turing.

If this is true it could also hint a bigger cost for building BVH than i would have assumed.
Some people think DXR does not use BVH at all (at least not at the lower level trees), but regular grids instead.
I that case, for animated characters rebinning triangles to grids would be more expensive than refitting an existing BVH i guess, which could explain a big constant cost in BFV.

Regular grids are not cool, but traversing the entire BVH per ray surely is no option, likely not even in hardware. And regular grids might be more hardware friendly than the complex sorting of ray batches to tree branches i had in mind.

Well... who knows... (tired of all that guessing) :)
 
Don't forget that with dxr low the primary rays from the camera are still cast and intersected, even in a scene with with rough surfaces. Its the secondary reflected that won't be cast because of the material properties. So you have to build the bvh and cast all of the primary rays on low, no matter the scene.
 
Yeah, i add to your suggestion of comparing screenshots to pick a spot where few rays could hit the sky. Likely there is no difference then even if raytraced.
Photoshop makes it easy to show difference between two images.
 
Funny story, these were added to Turing and are listed in Uniform Datapath Instructions
But I am pretty sure uniform instructions were added to save on power and vector register space because integer SIMDs are already decoupled in Turing, hence the same math can be done via more general purpose integer SIMD units concurrently to FP ops
DX RT should be quite stressing on resources handling, so both vector and uniform integer pipelines should be usefull and can likely be executed concurrently with some overlap
It seems from the description that most of the uniform data path ops are one-way with respect to the vector domain, with moves o loads into the uniform register file but not out. However, the defined behavior of the instructions and how the uniform register file is exposed is not really discussed and there could be a straightforward way to utilize it.
One idea I had for saving register accesses would be if there were an operand reuse cache slot set aside for this functionality. If it were sized like the cache is for the vector registers, a single operand cache slot could act like a temporary 16-32 register file that could potentially run uniform operations for an extended period of time without going back to the main register file. There are some elements in the SIMD path like shuffles and broadcasts that might have some elements of what would be needed for this to be done, although that might be too much to fit into a dedicated path.
 
Don't forget that with dxr low the primary rays from the camera are still cast and intersected, even in a scene with with rough surfaces. Its the secondary reflected that won't be cast because of the material properties. So you have to build the bvh and cast all of the primary rays on low, no matter the scene.
rasterization is still used in all those hybrid renderers in games, no primary rays.
 
To be clear, the scene is rasterised, and only when a surface shader has an active raytrace component are 'secondary rays' cast? There's no need for primary rays because the surface evaluation happens during rasterising, and the 'primary ray' data for incidence ray is provided from camera position?
 
Yes, I wasn't thinking about it correctly. They evaluate the materials first and figure out which areas of the screen need rays the most.
 
"To be clear, the scene is rasterised, and only when a surface shader has an active raytrace component are 'secondary rays' cast? There's no need for primary rays because the surface evaluation happens during rasterising, and the 'primary ray' data for incidence ray is provided from camera position?"

Yes, usually primary rays go from camera to scene (which we don't need because we rasterize),
and secondary rays spawn from the first hit (reflection and refraction rays, also shadow rays towards light sources or AO).

So as long as we are hybrid, we will not use the terms often.
An exception would be updating an environment map with RT (e.g. for car reflections in a racing game) which would use primary rays as well.
 
Back
Top