Nvidia Turing Architecture [2018]

Looking at the new SM architecture with 16 FP/INT units instead of 32, I'm wondering how long it will take for warps to be reduced from 32 to 16. That could improve thread divergence significantly.
 
The FP-units did operate in groups of 16 for quite a while already. That's nothing new in Volta/Turing.
IMG0050467_1.png
 
SIMD width is not really indicative of warp size. G80 was 8-wide and took 4 clocks to execute an instruction. It would also increase scheduler cost significantly and break years of optimizations tailored for 32 wide warps.
 
SIMD width is not really indicative of warp size. G80 was 8-wide and took 4 clocks to execute an instruction. It would also increase scheduler cost significantly and break years of optimizations tailored for 32 wide warps.

I understand that, but thread divergence is determined by the number of threads in a warp is it, and not the nr of threads in the sense of nr of SIMD lanes ? In case of the former (as I understood it was), I know scheduling rate will need to increase, but thread divergence should improve a lot as there is only 16 instead of 32 threads to diverge with warp size 16.
 
FP-group size was 16 since Fermi, but going forwar from maxwell, Nvidia did not make that readily apparent in their diagrams anymore. You really had to ask about it. So, I don't think they will do away the economization of control vs. execution width. I'm actually glad they did not increase it.
 
Reading a bit more, thread divergence seems to be dictated by warp size, which always has been 32 for NV. It would make sense to reduce warp size to 16 to improve divergence IMHO.
 
Last edited:
So does RTX not need DXR to work? I am asking because the Ray Tracing in the StarWars demo works on both Pascal and Turing without DXR.
 
So does RTX not need DXR to work? I am asking because the Ray Tracing in the StarWars demo works on both Pascal and Turing without DXR.
I'm assuming via Nvidia Optix in their drivers? Likely the demo is hand tuned to Optix, which means it probably won't work on Vega once next W10 update hits (and AMD adds DXR to drivers)
 
Nope, it can also run as an acceleration layer for Vulkan RT and Optix.

Unfortunately the screenshots of the upcoming 3dmark RT benchmark look quite unimpressive. Hopefully someone does a proper DXR or Vulkan benchmark.

Isn't the "look" irrelevant as long as it gives numbers to compare across hardware?
 
So does RTX not need DXR to work? I am asking because the Ray Tracing in the StarWars demo works on both Pascal and Turing without DXR.
The Star War Reflections demo is built using DXR. RTX accelerates DXR but DXR can run on any DX12 GPU using the fallback layer.
 
Isn't the "look" irrelevant as long as it gives numbers to compare across hardware?

We already have a benchmark where looks are irrelevant, it’s called math.

3DMark actually used to be a benchmark in the literal sense by producing visuals that games of the time couldn’t match. It set expectations for what’s possible in the future when using the latest and greatest tech. It hasn’t been that for a long time. Now it just spits out a useless number and doesn’t look good doing it.

I miss the good old days of Nature and Airship.
 
We already have a benchmark where looks are irrelevant, it’s called math.

3DMark actually used to be a benchmark in the literal sense by producing visuals that games of the time couldn’t match. It set expectations for what’s possible in the future when using the latest and greatest tech. It hasn’t been that for a long time. Now it just spits out a useless number and doesn’t look good doing it.

I miss the good old days of Nature and Airship.

Aaaah, the sweet memories of running Nature on a ATI 9700 Pro! Bliss! :smile2:
 
Pretty cool. Still waiting on that killer app though for VR to really take off.
It will be certainly interesting to see how it will finally happen and does it need a additional change to hardware.
We might see something nice in couple of days though. (Hoping for turing/oculus presentation.)
We already have a benchmark where looks are irrelevant, it’s called math.

3DMark actually used to be a benchmark in the literal sense by producing visuals that games of the time couldn’t match. It set expectations for what’s possible in the future when using the latest and greatest tech. It hasn’t been that for a long time. Now it just spits out a useless number and doesn’t look good doing it.

I miss the good old days of Nature and Airship.
Indeed.

Would be a lot more interesting if it would give proper timings on what happens within GPU during the test.
For raytracing the pure tracing speeds and with different kinds of shaders attached. (To see how GPU handles tracing and shading, I'm quite sure software path will get hit in comparison.)

Old good demos were gorgeous.
 
Last edited:
Back
Top