Nvidia Turing Speculation thread [2018]

Malo · Aug 24, 2018

entity279 said:
The "live" nature of TV sports transmissions would make any post processing steps less apealing, IMO

Yeah it would have to be almost real-time to be used in instant replays.

silent_guy · Aug 24, 2018

CarstenS said:
Just in case you are not being sarcastic: That's just a grossly oversimplified illustration to show how different parts of the calculation can overlap. In fact, RT- and Tensor-Cores are integrated into the individual SMs. Even under the assumption that the chip shot is just an equally oversimplified artists interpretation and not resembling reality at all, it would take large amounts of energy to move all that data around for a single frame. The 24 bright spots in the upper and lower horizontal middel for example are most likely the Raster Backends/ROPs.

I’m pretty sure he was using the slide to show that RT has been accelerated (the time graphs), not what the real layout looks like.

giannhs said:
i learned not to believe anything that nvidia or others affiliated with them say unless i see a poc that fiasco of the 5xxx series was literally the last nail for me when it comes to that company

If your technical judgement is so clouded by emotions, why bother to engage in these kind of discussion?

Voxilla · Aug 24, 2018

Malo said:
Yeah it would have to be almost real-time to be used in instant replays.

Anything based on Deep Learning has also the potential of generating artefacts.
This for rare cases not seen during the training.
If you look at the DL slomo footage of the falling ice hockey skater, there are huge tearing artefacts on the skates.
As good as it looks, it might not be good enough to cover everything.

silent_guy · Aug 24, 2018

Voxilla said:
Curious we didn't hear yet about another NN possibility, namely Deep Learning Super Slomo.

That deep slomo network requires at least an order of magnitude more calculations and cannot be done in real time. (At least not on a single GPU.)

It’s a large, deep network.

Here’s the paper: https://arxiv.org/pdf/1712.00080.pdf

entity279 · Aug 24, 2018

^^ Slomo cannot be done in realtime anyway, without changing a few physics' constants *runaway*

CSI PC · Aug 24, 2018

silent_guy said:
That deep slomo network requires at least an order of magnitude more calculations and cannot be done in real time. (At least not on a single GPU.)

It’s a large, deep network.

Here’s the paper: https://arxiv.org/pdf/1712.00080.pdf

Cue Jensen "The more you buy the more you save"

I assume comes down to if the spanking new large DGX node can be centralised with the broadcasting station and away from the hires TV/film camera feed giving greater flexibility.
Those hires broadcast cameras used for best fidelity in sports I think (used to be) are shockingly expensive, even without considering aspects of high speed motion capturing with slo-mo playback.

Lorens · Aug 24, 2018

I am very confused about what part of the denoising is done with the help of AI instead of normal shaders doing the work.

In the quadro turing presentation they showed that only global illumination is being denoised by a AI based denoiser and reflections and all other light effect with compute. But the slide at the turing event had no single mention of AI denoising , not even for global illumination.

Are the tensor cores too slow for it in realtime gaming situations ?

Scott_Arm · Aug 24, 2018

I think the tensor cores are used for what they're calling DLSS (deep-learning super-sampling), but which seems to be an upscale instead of real super sampling.

Lorens · Aug 24, 2018

Scott_Arm said:
I think the tensor cores are used for what they're calling DLSS (deep-learning super-sampling), but which seems to be an upscale instead of real super sampling.

Yes they are used for that but Jensen also suggested it was used for raytrace denoising or attempt could be in some capability. Nvidia is a bit vague about it.

Maybe it just something they are still developing and we will see AI based denoising of raytracing at a later time.

Malo · Aug 24, 2018

Lorens said:
Yes they are used for that but Jensen also suggested it was used for raytrace denoising or attempt could be in some capability. Nvidia is a bit vague about it.

Maybe it just something they are still developing and we will see AI based denoising of raytracing at a later time.

You seem to imply Tensors couldn't be used for both?

Lorens · Aug 24, 2018

Malo said:
You seem to imply Tensors couldn't be used for both?

No I am saying they could be but during the presentation the slide said they don't, which would conclude that they only use the tensor cores for DLSS.

Ike Turner · Aug 24, 2018

The Tensor Cores are used for Denoising is practically all of the RT demos shown Pica Pica, Star Wars, Cornell box, etc etc) Now can they do both Denoising and DLSS at the same time? We don't know yet.

Lorens · Aug 24, 2018

Ike Turner said:
The Tensor Cores are used for Denoising is practically all of the RT demos shown Pica Pica, Star Wars, Cornell box, etc etc) Now can they do both Denoising and DLSS at the same time? We don't know yet.

https://image.slidesharecdn.com/jhh...rce-rtx-launch-event-21-638.jpg?cb=1534805756

This slide seems to imply they don't.

Compare that to the quadro presentation slide where it does say AI below global illumination.

Clukos · Aug 25, 2018

Honestly, that's what I expect from the 7nm TU102 replacement:

Increased clockspeed to around 2350-2500 MHz boost, 2050 base
2 RT cores per SM
64 SMs (4096 cc)
16GB HMB2 or 24GB GDDR6
Identical tensor core count

That should be a significant leap over Pascal in "normal" FP32 workloads while offering faster RT perf than Turing at a -hopefully- smaller die size and lower cost.

McHuj · Aug 25, 2018

I think a big limiter will be bandwidth in the 7nm parts. NVIDIA is already using 14 Gbps GDDR6, going to the current max of 18 Gbps is only a modest bump.

They may need to go to HBM in the 102 class GPU to get the bandwidth.

pharma · Aug 25, 2018

silent_guy said:
That deep slomo network requires at least an order of magnitude more calculations and cannot be done in Real time. (At least not on a single GPU.)

Wonder if it's a future feature since it's included in the NGX SDK. Currently there is only details for DLSS and AI Painting, but also included in the stack are placeholders for AI Slow-Mo and AI Res-Up.
https://developer.nvidia.com/rtx/ngx

ShaidarHaran · Aug 26, 2018

McHuj said:
I think a big limiter will be bandwidth in the 7nm parts. NVIDIA is already using 14 Gbps GDDR6, going to the current max of 18 Gbps is only a modest bump.

They may need to go to HBM in the 102 class GPU to get the bandwidth.

A theoretical 18Gb/384-bit part would see memory bandwidth up to 864GB/s from 616GB/s, a healthy 40% increase and well within the realm of possibility.

McHuj · Aug 26, 2018

ShaidarHaran said:
A theoretical 18Gb/384-bit part would see memory bandwidth up to 864GB/s from 616GB/s, a healthy 40% increase and well within the realm of possibility.

That's true, but I'm just not expecting that in the first round of 7nm chips.

manux · Aug 26, 2018

eloic · Aug 26, 2018

manux said:

Way too much chit-chat, IMO, and little detail about the questions asked, actually.

And it's kind of annoying this recent attitude of "thanks to RT everything feels real and like you're actually in the game, so before this moment everything was crap".

Nvidia Turing Speculation thread [2018]

Malo

Yak Mechanicum

silent_guy

Voxilla

silent_guy

entity279

CSI PC

Lorens

Scott_Arm

Lorens

Malo

Yak Mechanicum

Lorens

Ike Turner

Lorens

Clukos

Bloodborne 2 when?

McHuj

pharma

ShaidarHaran

hardware monkey

McHuj

manux

eloic

Similar threads