Nvidia Turing Speculation thread [2018]

Kaotik · Aug 8, 2018

psurge said:
My guess is the GV104 will be quite a bit larger than GP104.

But let me indulge in some wishful thinking for a second... weren't Tesla P100 and GTX 1080 released in relatively quick succession? I think it's been over a year since the V100 release, so maybe the architectural differences between the data-center and gaming lines of Volta are greater than for Pascal. From what I remember, the ray-tracing demos we've seen so far were running on multi-GPU V100 workstations with obscene price tags. At the same time, it's not really clear how much hardware V100 dedicates to accelerating ray-tracing. If the answer is "not much", and gaming-Volta could bring that sort of image quality within consumer reach, it would certainly motivate me to upgrade, even if the performance uplift in other scenarios were unimpressive.

I doubt there will be any sort of dedicated hardware for RT, PowerVR already tried it ages ago without getting any real tractiong (of which I'm aware of)
The current NVIDIA demos use Tensor cores to accelerate the denoising algorithms (which are used to lighten the load by simply simulating less rays and guesstimating the rest)

Tensor cores are also where it becomes "tricky" for NVIDIA - they are limited to certain types of calculations which are next to useless at least on current games and I dare to say for foreseaable future too, outside perhaps some special gameworks-shenanigans and the previously mentioned RT (which isn't used in games yet)
So how much silicon can you really waste on them?

trinibwoy · Aug 8, 2018

Nvidia has been pretty cagey about their DXR implementation. I’ve only seen tensor/AI denoising mentioned in reference to OptiX, not DXR.

The OptiX AI denoising technology combined with the new NVIDIA Tensor Cores in the Quadro GV100 and Titan V deliver 3x the performance of previous generation GPUs and enable noiseless fluid interactivity for the first time.

I imagine any developer that goes down this path will target general compute performance and not zomg tensors.

On the other hand DXR performance is going to depend heavily on some secret sauce to speed up acceleration structure construction and ray casting. Microsoft has implemented a DirectX compute based fallback path for hardware that doesn’t have special support for DXR and they expect it to be slowwww. So any useful implementation would presumably require custom hardware.

Kaotik · Aug 8, 2018

trinibwoy said:
Nvidia has been pretty cagey about their DXR implementation. I’ve only seen tensor/AI denoising mentioned in reference to OptiX, not DXR. .

It's possible I just remember it wrong and/or understood it wrong, but it's hard to imagine what else would explain limiting it to Volta?

DavidGraham · Aug 8, 2018

Kaotik said:
I doubt there will be any sort of dedicated hardware for RT, PowerVR already tried it ages ago without getting any real tractiong (of which I'm aware of)

Why not? NVIDIA already uses their architectural features to introduce/accelerate special effects in the PC space. Stuff like Ansel, Multi Res Shading/Lens Matched Shading, Single Pass Stereo, and Perspective Surround. These effects are powered by the Simultaneous Multi-Projection (SMT) feature that is only possible due to the PolyMorph Engine 4.0 and the way NVIDIA structures it's Geometry units in Maxwell and Pascal. My guess that RTX could be something similar.

Deleted member 13524 · Aug 8, 2018

trinibwoy said:
Guess that seals it. I’m really curious as to what GV104 will bring to the table that would match or beat GP102 at a considerably lower die size and power consumption. Given what we’ve seen from Titan V Volta performance per flop in games isn’t encouraging.

GDDR6 will help with bandwidth but it seems GV104 will need ~450mm^2 to deliver a useful upgrade.

GV104 might not be significantly smaller than GP102.
With GDDR6 ramping up production, GDDR5X is in an odd place (not as cheap and widely used as GDDR5 + not as fast as GDDR6 + RAM being generally starved of available production lines), so GPUs using it may be EOL'd soon.
At the same time, 16/12FF isn't a premium process anymore, so making larger chips may be justified.

By taking away the GDDR5X models, we might end up with a consumer lineup of:

- GTX 1180 replacing all the GP102 products (1080 Ti + Titan X + Titan Xp)
- GTX 1170 (cut-down GV104 with 192bit GDDR6 and less SM units) replacing GTX 1080
- GTX 1070 Ti
- GTX 1070
- (everything the same below that)

Ike Turner · Aug 8, 2018

As already mentioned, as of right now RTX is only "Denoising" using the tensor cores. I've personally confirmed this with devs at Epic who worked on the Star Wars tech demo with ILMxLabs. But as usual Nvidia's PR is all about making it sound like they invented sliced bread. As a matter of fact even Epic devs are "encouraged" to barely mention DXR and instead talk about RTX by Nvidia when discussing UE4's DXR implementation..that's the "price" to pay when you get free HW and help/resources from NV I guess. There's a blog post on Nvidia.com discussion the history of Ray Tracing..and as you guessed it there iqn.t a single mention of DXR in it..RTX on the other hand...

Example:

Same video...different names...
Remedy:

Nvidia

Rootax · Aug 8, 2018

What type of hardware/calculations do we need to speed up dxr / rtx ? I mean, what is the bottleneck right now ? vram bandwidth ? on chip cache ? raw compute power ?

DavidGraham · Aug 8, 2018

Ike Turner said:
as of right now RTX is only "Denoising" using the tensor cores

The tech is still in it's infancy, It's naive to think that's the only part of the pipeline that will ever be accelerated.

Ike Turner said:
As a matter of fact even Epic devs are "encouraged" to barely mention DXR and instead talk about RTX by Nvidia when discussing UE4's DXR implementation

Those very demos are accelerated using hardware provided by NVIDIA, whether we are talking demos from Epic, Dice, or Remedy. All were performed on NVIDIA Volta hardware. Under these circumstances, I don't think it's unusual for NVIDIA to ask for it's name on the box.

NVIDIA's RTX page actually has more description for DXR than RTX. So it's not really a total obfuscation of the existence of DXR.
https://devblogs.nvidia.com/introduction-nvidia-rtx-directx-ray-tracing/

The GameWorks RTX subsection only have RTX denoising at the moment to achieve a variety of effects like Area Shadows, Lights, Reflections and Ambient Occlusion.
https://developer.nvidia.com/gameworks-ray-tracing

trinibwoy · Aug 8, 2018

Rootax said:
What type of hardware/calculations do we need to speed up dxr / rtx ? I mean, what is the bottleneck right now ? vram bandwidth ? on chip cache ? raw compute power ?

Primarily compute and bandwidth efficiency. Building the acceleration structure and casting rays both require lots of random memory accesses and branchy code where different threads/rays in a warp can all be doing their own thing. Each thread can even be running a different shader (depending on what object the ray hits). This doesn't play well with GPUs which thrive on coherent memory access and coherent thread execution.

Don't think there's a silver bullet. Just need more of everything - flops, bandwidth and huge caches.

Deleted member 2197 · Aug 9, 2018

Nvidia EEC registrations indicate three new GeForce GPUs

Remember the Manli entries? It's happening again. This round Nvidia has registrations active that point to the arrival of three models GeForce video cards. Earlier on the PCB of one of the cards already surfaced. A picture of a PCB with model number PG180 already appeared, remember?

If you look closely at the backside of that PCB, it has PG180 written on it and guess what? Prolonged on that some registrations indicate the arrival of PG180, PG160 and PG150, you can check that here EEC, and here EEC.

The registrations thus indicate the arrival of more PCBs for more models graphics card, which presumably will use a different GPU. If you follow the numbers logic would indicate to be a GTX 1150, GTX 1160 and a GTX 1180. But with PCB names, that always isn't the case, to be honest. It could be very well that the PCB number will have nothing to do with the graphics card names.

https://www.guru3d.com/news-story/nvidia-eec-registrations-indicate-three-new-geforce-gpus.html

DavidGraham · Aug 10, 2018

A statement from a Galax rep:

Players will be able to see the information about the new graphics card in September. The performance will certainly have a breakthrough growth, and will support the most advanced NVIDIA ray tracing technology.

https://videocardz.com/77160/nvidia-geforce-gtx-20-11-series-rumor-roundup-2

Geeforcer · Aug 10, 2018

How is it possible that the cards are days away from an announcement and such basic things as chip designation and card name, let all me features and specifications are still seemingly unsettled? Did Nvidia hire a bunch of former CIA agents to manage leak prevention or something?

Anarchist4000 · Aug 10, 2018

Rootax said:
What type of hardware/calculations do we need to speed up dxr / rtx ? I mean, what is the bottleneck right now ? vram bandwidth ? on chip cache ? raw compute power ?

As trinibwoy mentioned, the issue is in grouping thread branches and coalescing memory access. Assuming a curved surface the rays will diverge horrifically. Essentially looking through a magnifying glass in reverse. Requiring a lot of memory/cache bandwidth to pack a vector for a SIMD unit.

trinibwoy said:
Primarily compute and bandwidth efficiency. Building the acceleration structure and casting rays both require lots of random memory accesses and branchy code where different threads/rays in a warp can all be doing their own thing. Each thread can even be running a different shader (depending on what object the ray hits). This doesn't play well with GPUs which thrive on coherent memory access and coherent thread execution.

Don't think there's a silver bullet. Just need more of everything - flops, bandwidth and huge caches.

The silver bullet may be procedural textures and geometry as a method to "compress" the resources and increase effective cache. While there will be some divergence as you mentioned, each thread should be roughly running the same instruction, but with incoherent memory accesses (hall of mirrors).

A series of cascaded DSPs or FPGA might be more efficient from a hardware perspective as the limiting factor will likely be the memory controller encountering stalls.

Geeforcer said:
How is it possible that the cards are days away from an announcement and such basic things as chip designation and card name, let all me features and specifications are still seemingly unsettled? Did Nvidia hire a bunch of former CIA agents to manage leak prevention or something?

With Volta, Ampere, and Turing it's unclear which part is covering which part the product stack. It's also possible there are minimal changes to the feature set so not worth disclosing for marketing purposes. Sure there is the ray stuff, but I haven't seen anything having to do with higher DX feature tiers and some of those may be problematic as "current" hardware was already supposed to have support or it was "unnecessary". Would be rough to admit hardware accelerated async compute and scheduling.

Samwell · Aug 10, 2018

Geeforcer said:
How is it possible that the cards are days away from an announcement and such basic things as chip designation and card name, let all me features and specifications are still seemingly unsettled? Did Nvidia hire a bunch of former CIA agents to manage leak prevention or something?

Annoucement is the reason. Nv has a tight control on their partners, so not even they are leaking. So stuff can only leak, when Nvs partners start shipping to distributors. But the hard launch is probably still a month away. But in 3 days we should know a lot more, when they launch/announce the new quadros.

Rootax · Aug 10, 2018

AdoredTV just made a video about that :

ninelven · Aug 10, 2018

Lol, the conspiracy theory bs at the end....

Kaotik · Aug 10, 2018

While I don't give any weight to AdoredTV, GeForce RTX and Quadro RTX trademarks are real
(http://tmsearch.uspto.gov/bin/gate.exe?f=searchss&state=4807:q16ye3.1.1 > search for GeForce RTX or Quadro RTX)

McHuj · Aug 10, 2018

We'll i'm hoping that the name RTX implies that the GPU actually has some usable features and performance for ray-tracing (if only partial algorithms)

Kaotik · Aug 10, 2018

McHuj said:
We'll i'm hoping that the name RTX implies that the GPU actually has some usable features and performance for ray-tracing (if only partial algorithms)

I doubt it's anything more than GV100 already has

Malo · Aug 10, 2018

Kaotik said:
I doubt it's anything more than GV100 already has

And bound to be a lot less?

Nvidia Turing Speculation thread [2018]

Kaotik

Drunk Member

trinibwoy

Meh

Kaotik

Drunk Member

DavidGraham

Deleted member 13524

Guest

Ike Turner

Rootax

DavidGraham

trinibwoy

Meh

Deleted member 2197

Guest

DavidGraham

Geeforcer

Harmlessly Evil

Anarchist4000

Samwell

Rootax

ninelven

PM

Kaotik

Drunk Member

McHuj

Kaotik

Drunk Member

Malo

Yak Mechanicum

Similar threads