A Few Notes on Future NV Hardware

Dave Baumann · Feb 12, 2003

I've just got back from NVIDIA's "Dusk-till-Dawn" developer event. I'll have an article with an overview up of the event sometime, but here a a couple of little hardware details picked up from the event:

W-Buffer support dropped from future NV hardware (use FP calculations if you want higher precision).
Sounds as though NVIDIA has interpreted the precision spec in DX to allow them to always calculate the PS stages in FP16. The _PP precision hint seems to only apply to texture co-ordinates.
Z and Stencil "are one buffer" so they should be cleared at the same time. Z-Cull is disabled with stencils.
Numerous references to "Second texture runs at full speed". Mmmmm...

McElvis · Feb 12, 2003

Z and Stencil "are one buffer" so they should be cleared at the same time. Z-Cull is disabled with stencils

Is this not the same as for the R300? Or something similar?

Dave Baumann · Feb 12, 2003

Is this not the same as for the R300? Or something similar?

Yes. NV3x sounds as though it will operate in exactly the same way as R300 hardware does in this respect.

jjayb · Feb 12, 2003

So the nv30 will have Chalnoths so called "ati stencil problems" in doom 3 also?

Ostsol · Feb 12, 2003

DaveBaumann said:
Sounds as though NVIDIA has interpreted the precision spec in DX to allow them to always calculate the PS stages in FP16. The _PP precision hint seems to only apply to texture textture co-ordinates.

LOL! That's exactly what I thought they'd do! It's kinda too bad. . . I thought I was just being a pessemistic bastard. . .

MuFu · Feb 12, 2003

Did they mention anything or did you pick up any "hints" regarding their next-gen, unified shading model?

MuFu.

Tahir2 · Feb 12, 2003

Did you get a GFFX Ultra too?

andypski · Feb 12, 2003

DaveBaumann said:
[*]Sounds as though NVIDIA has interpreted the precision spec in DX to allow them to always calculate the PS stages in FP16. The _PP precision hint seems to only apply to texture co-ordinates.[/list]

How inventive of them...

Now where did my large texture maps go - or at least, where did all my bilinearly interpolated texels go... :?

DeanoC · Feb 12, 2003

Ostsol said:
DaveBaumann said:

Sounds as though NVIDIA has interpreted the precision spec in DX to allow them to always calculate the PS stages in FP16. The _PP precision hint seems to only apply to texture textture co-ordinates.

Click to expand...

LOL! That's exactly what I thought they'd do! It's kinda too bad. . . I thought I was just being a pessemistic bastard. . .

The good news is that nVidia is wrong! Amar has just confirmed to me (on DirectX mailing list) that temp registers have to support s16e7 (i.e. 24 bit floats). It appears that there was a typo in the spec.

It this hadn't have been corrected, it would have set D3D back several years. This is going to cause havoc with benchmarks, when nVidia 'fix' the drivers and use 32 bit mode how much LOWER will they score with pixel shader 2.

Now I've got to convince them to fix there drivers... but at least I can publish my ShaderX2 stuff without worry.

Pissed off at 3pm, happy at 10pm (with getting stuck in a bomb scare for 3 hours, thats not bad going).

Althornin · Feb 12, 2003

DaveBaumann said:
W-Buffer support dropped from future NV hardware (use FP calculations if you want higher precision)

Uh-oh, DS, where are you?
Its not just ATI now - it must be a global conspiracy to destroy your games.
But at leat we wont have to hear one-sided anti-ATI rants abut this anymore...

Dave Baumann · Feb 12, 2003

Thanks Dean

The good news is that nVidia is wrong! Amar has just confirmed to me (on DirectX mailing list) that temp registers have to support s16e7 (i.e. 24 bit floats). It appears that there was a typo in the spec.

I assume that is "at least 24 bit floats"? i.e. I assume NV3x will actually operate at FP32 precision, rather than FP24?

This is going to cause havoc with benchmarks, when nVidia 'fix' the drivers and use 32 bit mode how much LOWER will they score with pixel shader 2.

You think the drivers are currently usin FP16?

Dave Baumann · Feb 12, 2003

Pissed off at 3pm, happy at 10pm (with getting stuck in a bomb scare for 3 hours, thats not bad going).

Woah. Must have been on your train home?

DeanoC · Feb 12, 2003

Yep of course your correct, for PS_2_0 the minimum precision for temp registers has to be s16e7. Same for texture registers but constant registers can be a minimum of s10e5 and colour iterators can be low precision with a range of 0-1.

Woah. Must have been on your train home?

The bomb scare was at King Cross, you most have just got your train just before.
I got in the house and made the post straight away and Amar replied very quickly. As you know I was not a happy bunny when nVidia told us, so it was the first thing I wanted to do.

Now the hard bit is going to be convincing them to redo the drivers! Lets hope we can get this cleared up before the GeforceFX boards ever go really live.

I wonder if I've just made an enemy of a large corparation... hope not they do throw such good parties

LeStoffer · Feb 12, 2003

For Dave (and others) ref, here's a quick key quote from the list that DeanoC is talking about.

- For ps_2_0 compliance, the minimum level of internal precision for temporary registers (r#) is s16e7** (this was incorrectly s10e5 in spec)
- The minimum internal precision level for constants (c#) is s10e5.
- The minimum internal precision level for input texture coordinates (t#)
is s16e7.
- Diffuse and specular (v#) are only required to support [0-1] range, and
high-precision is not required.

It seems to me (I'm no dev) that a FP16 pipeline might get in a bit of trouble here, although most of us are still in the dark about the NV30 implementation.

KimB · Feb 13, 2003

jjayb said:
So the nv30 will have Chalnoths so called "ati stencil problems" in doom 3 also?

Possibly. Will need some tests. The disabled technology is definitely called by a different name between the two cards. How different is it in reality?

Dave Baumann · Feb 13, 2003

Chalnoth said:
Possibly. Will need some tests. The disabled technology is definitely called by a different name between the two cards. How different is it in reality?

The upshot, Chalnoth, is that regardless of how they will perform in any single title they both have exactly the same characteristics in regards to Z removal and stencil ops.

Dave Baumann · Feb 13, 2003

DeanoC said:
The bomb scare was at King Cross, you most have just got your train just before.

Thameslink is not joined to the main King Cross station, prolly why we avoided it. Heh - sounds like fun getting home for everyone what with bomb scares at train stations and the army around Heathrow!

Now the hard bit is going to be convincing them to redo the drivers! Lets hope we can get this cleared up before the GeforceFX boards ever go really live.

What makes you think they are already using FP16 in the drivers?

I wonder if I've just made an enemy of a large corparation... hope not they do throw such good parties

Yes, shall have to dig out my camera!

JD · Feb 13, 2003

Sometimes you need to clear stencil buffer only. Shadow volumes with multiple lights comes to mind.

When can we expect Derek Smart here?

DeanoC · Feb 13, 2003

DaveBaumann said:
DeanoC said:

Now the hard bit is going to be convincing them to redo the drivers! Lets hope we can get this cleared up before the GeforceFX boards ever go really live.

Click to expand...

What makes you think they are already using FP16 in the drivers?

Click to expand...

Various chats with some of Dev Rels (I spend a fair few hours after the party/at breakfast chatting with them), seemed to indicate that Dx9 PS_2_0 was running. I was told that I could run my code as soon as I got the board, also the cgFX demo was ran on both OpenGL and Dx9 and some of them used long shaders.

As they were warning us about using FP16, they most have wrote the driver to use it. The backend conversion from PS_2_0 to internal ops will be quite FP16 specific to get good performance (packing temp registers etc).

pocketmoon_ · Feb 13, 2003

Hey Dave,

Where you the one dancing on the bar with the girls ?

The whole FP16/32 issue is very confusing - the presenter struggled with a LOT of the information he was trying to get across.

A Few Notes on Future NV Hardware

Dave Baumann

Gamerscore Wh...

McElvis

Dave Baumann

Gamerscore Wh...

jjayb

Ostsol

MuFu

Chief Spastic Baboon

Tahir2

andypski

DeanoC

Trust me, I'm a renderer person!

Althornin

Senior Lurker

Dave Baumann

Gamerscore Wh...

Dave Baumann

Gamerscore Wh...

DeanoC

Trust me, I'm a renderer person!

LeStoffer

KimB

Dave Baumann

Gamerscore Wh...

Dave Baumann

Gamerscore Wh...

JD

DeanoC

Trust me, I'm a renderer person!

pocketmoon_

Similar threads