For The Last Time SM2.0 vs SM3.0

Is there anywhere where I can get some pictures/articles on HDR, I'd like to know what it's good for.
 
malficar said:
I keep getting conflicting answers on a question of mine I would like answered. This is purely from a gamer/user perspective, NOT a coder or developer. I am not interested in things happening on the driver level.

So here's the question. Is there any visible effect in a game that SM 3.0 can produce that SM 2.0 can not with alittle more work? So if I do go with the X800XT over the 6800GT will I be missing 'neat effects', as some have put it.

Thanks in advance.
There's a significant difference between your thread title and your actual question.
The X800 series is not limited to shader version 2.0. It can do shader version "2.x", and is a "2_b" shader compiler target.

Version "2.x" in itself doesn't say much, because it is sufficient to exceed a single "2.0" requirement to claim "2.x". You need to look at where and how a chip exceeds version "2.0".
The X800 can execute shaders up to the maximum length DX9 allows for "2.x", that is somewhere between 511 and 1532 operations, depending on instruction mix. This is a significant step up from version "2.0". X800 is still limited to four levels of indirection ("dependent texture fetches"), which is the same as R3xx class hardware, and may be its biggest overall flaw.

(Pixel) Shader version "3.0"'s most significant benefit is dynamic branching, which is primarily useful for skipping over computations that aren't necessary for the current fragment, which obviously saves computational resources (=effective fill rate). Dynamic branches can also, theoretically, lower CPU load because fewer shader changes are needed per frame (state changes in general are expensive in DX9).
How useful this really is depends on implementation. There currently appear to be significant penalties for dynamic branching on NV40. I'm not claiming that this makes it counterproductive generally, but it's again important to note that there are differences between theoretical value of a shader model, and the actual value of implementations.

Otherwise I'd just like to point out that a shader that must execute 511 instructions per fragment is going to run slow as dog, with fillrates in the "less than a Voodoo 1" ballpark. No matter what hardware or shader model you use. This also means that a very long pixel shader can be multipassed without much penalty because multipassing consumes bandwidth and geometry resources, and both of these resources are heavily underused by very long pixel shaders. You'd get a multipass split basically for free, so for pixel shaders without lots of conditionals, you don't really need SM3.

Re VS3.0: pure nicety, if filters were supported. Unfortunately NV40 does not support filtering of vertex textures and this makes vertex texturing useless IMO. You can just as well use another vertex attribute stream, or indexed constant storage in lieu of a vertex texture. You'd pick one of the two depending on the amount of data in the vertex texture and its fetch locality.

Another vertex level capability of NV40 is the stream frequency divider, marketed as "geometry instancing". Most useful for rendering many objects with low polygon counts. Draw calls are expensive in DX, so this allows the chip to, in a nutshell, generate the draw calls itself internally, under restricted circumstances.
I don't think this is very useful. I may be wrong ;)

In summary, NV40 can not render effects that aren't possible at all on X800, but it may very well be more efficient at rendering the same effects once they
a)reach a certain complexity
b)contain many conditionals
and/or
c)require many levels of texture indirection

Re 3Dc (just because it has been brought up): it just (de)compresses two-channel textures in a format suitable for normal maps, it does not enable any new effects. The compression gain is fixed at 50%. While it does save storage space and bandwidth, it is unlikely to enable higer resolution normal maps to fit on the same card. That would be a sensible claim to make for a 4:1 (or better) compression, but IMO not for "3Dc".
 
zeckensack said:
In summary, NV40 can not render effects that aren't possible at all on X800, but it may very well be more efficient at rendering the same effects once they
a)reach a certain complexity
b)contain many conditionals
and/or
c)require many levels of texture indirection

Nop...
Forgot FP16 blending and FP32 textures (all targets except 3D I think), that include trilinear and anisotropic filtering.
Like proven with FarCry, no one is going to simulate HDRI by rendering to a FP RT, and read back from pshader... Keep it simple and use FP blending...

To malficar: NVIDIA doesn't need to upgrade until DX10. There is nothing that it doesn't already support in the NV40. Well, except 3Dc... :rolleyes:
 
Sigma said:
Forgot FP16 blending and FP32 textures (all targets except 3D I think), that include trilinear and anisotropic filtering.

I may be wrong, but I though filtering was only supported on FP16 textures?

Like proven with FarCry, no one is going to simulate HDRI by rendering to a FP RT, and read back from pshader... Keep it simple and use FP blending...

UnrealEngine3 will implement this as one method for HDR, as does HL2.
 
DaveBaumann said:
Sigma said:
Forgot FP16 blending and FP32 textures (all targets except 3D I think), that include trilinear and anisotropic filtering.

I may be wrong, but I though filtering was only supported on FP16 textures?

:D What I meant was FP16 blending. And FP16/32 textures.... :)

DaveBaumann said:
UnrealEngine3 will implement this as one method for HDR, as does HL2.

Unreal? Are you sure? I remember they specificly talking about FP blending on the presentation...

As for HL2, well, they don't have any choice, know do they? We are talking of ATI/Valve: and they make a nice couple.... :rolleyes:

EDIT: I think Carmack talked about waiting for FP blending too, since with dynamic lighting, having to render to a FP RT and read back for each light is very slow...
And since HL2 is still very "lightmaped", maybe this is why they can afford to have HDRI. They probably do everything in one render pass...
 
Sigma said:
zeckensack said:
In summary, NV40 can not render effects that aren't possible at all on X800, but it may very well be more efficient at rendering the same effects once they
a)reach a certain complexity
b)contain many conditionals
and/or
c)require many levels of texture indirection

Nop...
Forgot FP16 blending and FP32 textures (all targets except 3D I think), that include trilinear and anisotropic filtering.
You're right, I forgot to mention that. It is not inherently tied to the shader model, texture caps should be orthogonal (ie you can use FP32 textures in fixed function "DX7" multitexture setups, if you want to). I'm a bit fuzzy on DX related details though, so there may be artificial limitations imposed by the API.

But as I was trying to discuss R420 vs NV40, and not specifically SM2 vs SM3, this is a good point.

Note that R300 and R420 both support FP16 and FP32 textures, they just cannot filter them at all. I agree that this is a significant limitation.
Sigma said:
Like proven with FarCry, no one is going to simulate HDRI by rendering to a FP RT, and read back from pshader... Keep it simple and use FP blending...
See Dave. rthdribl uses the same technique, and was the first publically available example of "HDR" rendering IIRC.

Of course this is going to be slower than if you had FP texture filtering at your disposal. I already commited to this:
myself said:
In summary, NV40 can not render effects that aren't possible at all on X800, but it may very well be more efficient at rendering the same effects once they
<...>
Add:
d)involve blending on floating point targets
 
zeckensack said:
See Dave. rthdribl uses the same technique, and was the first publically available example of "HDR" rendering IIRC.

Yes, for a tech demo. Imagine doing the same in Doom3. It had to be done for every light: impossible. I've tried this myself using every NV extension in GL, but still: very slow. Better on Rad, but still slow nonetheless...
 
Sigma said:
:D What I meant was FP16 blending. And FP16/32 textures.... :)

FP16 filtering and blending, fair enough - FP16/32 textures (without filtering) have been supported in R300 upwards since day 1.

Unreal? Are you sure? I remember they specificly talking about FP blending on the presentation...

According to the discussion I had with Tim Sweeney last week, yes. HDR will be implemented for non FP16 blending capable parts as a fallback method.

As for HL2, well, they don't have any choice, know do they? We are talking of ATI/Valve: and they make a nice couple.... :rolleyes:

Lets deal with the facts shall we, and this is not "no one" as you initially stated. Valve are already supporting HDR to in integer texture for the FX series, as well as float for ATI's shader 2.0 parts - we don't know if they are going to use blending or not yet.
 
zeckensack said:
Re VS3.0: pure nicety, if filters were supported. Unfortunately NV40 does not support filtering of vertex textures and this makes vertex texturing useless IMO.
Textures filtering would be nice, but lacking of that doens't make texture sampling in the VS pipeline useless.
There are plenty of (very interesting and exotics by now) things you can do just with point filtering! ;)

You can just as well use another vertex attribute stream, or indexed constant storage in lieu of a vertex texture. You'd pick one of the two depending on the amount of data in the vertex texture and its fetch locality
Last time I checked vertex attributes and shader constants weren't filtered entities :)


Another vertex level capability of NV40 is the stream frequency divider, marketed as "geometry instancing". Most useful for rendering many objects with low polygon counts. Draw calls are expensive in DX, so this allows the chip to, in a nutshell, generate the draw calls itself internally, under restricted circumstances.
I don't think this is very useful. I may be wrong ;)
It could be very useful in some applications/games. I would like to know how geometry instancing is used in FarCray. Maybe they use it to render things like rocks, vegetation, grass and so on..

ciao,
Marco
 
DaveBaumann said:
I may be wrong, but I though filtering was only supported on FP16 textures?

FYI: I just looked it up in the "GPU Programming Guide": With FP32 textures you can actually use Nearest Filtering but not Bilinear, Trilinear or Anisotropic Filtering.
 
LeStoffer said:
FYI: I just looked it up in the "GPU Programming Guide": With FP32 textures you can actually use Nearest Filtering but not Bilinear, Trilinear or Anisotropic Filtering.
Errr... isn't "nearest filtering" = "no filtering"?
 
nAo said:
zeckensack said:
Re VS3.0: pure nicety, if filters were supported. Unfortunately NV40 does not support filtering of vertex textures and this makes vertex texturing useless IMO.
Textures filtering would be nice, but lacking of that doens't make texture sampling in the VS pipeline useless.
There are plenty of (very interesting and exotics by now) things you can do just with point filtering! ;)
Yes, I won't dispute that. My point was that you can do all of them without vertex textures.
nAo said:
Last time I checked vertex attributes and shader constants weren't filtered entities :)
"Nearest" filtering is the same as no filtering at all. Vertex textures on NV40 aren't "filtered entities" either.

Whatever texture coords you'd like to use in the vertex shader to lookup a vertex texture, you can resolve them on the host and put the "lookup" result into another vertex attribute stream.

This works brilliantly for texture coords that are invariant wrt result position (like all forms of displacement). It won't work well with texture coords that depend on the transform matrices (like fish-eye lens distortion).
This was another thing I forgot about :oops:
 
Simon F said:
Errr... isn't "nearest filtering" = "no filtering"?

Well, I assume that it means nearest neighbor filtering, but maybe I have spend too much time in Photoshop? ;)

Edit: Yes, probably too much PS. I think that Simon is right that with "nearest filtering" nVidia just mean nearest-point sampling (not filtering). Sorry for that.
 
zeckensack said:
Whatever texture coords you'd like to use in the vertex shader to lookup a vertex texture, you can resolve them on the host and put the "lookup" result into another vertex attribute stream.
You can do everything on the host, even render a full frame if you like it. That doesn't mean we want to do that. How many games that employ displacement mapping or geometry images techniques did you play with recently?
Nevertheless GPU that can push a lof of pre-baked primitives are here since years..

ciao,
Marco
 
pat777 said:
DaveBaumann said:
What does this "Ta" mean?
ta ( P ) Pronunciation Key (tä)
interj. Chiefly British (but also used in Aus')

Used to express thanks.



Could also be Tantalum but Dave's not that obscure... well apart from the "..." incident:)
 
Back
Top