For The Last Time SM2.0 vs SM3.0

nAo said:
zeckensack said:
Whatever texture coords you'd like to use in the vertex shader to lookup a vertex texture, you can resolve them on the host and put the "lookup" result into another vertex attribute stream.
You can do everything on the host, even render a full frame if you like it. That doesn't mean we want to do that.
I'm not going there. How about you show me a vertex shader that uses vertex texturing, and I show you how to implement what I was talking about? Then we could discuss the performance implications of the technique.

Or do you want me to produce example code?
 
zeckensack said:
How about you show me a vertex shader that uses vertex texturing, and I show you how to implement what I was talking about? Then we could discuss the performance implications of the technique.
I perfectly understand what you are saying, in fact I'm not saying it can't be done..so I really don't need you show me anything. What I'm saying is that doing the same stuff with VS3.0 is probably much better. I doubt a current CPU can hide memory latencies better than a GPU in the general case.
NV40 has 6, multithreaded, vertex shader pipelines.
Even with a medium size shader the GPU will have tens to hundreds cycles to hide textures sampling latencies.

ciao,
Marco
 
NVIDIA doesn't need to upgrade until DX10. There is nothing that it doesn't already support in the NV40. Well, except 3Dc...
Not really... The NV40 doesn't support half the floating point texture formats, 32-bit fp blending and a whole lotta DX9 features including N-Patches and adaptive tessellation. Practically the NV40 is missing half the essential DX9 features!
 
N-Patches are not really supported in hardware for R300 and up and there is no requirement for support of this or adaptive tesselation in DX9. IMO, anything to do with tesselation is a bit of a waste of time before DX10 anyway (and even then its not a certainty that the tesselation engine will be a requirement or even supported).
 
So you wouldn't want to include hardware assisted dynamic LOD that would probably double framerates while dramatically increasing the tessellation levels near the viewer?
 
zeckensack said:
Another vertex level capability of NV40 is the stream frequency divider, marketed as "geometry instancing". Most useful for rendering many objects with low polygon counts. Draw calls are expensive in DX, so this allows the chip to, in a nutshell, generate the draw calls itself internally, under restricted circumstances.
I don't think this is very useful. I may be wrong ;)
FarCry is using it for grass and other foliage. I'm sure you can see how geometry instancing could be a very significant speedup for grass in that game. Not everything really needs to be rendered with high resolution meshes.

There may also be performance increases and/or CPU usage decreases involved in rendering large crowds, such as in ATI's Crowd demo, by moving the differentiation between the different models to the graphics card more.
 
DaveBaumann said:
N-Patches are not really supported in hardware for R300 and up and there is no requirement for support of this or adaptive tesselation in DX9. IMO, anything to do with tesselation is a bit of a waste of time before DX10 anyway (and even then its not a certainty that the tesselation engine will be a requirement or even supported).
So what do you think would be an interesting feature to add to the Nv40 between now and DX10? 8)
 
nAo said:
I perfectly understand what you are saying, in fact I'm not saying it can't be done..so I really don't need you show me anything. What I'm saying is that doing the same stuff with VS3.0 is probably much better. I doubt a current CPU can hide memory latencies better than a GPU in the general case.
NV40 has 6, multithreaded, vertex shader pipelines.
Even with a medium size shader the GPU will have tens to hundreds cycles to hide textures sampling latencies.

ciao,
Marco
If you can remove the need to use vertex textures, there will be no latency to hide to begin with. I already admitted that this is not possible (or sane, for performance reasons) for all uses of vertex texturing. I'm really only interested in those cases where you have the same amount of draw calls and the same amount of required memory objects. (I'd count a bound vertex texture and a bound attribute stream interchangeably as an abstract bound memory object)

You can remove vertex texturing if you look up the texture with texcoords that do not depend on vertex position after transformation. And you can still scale/rotate/whatever the "texture sample" after the "lookup".

You can not do what NVIDIA's spiky sphere demo does, without vertex textures, because the texcoords for the displacement map can be scaled over the object surface by the user. You could however implement displacement strength scaling (length of spikes) just fine. Any displacement that's local to the mesh in question can be done, fast enough(TM).

Animated displacement mapping is no problem either if it is done by respecifying the vertex texture, or by just binding another one, or by interpolating between two vertex textures etc. Vertex textures can be stored in more compact data formats than vertex attribute streams, so you'd blow more space and bandwidth. But there won't be a functional difference. Both approaches require you to break the current batch just the same.

I think this approach would be practical for most terrain rendering and character/object meshes. Deformable or static.

If you animate the displacement by moving/scaling/rotating, etc, displacement texcoords before you do the lookup, it won't work. That's the spiky ball, and it may be something you want to do for animated water waves and the like.
 
Vertex stream frequency is ignored for indexed geometry and for shaders below 3.0, so how did ATI implement their geometry instancing? Plus, without indexed geometry, you'd be wasting too much memory. So how have the hardware folks overcome this limitation?
 
I am well aware that the X800 does not 'just' use PS 2.0 but supports the expanded 'b' version, of some 1500 plus instructions. Hence my original suggestions of a 'work around' to equal PS 3.0. in some areas. Will the extended PS 2.0 matter if developers ignore it? I mean to ask if there are long shaders that are PS 3.0 but less that 1500 instructions can the X800 process them?

Thank you for all the feedback. Both DaveB and Ruined have made points I'd like to touch on. Ruined points out that Far Cry will implement HDR lighting and in a manner which the X800 can not support. DaveB makes the point that if developers choose to make more open alternatives the X800 could very well do effects that are similiar. So I suppose my answer would be the X800 can do similiar effects if the developers choose to make that option open, if they do not I will be missing some 'candy' because as Ruined pointed out, the X800 does not do HDR in the manner the 6800series does.

I have been hearing that an increasing number of developers are being 'encouraged' to make PS1.1 and 3.0 games ignoring fallbacks to 2.0.
Now being that 2.0 is a part of 3.0 will this matter? If it does indeed I feel sorry for FX owners and others restricted to PS 2.0 down. Particularly if developers ignore alternative ways to do effects.

As others have suggested by the time 3.0 is really up and running it's likely most us will be refreshing our cards anyway and ATI will have a SM3.0 card out. However not all people can afford to replace their cards frequently so I hope they aren't left in the cold.

Again, my thanks for all the feedback.
 
nAo said:
How many games that employ displacement mapping or geometry images techniques did you play with recently?

I would say thats more due to laziness, but actually the truth behold is their is a lot if you widen your defination. ;) That is if you consider all those games that use height maps stored in 8 bit or 16 bit grayscale textures and create the terrain from that (which essentially was one of the things that Matrox kept on advertising with their displacement mapping). While this isn't displacement around a non-planar surface it still is displacement mapping and can fairly easily be modifed of course. Also ideally for displacement mapping sometype of active tesselation would be going on like with Matrox's otherwise you need to generate a whole lot of extra polys (which is the main reason beforehand I don't think you have seen much use of displacement mapping besides for terrain).

Basically unless you are going to change your displacement mapping someway I don't see why would you use the vertex shaders except for laziness and someone is paying you to implement it versus allow it be a feature anyone can use. But it's all up to the developer to decide of course what they wish to spend there time on. I just feel implementing displacement mapping for rocks seems like something that should have been done as well on the CPU on level load.
 
DaveBaumann said:
You sure? ;)
Simply put, Dave, I claim that they didn't emulate geometry instancing unless they were able to draw more than one soldier with a single draw call.
 
hmm......well, UE3.0 will be using PS2.0, so no worries from Epic and perhaps subsequent licensees :) id and Valve....no problems there with their current philosophies with respect to engines.

Not-so-high profile game engines will probably target the larger market which is PS2.0 and lower.

if nVidia releases low end cards with PS3.0 called geforce 6 6200......;)
 
The rest of the NV4x lineup should be available around the end of the year. Releasing NV4x-based chips from the high-end to the low-end is the only real way that nVidia can start taking marketshare back from ATI in the broader market, and I'm sure they know it.
 
Chalnoth said:
DaveBaumann said:
You sure? ;)
Simply put, Dave, I claim that they didn't emulate geometry instancing unless they were able to draw more than one soldier with a single draw call.

And how well do you know ATI's hardware?

Most hardware doesn't match the vertex declaration/stream API at all well. Imagine trying to match D3D9 vertex streams to PS2 VU1, you would miss a large set of capabilities (VU1 is roughly VS2.0 but has vertex frequency support).

Thats why we all watch OpenGL extensions, things often appear there before being added to D3D.
 
Back
Top