Report says that X1000 series not fully SM3.0?

Humus · Oct 9, 2005

Demirug said:
VT and R2VB can be used to do the same things but VT will need less API calls. As a reduced number of API calls will get me more CPU time for other things VT is simply the better way.

The difference in API calls is minimal. Another render state or two and off you go. Besides, far from all applications are CPU limited. For VTF and R2VB passes I expect it to in most cases be GPU limited.

DemoCoder · Oct 9, 2005

Humus said:

Code:

 - Render physics to texture.
 - Render visual results. Vertex shader displaces water surface.
   * Stream0: Vertex array containing position
   * Stream1: Texture from first pass reinterpreted as vertex buffer.

The two methods are not equivalent. A vertex shader with VTF can random access the texture generated by the physics pass. The method you propose requires that the texture from the first pass be rendered in such a way so that the physics pass texture lines up with stream0 modulo frequency divider. But what if the coordinates of a vertex determine which value sample to take? You seem to be operating under the assumption that one is doing water or cloth effects on a standard tesselated grid.

The postulated method was to render your physics once, and then apply that physics texture to multiple streams. However, if you simply interpret the physics texture as a stream and want to apply it to multiple vertex streams, this places severe restrictions layout of the geometry in the stream to be displaced.

Consider a vertex stream where each frame, some new geometry is generated, and each frame, some old geometry at the tail of the stream is removed. How would you arrange your physics texture so that the right texels displace the right vertices?

You've got a very specific use case in mind, and yes, in your use case, the two methods may be equivalent. But not all use cases are. Therefore, the two methods are not isomorphic.

Humus · Oct 9, 2005

Mintmaster said:
I don't think vertex texturing will be replaced by R2VB, and agree with DC's thought that for the most part the opposite will happen. It just makes sense to stick with the current organizational model of the rendering pipeline. R2VB just unnecessarily complicates things - well, it will be unnecessary when latency-free VTF comes around.

I don't expect R2VB to replace VTF either, but nor do I expect VTF to replace R2VB. The pipeline is certainly in for some shake up.
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2403&p=4

Jawed · Oct 9, 2005

Humus, are streamout and MEMEXPORT the same thing?

Jawed

Humus · Oct 9, 2005

DemoCoder said:
Consider a vertex stream where each frame, some new geometry is generated, and each frame, some old geometry at the tail of the stream is removed. How would you arrange your physics texture so that the right texels displace the right vertices?

Not sure what problem you're seeing here, just use the right offset in your DIP calls?

DemoCoder said:
You've got a very specific use case in mind, and yes, in your use case, the two methods may be equivalent. But not all use cases are. Therefore, the two methods are not isomorphic.

Nobody said the methods were equivalent or isomorphic. It's two different features with a certain overlap. Yes, you can find usage cases where R2VB will require another pass, just like I can find usage cases that R2VB handles easily but VTF doesn't handle at all, such as extremely compact terrain compression using ATI1N textures. That's not the point. The point is that R2VB covers the vast majority of usage cases you'd use VTF for, and typically is faster given today's hardware architectures.

DemoCoder · Oct 9, 2005

Jawed said:
Humus, are streamout and MEMEXPORT the same thing?

Jawed

Presumably streamout can write to memory that may be specialized (on-chip GPU cache/SRAM/etc) and would be different than scatter writes to video memory.

KimB · Oct 9, 2005

Humus said:
Nobody said the methods were equivalent or isomorphic. It's two different features with a certain overlap. Yes, you can find usage cases where R2VB will require another pass, just like I can find usage cases that R2VB handles easily but VTF doesn't handle at all, such as extremely compact terrain compression using ATI1N textures. That's not the point. The point is that R2VB covers the vast majority of usage cases you'd use VTF for, and typically is faster given today's hardware architectures.

Which basically means that game developers just aren't ever going to bother to make use of either feature.

bloodbob · Oct 9, 2005

Mintmaster said:
FP16 filtering, while nice, is not very necessary at all for HDR in games. Besides, ATI can do pixel shaded filtering if need be at a speed cost, even making it transparent to the developer.

Ehh isn't there some texture or depedant texture read limits that could be broken?

IMHO the new cards only support SM3.0 via a technicality seeing as ATI is supplying the chips for the next MS xbox and seeing as it technically does support SM3.0 because actually supporting texture fetchs with atleast one texture format wasn't explictly stated its probably not worth wild MS trying to block ATI.

BRiT · Oct 9, 2005

bloodbob said:
IMHO the new cards only support SM3.0 via a technicality seeing as ATI is supplying the chips for the next MS xbox

Minor nitpick, but that is not exactly true. ATI is not providing chips for XBox360, as in physical chips. MS bought the design from ATI. For every chip MS produce, ATI get paid a royalty. This royalty is for the Intellectual Property ATI created and contributed to the chip. If the yields for the chip suck, it's MS eating the cost. If the yields for the chip are amazing, it's MS reaping the profits. At this point, ATI sit back and collect the royalties. ATI does not have to deal with any of the headaches or hassles of production issues.

Ailuros · Oct 9, 2005

Νitpick on the nitpick:

At this point, ATI sit back and collect the royalties.

Royalties refer to units sold afaik; meaning not yet *runs*

KimB · Oct 9, 2005

bloodbob said:
Ehh isn't there some texture or depedant texture read limits that could be broken?

I don't expect that would be a significant issue. I think the only physical issues would be performance, and forcing developers to write two separate paths for the different hardware.

bloodbob · Oct 9, 2005

Chalnoth said:
I don't expect that would be a significant issue. I think the only physical issues would be performance, and forcing developers to write two separate paths for the different hardware.

Uhuh which kind of defeats the whole point of it being made "transparent" the devs might as well just do the texture filtering themself if they have to go off and re-write their shaders to do it in multiple passes.

Joe DeFuria · Oct 9, 2005

Chalnoth said:
Which basically means that game developers just aren't ever going to bother to make use of either feature.

Probably depends on the performance of one feature or the other. Last time I checked, developers pretty routinely use different methods on different architectures to get the results they need. (Not that they like it.)

Sinistar · Oct 9, 2005

Chalnoth said:
I don't expect that would be a significant issue. I think the only physical issues would be performance, and forcing developers to write two separate paths for the different hardware.

Havn't they been doing that all along?

991060 · Oct 9, 2005

A few questions concerning R2VB:

1.how do I take advantage of texture filtering in PS unit when the topology information is lost during the R2VB pass?

2.Only in certain situations, R2VB can be faster than VTF: the texture can be directly used as a vertex buffer. In most other cases, when the vertex needs to acess arbitrary location in the texture, a "synthesis" pass is inevitable for R2VB, and hence it's slower than VTF. Do I understand it correctly?

croc_mak · Oct 10, 2005

bloodbob said:
Uhuh which kind of defeats the whole point of it being made "transparent" the devs might as well just do the texture filtering themself if they have to go off and re-write their shaders to do it in multiple passes.

What!!!?

Even the very few developers that had some interesting idea to try out with VTF have been shying away from using it due to the performance issue on NV hardware....Now, you are saying they'll do more fetches and do the filtering in the shader...I somehow dont think that's going to happen..

Someone earlier in the thread suggested that ATI should have simply implemented the VTF feature through driver emulation - The irony in my view is that developers would still be left with only one choice of usable VTF hardware - which is ATI hardware.

Geo · Oct 10, 2005

Like a free ride, when you've already paid. . .

bloodbob · Oct 10, 2005

croc_mak said:
What!!!?

Even the very few developers that had some interesting idea to try out with VTF have been shying away from using it due to the performance issue on NV hardware....Now, you are saying they'll do more fetches and do the filtering in the shader...I somehow dont think that's going to happen..

Someone earlier in the thread suggested that ATI should have simply implemented the VTF feature through driver emulation - The irony in my view is that developers would still be left with only one choice of usable VTF hardware - which is ATI hardware.

Someone earlier also suggest that ATI should implement HDR filtering transparently which is what I was replying to. So unless your saying we should go back to point filtering in pixel shaders I think my comments where completely valid.

Rolf N · Oct 10, 2005

991060 said:
A few questions concerning R2VB:

1.how do I take advantage of texture filtering in PS unit when the topology information is lost during the R2VB pass?

You can't access topology information in a vertex program either, so this is no loss.

991060 said:
2.Only in certain situations, R2VB can be faster than VTF: the texture can be directly used as a vertex buffer. In most other cases, when the vertex needs to acess arbitrary location in the texture, a "synthesis" pass is inevitable for R2VB, and hence it's slower than VTF. Do I understand it correctly?

Not sure what you're referring to.
The instant you are writing a vertex buffer out of fragment processing, you already have your synthesis pass. You shouldn't need another one. My knowledge about more advanced DXG topics is rather sketchy, so I should mention that for this to work you need a mechanism to reinterpret a pre-existing vertex buffer as a texture, so that it can be read into fragment processing.
(The {NV|ARB}_pixel buffer extensions allow this in OpenGL)
Otherwise you'd just move your vertex shader code more or less verbatim over to the fragment processor. Instead of vertex attributes, you now have texture samples, but the math will be the same. If you sample the "vertex buffer texture" at the same place where you're going to write the matching "vertex texture lookup" to the render target, this instantly matches up.
I.e. just render a quad with trivial vertex processing and let the texcoord interpolators handle your "vertex fetch".

A much more pressing question would be: how often will you have to render the vertex buffer? And that depends on the effect. E.g. if the original vertex program calculates vertex texture fetch coords in dependence of transform matrices, you may be forced to repeat the whole process whenever your matrices change.

991060 · Oct 10, 2005

zeckensack said:
You can't access topology information in a vertex program either, so this is no loss.

The texture filtering algorithm depends on the size and position of the fragments on the screen.

When rendering to VB, you're effectively rendering to a Nx1 render target, I don't know how the filtering unit can do any reasonable work in this situation. Also, a vertex is a mathmatical defition, it occupies no space or area in the 3D space or on the screen. A fragment can cover more than one texel is the reason why we need txture filtering in the first place. For vertex, why we need such capability?

What I mean by the loss of tolology is that you can't use the original index bffer(by which the topology of the mesh is determined) when doing R2VB.

Report says that X1000 series not fully SM3.0?

Humus

Crazy coder

DemoCoder

Humus

Crazy coder

Jawed

Humus

Crazy coder

DemoCoder

KimB

bloodbob

Trollipop

BRiT

(>• •)>⌐■-■ (⌐■-■)

Ailuros

Epsilon plus three

KimB

bloodbob

Trollipop

Joe DeFuria

Sinistar

I LIVE

991060

croc_mak

Geo

Mostly Harmless

bloodbob

Trollipop

Rolf N

Recurring Membmare

991060

Similar threads