Hack the Catalyst and find a SM3 chip

zeckensack said:
W/o filters? With pure point sampling, you can't smoothly scale (or otherwise transform) texture coords. You always end up with popping artifacts if you try.
Huh? How? I would think that you'd simply sample at (x,y), (x+1,y), (x,y+1), (x+1, y+1), assuming truncation when point sampling. Then you'd simply use the fractional part of x,y to generate the bilinear factors and use a mad to do the averaging. I don't see how there would be popping artifacts here (note: it may be useful here to give the shader programmable access to some texture commands, such as the possibility to load four texture samples at once, or generating the bilinear averaging factors based upon the texture coordinates).
 
Chalnoth said:
zeckensack said:
W/o filters? With pure point sampling, you can't smoothly scale (or otherwise transform) texture coords. You always end up with popping artifacts if you try.
Huh? How? I would think that you'd simply sample at (x,y), (x+1,y), (x,y+1), (x+1, y+1), assuming truncation when point sampling. Then you'd simply use the fractional part of x,y to generate the bilinear factors and use a mad to do the averaging. I don't see how there would be popping artifacts here (note: it may be useful here to give the shader programmable access to some texture commands, such as the possibility to load four texture samples at once, or generating the bilinear averaging factors based upon the texture coordinates).
Of course you can do that. My point was that everybody does it, and the bare naked point sampled VTs themselves aren't useful at all.

Just about every new technique enabled by vertex textures depends on filtering. Everyone who uses them implements an explicit bilinear filter. Filters make vertex texturing useful. They must be efficient. In contrast to DeanoC ...
DeanoC said:
One day there may be no fixed function filtering in vertex OR pixel shaders but that day is some way off.
... I believe this calls for a hardware implementation. Just like NV4x can filter FP16 textures while NV3x couldn't, I believe that, going forward, we'll rather see more hardware filtering than less.
 
Chalnoth said:
Ostsol said:
EDIT: Of course, this is only the case for an 8-bit heightmap. A heightmap that is generated in floating-point precision and stored in float-point precision will not suffer the same way.
Huh? How does higher precision save you here? It seemed to me that your entire argument revolved around the granularity of sampling, not the precision....
Take the raw data from an algorithm that generates floating point data. My own implementation of fBm, for example, generates data in the range of [0,1] and I usually scale it to [0,255]. Obviously, you're not going to get a bunch of already rounded numbers. Storing it in an 8-bit format results in rounding, which leads to the stair-step errors I pointed out. A floating point format wouldn't round anything, of course, since it's exactly the same as the original data.

Obviously linear texture filtering isn't needed for floating point texture formats, in this case. Someone else can probably think of other cases, though.
 
Ostsol said:
Take the raw data from an algorithm that generates floating point data. My own implementation of fBm, for example, generates data in the range of [0,1] and I usually scale it to [0,255]. Obviously, you're not going to get a bunch of already rounded numbers. Storing it in an 8-bit format results in rounding, which leads to the stair-step errors I pointed out. A floating point format wouldn't round anything, of course, since it's exactly the same as the original data.

Obviously linear texture filtering isn't needed for floating point texture formats, in this case. Someone else can probably think of other cases, though.
That's assuming there are as many texels (or more) as there are vertices, and that the texels don't vary by a large amount in the texture.
 
True. I think in such cases, I'd prefer something more than just linear filtering, though. Some form of cubic interpolation would be nice.
 
Ostsol said:
True. I think in such cases, I'd prefer something more than just linear filtering, though. Some form of cubic interpolation would be nice.
You mean in cases where there were more vertices than texels? Yes, bicubic would be great. But there's no way that's going to be fully hardware-accelerated. That said, it would be nice to get some hardware-assisted macros for both this and for bilinear filtering (since no hardware is going to fully support FP32 bilinear for some time, let alone bicubic).
 
Agreed, though bi-cubic would be pretty slow to emulate using shaders. 16 texture samples is alot even for pixel shaders. . .
 
Ostsol said:
Agreed, though bi-cubic would be pretty slow to emulate using shaders. 16 texture samples is alot even for pixel shaders. . .
It'd be slow even without emulation. The only part of the performance hit you could realistically hide with hardware acceleration of bicubic would be the calculation of new texture coordinates, and the calculation of the bicubic filtering coefficients.
 
I think, without real HW geometry dynamic LOD capability, texture filtering in VS is somewhat useless. We only need filtering when the mapping between texel and pixel(or vertex) isn't static.
 
Chalnoth said:
Yes, bicubic would be great. But there's no way that's going to be fully hardware-accelerated.
GL_SGIS_texture_filter4. According to google, there are some GL server implementations that export this extension.
 
no_way said:
Chalnoth said:
Yes, bicubic would be great. But there's no way that's going to be fully hardware-accelerated.
GL_SGIS_texture_filter4. According to google, there are some GL server implementations that export this extension.
1. What hardware do you know that supports that extension?
2. That extension isn't necessarily bicubic filtering, but allows other interpolation schemes.
3. We're talking about vertex textures here, anyway, and are you aware of the number of calculations a bicubic filter takes? Doing a full bicubic filter with FP32 data with nothing but special hardware would be ridiculous.
 
Chalnoth said:
GL_SGIS_texture_filter4. ...
1. What hardware do you know that supports that extension?
Apparently, some SUN super-high-end accelerators, like XVR-4000. I have no direct knowledge though, just googlig.
 
Well, that's not exactly relevant to gaming hardware.

Bicubic would be kind of nice, but the only thing would realistically be hardware-acclerated for bicubic would be the factor calculation (still talking about vertex textures here, by the way). The muls and adds that go into the interpolation would need to be performed via the shaders.

Anyway, this is related to something that I'd really like to see in 3D hardware: access to texture filtering hardware through the shaders. One example might be an instruction that takes a 2D texture coordinate, and returns the bilinear filtering weights in a 4-vector. 1D and 3D variants would be useful as well (though the 3D variant would need to return values into two registers).

This would be useful for a variety of scenarios, including:
1. Performing texture filtering at higher precision than the hardware supports, using the shaders.
2. Performing a texture filtering operation on data calculated from a texture. For example, one could do normal mapping four times on neighboring coordinates, then use the bilinear filtering weights to calculate the end result.
3. Could be extended to support more complex filtering schemes, such as bicubic.
 
zeckensack said:
Just about every new technique enabled by vertex textures depends on filtering. Everyone who uses them implements an explicit bilinear filter. Filters make vertex texturing useful. They must be efficient. In contrast to DeanoC ...
I don't think you're right about that.

To me, the biggest application in gaming is cloth simulation, as it adds life to the animation. Water simulation is also a good use of VT, but can be done quite effectively with bump/offset maps. Anyway, you just have a mesh of vertices, with the texture coordinate pointing to the centre of each texel, whose values (4xFP32) indicates the position of that vertex. In the pixel shader, each texel's values are modified by considering neighbouring values, and the vertex texture is updated ping/pong style. The only reason for filtering is for when you have a greater mesh density than texture size. That seems unlikely to me, as a 1024x1024 texture is quite fast to render to (even with the simulation shader), but needs 1M verts. Even if this situation arose, you could do that with an enlargement texture.

As for displacement mapping, even view dependent, I'd just use multiple vertex streams, interpolated by the shader. What exactly do you mean by "every new technique"?

OT: Why the hell isn't render-to-vertex-array part of DX or OGL yet? I heard superbuffers are half exposed, but can't find any examples or real documentation. Then vertex shading in general is just there to save memory space and a bit of speed (render state changes, early-out from clipping, and simultaneous execution with PS), but you gain speed through the PS anyway.
 
Chalnoth said:
3. We're talking about vertex textures here, anyway, and are you aware of the number of calculations a bicubic filter takes? Doing a full bicubic filter with FP32 data with nothing but special hardware would be ridiculous.
Didn't R200 do some fancy interpolation schemes for TruForm? Can't remember the exact calculations, but they were definately more complex than linear. I guess having normal vectors helps.

Geforce3/4 also had fancy tesselation via Bezier patches, right? I don't think it's that ridiculous.
 
Chalnoth said:
Mintmaster said:
My guess is it has to do with a unified shading pipeline in ATI's future architectures stemming from R500 (well, R400 originally). The biggest advantage you get is a massive increase in vertex shading power. Aside from academic examples, however, I don't think having super long vertex programs really buys you that much better graphics.
I don't see why vertex and pixel pipelines would allow a different number of instructions in this case.
Huh? You were pondering why ATI was putting more emphasis on vertex shaders. I just gave you a reason. ATI wants devs to make games vertex shader heavy, not only because their achitectures are faster now, but it also allows unified shader pipelines to really shine in the future.
 
Chalnoth said:
Sure, but in neither case were they done completely via specialized hardware.
For R200, I'm pretty sure it was. Later (i.e. RV250, R300) ATI just did it with a shader, IIRC. Can anyone else shed some light on this?
 
To speed things up, you can add much more transistors to each stage, or reduce the stages and increase the clock. Ultimately, it depends on your transistor budget and the time the longest stage takes to complete. So it might make more sense, especially with more generic, unified shaders, to drop the complex calculations and trying to hide more and more interdependencies.

The end result might take a lot more clocks, but could be much faster. And you could not only remove the transistors to speed up those complex operations, but all the buffers needed to hide the latencies as well. Which might make room for more pipelines/ALUs, thereby increasing the throughput.
 
Ok, I'm not sure where to post this, but since this is the closest and most recent thread I could find about R520, I'll post it here.

Can anyone explain to me what "hybrid vertex textures" are in R520? I was browsing the ATi website and found that they were looking for someone for D3D Driver Development. The key responsibilites list these:

  • Design and develope Direct3D driver software for R3XX, R520, R600 XP and Longhorn drivers
  • Develop new code for R520 HWL and hybrid vertex textures.
  • Develop new code for the R600 HWL
  • Develop new code for the Longhorn Independent layer
  • Develop new code for the WGF driver
  • Fix EPR and work on performance issues

Source: http://sh.webhire.com/servlet/av/jd?ai=405&ji=1464555&sn=I

So what are hybrid vertex textures that are in R520? Didn't someone mention before that the vertexshaders in R520 might be more like R500? Is this it? Since they mention both R520 en R600 it seems to me as this is a hybrid between these two cores. Is it something like vertex texture fetch which allows vertex shaders to read data from textures like pixelshaders, but is this "hybrid version" more advanced?

Oh and.. please try to explain it in n00b-language. ;)
 
Back
Top