Jawed
Legend
1024 32bit elements per input vertex.However, with stream out you can output I believe up to 16 floats per vert, of which you could manually pack/compress data into.
Jawed
Last edited by a moderator:
1024 32bit elements per input vertex.However, with stream out you can output I believe up to 16 floats per vert, of which you could manually pack/compress data into.
Do you really expect that accesses to texture data from the vertices are going to be incoherent?
Well, you can obviously render to a texture then sample it in a vertex shader. At that point the only difference that remains between VTF and R2VB is the way data is fetched by the vertex shader: either automatically from vertex buffers based on the vertex index, or explicitly from a texture using texture coordinates. In that regard VTF is more flexible than R2VB.Yes of course, but if you use VTF you have to sample every texture you use every frame in your vertex shader, and sampling in vertex shader is slower than in pixel shader. Also many textures sampled in vertex shader are in floating point format, and for any advanced algorithm, just sampling one 4 channel texture is not enough . When you use R2VB, you do the sampling once and input the result to a static vertex buffer. You do not have to sample anything anymore when you render the vertices. R2VB is faster if you update the data only periodically.
Well, you can obviously render to a texture then sample it in a vertex shader. At that point the only difference that remains between VTF and R2VB is the way data is fetched by the vertex shader: either automatically from vertex buffers based on the vertex index, or explicitly from a texture using texture coordinates. In that regard VTF is more flexible than R2VB.
Everything is so much easier in the console development.
Pixel shader renders in quads, and the texture cache is more optimally used. Of course if we are talking about theoretical architecture with no texture cache, and no other specific texturing optimizations (possible because of the much more controlled texture accessing patters in pixel shaders), then the both vertex and pixel texturing should behave similarly performance wise. However this is not the case with current DX9 and DX10 chips.
Another point we should probably bring up here is that with DX10/GL2.1(+NVidia's extensions) you can stream out. However from what I've seen in GL, stream out only works with floats. Even in this case R2VB still has a possible advantage in that you could write compressed output (ie FP16 or RGBA INT8 for colors) which could be a considerable bandwidth savings overall.
In my InfiniteTerrainII demo I compared VTF to R2VB on R600 and VTF was about 10-20% faster than R2VB IIRC. The reason is because when you use R2VB all data is fetched with vertex fetch, whereas with VTF some data is fetched with vertex fetch and other with the texture units. Theorethically you could double the fetch rate by fetching half the data as textures, assuming you're not limited by bandwidth or texturing in the pixel shader.
BC4 ?
In your data, is there no correlation between the four channels then?I have always been a a bit disappointed that there is no BC6 (or ATI4) offering four channels compressed like this. It would be perfect fit for our material system.
In your data, is there no correlation between the four channels then?
Oh, yes, I remember that thread now.
Oh, yes, I remember that thread now.
If R2VB doesn't work at all on NV 8/9 series cards, sounds like R2VB isn't going to be a universal solution (which is too bad).
None to be found.1) How is R2VB doing on current Nvidia drivers?