Tilers and DX10, any inherent difficulty?

Gunhead

Regular
[I posted this elsewhere but realising I was hi-jacking a topic there, let's try here...]

Does the deferring of rendering pose any difficulty for the free access to vertex and fragment data in VS/PS 3.0 (IIUC)? From another angle, can you make a unified VS/PS unit array if you need to put a sorting & binning stage midway? What about the PowerVR-specific Infinite Planes method of HSR, any specific problem with that?

Hope these questions make sense... Of course, I hope the answer is "no problem", but in that case I'd still be very grateful for a brief explanation or guestimation on how a TBDR would tackle the DX10 "shared" shader system :)
 
Deferring means post-poning. The way a Kyro works: you collect all the triangles for the scene, you sort them into buckets according to where (which screen tile) they will show up, then you render a tile at a time. This way, when you render a tile (say 32x16 pixels) you can very effectively kill the invisible polygons (that are behind others and just unnecessary work, especially with heavy pixel shader programs), and you can also have the tile buffer onchip and save video memory bandwidth.

Whereas a traditional "immediate mode renderer" just sends the triangles straight thru (from the CPU/T&L/VS) to be rendererd in the order they come, one triangle at a time, not caring where they end up. (Although the current architectures do have some funky tricks up their sleeve.)

There are good Beyond3D articles on tile based deferred rendering, check them out! :)

Others, don't kill me for my layman explanations :p
 
Using an array of shared VS/PS 3.0 arrays should not pose any particular problems with tile-based renderers as far as I can see. You perform vertex shading for one frame and pixel shading for the previous at the same time - writing results back to memory and binning buffers in case of vertex shader operation and to tile-buffer in case of pixel shader operation. The resource sharing should be similar to what is the case in immediate-mode renderers, except that there are a few possible deadlock conditions in the IMR that the TBR avoids.The tiler HSR algorithms should be relatively unaffected - except that when the pixel shader writes to Z, the HSR must be disabled for all affected pixels, similar to what is already the case with transparency and certain stencil operations (Question to the ImgTec people here, that I have been wondering about for a while: what happens in the ImgTec tiler architecture if a very large number of polygons, say 100, are deemed to be visible for the same pixel?)
 
arjan de lumens said:
The tiler HSR algorithms should be relatively unaffected - except that when the pixel shader writes to Z, the HSR must be disabled for all affected pixels, similar to what is already the case with transparency and certain stencil operations.

Hmmm, what about vertex textures (the texture address instruction) in VS 3.0? I mean, what if the pixel shader/pipeline updates and change a whole lot vertex data? Would you not have discarded some vertex data at this point that you should be testing the new vertex (positions) against?
 
LeStoffer said:
Hmmm, what about vertex textures (the texture address instruction) in VS 3.0? I mean, what if the pixel shader/pipeline updates and change a whole lot vertex data? Would you not have discarded some vertex data at this point that you should be testing the new vertex (positions) against?
That would open a can of family annelida on any HW system!
 
LeStoffer said:
Hmmm, what about vertex textures (the texture address instruction) in VS 3.0? I mean, what if the pixel shader/pipeline updates and change a whole lot vertex data? Would you not have discarded some vertex data at this point that you should be testing the new vertex (positions) against?
I'm afraid I don't understand what kind of problem you are trying to explain here - wouldn't you just issue and execute the VS texture fetch instruction just like any other vertex shader instruction? It is always possible, and not even very hard, to design the pipeline so that it can keep proper track of whether it is working on a vertex or a pixel at any given time ... and those design issues would be the same for IMRs and TBRs.
 
arjan de lumens said:
I'm afraid I don't understand what kind of problem you are trying to explain here - wouldn't you just issue and execute the VS texture fetch instruction just like any other vertex shader instruction?

Okay, this thought might be too far fetched. :arrow: I was thinking about displacement mapping with VS 3.0 by using the texture fetch instruction but based on a texture rendered and updated by the pixel pipeline.

My line of thinking was that this might pose a problem to a deferred render because the pixel pipeline in effect change the vertex data in the frame (maybe adding new vertices, deleting others or just moving some around). My way of looking at a deferred render might be a bit faulty here, but once you are done sorting your polygons (and thus vertex data) I see the data as locked for that frame. But then what if you order the pixel pipeline to change the texture (e.g. normal map) and try to do a displacement mapping by updating with the texture fetch instruction? At this point you would have discarded vertex/polygon data the might otherwise intertwine with the newly created.

Well, I told you it was far fetched. :eek:
 
Somewhere along the line the problem of occlusion culling has to be solved, unless the solution is feedback it is going to solve any problems with tiling too (including the display list). For now the only thing which could interfere with tiling would be if the vertex shader used the present rendering target as input for its displacement mapping ... and that would be downright silly.
 
LeStoffer, the occluded vertices are still there, if that's what you're wondering.

Deferred rendering saves on raster load. And an IMR would've overwritten the old verts anyway (discard when lower Z values are discovered)
 
LeStoffer said:
Okay, this thought might be too far fetched. :arrow: I was thinking about displacement mapping with VS 3.0 by using the texture fetch instruction but based on a texture rendered and updated by the pixel pipeline.

My line of thinking was that this might pose a problem to a deferred render because the pixel pipeline in effect change the vertex data in the frame (maybe adding new vertices, deleting others or just moving some around). My way of looking at a deferred render might be a bit faulty here, but once you are done sorting your polygons (and thus vertex data) I see the data as locked for that frame. But then what if you order the pixel pipeline to change the texture (e.g. normal map) and try to do a displacement mapping by updating with the texture fetch instruction? At this point you would have discarded vertex/polygon data the might otherwise intertwine with the newly created.

Well, I told you it was far fetched. :eek:
Binning of triangle data in a deferred renderer happens after tessellation and vertex shader.

And you can't use a texture as both input and render target simultaneously (at least not with defined result). If you render to a texture and use this texture as displacement map, you first finish rendering to that texture before using it for displacing vertices, no matter if it's an IMR or DR.
 
Xmas said:
Binning of triangle data in a deferred renderer happens after tessellation and vertex shader.

I understand that, but these two following replies have given me peace:

Tagrineth said:
LeStoffer, the occluded vertices are still there, if that's what you're wondering.

Xmas said:
And you can't use a texture as both input and render target simultaneously (at least not with defined result).

Thanks....
 
Yeah, what I was wondering was pretty much exactly what LeStoffer asked: [simplified] If the VS side needs a texture made by the PS side, how does it work if the VS side is already doing the next scene? [/simplified]

And, if I understood what Xmas explained, the answer is that you don't even attempt to do that simultaneously, but you have the PS render the texture, save it into video memory, then have the VS use it in a new go at that part of the scene -- correct? So if you want to do that (have the PS feed the VS), you do multipass anyway?

(Game-wise I was thinking of some kind of "matter mirror" where the geometry bends to reflect a part of the visible scene, or something like that.)
 
Rendering to a texture is like rendering another frame, except it will not show on screen.

The problem with a deferred renderer is that while the texture is being rendered, binning of the geometry for the next frame should take place. So if you use a texture as input to the vertex shader, processing of these vertices has to be deferred until the texture is finished, possibly wasting some vertex processing power.
 
Um, if the VS is supposed to be doing the next frame, doesn't it waste all the VS power then? (Not just some of it.)

Like:

1) VS does frame A except the DM; PS renders previous frame for output.
2) VS does nothing; PS renders frame A to texture.
3) VS does the DM from the texture for frame A; PS does nothing.
4) VS does frame B; PS renders final frame A for output.

Is the recursive goal (render to texture, the use it for DM for the same frame) bogus to begin with? Or is there something otherwise amiss with my example abowe? Or are there some tricks that can prevent those two big "does nothing" slots?

Thanks for answers so far -- I guess this round will clarify it for dumb me for good...
 
Gunhead said:
Um, if the VS is supposed to be doing the next frame, doesn't it waste all the VS power then? (Not just some of it.)
No, not all, because you can defer the displacement mapping until all other opaque geometry is processed. Opaque objects can be rendered order-independent, transparent objects may or may not be order-independent, depending on the implementation.

So only while the actual displacement mapping takes place, the rendering pipeline has nothing to do. The vertex shader will rarely be a bottleneck.
 
Also there is always double buffering, the operations might be serial at a low level ... but they can be pipelined and run parallel at a higher level.
 
Back
Top