Framebuffer content and pixel shaders

Bastion

Banned
Is the existing framebuffer content for a given pixel a standard input to Pixel Shaders in the latest DX9 video cards? Thanks.
 
Bastion said:
Is the existing framebuffer content for a given pixel a standard input to Pixel Shaders in the latest DX9 video cards? Thanks.

Depends what you mean by 'standard input'. If you are talking about whether or not it is available in an input register, then no. However, AFAIK, you can access the existing frame buffer pixel data if you assign the frame buffer surface in question to a texture sampling unit. You can then perform texture lookups in a pixel shader to read in pxiel data from the frame buffer surface.
 
Enos_Feedler said:
Depends what you mean by 'standard input'. If you are talking about whether or not it is available in an input register, then no.
Yes, this is what I meant. My next question should be fairly obvious; why not?
 
Bastion said:
Yes, this is what I meant. My next question should be fairly obvious; why not?

Due to the nature of the datapath design. The hardware datapath is designed with a typical dataflow in mind with respect to how data should move in and out of memory and around the hardware pipeline. The criteria of this design being a balance of performance and flexibility, where in the case of graphics, we are mainly taking the side of performance ;)

Why is the old pixel value not simply available in registers? Registers contain interpolated vertex parameters for triangles of the current primitive stream. The hardware registers containing interpolated params are fast to fill because the interpolated values are computed locally at the pixel shader in the hardware pipeline. To supply a register with the old pixel value would mean to fetch it from frame buffer memory, which is way too slow and a waste of bandwidth when its not being used.

That being said, you *can* get the old pixel value, its just not setup in pixel shader registers by default. What are you trying to do anyway?
 
Bastion said:
Yes, this is what I meant. My next question should be fairly obvious; why not?
Mainly data coherency. The pixel shader needs to keep track of the cases where pixels from different polygons overlap each other, so that it can apply these pixels to the framebuffer in correct order. With traditional fixed-function blending or Z-test, the number of pixels it needs to perform such tracking of is quite small since the blending/Z-test takes a fairly short constant time. With framebuffer data pulled into the pixel shader, you increase the read->write delay by perhaps 2 orders of magnitude for typical shaders, potentially making the data dependency checking ~2 orders of magnitude more expensive as well. There are also extra complications with situations such as:
  • edges shared between polygons - in these cases, there is usually no actual overlap, but generating proofs of such non-overlapping can be difficult and expensive, but not generating proofs causes false overlap detection, which destroys performance.
  • in case of multisampling and a pixel that covers more than 1 sample, how do you choose the color (or even worse: Z value) to fetch into the shader?
 
Last edited by a moderator:
Having the current fb value as an input was discussed for DX10. It was pretty much veto'd by the hardware manufacturers.

Doing potentially requires resolution of all previous inflight pixels to provide the correct value and pixel shaders are a long way from the end of the pipe.

The short none technical version is as much as developers would like it, it's prohibitively expensive from a hardware (or performance if it's not done well) standpoint.
 
ERP said:
The short none technical version is as much as developers would like it, it's prohibitively expensive from a hardware (or performance if it's not done well) standpoint.
It might be expensive on some architectures, yet it has been done already (S3). On some other architectures it might even be easy to do.
 
Hang on a second.
The framebuffer content has already fully been processed, 2nd of all if you used that approach that you're suggesting then how would someone apply a shader to a single "Object" in a scene?

Bastion said:
Is the existing framebuffer content for a given pixel a standard input to Pixel Shaders in the latest DX9 video cards? Thanks.
 
K.I.L.E.R said:
Hang on a second.
The framebuffer content has already fully been processed, 2nd of all if you used that approach that you're suggesting then how would someone apply a shader to a single "Object" in a scene?
If you wanted to use multipass effects on a single object, you're better off just rendering that particular object to a texture then using that texture on the next pass.
 
Bastion said:
If my original query was answered in the positive my job would be easier :)
If you answered the question of what you're doing (not why, nor where or for whom) someone could work with that and perhaps suggest a different way to make your job easier ;)
 
K.I.L.E.R said:
I don't understand what the original poster is asking and it's confusing me.
He is asking for a way to get the current framebuffer content at the current pixel location into the shader. I.e. programmable blending.
 
You can't do that, at least not in shaders.
Nothing stopping him from doing the rendering, returning the data from framebuffer, compacting and storing the compact data in some compact form in a series of textures and pass them back into the GPU as compiled data and doing a set of shader operations on them that is required after the original object's been rendered.

Problem is:
Render
Copy framebuffer content from GPU -> RAM
Analyse data and store new data into memory
Send from memory into GPU
Render

Anyone think this would be realistically fast on non-AAA titles using a PCI-Express card?
Why would you do this, unless you are trying to do some form of distributed computing on your video card for something like linear algebra?

I hope I'm thinking correctly.
 
I don't see why you would want to involve the CPU here. The usual way to do this is rendertarget ping-pong, or copying the framebuffer to a texture after the opaque pass and after every transparency layer.
 
What it sounds like you want is an F-Buffer. The current implementations will only let you restore values from a previous pass, from all the fragments that hit a framebuffer, but not within the same pass. However, this can fix blending issues with multi-pass rendering and provide interesting functionality.

However, there were other proposed implementations in the original Stanford paper that would let you get at the current framebuffer(s) value, in the same shader, but I don't know of a hardware implemenation that can do that. As has been said, the consistency and coherence issues get tricky when you have a highly parallel processor. You'd also have to bypass the write cache.

Now, if you had a generalized scatter, you could spill to memory as restore values as needed...
 
From a hardware standpoint the only way I can see of implementing it is to do what CPU's for the most part do, and block on the write in the case of a collision in the write queue.

This needn't be particularly slow if you can continue processing other pixel groups and collisions are relatively infrequent. It might lead to extremly poor performance in pathalogical cases though.

Having said that CPU write queues are probably MUCH smaller than the equivalent GPU logic, so it's likely expensive from a silicon standpoint.

On the software side the traditional solution is to pingpong as mentioned above.
 
So he only wants to access the colour values in the framebuffer?
I thought he wanted the entire thing. My bad. :LOL:
 
K.I.L.E.R said:
So he only wants to access the colour values in the framebuffer?
I thought he wanted the entire thing. My bad. :LOL:
Well, I've long learned that I can't have it all :) but your first sentence is essentially correct.

zeckensack said:
If you answered the question of what you're doing (not why, nor where or for whom) someone could work with that and perhaps suggest a different way to make your job easier ;)
For a post-process effect using mutipass rendering I currently need to render to a texture. If a 3D chip allowed me to have my shader to get to the frame buffer, I wouldn't need that texture of mine.

I knew the reason why modern 3D chips don't allow this (performance) but I didn't know how or why performance could be negatively affected. From the replies in this thread and elsewhere, I think the problem is going from memory to the shader units and that some kind of stop-start must be happening. Thanks to Xmas, I am now checking out S3.
 
Back
Top