How will GPUs evolve in comming years?

Dave B(TotalVR) said:
There is no need for framebuffer cache on a tiler, sure it would add latency but deferred renderers would be able to hide that easily.
Yes, there is (need to store framebuffer data for one tile). Anyway, this latency couldn't be hidden unless you were willing to dedicate a separate unit to doing all FSAA calculations (I suspect most implementations of such a system would tend to be lower in performance than simple MSAA, so it seems like rather a lot of logic to dedicate just to save on the tile cache, which could be increased in size relatively easily).

Well, you could exclude edges by making your rasteriser write the edges of strips and fans etc instead of the actual triangles.
The problem with this is that not only can you have sharp corners (with rather different lighting) in both cases, but sometimes the most optimal rendering is actually just an optimized triangle list (strips/fans maximize at one triangle per vertex as the size of each strip/fan gets large: an optimized triangle list has a theoretical limit of two triangle per vertex, though that is limited by the amount of cache available).

If we are talking PVR then being as after HSR is performed we'd know the depth of each pixel {i think, SIMON! Cmere!:} a check could be performed too see if the z-value changes significanlty from surrounding pixels. It should also be possible to determine if the next pixel is using a different texture.
Except the main problem, then, would just be a sharp corner, like, say, the corner of a room. For most every lighting scheme out there, that corner would be very visible and rather prone to aliasing.
 
Chalnoth said:
Dave B(TotalVR) said:
There is no need for framebuffer cache on a tiler, sure it would add latency but deferred renderers would be able to hide that easily.
Yes, there is (need to store framebuffer data for one tile).

The tile buffer is stored on chip, so any framebuffer cache would be unneccessary as the are of framebuffer needed is essentially in cache already.


Anyway, this latency couldn't be hidden unless you were willing to dedicate a separate unit to doing all FSAA calculations (I suspect most implementations of such a system would tend to be lower in performance than simple MSAA, so it seems like rather a lot of logic to dedicate just to save on the tile cache, which could be increased in size relatively easily).

I was talking about hte latency of generating this 'AA buffer' before rendering the tile. You will incur a fillrate penalty for filling the extra sample points associated with your current pixels AA level, but that hit should be no greater than the fillrate hit of MSAA, providing you chose your edges wisely.


Well, you could exclude edges by making your rasteriser write the edges of strips and fans etc instead of the actual triangles.
The problem with this is that not only can you have sharp corners (with rather different lighting) in both cases, but sometimes the most optimal rendering is actually just an optimized triangle list (strips/fans maximize at one triangle per vertex as the size of each strip/fan gets large: an optimized triangle list has a theoretical limit of two triangle per vertex, though that is limited by the amount of cache available).

Perhaps, I dont know, but aren't you either using strips and fans or not? I am not familiar with an 'optimised triangle list'.. Optimisied in what way?


If we are talking PVR then being as after HSR is performed we'd know the depth of each pixel {i think, SIMON! Cmere!:} a check could be performed too see if the z-value changes significanlty from surrounding pixels. It should also be possible to determine if the next pixel is using a different texture.
Except the main problem, then, would just be a sharp corner, like, say, the corner of a room. For most every lighting scheme out there, that corner would be very visible and rather prone to aliasing.

Might be caught by the difference in z-values, think about the viewpoint where you would notice the aliasing at the corner of the room. Now think about the differences in z-values of the pixels either side of the ones the corner lies along. Best case scneario being where you are looking down the length of the wall on your left at the corner, the pixel 1 to the right will have the same z-value (assuming a 90 degree corner), the pixel 1 to the left will have a very different z-value. Worst case scenario the 45 degree one, where you are looking at the corner of a square room from its opposite corner. The AA may not work here, but do you need it?

Take the convex case, you have a wedge sticking out from a wall the only time this argument is relevant is when you look at the tip from such an angle that you can see the wedge either side of the tip (i.e. end on). With a thin wedge (but not single pixel thin, say 5-10) the difference in z-value on adjacent pixels could be picked up on again. For a fat wedge hundreds of pixels wide the angle is going to be approaching that which you would't see the aliasing, but before that it would be AA'd. The situation ends up being one of choosing your threshold of z-buffer differences for determining whether to AA or not. Hey you could even add the bonus quality option of not checking and just AA'ing all the time.


this AA option could be caompletely programmable via manipulation of this AA buffer, you could multiply it all to determine the number of samples for this pixel and add a fixed number before or after your multiply. So you could set a base number of samples per pixel for scene wide AA with extra AA for edges, for example.

As for alpha textures, thats harder and may be solved easier by a funky texture filtering algorithm. Nether the less you could render the alpha component of the texture into the AA buffer having done an edge detect on it first. Heh, you would only need to render with a bit depth of 1. probably would save though coz of granularity on the bus.
 
Dave B(TotalVR) said:
Chalnoth said:
Dave B(TotalVR) said:
There is no need for framebuffer cache on a tiler, sure it would add latency but deferred renderers would be able to hide that easily.
Yes, there is (need to store framebuffer data for one tile).
The tile buffer is stored on chip, so any framebuffer cache would be unneccessary as the are of framebuffer needed is essentially in cache already.
You are totally misunderstanding what I'm saying. I'm talking about the tile buffer.

Perhaps, I dont know, but aren't you either using strips and fans or not? I am not familiar with an 'optimised triangle list'.. Optimisied in what way?
Optimized for a maximum number of shared vertices. The best way to optimize the list is dependent upon the specific vertex cache implementation. An optimal mesh has a 2:1 ratio of triangles to vertices, and thus if the ordering is done in such a way that cache hits are maximized, the ratio of rendered triangles to transformed vertices can be noticeably better than the 1:1 ratio for strips/fans.

Might be caught by the difference in z-values, think about the viewpoint where you would notice the aliasing at the corner of the room.
Might be caught would be much worse than never caught. And yes, if you are in a room right now, look at the corners. You should notice significant differences in lighting from one side of the corner of the room to the other. This may not be the case if your light source is in the exact middle of the room, of course, but it certainly is the case most of the time.

Another problem with using depth differences is that you'll start turning on AA for flat surfaces (that are tessellated), but situated at an oblique angle to the viewer.

Anyway, there's no easy way to detect such fragments, as which edges show aliasing depends upon the lighting of the surface in question. I seriously doubt any automated algorithm would be perfect.

As far as alpha textures are concerned, there's a much nicer way to do it on a deferred renderer: simply use an alpha blend. Since the TBDR automatically does the depth sorting, this isn't a problem at all.
 
JF_Aidan_Pryde said:
Can 256 Pipelines, 5GHz, fully programmable hardware do the job?


probably not. the 1st GSCube had 256 pipelines (16 GS * 16 pipes) the 2nd GSCube had 1024 pipelines (64 GS * 16 pipes) and that was not enough for renderman level graphics.

j/k. I know it takes much more than the amount of pipelines to reach film quality CG in realtime.
 
You are totally misunderstanding what I'm saying. I'm talking about the tile buffer.

Well this AA'ing will not reduce the amount of caching done over MSAA, I suppose it would over SSAA because you would be writing fewer tiles (although they will be correspondingly smaller so its an efficiency thing)



"Might be caught would be much worse than never caught. And yes, if you are in a room right now, look at the corners. You should notice significant differences in lighting from one side of the corner of the room to the other. This may not be the case if your light source is in the exact middle of the room, of course, but it certainly is the case most of the time."


Yeah but the only significance of any difference of lighting is going to be between the two surfaces where they meet. If there is a very large difference at this point then cant this be determined by, say, vertex lighting.


"Another problem with using depth differences is that you'll start turning on AA for flat surfaces (that are tessellated), but situated at an oblique angle to the viewer."

Not if they are part of the same stip, which is likely if strips are being used (which is also likely)


"Anyway, there's no easy way to detect such fragments, as which edges show aliasing depends upon the lighting of the surface in question. I seriously doubt any automated algorithm would be perfect."

back tot he vertex lighting idea again. Calculate the light difference between these pixels (dual vertex ligh calc, one for each surface on that given pixel). Thats adding work to your vertex shader, lol. AA that comes with a polygon throughput hit=)


"As far as alpha textures are concerned, there's a much nicer way to do it on a deferred renderer: simply use an alpha blend. Since the TBDR automatically does the depth sorting, this isn't a problem at all."

Good point, hadn't thought of that.
 
Back
Top