How will GPUs evolve in comming years?

_xxx_ said:
I say all gfx chips will soon be based on some kind of deferred rendering, regardless of lighting/shadowing (which will become better, for sure).
I seiously don't think so. Too much caching of unbounded data sets.
 
Nothing new under the sun.

The main problem will be RAM bandwidth I think, so I expect RAM technology to be more and more important. Probably that means that anything that will diminush the needed bandwidth will be very valuable - that's why I expect the research on compression for GPUs to become more intense for instance.

Of course the increase of arithmetic intensity should continue to push devs to do "better pixels", but with a PPP it'll be easy to do lots of small triangles too - just not too small if we still want the graphic processors to remain as efficient as they are today. Maybe quadrics will make their entry in real-time ? (I'm just throwing an idea, had not looked at it) Anyway, the PPP should be great to compress the RAM representation of shapes, so it should be usefull at least for that - not to mention it should ease to do all sorts of "multi-dude" thingies.

I'm not sure if we'll see any "revolutionnary" change, on the contrary my feeling is that the IMR will evolve and remain more efficient that the other approaches. I might be wrong however. (Aaah!!!! The wonders of the debates about [ ] real-time raytrace / [ ] tile base rendering / [ ] other - please specify _________ ;) )

It'll be interesting to watch all this anyway... :)
 
Remi said:
Of course the increase of arithmetic intensity should continue to push devs to do "better pixels", but with a PPP it'll be easy to do lots of small triangles too - just not too small if we still want the graphic processors to remain as efficient as they are today.
Well, with some modifications to the core, the only reason why lots of small polygons would be inefficient would be due to the ineffectiveness of z-buffer compression under such a scheme. But, even then, z-buffer compression might be changed to one that, instead of just storing flat surfaces, starts storing curved surfaces as a compression method.
 
Sage said:
Remi said:
Maybe quadrics will make their entry in real-time ?

there's a little failure called the NV1 you might want to look up... ;)

That doesn't mean it was a bad idea, just not supported. Personally I think quadratics/hos are absolutely necesary for us to move forward. If nothing else think of them as very good geometry compression, or as always having the proper LOD for the objects distance from the screen.
 
Killer-Kris said:
That doesn't mean it was a bad idea, just not supported. Personally I think quadratics/hos are absolutely necesary for us to move forward. If nothing else think of them as very good geometry compression, or as always having the proper LOD for the objects distance from the screen.
Well, no, the problem with the NV1 was that its quadrics were horrible to attempt to program for.
 
Chalnoth said:
Killer-Kris said:
That doesn't mean it was a bad idea, just not supported. Personally I think quadratics/hos are absolutely necesary for us to move forward. If nothing else think of them as very good geometry compression, or as always having the proper LOD for the objects distance from the screen.
Well, no, the problem with the NV1 was that its quadrics were horrible to attempt to program for.

I actually know nothing about NV1 and it's specific implementation, I was just being generic with the idea of higher-order-surfaces.
 
Sage said:
there's a little failure called the NV1 you might want to look up...
Wow.... Thanks for the info. I knew it did quadrangles, but I didn't knew they attempted quadrics...

Chalnoth said:
Well, with some modifications to the core, the only reason why lots of small polygons would be inefficient would be due to the ineffectiveness of z-buffer compression under such a scheme.
I was under the impression that the rasterization order was rather important to keep texels flowing without much trouble. Really small triangles (let's say 4/5 pixels) would rather defeat that scheme, right? Does this mean that today's processors can cope better with a more hectic fetch order?
 
Remi said:
I was under the impression that the rasterization order was rather important to keep texels flowing without much troubles. Really small triangles (let's say 4/5 pixels) would rather defeat that scheme, right? Does this mean that the processors can now cope better with a more hectic fetch order?
Right, so you'd need to generalize the way pixels are dispatched to the pixel pipelines. Specifically, you'd make use of some sort of tiling mechanism where you attempt to cache as many pixels as you can, then sort them into tiles and render each tile separately. I believe ATI already does this.

Then, after that, all you need to do is make sure that all pixels on a quad need not be executed on the same triangle (though you may still require them to be using the same shader, same textures, etc.). Obviously this will take more transistors than what is currently done, but shouldn't be any less efficient on memory accesses.
 
Chalnoth said:
...you attempt to cache as many pixels as you can, then sort them into tiles and render each tile separately. I believe ATI already does this.
Ok, it does makes sense then.

Chalnoth said:
...Then, after that, all you need to do is make sure that all pixels on a quad need not be executed on the same triangle...
With unified hardware units (helping to minimize communication costs, as Dave hinted), each unit should be able to act as VS and therefore have its own control logic. Doesn't that mean the end of quads? (this is rather unclear to me for now - 4-way SIMD at the PS looks an idea good enough to be kept...) If that's the case, then we might be a lot closer to (really) small triangles that I thought we were...

Edit: Never mind. I believe I'm thinking too much and too early of the R5xx's architecture... :mrgreen:
 
Well, you don't want to end quad-based rendering, because it has some nice memory access statistics. Unless, of course, you find another location-based rendering system that works better for the transistor usage (vertex data doesn't need any location-based processing to work well).
 
Simon F said:
maniac said:
I don't think in 2017 they'll use textures anymore. Everything will be shader or other stuff they come up with.
Absolutely, except that those shaders will use a big table of constants :) :p
Sounds like a texture to me. ;)
 
JF_Aidan_Pryde said:
Ailuros said:
Probably a stupid layman idea from my behalf, but would something like a SoC stand a chance in the not so foreseeable future?

Ail, what kind of "System on chip"?

Just a weird idea that popped into my mind. I've no idea if it would even make sense; with eDRAM slowly making it's appearance in larger quantities and CPUs/GPUs constantly moving closer to each other I figured that it might become an option. Something like a CPU, a GPU and ram embedded on the same "die".
 
I think we will soon see Oracle running on video cards to keep track of all the objects.


select polygon from monster_table where monster = 'GodZilla';
 
What, if anything would have to change for true 3D displays? Not stereoscopic but 3D displays that allow for a 360 degree walk around (holographic?).
 
Well, if you fiddle your rasteriser so that you can produce a buffer that stores zero's inside polygons and other numbers (depending on numbers of intersecting edges) at the edges of a polygon you can produce a per pixel buffer that is representative of the number of samples (multiplied by an AA'ing level constant) that is required to decently AA the scene.

Perhaps some funky way of including alpha textures in with this would allow them to be AA'ed too.

Once you have this buffer I imagine programming a pixel shader to use it would be a simple task.

(this sounds to me like it would be much easier to do on a deferred renderer) ;)
 
Well, it's not really possible to do that sort of thing on an IMR. A deferred renderer may use such a system to reduce the amount of framebuffer cache that the GPU needs per tile. Such a system would add some latency to the pipeline.

That said, if you really want to make such a technique optimal, you're not going to want to AA every triangle edge. But there's probably not any truly robust system for proper edge detection (as Matrox' FAA showed us).
 
Chalnoth said:
Well, it's not really possible to do that sort of thing on an IMR. A deferred renderer may use such a system to reduce the amount of framebuffer caching that needs to be completed. Such a system would add some latency to the pipeline.

That said, if you really want to make such a technique optimal, you're not going to want to AA every triangle edge. But there's probably not any truly robust system for proper edge detection (as Matrox' FAA showed us).

There is no need for framebuffer cache on a tiler, sure it would add latency but deferred renderers would be able to hide that easily.

Well, you could exclude edges by making your rasteriser write the edges of strips and fans etc instead of the actual triangles. If we are talking PVR then being as after HSR is performed we'd know the depth of each pixel {i think, SIMON! Cmere!:} a check could be performed too see if the z-value changes significanlty from surrounding pixels. It should also be possible to determine if the next pixel is using a different texture.
 
Back
Top