A dream I had (partial fixed function 'GPUs')

PsychoZA

Newcomer
Last night I dreamed that graphics card processors would make a return to partial fixed function operation. They would be part GPU, but have set functions for performing depth of field, fresnel water effects, normal/specular/parallax mapping, high dynamic range, motion blur and maybe a shadow map technique. So the next iteration of the nVidia line would have the same amount of stream processors as the current gen and all the extra transistors would be for the fixed function items. The performance increase would be the same as when graphics cards first went to performing 3D functions, ala the Voodoo 1, but would only be noticed if the game was patched to support it. The functions would be called inside fragment programs so the data generated by the fixed functions could be used in the shaders.

The thing is, that sounds quite plausible to me. GPU's have become so versatile that they surely can't be as fast as fixed function. They still have the ROPs, but with every game on the market sporting all those features I listed above, surely it could actually be better to have a hybrid approach? The operations wouldn't have to go through thread schedulers and the data wouldn't have to be passed around. It would just go straight through and come out the other end done. Would those functions require too many transistors to be practical?

I see the performance hit that the R6xx takes due to not having normal fixed function AA as the inspiration for this dream. The original move away from x86 processors was for this exact reason. I remember reading a quote from an old 3dfx employee who said they would take a mathematical function, like the 2D transform and just slap it onto the dye. Put the number in, get the answer out. Another reason for the dream could be the Wii, as I heard its really easy to develop for and get good results because the GPU is fixed function.

Sure there would be some problems, like the graphics card isn't 'future compatible', but I've always believed any hope of a graphics card lasting longer than 2 years and still being able to play games decently is nothing more than wishful thinking. Patching games would be easy as you could just take the shaders, cut out a big chunk of math in the middle and put in the function call with the custom parameters.

What other problems would there be to that approach or is it really just too far fetched? Just remember I'm talking hybrid here, you have to keep shaders for infinitely obvious reasons, so I'm not even considering going totally backwards. It just seems like every developer reinvents the wheel by writing their own shader code for something thats done in 100 other games, then the hardware company has to go in and optimize their driver so it gets good performance because the programmer did things slightly differently, even though he got the same result.
 
OK four points:

1. You should not be dreaming of such things I am sure it is not "normal". ;)

2. Assuming R6xx is not performing well because it does not have enough fixed function hardware is not entirely correct.

3. Patching games to make your fixed function hardware is a recipe for disaster - UHV pushing more work on ISV's is not what the gaming industry wants to hear.


4. Microsoft would need to be in on this from day one but the direction they have taken is opposite to what you are suggesting with regards to DX.

IMHO... and I am not an expert by any stretch..
 
You just don't understand how GPUs work and how effects that you describe are programmed... Removal of the fixed-function pipeline is the best thing that ever happened to the GPUs. Your proposal is as ridiculous as the idea of designing CPUs that have zip-compression, MS Word menu bar rendering and minesweeper game as hardware features.
 
OK four points:

1. You should not be dreaming of such things I am sure it is not "normal". ;)

2. Assuming R6xx is not performing well because it does not have enough fixed function hardware is not entirely correct.

3. Patching games to make your fixed function hardware is a recipe for disaster - UHV pushing more work on ISV's is not what the gaming industry wants to hear.


4. Microsoft would need to be in on this from day one but the direction they have taken is opposite to what you are suggesting with regards to DX.

IMHO... and I am not an expert by any stretch..

1. I take SSRI's and they give me the most insanely realistic and vivid dreams. Its kinda fun for the most part.

2. It was only with regards to the AA performance drop. There are many many other issues there...

3. The old games would perform just as well, as the 'current gen' GPU is essentially still there, so it wouldn't be necessary. With games like Doom that allow you to modify the shader code, it would even be possible to mod it to support it.

4. The functions would be called from within the fragment programs, surely is not that big a deal from the software side? More down to HLSL and the drivers.

@Zengar: That was nonconstructive and trollish, don't you think? The GPU exists because graphics can be easily applied to hardware acceleration. Fixed function was the beginning because software wasn't fast enough, now it seems we've swung full circle and we're almost back where we started. I also went to great lengths to mention that it was a hybrid 'idea' to create fixed function versions of shaders that every game has. Which can be done now, but couldn't be done in the past because they couldn't easily predict which functions would be most needed. I have a fairly good idea of how GPU's and CPU's, in particular, work as I have done Electronic Engineering.

It can't be completely fixed function because it reduces creativity and limits the imagination, but a pure 'software' GPU is the other end of the scale where you have almost limitless creativity, but there is a speed disadvantage (although not apparent because we don't have 700mhz 700M transistor fixed function graphics cards to compare it to). I see the ultimate solution as a balance between the two. Well, thats at least how I saw it in my dream and it seems to make sense. I would just like to know some real reasons why not. Not some rant about Word Menu bars...
 
Last edited by a moderator:
Maybe I was a bit harsh, as I am not in a good mood today (it was a long, terrible week for me). Still, your idea is absolute nonsense. We got off the fixed function because we have programmable hardware that provides similar performance -- and even if we have a 5% performance loss here, noone cares. Implementing particular effects in hardware introduces overhead in (a) chip design and complexity (b) dataflow. You talk about function of "water freshnel effect". I can't understand what is supposed to be (and I implemented enough water shaders), but let us imagine, we have such unit. Where will it be located? Will we have one such unit per processing shader unit (resulting in size/complexity increase)? Or should such unit be placed separately (resulting in stalls when multiple fragments need it?)

Also, you speak of "hardware support" of HDR and shadow maps? I am sorry? Every game out there uses a different algorithm, besides, GPUs support everything you need -- floating point textures, framebuffers and blending? Hardware function that supports normal mapping? Well, it is the dot product - a single instruction in most GPUs. Motion blur? The only conclusion I can make from reading your post is that you have never actualy done some graphics programming.

This way, my comparison about Word menus weren't a rant, this is actually exactly the thing you propose. Take some functionality many program (may) use and add a hardware unit for it. This is horror for the software developer (because it takes his freedom), hardware developer (extreme hardware complexity), driver team (same thing), consumer (poor support, bad roduct quality, lots of bugs, no new features).

It only makes sence to create special-function units for task that are (a) simple (b) time consuming (c) repetative (used often). With other words: optimize bottlenecks. This is what GPUs are doing. You have hardware support (fixed function) exactly where it is needed: texture sampling (an operation that requires lots of math but doesn't have to be flexible), AA framebuffer operations, depth buffer, stencil buffer, raserization.
 
functions for performing depth of field, fresnel water effects, normal/specular/parallax mapping, high dynamic range, motion blur and maybe a shadow map technique.

HDR is already fully hardware accelerated, as floating point blending and filtering is fully supported by the fixed blending hardware. No extra shader instructions are needed to support HDR.

Normal mapping is basically just a dot product (and that takes one cycle on all new hardware). Of course you have to calculate the normal vector first, but it cannot be done by fixed function hardware, as your input might be either in world space, tangent space, might be packed in 2 components (you calculate the third component in the shader - for world space you also need to store sign bits somewhere and do the extra mul) or might be just a heightmap (that you sample with fetch4 and calculate the normal in the shader).

Shadow mapping is also fully supported by the fixed pipe. You render the scene from the light perspective to a depth/stencil texture (no shaders needed for this at all). Then you calculate a projection matrix (CPU) and sample this texture (with projection divide) in your surface shader. Fixed function PCF is also supported by all Geforces. No extra shader instructions needed, it's all fixed function already.

Motion blur is a more complex issue. If you want 100% correct motion blur, the processing power requirements will be infinite. For perfect result you'd have to sample the scene infinite number of times from the frame time period and calculate the average. There are countless approximations developed, but none are perfect for each situation. For example our deferred renderer calculates the motion vectors of each pixel and stores them in a buffer. Then it combines necessary last frame g-buffers and the current g-buffers to calculate the correct blurring area. It's a screen space process and provides the quality we need. However it's not physically perfect, and would generate artifacts in certain cases. I would be happy to see our algorithm perform 10% faster in fixed function, but I doubt anyone else would agree with me (and no sane hardware manufacturer would spend tens of millions of transistors to support it fully on fixed pipe and preallocate the extra frame buffers for it).

I wouldn't really want to comment on water effects, as I haven't coded water effects for a long time. However as you talk about fresnel, you would need the input of all lights affecting the scene to calculate the effect. The problem with any fixed function lighting calculation is currently that the lighting formula and material definition differs a lot from engine to engine. To use the fixed function calculation, you'd have to convert your material definition and lights to a format supported by the fixed function fresnel calculation. Also fully realistic fresnel effect needs proper raytracing (as it simulates ray bending from a lens), and is almost impossible to do on current raster/scanline based rendering hardware. The fixed function would be an approximation at best, and I doubt it would be the most suited for all games needing it. Why waste millions of transistors for a feature that only a few games use?
 
Last edited by a moderator:
Not only will we move towards total programmability, but I also expect discrete gfx-hardware, sound chips etc. to completely disappear in foreseeable future.

What is need is flexibility in both SW and HW - and that's exactly the opposite of what fixed HW does/implies.
 
Not only will we move towards total programmability, but I also expect discrete gfx-hardware, sound chips etc. to completely disappear in foreseeable future.

The question being if general purpose hardware will ever be able to efficiently replace dedicated hardware. For demanding applications and/or settings I'd instantly say no; for absolute budget stuff maybe, but it would still mean that in one way a dedicated graphics unit will be incorporated in a CPU.

What is need is flexibility in both SW and HW - and that's exactly the opposite of what fixed HW does/implies.

How about a fine balance between programmable and fixed function HW according to each timeframe's needs?

A foreseeable future for the GPU market would be what? One year or a tad pushed maybe 2? Try a 100% programmable GPU with not a single shred of fixed function HW in it within that timeframe and we'll see then why performance might suck in the end.
 
Last night I dreamed that graphics card processors would make a return to partial fixed function operation.
Graphics hardware has been through this cycle at least once or twice so, no doubt, it will do it again.
 
Ail, I'm talking like 20 or 30 years from now. Out of my arse of course, since I don't own a crystall ball :) But I mean stuff like 3d gfx, audio, video, blah chips will all disappear since it'll all be integrated or emulated somehow. Basically a "gamer" in that future will just buy a small device using very little energy which needs no exspansions for doing its job and has computing power in abundance.
 
@sebbbi: Ah, thanks. I didn't realise the extent to which the current crop of GPU's still had that kind of hardware optimization.

I guess the advancement of our requirements isn't as fast as our technological advancement which is going to allow a lot of 'software' solutions. Even though its incredibly flexible, it just seems a pit inefficient (in terms of use of transistors, but very efficient in other regards), but I guess its unavoidable.
 
I don't think it's such a bad idea afterall. Think about it. CPUs are adding (or will be very shortly) more fixed-function hardware in the way of vector processors, and some server CPUs already have things like Java acceleration built-in now. The line between CPU and GPU is being blurred, I don't see why fixed-function hardware is out of the question for GPUs. Afterall, we still have ROPs and TMUs.
 
I don't think it's such a bad idea afterall. Think about it. CPUs are adding (or will be very shortly) more fixed-function hardware in the way of vector processors, and some server CPUs already have things like Java acceleration built-in now. The line between CPU and GPU is being blurred, I don't see why fixed-function hardware is out of the question for GPUs. Afterall, we still have ROPs and TMUs.

Can we call vector processors "fixed function" hardware? I don't think I agree that that is totally accurate.

Also I'm 100% for fully programmable hardware. Go RV670 and G92!!!
 
CPUs are adding (or will be very shortly) more fixed-function hardware in the way of vector processors, ...
You do realize that those vector processors were mostly added because CPU vendors want to allow people to move away from their dedicated hardware and go to a software only solution, right?

... and some server CPUs already have things like Java acceleration built-in now.
Do you have pointers to that? I thought those things had long gone the way of the dodo? There's really no fundamental reason why JIT would be less performant than dedicated JAVA instructions.

The line between CPU and GPU is being blurred, I don't see why fixed-function hardware is out of the question for GPUs. Afterall, we still have ROPs and TMUs.
Well, the trend is definitely not your friend...
 
I don't think it's such a bad idea afterall. Think about it. CPUs are adding (or will be very shortly) more fixed-function hardware in the way of vector processors, and some server CPUs already have things like Java acceleration built-in now. The line between CPU and GPU is being blurred, I don't see why fixed-function hardware is out of the question for GPUs. Afterall, we still have ROPs and TMUs.

Yes, but this are specific optimizations. The thread starter suggested fixed hardware for high-level procedures and effects. Of course you _need_ to have fixed hardware somewhere, but this hardware should only cover atomic operations (like vector calculation/AA/ROPs etc.) and not such vague algorithms like shadow mappin gor motion blur. I would also like if GPUs would have out-of-the box Perlin noise function.

Once more, it goes like this: analyse graphics algorithms and try to find repeatedly used functions. This are one's that need hardware implementation. I listed most of them in my previous posts.
 
Fixed Function was ok when it was just simple things like generating triangles.

(making up numbers & I'm no hardware engineer so quite possibly talking out my ass but this is my understanding of the problem)

Say you have shader A which is 120 instructions long and shader B which is 130 instructions long.
Shader A & B share say 80 instructions but not in the same order & with unshared instructions mixed in-between.
Of the 80 shared instructions, lets say 60 of them are actually repeats of 3 basic instructions (Multiply, Add etc or with input variables switched round).

So say each step takes 1million transistors to build fixed function pipelines, then the 2 fixed function pipelines will cost 250 million transistors total.

In a programmable processor capable of rendering both Shader A & Shader B, say each step costs 1.5million transistors.

Then you can have a General Processor consisting of the 23 shared instructions =34.5 million transistors.
The other 70 less common instructions get put in a Special Function processor for 105 million transistors.
To generate each shader, you loop through the GP & the SF processors adding a bunch of functions at a time.

It might wind up taking net more clocks for a given shader but you can have 4 GPs + 1 SF for less transistors (243million) than the two fixed function pipelines.
Even if programmable pipeline takes twice the transistors per instruction than fixed, you still get 3 GPs + 1 SF for only 28million more transistors than the 2 fixed function pipelines.
Additionally you have the flexibility to generate any other shader whose instructions are a subset of {A + B}.


Obviously with real numbers there would be a crossover point somewhere where the shaders are simple enough that the extra complexity of making the instruction programmable gives a loss to the programable processor.

I think that threshold was passed back with the R300 which emulated the DX7 fixed function pipeline as a shader within the programmable pipes, at worst about equal with NV30s mix of fixed & programmable pipes and often demolishing it.
 
You do realize that those vector processors were mostly added because CPU vendors want to allow people to move away from their dedicated hardware and go to a software only solution, right?

Sure, but overall it's just a blending of the two. CPUs becoming more like GPUs and vice versa.

Do you have pointers to that? I thought those things had long gone the way of the dodo? There's really no fundamental reason why JIT would be less performant than dedicated JAVA instructions.

I was wrong about the Java acceleration. I was thinking about Sun's Niagara family of MPUs and their SPUs which accelerate cyroptograhpic algorithm processing. Must've associated Sun with Java automatically :p

On the same note, and in defense of my argument of GPUs becoming more like CPUs, they also have fixed-function hardware dedicated to algorithm acceleration (see pure video and UVD).

Well, the trend is definitely not your friend...

Well we haven't reached the ultimate evolution of either CPU or GPU yet, but I believe I've presented evidence pointing towards an homogenization of the two.
 
On the same note, and in defense of my argument of GPUs becoming more like CPUs, ...
:?::?::?: This thread is about the merits of adding fixed functions in the GPU, which is almost the opposite of claiming that a GPU is getting more like a CPU.

... they also have fixed-function hardware dedicated to algorithm acceleration (see pure video and UVD).
The standard pattern always goes like this: new kind of technology -> dedicated HW -> CPU/GPU becomes faster -> dedicated HW goes away and becomes programmable.

Remember the days where CPU had a hard time keeping up with ordinary DVD's? Now they eat it for lunch.

UVD and PureVideo were not added to play low bit-rate MPEG2 clips but to crunch through HD-DVD/BlueRay. Give it another 2 or 3 years and the need for dedicated HW will once again be greatly diminished.
 
Also I'm 100% for fully programmable hardware.

There will always be a need for fixed function parts. For instance I don't see dedicated depth testing hardware going out of fashion anytime soon. Texture units will probably remain fixed function for generations to come. I don't see programmable rasterization on the horizon either.
 
1. I take SSRI's and they give me the most insanely realistic and vivid dreams. Its kinda fun for the most part.

What are SSRI's and why dont you dream of large boobies like the rest of us ?
I dreamt of a gpu/dsp hybrid maybe im nuts too :D
 
Back
Top