Software rendering in games (56k)

Scali · Aug 29, 2004

That was pretty cool. There's no argument that hardware runs faster than software, you're just stuck doing hardware things the hardware way, and general computing is getting pretty damn fast these days.

Thanks.
But the limits go for everything... If you want to do realtime raytracing, you're stuck to doing things the raytracing way. Currently that still means you can't sample every pixel, you can't do better texture filtering than mipmapping, reflections are limited to 1 bounce, you can't use detailed triangle meshes, let alone skinning...
I've put my money on graphics hardware getting a more "general computing" model, rather than waiting for CPUs that are fast enough to do software rendering again.
I can actually see lots of room for improvement with GPUs, while CPUs seem to be at the end of their evolution... The main improvements come from improved manufacturing processes.

Goragoth · Aug 29, 2004

reflections are limited to 1 bounce

Do you mean that because of hardware speed limitations or what? Obviously raytracing can do multiple bounces on reflections if you have enough processing power to spare. Still doesn't make it suited for games though.

And I would like to add that there are lots of things that raytracing cannot do. For example it doesn't handle diffuse light at all, to get that you need a more complete model including radiosity. Even then I would argue that "faking" things is still better in many cases because what's important is the final image and how pleasing it looks, not how physically correct it is (unless you are in engineering/architecture/etc in which case you do but that's completely different anyway, I'm talking games/movies).

While I think that writing a software renderer is great fun and can be useful for various things I don't think we will return to full software rendering in a hurry. In fact we are moving still further away from it as offline renderers such as mentalray are getting support for hardware acceleration.

Scali · Aug 29, 2004

Do you mean that because of hardware speed limitations or what? Obviously raytracing can do multiple bounces on reflections if you have enough processing power to spare. Still doesn't make it suited for games though.

Yes, pretty much all the things I listed are possible in raytracing, but only at a considerable performance-hit, meaning they cannot be used in realtime situations in the near future (my hardware-reflection program actually does do 'unlimited' bounces, the advantage of render-to-texture is that you can spread the iterations over multiple frames. Quite hard to do with raytracing

).

Nick · Aug 29, 2004

Shamless plug...

Here's my new swShader website: http://sw-shader.sourceforge.net/main.html. It's still under construction, but feel free to give me feedback about layout and content.

To get back on topic a bit:

Kaotik said:
The point of this thread is that you people explain me where is software rendering today, what we can do, what we can't do...

The new version of swShader I'm working on will be able to do anything you want. There's nothing that can't be done. On the contrary, there are things that swShader will be able to do, and hardware won't. The -only- thing it won't be able to do is provide higher performance. That's the only limitation, and it will always be.

Jerry Cornelius · Aug 29, 2004

To try and get back on topic as well, it's a better question to ask what hardware can't do, because with software the sky is the limit. If a feature isn't supported in hardware you have to see if some kind of hybred solution will work.

The problem with software, as stated, is performance, which may be what you were really asking in the first place. In this regard texture filtering is the biggy, averaging four sample points from two mip levels for simple trilinear in 32 bit colour is probably off limits for even the fastest computers unless we are talking a pretty low res scene or some kind of hack.

That being said, if you depart from the usuall filtering standards you should be able to accomplish something half deent and avoid major aliasing, assuming you don't render any hidden pixels. Also, some kind of edge AA is a real possibility if your rendering engine is tightly integrated with the game engine.

Kaotik · Aug 29, 2004

Let me rephrase that, as I got bit understood.

What can we do with software - in games with current CPU's with acceptable speed.

Scali · Aug 29, 2004

What can we do with software - in games with current CPU's with acceptable speed.

In short, Quake1/2/3 (or Unreal if you like

).
No shadows, no reflections, no per-pixel lighting (except for special-case hacks).

Anteru · Aug 29, 2004

With current CPUs we can have realtime reflections, shadows, bump mapping etc ... Just look at Outcast, it ran smoothly on a Athlon 600 @ 512x384, should run now at 1024 or more on a Athlon 3000. Unfortunately, I don't have it, so I can't test how many fps you get atm, but maybe someone else on this board could dig out a some nice screenshots.

Scali · Aug 29, 2004

I don't know the game, but I'm quite sure they're special-case hacks
(projected shadows, static envmaps, embm rather than proper per-pixel dot3 lighting, only applied to small parts of the scene, etc).
You simply don't have the processing power to do everything in realtime the way modern hardware does it in eg Doom3, which is completely generic and dynamic.

davepermen · Aug 29, 2004

outcast rocks

Scott_Arm · Aug 29, 2004

hey

davepermen said:
outcast rocks

That's the voxel engine game, isn't it? I may have CDs for that around somewhere.

Nick · Aug 29, 2004

There are a few things that might sound surprising for a software renderer:

- Stenciling is fast. Unlike hardware, there is not really a pipeline with a fixed fillrate. With MMX, I can fill the stencil buffer at gigahertz speeds, so to say. The newest hardware probably still beats it, but either way, it's damn fast for software.

- Sampling multiple textures is fast. I once experimented with a shader that sampled several -hundred- textures. Performance only dropped a relatively small factor. This shows that modern CPUs are used more efficiently if you give them more work. More independent work and less jumps. There's also a lot of setup work that doesn't have to be redone for new sampling.

- Shading is fast. That's right. SSE can compute most arithmetic shading operations faster than a hardware shader (one pipeline). And all that in full 32-bit precision. Of course it's no match once you compare it to hardware with four or more pipelines. But either way, operations between colors and other vectors is very fast compared to texture operations and the extra work it requires. If only CPUs had instructions to accelerate sampling operations...

Anyway, I have to disagree with Scali once again (nothing personal): shadows, reflections and per-pixel lighting are - though with considerable effort - possible with acceptable performance. Granted, hardware is way better at all of this.

Nick · Aug 29, 2004

Scali said:
I don't know the game, but I'm quite sure they're special-case hacks
(projected shadows, static envmaps, embm rather than proper per-pixel dot3 lighting, only applied to small parts of the scene, etc).
You simply don't have the processing power to do everything in realtime the way modern hardware does it in eg Doom3, which is completely generic and dynamic.

Entirely true. They use many tricks and hacks. It looks amazing, but would only work for this type of game.

The terrain can be bilinear filtered with quite good performance, but the voxel tracing algorithm lends itself well for optimizations of this. Four texels in the terrain's texture map, map to several pixels on the screen. I'm quite sure they use this fact to avoid re-sampling the texture for every pixel. Weight factors for the bilinear filtering are most probably also easily derived from the voxel algorithm.

As far as I know, the characters of the game were rendered the 'slow' way. But as long as they don't fill the whole screen, there is only a limited amount of fillrate required to do this. And yes, for distant objects the bump mapping and advanced lighting is skipped.

Either way, Outcast rocks.

Scali · Aug 29, 2004

- Stenciling is fast. Unlike hardware, there is not really a pipeline with a fixed fillrate. With MMX, I can fill the stencil buffer at gigahertz speeds, so to say. The newest hardware probably still beats it, but either way, it's damn fast for software.

Well, NVIDIA has the solution in their GPU for that aswell

Their pipelines can do twice the work if there are no colour operations, but only z/stencil.
Also, in practical situations, you generally only stencil based on certain conditions, eg z or alphatest. And that already slows a CPU down a lot.
So the advantage is purely theoretical, I'd say.

- Sampling multiple textures is fast. I once experimented with a shader that sampled several -hundred- textures. Performance only dropped a relatively small factor. This shows that modern CPUs are used more efficiently if you give them more work. More independent work and less jumps. There's also a lot of setup work that doesn't have to be redone for new sampling.

Again, this is probably purely a theoretical advantage. You rarely need more than 2 or 3 textures, especially if you are doing more advanced shading (more arithmetic, less textures). So perhaps if you throw enough textures at it, software rendering will actually be faster than hardware, but there probably aren't any practical cases for using that many textures.

- Shading is fast. That's right. SSE can compute most arithmetic shading operations faster than a hardware shader (one pipeline). And all that in full 32-bit precision. Of course it's no match once you compare it to hardware with four or more pipelines. But either way, operations between colors and other vectors is very fast compared to texture operations and the extra work it requires. If only CPUs had instructions to accelerate sampling operations...

This is probably again a theoretical advantage only. Most hardware can do reasonably complex per-pixel shading in realtime today (at least GF3 and up). Perhaps if you make it more complex, software will be faster, but will it look any better, and is it still realtime?

To conclude: if the question is "what can you do with a software renderer for games?", these aren't the answers, since these advantages cannot be exploited in a way that gives you a renderer that is fast enough for realtime use. The only thing you are basically saying is "Hardware cannot do these things in realtime either". Which is not really interesting

But if you are talking about offline rendering, then yes, software has definite advantages over hardware at this time. You can do raytracing in a more efficient way than on current hardware, you can implement a REYES renderer for high quality antialiased rendering, you can use complex 2d, 3d, 4d, ... procedural textures, etc.

Anyway, I have to disagree with Scali once again (nothing personal): shadows, reflections and per-pixel lighting are - though with considerable effort - possible with acceptable performance. Granted, hardware is way better at all of this.

As soon as you get my program running on your renderer, we can see how acceptable that performance will be (how far along are you? it is an old DX8 thing without shaders, so it should almost run on your existing DLL already).
And note that this is just one optimized demo, it doesn't make a game yet.
So even if you would get 15 fps here, that doesn't mean you can actually make an entire game like this.

Nick · Aug 30, 2004

Sure, all these advantages are 'theoretical' in a certain way. The only benefit software has is to implement such advantages before hardware does. But performance-wise it doesn't mean that much, I have to agree... Either way I think it's an interesting property of software rendering. Serial execution is a (theoretical) benefit when performing only one task. Nearly all of the CPU will be used to perform this task as fast as possible (unfortunately not fast enough).

To a certain extent, we see this evolution with GPUs as well. With a 'unified shader model', vertex and pixel shader unit usage no longer has to be balanced. So it can no longer be a bottleneck. This makes the advantage a little less 'theoretical' for the hardware implementation. I'm sure there are other things that will make GPUs look more like a bunch of specific CPUs. I am very much looking forward to running my software renderer on dual-core processors...

Scali said:
As soon as you get my program running on your renderer, we can see how acceptable that performance will be (how far along are you? it is an old DX8 thing without shaders, so it should almost run on your existing DLL already).

I'm focussing on the DirectX 9 DLL now. It should be simple to create a DX8 version out of that later, so I avoid doing both at the same time. Friday was the last day of my internship, so only now I have the time to really implement it completely, but my experiments already show interesting results. Depending on what kind of project TransGaming lets me work on (could be -very- related), it should take one to three months to fully implement my current plans.

And note that this is just one optimized demo, it doesn't make a game yet.
So even if you would get 15 fps here, that doesn't mean you can actually make an entire game like this.

I know. swShader is intended to provide a whole range of quality versus performance. Shader support is intended for low-resolution low-framerate applications where quality and compatibility is of primary importance. My fixed-function pipeline is intended to be equivalent or better than Pixomatic, so it can be used as a fallback for games with low demands. Anyway, I never expect people to use it as the primary renderer.

Reznor007 · Aug 30, 2004

Kind of off topic, but somewhat related....

MAME(the arcade emulator) has a complete Voodoo1 emulator that does all rendering in software. I'm not sure how fast just the graphics rendering is though, because the rest of the hardware the supported games(NFL Blitz, SF Rush) use is killer to emulate.

Here's some shots:
Rush http://www.aarongiles.com/pix/sfru0067.jpg
War http://www.aarongiles.com/pix/moregeo.jpg (Voodoo2)
Blitz http://www.aarongiles.com/pix/blit0039.png

Voodoo2 support should be finished in the near future for newer games(Gauntlet Legends, etc).

JD · Aug 31, 2004

For game dev you can buy a $40 gf2 card which has rudimentary per-pixel support and save bunch of dev time. The only reason I would go with sw is disappointment with 3rd party hw drivers. So far, nv has been good in that department for me. I'm talking from game dev perspective and sw is valid for new features and exotic ones like photon mapping and voxels. I see hw distancing from sw already like in filtering, screen res, etc.

Scali · Aug 31, 2004

sw is valid for new features and exotic ones like photon mapping and voxels.

HW Jensen et al have implemented a reasonably decent photonmapper on standard ps2.0 hardware.
PowerVR and NVIDIA have demonstrated raycasting with PS3.0 through a 3d texture... which is effectively voxel rendering.

Guden Oden · Aug 31, 2004

Kaotik said:
indeed there's 2x "zoom", but it can be disabled from ut2004.ini

Actually, if it's 2x in each direction it only does a quarter of the workload it should, so 640 res at playable rates is really more like 320 res. And with all due respect, I think those screens look like...ass.

They don't look "good all things considered" or something like that, they just look like ASS.

Software rendering is dead at the moment, I think even Tim Swiney regrets the bull he spouted a few years ago when he said it would come back en vogue again now that CPUs were becoming so fast.

The thing is, CPUs will never be able to compete with GPUs as long as graphics rendering can be cut up in pieces and parallelized the way it can today, and there's no reason to believe it won't be able to in the future either...

Nick · Aug 31, 2004

Guden Oden said:
Actually, if it's 2x in each direction it only does a quarter of the workload it should, so 640 res at playable rates is really more like 320 res. And with all due respect, I think those screens look like...ass.

It can easily be disabled, and the performance impact is not all that big. They might actually be doing an in-between resolution and scaling that up a little. You can also set the filter quality to 3, which is full bilinear. Doesn't look like ass any more, but requires a 4 GHz processor...

Software rendering in games (56k)

Scali

Goragoth

Scali

Nick

Jerry Cornelius

Kaotik

Drunk Member

Scali

Anteru

Scali

davepermen

Scott_Arm

Nick

Nick

Scali

Nick

Reznor007

JD

Scali

Guden Oden

Senior Member

Nick

Similar threads

Software rendering in games (*56k*)

Drunk Member

Senior Member

Similar threads

Software rendering in games (56k)