Interesting Benchmark Effects

Zeno

Newcomer
Hi everyone. This is my first post here, though I'm active on opengl.org and occasionally nvnews. The site looks great as far as board participation goes....very high quality posts.

Anyway, the reason for this thread is that I am considering making a small OpenGL benchmark and I wanted some input from the members here. It will test mostly advanced graphics card features using only cross-platform extensions such as ARB_vertex_program, ARB_fragment_program, and ARB_vertex_buffer_object.

To do this, of course, I need to write some shaders. Here's where you can help :). Are there any effects or simulations you've always wanted to see done on the GPU? This is just meant to be a brainstorming session, so throw out anything that might look interesting. I'd like to have at least one compute bound pixel shader, one texture-lookup bound pixel shader, and one complicated vertex shader.

Thanks in advance,

-- Zeno
 
Some effects I like:

stained glass/refraction

high dynamic range lighting

hair

complicated surface lighting -> enamel wood chair with water beads on it, is an example that comes to mind

Some things I've heard of and am interested in:

vertex shader bound soft shadowing -> AFAIK, several shadow renderings blended together

more advanced displacement mapping...we might have PS/VS 3.0 hardware by the time you finish EDIT: that doesn't necessarily matter directly for OpenGL, does it? :LOL:


These are some things that popped in my head...this is about my time for sleep on my warped sleep schedule, so if anything doesn't make sense, you'll know why.... *yawn*
 
First of all, I'd have to ask you what sort of engine, and world, your benchmark will be based on/reside in.

At the moment, my interest is in floating point and precision. Having familiarized myself (more or less) with FP32, I am looking past that (FP32) due to my current favourite experimentations.

One case where I actually noticed artefacts from (being limited to) FP32 precision was using 4x4 texture transform matrices to assign texture coordinates to an object that was far away from the origin. When you got up really close to it, the texture would jitter as a result of not having enough bits of precision.

Another thing on my radar wherever/whenever FP32 isn't sufficient is the uniform matrix treatment of coordinates in a very large world.

For example, if you were forcing everything to reside in a single global coordinate system (which games like Everquest avoid, by breaking up the world into zones), and you had a per-pixel lighting model, then intricate objects hit by nearby lights in areas far away from the origin would start to show precision artefacts.

For pure 3D rendering usage, the limitations of FP32 can be worked around pretty effectively. If we were ever to move to a much more general model where 3D hardware were used for, i.e., solving matrices for physics simultation, then support for FP64 would be vital. But that doesn't seem to be on anyone's radar screen currently, for good reasons (latency, lack of generality in the shaders, etc). But very long-term, thinking maybe 7 years out, it seems almost certain that 3D hardware will look more like CPU's in terms of generality, and CPU's will look more like 3D hardware in its opportunities for parallelism.

I'm not sure if this helps with the making of your benchmark but I'm hoping that instead of thinking about "just" writing shaders (as the basis, and testing aim, of your benchmark), maybe you'll simply consider things from an engine point of view.

Just my opinion and apologies if I'm not exactly offering helpful suggestions (or got sidetracked :) )
 
Hi Zeno!

Welcome aboard an nice to see you here too. :)

I'd be interested in benchmarks with different kinds of shadowing. Shadow mapping/stencil shadows/projected shadows etc. Would be nice with some kind of implementation of soft shadows too. :)
 
Any of the above would be great to see in a benchmark. Just remember, though, the key is to generate everything procedurally (i.e. the elephant in 3DMark03) with fp precision, utilizing only minute amounts of precomputed/precalculated textures/maps if possible! We have yet to see such a benchmark; it would be great to evaluate, not only current, but future hardware as well.

Try including some things like displacement mapping using uber buffers (render to vertex array), procedural texturing and shading (including per-pixel phong illumination, anisotropic lighting, etc. with no cube-maps or other hacks), and soft shadows.
 
I'm interested in seeing the performance impact of post-processing effects (either model-based or full-screen) under various stress conditions...

Another thing I'd love to see is terrain generation done at the VS level (there was a good article describing this method on Flipcode, IIRC).
 
Thanks for the feedback everyone.

Some good ideas already. Here are my thoughts on your thoughts :)

Looks like everyone is interested in shadows these days, so perhaps a comprehensive shadow test is in order. Maybe do a low/high polygon case with both stencil and shadowmaps, including whatever techniques are available for soft shadows? One question to run past you guys - would you make a scene that is fair to both shadowmaps and stencil shadows, or should the scene show off the worst properties of each? I.e. a chain link fence or something like that for stencil shadows and a "dueling frusta" camera on the shadow maps?

Reverend - I don't plan to really base the benchmark in any sort of engine. It is intended to be a graphics card benchmark only. I have run into the same problems that you have with FP32 in both vertex and texture coordinate precision, but if I recall I was able to use double precision matrices through fixed pipe OpenGL to solve them. At any rate, I agree that it is a problem that will come up more and more, but not something that I want to address with this program.

Luminescent - No problem with making everything procedural. I had intended it to work that way for two reasons. The first and foremost reason is that I am no artist. The second reason is that using procedural geometry and/or textures may be a good way to inflate the instruction counts in my shaders. I don't think the "uber buffer" extensions are done yet, but if they happen to get finished while I'm making the benchmark I'll certainly consider them.

CorwinB - There will probably be some sort of post-processing effect to test multiple texture accesses in a fragment program. Maybe a sinc filter or something simple at first. I am also considering some procedural terrain.

Here are some other thoughts I had:

diffraction shader (like a CD)
rainbow (actually calculated)
procedural lava
procedural fire
cellular automata (Dobashi clouds?)
fractals are always good (though Humus already did the Mandel)
thin film (soap bubbles/oil slick)
heat gradient (mirage)
Perlin noise (applied to something - terrain maybe)

Keep them coming :). I'd like to start this weekend.

-- Zeno
 
Good ideas overall, Zeno. It sounds like an interesting and potentially great benchmark, however, some navigation functionality with flypath recording (or random flypath generation) abilities for each sequence, in addition to a default flypath for online comparisons would be excellent. This way, cheating can be bypassed by running a series of flypaths and averaging the average framerates while a default flypath remains for quick comparisons.

Also, noise functions are a great help for the procedural textures (you probably know this; just wanted to make it known).

Allowing the user to choose precision would be extremely important for the benchmark (has never been done). Maybe an option for full precision (fp32/fp24), partial pecision (fp16), integer precision (only useful for NV30, 31, 34, &30), and mixed precision mode, where you use only the precision type required for the necessary visual fidelity.

God speed.
 
I had planned to keep the camera fixed and the scene entirely within view of the camera. In general, I don't have the resources to make large sweeping worlds to fly around in, and if the whole scene is always visible it precludes certain types of cheating. This wouldn't be the case for a procedural terrian, though, and if I do such a test I'll probably include the ability to create your own flight path.

I also like the idea of allowing the user to choose precision. Unfortunately, ARB_fragment_program does not give much control over this. There is a "hint" you can specify as either ARB_precision_hint_fastest or ARB_precision_hint_nicest, but I haven't seen it have any effect on either ATI or NVIDIA cards. ATI cards are always 24 bit anyway, and in order to compute with partial precision on NVIDIA I would have to write a special shader using their extension, NV_fragment_program. I'd like to stay away from this for several reasons (time, people crying unfair, I prefer to support vendor independent extensions, etc).

Hope this makes sense. My end goal is to make a fair cross-platform (although requiring latest generation capabilities) OpenGL based benchmark.
 
If you can, though, Zeno, include the flypaths recording functionality. Do what you can with precisions, but options would be nice.
 
I would like to see something with lots of color. Show off what you can do with 128-bit color.
 
I'm currently trying to do correct reflection / refraction, which needs (at least) 2 render2texture passes + final pass. So some sort of Render2Texture speed would be great (of course then reading back from the rendertarget)...
 
How long do you think this will take Zeno? As in like a Beta version of the benchmark? Does sound exciting and very ambitious to say the least. Please keep us inform of progress and maybe break down the different tests before compiling them all together showing us the results for your feedback and QC of benchmark.
 
Two requests.

1) Add a feature to capture and compare specific frames (or a several ranges of frames) a user selects so image quality across cards and their drivers may beeasily compared to a benchmark standard to simplify spotting cheating.

2) Add code paths that compare Directx 9 rendering with as far as possible the equivalent OpenGL ARB2 calls (possibly selectively across steps the graphics pipelines but certainly in the fragment and pixel shaders). I'd like to see just where NV3x and R3x0 are bottlenecked and if one API gets a huge boost it could indicate driver cheat / optimisation of the wrong kind where someone might be dropping into lower quality rendering than required.

Lastly do you see this benchmark would have superior graphics to what Massive Development are targeting on the soon to be released Krass engine based Aquaxnox 3? I read with interest it can freely switch between Directx 9 and OpenGL calls on the fly when rendering a scene (but no comments on why you might want to do this).

Good luck!
 
rwolf - I'll do my best :)

Maven - there will probably be at least one test that uses render to texture.

Noko - I'm hoping that the whole thing will take me less than three months. I'd like to write a shader test per weekend and use the time during the week to add things that don't require me to think much...I still have to go to work during the day ;). Once I have a few tests done, I'll make a web page for it on my web site. I will want feedback before I release the test to the general public. I'll probably pick a few knowledgeable people from OpenGL.org or here to give feedback before I make a general release.

g_day - Thanks for the suggestion. There will definitely be a screen capture option. There will probably not be any Direct3D in the benchmark...mainly because I don't know D3D, but also for time reasons. The question as to whether the benchmark will have superior graphics to a commercial engine is a subjective one. Obviously, certian effects will be beyond what any engines do, but the geometry and textures are likely to be much simpler without an art team. Make sense? It will end up being more like a succession of Humus'-type demos within a framework for timing them. Would you say his Mandelbrot shader has superior graphics to DoomIII? It's apples and oranges. I do want to make it pretty, though, so that people have nice demos to show off their systems with.
 
Is there any reason why you aren't willing to use anything less than ARB_f_p? Things like ARB_v_b_p etc makes sense, but why not test geforce3/4's and 8500's while you're at it.

Also, dont include just ONE use of anything you can think of. Mix things up. For example, the render-to-texture thing, why not use it in multiple cases? Sure, have one "pure" demo that shows how fast/slow it is, but also use it in a real case, and then maybe some obscure/pathological case.

And vendor specific extensions and/or optimisations are not evil :devilish:
I know that you said time is a constraint, but it's not that hard to design a system with mulitple backends, even if you dont fill them out...release the source code so that others might want to do it.. ;)
 
AndrewM said:
Is there any reason why you aren't willing to use anything less than ARB_f_p? Things like ARB_v_b_p etc makes sense, but why not test geforce3/4's and 8500's while you're at it.

The most important reason is that I don't think there is a need for another DX8 benchmark. I think 3DMark2k1 and pretty much every game out there have it covered. In addition, I am only one person...I can't work on this full time and I want to get it done in a few months. I am not purposely excluding those cards, though. There may be a test or two that does not use fragment program. In that case, Geforce3/4/Radeon8500 should run them just fine.

AndrewM said:
Also, dont include just ONE use of anything you can think of. Mix things up. For example, the render-to-texture thing, why not use it in multiple cases? Sure, have one "pure" demo that shows how fast/slow it is, but also use it in a real case, and then maybe some obscure/pathological case.

I doubt if any of the tests will end up being "pure". It's tough to make something that's visually interesting and does nothing but render-to-texture, for example :).

AndrewM said:
And vendor specific extensions and/or optimisations are not evil :devilish:
I know that you said time is a constraint, but it's not that hard to design a system with mulitple backends, even if you dont fill them out...release the source code so that others might want to do it.. ;)

I don't have a moral objection to vendor-specific optimizations, only a time objection. I have designed the basic framework so that it's easy to replace an ARB shader with a vendor-specific one later.

I haven't decided yet what sort of license I will release the benchmark under. Lots of things to consider here.

In case anyone is wondering about progress, I got the improved 3D Perlin noise algorithm to run in a 51 instruction pixel shader today (see here http://mrl.nyu.edu/~perlin/noise/). I haven't decided yet how to present it....whether to show a 2D plane sweeping through the volume, do actual volume rendering, or something else entirely. Also up in the air is whether to do some sort of fancy shading on the volume...it's already pretty computationally expensive. It slows my FX5200 to a crawl (typical with non-trivial fragment programs) and there is a driver bug related to texture indirection that prevents it from running on my 9700 Pro (I have emailed ATI devrel about it).
 
Back
Top