New GLSL demo

Humus said:
Rambler said:
Looks cool as usually, but Humus, are you sure the occlusion culling is working? In first room i get 42fps(~22 in the "flame"), but once i move from 1st room, i get constant 21 fps everywhere.
Also i didn't notice any difference when toggling the occlusion queries in menu.
Cat 3.10, 12x10, r9800, p4 2.4GHz

Edit: erm, yes there is a difference, but still there's the fps drop when leaving 1st room.
Yup, it's most certainly working. First room, 100fps enabled, 69fps disabled. The drop in fps is when the second room becomes visible, so that it needs to draw two rooms. If you're running in very high resolution, 1600x1200 or so, which I suspect you do given your fps, then the performance difference is smaller since the particle system overhead and state changes becomes less important. Using portals have the other good property that it more or less automatically does a coarse front-to-back sort since the room you're into is drawn first and then recursively to the closest rooms. So that helps hierarchical-Z quite a bit, so that even when occlusion culling is disabled the hierarchical-Z can quickly cull a lot too.
First, very nice demo :) For some reason, the per-pixel lighting just looks really good to me.

Anyway, on the portal rendering, I get this feeling that your portals are all one-way only. That is, it looks like there's a portal between the first and second rooms, and yet, when I'm in the second room, it doesn't seem like there's any direction in which I can turn that will give higher performance than looking back at the first room. This can be repeated in a couple of other places, too.
 
bloodbob said:
No noise no GLSLANG support as it is a required compoment even if it is hell slow but noise isn't going to be a major factor for a little while so its a moot point. Noise implementation can vary the requires are some what subjective as they don't use any values other than average ( ignoring range ). The older GLSLANG documents effectively said that implementation should really really be a perlin noise unless you got something else just as good.

I still think static loops should have been introduced before ati started saying they actually supported GLSLANG :/

Two questions remain in my mind about the new GLSlang support.

1. As asked before can anyone tell me if it actually support F-Buffer?
2. How does the preformance compare M$ HLSL considering that the optimisation can be done at the source language rather then the byte code?

Well, noise is there as MazyNoc posted, but not running in hardware mode.

The F-buffer doesn't seem to be enabled.

Haven't written anything comparable in GLSL so I can bench. Will do that at some point.
 
Chalnoth said:
Anyway, on the portal rendering, I get this feeling that your portals are all one-way only. That is, it looks like there's a portal between the first and second rooms, and yet, when I'm in the second room, it doesn't seem like there's any direction in which I can turn that will give higher performance than looking back at the first room. This can be repeated in a couple of other places, too.

No, they are most certainly two-way. I'm getting 71fps in the second room looking into the wall behind the corner next to the tunnel, and 54fps with occlusion culling disabled. Or if I'm at the entrace into the first room, then looking to the left into the wall I'm getting 72fps, and when I'm looking back into the first room I'm getting 64fps.

Going through the wall you can easily verify that the room is being culled away if you place yourself so that the portal is occluded.
 
Humus said:
No, they are most certainly two-way. I'm getting 71fps in the second room looking into the wall behind the corner next to the tunnel, and 54fps with occlusion culling disabled. Or if I'm at the entrace into the first room, then looking to the left into the wall I'm getting 72fps, and when I'm looking back into the first room I'm getting 64fps.

Going through the wall you can easily verify that the room is being culled away if you place yourself so that the portal is occluded.
Yeah, you're right. It helps if I don't try to do the testing full screen....makes the differences more obvious.

Anyway, I'm curious as to how you actually do the portal detection. I would think that the easiest way would be if there was a per-pixel command where if certain conditions are met (z-test pass, here), rendering halts, and the GPU sends a message to the CPU. Is there such a command?
 
Well, that's sort of what occlusion culling does. It counts the fragments that passes depth and stencil testing. So if the count was zero, then the portal was completely hidden and you can safely throw away the sector behind it. While waiting for the occlusion query to complete I do the math for the particle system, so I can get the most out of the parallelism.
 
Humus said:
Well, that's sort of what occlusion culling does. It counts the fragments that passes depth and stencil testing. So if the count was zero, then the portal was completely hidden and you can safely throw away the sector behind it. While waiting for the occlusion query to complete I do the math for the particle system, so I can get the most out of the parallelism.
Ahhhh....can't believe I didn't think of that :) Thanks.
 
Wow - finally NV has GLSL support. Hard to believe that once, in the not too distant past, NV's drivers were the gold standard for the whole industry. It's of course not the driver's team fault - they have plenty shaders to rewrite after all.

NB: I love how the guy who took that screenshot apparently forgot to clean (or at least hide) all the WinRARed warez off of his desktop - including one called "Luigimansion" which is of course Gamecube only. :LOL:
 
It's probably too soon to compare performance of GLSL compilers with HLSL. They're still sorta beta, and will need a few iterations of optimizations. MS's HLSL compiler improved alot since the original beta.
 
akira888 said:
NB: I love how the guy who took that screenshot apparently forgot to clean (or at least hide) all the WinRARed warez off of his desktop - including one called "Luigimansion" which is of course Gamecube only. :LOL:

not to say the windows is an illegal one as well (if you look at the desktop-version-string) :D

but yeah, it's cool to see nv drivers finally showing initial glsl support. now the race can start for the best support :D
 
DemoCoder said:
It's probably too soon to compare performance of GLSL compilers with HLSL. They're still sorta beta, and will need a few iterations of optimizations. MS's HLSL compiler improved alot since the original beta.
I wouldn't say it's too soon to compare. I'd say it's too soon to expect good performance. I think much more interesting right now would be how functional the support is. Performance should come later. And, of course, performance will have to be up to par once games come out with GLSL support, and, therefore, nVidia would be best-served to have the performance up to par quite a bit before those games come out, but it would not be necessary to do so.

I do, however, suspect that their work with Cg and assembly compilation from HLSL should make it more or less easier to get performance up in GLSL.
 
Yes, because Cg already "compiles to the metal", and therefore, most of the work for GLSL will be in the parser frontend, and in comforming 100% to the GLSL semantics, which are very close to HLSL.

What I meant was, don't expect good performance in the first revision. First job is correctness. Premature optimizations is the r00t of all ev1l. :)
 
Well, Cg doesn't compile to the metal, but I suppose does come fairly close in the OpenGL compile targets. I do wonder if nVidia does have any hardware instructions that aren't exposed in their OpenGL targets that might be useful when compiled from a higher-level language? Probably not yet...but future architectures may be optimized for compiling directly from a highl-level language to machine. Towards this end, I do hope that Microsoft's next version of DirectX forgoes standard assembly altogether.
 
Well, the NV_fragment_program extension is about as close to the metal as you can get. It's machine code and not abstracted. Cg compiles to NV30 hardware as GCC compiles to X86 assembler. In no way is NV_fragment_program a virtualization and abstraction of the underlying HW as ARB_fragment_program and DX9 is. (isn't it interesting how details like precisely 32 FP32 registers are exposed OR 64 FP16 registers labeled H0-H63)

Sorry Chalnoth, but Cg's NV30 backend compiles "to the metal". The intermediate assembler step is merely a showpiece. It could just as well write out binary object code and the results would be the same. The Cg backend is not written to accomodate some "generic" abstract NV_fragment_program architecture, as if there would be multiple IHVs supporting such an abstraction. It looks uber-concrete to me.
 
Humus said:
Mendel said:

Nice. Is this official though? I've expected GLSL support from nVidia to come soon as there have been talks on opengl.org about it being in the drivers already, but not yet enabled.

No official drivers yet, but there are some beta drivers with GLSL _and_ Cg in the driver. Not sure how Cg is enabled in the driver tho.
 
Back
Top