More than a year on and shaders are unimpressive

Most hardware is faster at drawing large polys than small polys - fill rate efficiency is typically lost at the edges of polygons. Smaller polys = more poly edges per unit area of screen = less fill rate efficiency. This will only slow you down if you are fill-limited, of course.

And large polys are also better for cards which feature Heirarcical-Z buffers (i.e. Radeon).
 
Ailuros said:
MDolenc,

When polygon rates increase, overdraw increases too. Hardware T&L does help but can't perform wonders either if overdraw doesn't get effectively addressed. Otherwise we'd see a GF2 Ultra in a dx7 T&L optimized game waltz over a similar clocked GF3.

The GeForce3 doesn't have any less poly pushing power with the old fixed-function T&L than the GeForce2 Ultra, when similarly-clocked, as you say.

Chalnoth,

Reverend:

The Kyro (or specifically, the Kyro2). With lack of cubemap support, and with LightDirection being a cube map texture, would disabling per pixel normalization of LightDirection enable the Kyro2 to run DOOM3? Would you do this?

John Carmack:

I doubt it, but if they impress me with a very high performance OpenGL implementation, I might consider it.

That was in reference to JC bothering to optimize for the Kyro. If you'll note, he says nothing about the hardware being incapable, just that its performance is unacceptable (dunno if you were attempting to argue with me or confirm what I stated...).
 
DaveBaumann said:
Most hardware is faster at drawing large polys than small polys - fill rate efficiency is typically lost at the edges of polygons. Smaller polys = more poly edges per unit area of screen = less fill rate efficiency. This will only slow you down if you are fill-limited, of course.

And large polys are also better for cards which feature Heirarcical-Z buffers (i.e. Radeon).

i'd assume it's the other way around - hierarchical Z should handle higher-granularity surfaces better: smaller triangles would be rejected as quick as the larger trinagles covering the same area (i.e. they would be rejected at the same stage of the z-pyramid), yet smaller triangles would have better coverage/rejection ratio - i.e. more occluded area would get rejected when represented by smaller triangles.
 
Chalnoth said:
That was in reference to JC bothering to optimize for the Kyro. If you'll note, he says nothing about the hardware being incapable, just that its performance is unacceptable (dunno if you were attempting to argue with me or confirm what I stated...).

chalnoth, problem is with kyro's lack of cube mapping - you need it for correct dot3 non-directional (i.e. point/spot) lighting. if you don't have that you can only do correct directional dot3.
 
The larger the triangle the more pixels can be rejected at a time. For instance, the coverage area for the higher Z buffer on Radeon 8500 equates to a total of 64 pixels - if you have a large occluded poly that spans this then you can reject all 64 pixels in one cycle; if you have plenty of smaller poly's, say 4 pixels in size, then its going to take multiple cycles to reject a similar number of pixels.

I think the practice of this is proved with the efficiency gains Radeon's display with Villagemark when comparing against the actual game efficiency gains of its Heirarchical Z buffer being enabled.
 
DaveBaumann said:
The larger the triangle the more pixels can be rejected at a time. For instance, the coverage area for the higher Z buffer on Radeon 8500 equates to a total of 64 pixels - if you have a large occluded poly that spans this then you can reject all 64 pixels in one cycle; if you have plenty of smaller poly's, say 4 pixels in size, then its going to take multiple cycles to reject a similar number of pixels.

I think the practice of this is proved with the efficiency gains Radeon's display with Villagemark when comparing against the actual game efficiency gains of its Heirarchical Z buffer being enabled.

you have a point, still larger polys would still get a statistically-worse rejection ratio - i.e. you'd have fewer occluded pixel early-rejected that you could have will smaller triangles spanning the same area. whether is'd better to have fewer pixels rejected quicker or more pixels rejected at the cost of more cycles is a matter of how expensive each of those pixels is.

and re the village mark test - it could be from the actual level of overdraw in the demo compared to the average tile out there.
 
I'm not so sure...you'd still be able to reject 64 pixels per clock...which means that a totally-occluded 256-pixel polygon might be gone through in a mere four clocks (Minimum...it'd take more in non-ideal circumstances...up to nine clocks for totally-occluded, more if it has to actually do per-pixel z-checks in one or more tiles).
 
The GeForce3 doesn't have any less poly pushing power with the old fixed-function T&L than the GeForce2 Ultra, when similarly-clocked, as you say.

I didn't imply that rather the contrary. I was clearly hinting on a case where NV20 would be able to set it's LMA and occlusion culling into use. Wild guess which will turn out faster?
 
By the time this wretched Unreal 2003 comes out, software T&L willl be considerably faster than Geforce1 hardware T&L.
Kyro did not have way superior FSAA. Above 800x600x16 the performance is much like other cards. Articles at gamebasement come to mind in both cases.

Dream on on the SW T&L case, unless there's already a 5Ghz cpu under production and I've missed it heh.

You seem to have quite a kick with that article from Skywalker and interpretreted it more than once to what you wanted it to read in the end. I'll make it a bit more simple for you:

While a K2 employs just garden variety Ordered Grid Supersampling, with 32bpp colour depth it can produce twice the framerate with FSAA on of that of what a V5 can give with two chips in SLI.

It's limited by the drivers by the way delivering 4xFSAA only up to 1024x768x32 and 2xVertical up to 1280x1024x32.

A Tiler will always be faster than an IMR with FSAA employed; live with it.
 
Above said:
By the time this wretched Unreal 2003 comes out, software T&L willl be considerably faster than Geforce1 hardware T&L.

Highly unlikely in a real-game situation.

If the game is made to make use of as many polygons as the GF1 DDR can handle, and use the CPU as well, then software T&L will not be faster because it will have to fight for CPU time along with other game processes (such as the new physics engine in UT2K3...).
 
Ailuros, your behaviour is unacceptable.
I came to beyond3d to avoid the misinformed, the bullying and the patronising, those who only want to talk without improving their erudition.
Simple facts about Geforce1 T&L are here, not on gamebasement as I suspected:
http://www.tech-report.com/reviews/2001q2/tnl/index.x?pg=6

And as for the Kyro- I had one, it is not twice as fast. Benskywalker is right. Do not be rude, this is not a stamping ground for petty little aggression.

Edit- I am out of this thread due to rude hostility.
 
DemoCoder,

I respectfully disagree with you on the impact of the V5 FSAA. Yes it did not push game designers but it offered something nice to the user which is a feature that they can use TODAY vrs waiting many months in hopes that the new features on their card might get implemented in some forum or fashion. I think 3DFX did not really push the development side as much and that is not really a good thing. However they did give the user community a gift. Every now and then its nice for the users to get something that they can use today to make their games look better. We all know that new features will make tomorrows games look better but as a society we would rather have instant gratification than banking on some new technology getting approved. We want todays games to look better and that wont happen with most some of the new features. Besides its the consumers that are paying for all of this so you have to throw them a bone once in a while :)
 
Above said:
...
Simple facts about Geforce1 T&L are here, not on gamebasement as I suspected:
http://www.tech-report.com/reviews/2001q2/tnl/index.x?pg=6

...

Good link Above, but the benchmarks use 2 differents CPUs speeds (800MHz and 1.4GHz). I would like to see the same benchmarks (soft and hard T&L) with same CPU speed (1.4GHz).

My Guesstimate for Car Chase High detail is 26fps or 60% of performance increase with 1.4GHzAthlon/GF1.

My guess you will need a AMD rating around 5000+ to have a software T&L faster using the same CPU in this specific case.
 
Above said:
Ailuros, your behaviour is unacceptable.
I came to beyond3d to avoid the misinformed, the bullying and the patronising, those who only want to talk without improving their erudition.
Simple facts about Geforce1 T&L are here, not on gamebasement as I suspected:
http://www.tech-report.com/reviews/2001q2/tnl/index.x?pg=6

And as for the Kyro- I had one, it is not twice as fast. Benskywalker is right. Do not be rude, this is not a stamping ground for petty little aggression.

Edit- I am out of this thread due to rude hostility.

Do I sense a bit more sensitivity than needed or did I offend a piece of hardware here suddenly?

Feel free to flip the argument down to a K1@115 where drivers don't even allowed 4xFSAA in 1024 at launch from a K2@175mhz.

As for the link on T&L it was supposed to support the following:

By the time this wretched Unreal 2003 comes out, software T&L willl be considerably faster than Geforce1 hardware T&L.

I wouldn't consider playing that wretched game neither on a GF1 nor on a non-T&L vga and that for obvious reasons. To compensate with a game like not only vga's with a times stronger T&L unit are required, a past 1.2ghz cpu at least and as effective as possible overdraw eliminating techniques all combined in one. If current performance graphs do not seem revealing, then we'll just have to wait until it's final release. I doubt though that the picture will change a lot from what we've seen in Anand's shootout. Other than maybe that he didn't exactly present the worst case scenarios in the game.
 
jb said:
I respectfully disagree with you on the impact of the V5 FSAA.

What about the actual number of Voodoo5's sold? I don't really remember anything specific, but based on the income of 3dfx in its last months, it seemed that only a few hundred thousand V5's were sold altogether...

Also, I wonder if Reverend has checked out the demo yet :)
 
Back
Top