FSAA and high polygon loads mutually exclusive?

On 2002-02-11 17:48, Galilee wrote:
I wouldnt way Aquanox is unplayable with FSAA.
no FSAA:
http://www.stud.ntnu.no/~vidaralm/Img/aqua.jpg
2X FSAA
http://www.stud.ntnu.no/~vidaralm/Img/aqua2X.jpg
4X FSAA
http://www.stud.ntnu.no/~vidaralm/Img/aqua4X.jpg

Well to give Doomtrooper some credit: On the FSAA benchmark the min FPS goes down to 1.4 (from 31.4 without FSAA)! That is unacceptable but it also goes to show that there is something odd and inmature with this engine IMHO.

Regards, LeStoffer
 
About the Comanche FPS...

First off, I'm not totally convinced that FRAPS is able to both provide framerate AND NOT add any slowdowns to the game.

It has been a short while since I've tried out the latest version of FRAPS, but all of my previous experiece in using that utility is that it was a horrible way of trying to capture performance of a game that doesn't have any native capability of doing this on its own.

Having said that...We were talking about the 4x mode, and MikeC has provided Comanche 4xs shots. 4xs is not your normal 4x mode, so bear that in mind. In total, here are the numbers:

25,35,40,57,60

Remember, this is not the same kind of game as Quake3, which is something very important when dealing with FSAA performance...But judging by those numbers, there's nothing wrong and/or prohibitive about FSAA in that game.

And anytime you're talking about performance, you cannot just make a blanket statement about...well, about everything/anything. One or two titles does not justify a position, like FSAA in newer games w/ DX8-like effects.

There is something to be said about crap code too...There are some games whose performance really does defy logic...and in many cases, you can probably attribute it to just bad code.

The funny thing about Aquanox is this...Aside from everything else, that game just sucks badly. Seriously, terrible game man...
 
On 2002-02-11 17:58, LeStoffer wrote:
On 2002-02-11 17:48, Galilee wrote:
I wouldnt way Aquanox is unplayable with FSAA.
no FSAA:
http://www.stud.ntnu.no/~vidaralm/Img/aqua.jpg
2X FSAA
http://www.stud.ntnu.no/~vidaralm/Img/aqua2X.jpg
4X FSAA
http://www.stud.ntnu.no/~vidaralm/Img/aqua4X.jpg

Well to give Doomtrooper some credit: On the FSAA benchmark the min FPS goes down to 1.4 (from 31.4 without FSAA)! That is unacceptable but it also goes to show that there is something odd and inmature with this engine IMHO.

Regards, LeStoffer

It's a bug in the benchmark here. It's not down to 1.4. The reason it happens is because the benchmarks "hangs" in the start, giving me a 1.4.
I'll try to run it some more. I usually get minFPS=1.4 without FSAA also :smile:

Edit: By following the frameratecounter I get minFPS=28 with 2X.
I'll try 4X.

Edit2: 20.1 fps with 4X.

I have no idea why it hangs in the beginning, it doesn't do that without FSAA.

<font size=-1>[ This Message was edited by: Galilee on 2002-02-11 18:33 ]</font>
 
I think its obvious isn't it, I mean its so blatantly obvious I can't beleive people are even questioning it. I'll refer back to a Geforce 4 Ti 4600 running with 4X FSAA shot...notice the FRAME COUNTER.
BTW I never stated just high polys, but with advanced effects like Pixel Shaders many times.

No its not obvious because its untrue mate. You can't believe that people are questioning your opinion that high poly counts effects FSAA performance?.. why can't you believe that?

I'm not interested in the game tests or pixel shaders. What I'm saying is this thread is not about pixel shaders and FSAA. This thread (if you just go and read the title) is specifically about your comments on high poly counts not mixing with FSAA. Now you may have also mentioned other effects but thats not what this thread is about, this thread is about these key comments:

As game continue to put more polys on the screen (UT 2) FSAA is not a option.

FSAA and high polycount games don't mix

Thats what I'm talking about and thats what this thread is about, not pixel shaders or any other effects. So in what way does poly counts effect FSAA performance? Don't get me wrong, I'm not saying you can't talk about pixel shaders ect and FSAA. What I'm saying is first acknowledge that high poly counts and FSAA are not mutually exclusive like the first guy asked.

Some of you are really..thick for being experts in this field...

DoomTrooper has repeatedly said that he's not talking about high polycounts but ADVANCED EFFECTS in ADVANCED ENGINES.

Read the first post which sets the discussion in this forum, take a look at Doom's comments, you think that those comments are not saying that high poly counts makes FSAA slow?

I'm not really interested in this thread too be honest, I just think that Doom should at least acknowldge that he is wrong in those comments and then everyone can move onto discussing what Doom actually wants to discuss.

<font size=-1>[ This Message was edited by: Teasy on 2002-02-11 19:01 ]</font>
 
I'm not sure I understand all the hubbub taking place here as I would have thought its just common sense. Anti aliasing is 'cheap' or 'free' only while software
isnt eating up fillrate and bandwidth. When the majority of available fillrate/bandwidth is already being consumed by shading effects, multipass rendering, etc, naturally performance with the added stress of antialiasing is going to drop.

FSAA performance is a moving target.
 
On 2002-02-11 02:51, MikeC wrote:
Modern game with DirectX 8 graphics technology being played at 1024x768 with 4XS AA.

http://www.nvnews.net/#1013480924

I'm updating my own post :smile:

I inadvertently took those Commance 4 screenshots with the GeForce4 Ti 4600 running at the same clockspeeds of the GeForce3 Ti 500. This was a result of the occlusion culling test I had performed with Quake 3 a day earlier.

I'll get the pics updated this evening.
 
There are many things that can fall under the category of "advanced effects". For some of them, the performance goes down bigtime with FSAA, for others, not.


Pixel shaders, as Dave pointed out, should not be affected by multisampling FSAA because they are only calculated once per pixel, not subpixel.

Anything that is fillrate limited (as opposed to bandwidth limited) should not be affected by multisampling implementations, since again, multisampling doesn't consume fillrate, only bandwidth.


Extra geometry can interact in multiple ways.








The last "advanced effect" is multipass algorithms, render-to-texture, shadow volumes/shadow buffers. Since the shadow buffers or render-to-texture tricks are usually rendered with FSAA switched off, they compete for bandwidth and GPU clock cycles.


So you see, there are many many ways a GPU can be bottlenecked. It could be waiting for data from the host AGP bus. It could be waiting for a vertex calculation to complete. It could be waiting for a memory access to complete. It could be waiting for a pixel write to finish. It could be waiting for a triangle to finish. Etc etc.


Not all of these are affected by multisampling FSAA, and it depends on the game engine and how it manages bandwidth, overdraw, etc Just look at some of the CPU-limited flight sims, whose framerate doesn't change that much no matter how fast the graphics card. FSAA is free in these games.


I think Doomtroopers comments are ill-specified since he doesn't actually identify the precise effects he is talking about.

Multisampling's (on traditional architecture) principal enemy is bandwidth. However, in many of these so-called advanced effects, it is extra clock cycles or combiner stages being burnt, not bandwidth, unless the advanced effect is a multipass trick.
 
Hi,

I'd like to throw another hypothesis out, and maybe democoder or some of the other gurus on here can verify/debunk it. :smile:

As features get more complex and advanced, even ones that eat bandwidth, the bandwidth that is consumed by 4X multisampling AA is comparatively smaller as hardware gets better. Basically, that as the bandwidth requirements for new features grow, and as the bandwidth available on new cards grow, the percentage of bandwidth used by a 4X multisampling AA implementation aproaches 0.

Say your Multisampling AA implementation is taking 640MB/sec of bandwidth when used, and you currently have 6.4GB of bandwidth. It's using 10% of the available bandwidth. Now say a year from now you have a new card that can do 12.8GB/sec instead. Comparatively the same AA implementation would only be taking 5% of the available bandwidth assuming that the complexity of the scene doesn't affect the the bandwidth use of the AA implementation (can someone verify this?) So on a newer card with more bandwidth, the effect of using AA should be smaller than on an older card.

Nite_Hawk

Edit: Changed amount of bandwidth used approaches 0 to percentage of bandwidth used approaches 0.

<font size=-1>[ This Message was edited by: Nite_Hawk on 2002-02-11 20:33 ]</font>
 
DemoCoder,

thanks for your posting. I wonder about #2, though, i.e. do such situations pose a serious problem with current games? I personally wouldn't think that such additional bandwidth requirements will really drag current multisampling FSAA hardware down to unplayable framerates, considering how much more of an impact multitexturing will have, but perhaps you have more, specific information?

Again, thanks for one of the most interesting posts in this thread,

.rb

________
Harley-Davidson FXRS
 
Last edited by a moderator:
Noone actually posted any results of the simple 3DMark2001 High Poly experiment, so I will:

Radeon 8500, 1024x768x32, no fsaa
1 light : 37.9 million polys/s
8 lights : 9.7 million polys/s

Radeon 8500, 1024x768x32, 4x fsaa "quality"
1 light : 37.9 million polys/s
8 lights : 9.7 million polys/s

I was actually surprised that the poly rendering capabilities were totally unaffected in this test, enough to recheck settings and results. After all, the number of vertices is certain to overflow any on-chip caches, so I felt you should see _some_ sign of bus-contention effect as the framerate in the single light case is actually quite high (some would call it playable ;), and the card is rendering at an effective 2048x1536x32 resolution in the 4xfsaa case.

/me shrugs

Entropy

<font size=-1>[ This Message was edited by: Entropy on 2002-02-12 17:14 ]</font>

<font size=-1>[ This Message was edited by: Entropy on 2002-02-12 17:15 ]</font>
 
I was actually surprised that the poly rendering capabilities were totally unaffected in this test, enough to recheck settings and results

Obviously that test shows the card is not fillrate limited (umh..better say bandwith limited in this case..) even with 4x FSAA on. Am I wrong or in that test textures are not used at all.I don't have 3dmark 2001 on my comp..
The second thought is that the test isn't even touching the rasterizer setup limit and that fifos between T&amp;L engine and the rasterizer are big enough to don't stall the t&amp;l engine.

ciao,
Marco
 
Entropy,

TRY using a game like MOHAA with 4X quality, since this game is pushing over 25,000 Vertices this would be a better test, as other factors come in like CPU cycles used up for AI etc...

You can get the DEMO for free if you don't have the game.
 
Please note, there is a big difference between Aquanox and Aquamark.

Aquamark is a custom designed benchmark to show off the Geforce3 hardware vertex shader in high polygon environments. There are about 80-120K polygons on screen each frame.

Aquanox, the game, has a MUCH lower polycount, there are a lot of things missing there. It is just about 30K polygons per frame.
 
Democoder-

>>"Not all of these are affected by multisampling FSAA"&lt;&lt;

All sound theory, but I'd simply add a #4 to your list specifically with FSAA:



If a method used to implement AA competes with GPU and/or CPU to perform the scaling and averaging processes associated with AA, an impact can be measured. The degree of this is still what needs to be determined. As polygon counts increase, this may indeed become a contributing factor. It may already be today.

Obviously any implementation that sees dramatic variance in AA performance dropoff based on CPU speed specifically means an implementation being handled by some degree of CPU instruction. Likely simple controlling of the GPU on a micromanaged level for each step of the AA process. This form of AA will obviously feel impact with increased geometry load as the two are similar in this respect and are therefore in contention for similar resources (CPU and GPU contention).


Entropy-
>>"Noone actually posted any results of the simple 3DMark2001 High Poly experiment, so I will"&lt;&lt;

Did you ensure to exit 3DMark, change the settings then restart 3DMark? I don't see the same results as you. This may in part explain ExtremeTech's results as they claim they couldnt get AA to work correctly also.
I see:
NoAA
1 light: 32.2
8 lights: 9.6

4x Quality AA
1 light: 21.3
8 lights: 8.0
------------
1 light (~34% change)
8 lights (~17% change)

Interestingly enough, to go more into the implementation approach of AA, my second box has a GF3. Albeit running on a slower processor (1gz P3 vs. 2ghz P4), running polygon synthetics with and without AA tells a similar story. I'd be very interested if someone with a GF3 and quite a bit more "cajones" PC-wise could follow up as well:
NoAA
1 light: 18.9
8 lights: 5.1

4xOGMS
1 light: 14.6
8 lights: 4.8
================
1 light(~23% change)
8 lights(~6% change)

It becomes more interesting once you go back to 3DMark2000 and use the hardwired T&amp;L, which one would expect might impose additional GPU contention depending upon AA implementation: (benchmark default settings- 10x7x16)

NoAA
1 light: 16709
4 lights: 11152
8 lights: 5805

4xOGMS
1 light: 7243
4 lights: 6587
8 lights: 4794
===================
1 light (~57% change)
4 lights (~40% change)
8 lights (~17% change)

The lower CPU speed may cloud a bit of this.
 
25,000 vertices per scene is only 1.5 million per second (@ 60fps), and at ~40 bytes per vertex, only 66mb/s, off the radar as far as bandwidth is concerned.

The reason why MOHAA would be much slower in 4XAA is not because of geometry sucking bandwidth, but because the game has high levels of overdraw/multipass/hires textures and simply eats bandwidth like a monster.


BTW, no graphics card I know of uses the CPU to perform AA blending. The GeForce has always done it via the build in scaling hardware.
 
Sharkfood, you are completely right. Odd thing is, after recording the scores, as I said, I got suspicius and checked that FSAA was on. It was. I then fired up 3DMark2001 again and ran the first gamedemo just to see that fsaa was enabled in the app. It was.

Sneaky. :/ My Bad.

Anyways, the real scores are

Radeon 8500, 1024x768x32, no fsaa
1 light : 37.9 million polys/s
8 lights : 9.7 million polys/s

Radeon 8500, 1024x768x32, 4x fsaa "quality"
1 light : 22.8 million polys/s
8 lights : 8.2 million polys/s

Which seems more right.
The higher framerate of the single light test, makes it consume a larger part of available bandwith.

Phew. There is order in the Universe again. :)

Entropy
 
On 2002-02-12 23:46, DemoCoder wrote:

25,000 vertices per scene is only 1.5 million per second (@ 60fps), and at ~40 bytes per vertex, only 66mb/s, off the radar as far as bandwidth is concerned.

The reason why MOHAA would be much slower in 4XAA is not because of geometry sucking bandwidth, but because the game has high levels of overdraw/multipass/hires textures and simply eats bandwidth like a monster.


BTW, no graphics card I know of uses the CPU to perform AA blending. The GeForce has always done it via the build in scaling hardware.

Democoder,

Does your technical theory change the fact that FSAA is not a option in that game, which was my main concern from my initial post. In the future do you think game engines are gonna be less easier to render with much more advanced effects and like you said your self..Highres Textures :smile:
 
The technical theory does, at a very minimum, bring some understanding to the topic @ hand...

To make across-the-board generalizations based only on opinion/speculation...and then try to turn them into fact is just flawed logic, IMHO. Understanding, or having some clear idea as to what the theory tells you can @ least point you in the right direction...And can also lead you to make more 'correct' assumptions/hypothesis/etc.
 
Type,

Technical Theories are exactly that, Theories. I'm not saying that the people in this thread are not intelligent, as I have alot of respect of the knowledge base in this forum. Yet and as I stated before you don't need to be rocket scientist to look at a frame counter. It was of the main reasons why I even started on the FSAA kick. I notice more and more (games like MOHAA) that FSAA is becoming a non-feature.
I would also like to point out a very serious flaw in benchmarking games today.
Almost every game that is sold today is played on the internet. Showing screens of a single player game running around with Max Anistropic with 56X FSAA @ 1600 x 1200 and saying look I get 80 fps. If you want to test your setttings properly start using a 30 + match server and see how well your frames are when you have 14 enemy aircraft, players etc..trying to turn you into swiss cheese.
 
Online gaming introduces additional resources to the game engine...Obviously, it's going to have a negative effect on performance.

Heck, I've always been most impressed with Id's networking code...probably as much as the actual 3D code...There are so few titles out that stress systems out like Id-powered games...and the truth of the matter is that most of them totally suck in this regard...

Which is yet another reason why Carmack and crew are simply head&amp;shoulders above everybody else.

I will say this...I have MOH, and I'm about 40% through the game. I play @ 10x7x32, 4x anisotropic filtering and 4xFSAA. Aside from the beach landing sequence, I haven't had any issue with this configuration.

Would I say, across the board, that FSAA and a game like MOH don't mix? No. I would say that when using the most aggressive settings, there may come a scene or 2 that really makes you think otherwise...But this happens so seldemly that I wouldn't even bother changing settings.

Besides, I really thing that FSAA is an acquired taste...One thing that 3dfx preached that was totally on the $$...Once you use it, you won't go back. Running in high-resolution simply does NOT address the problem @ hand, despite what some people claim...There is an abundance of definitive proof that supports this claim...

I've said it before, and I'll say it again...I think a lot of 8500 owners have been dissapointed with SmoothVision...not so much with any issues pertaining to quality, but the horrible performance. I can tell you that I have spoken with more than 1 nVidia engineer who has jokingly said (to the effect)..."I honestly have NO idea what ATI is doing...There is NO reason why SV performance should be THAT low...There almost has to be something else wrong to get such bad performance."

And the more I think about the initial 8500 delays...SmoothVision not working...etc...I'm actually going to place the following bet...we'll see how close I come.

2 things will contribute to the R250's performance increase over R200, and it won't just be clock/memory (though that will obviously help)...

1. Fix the bug...yeah, you know the one.
2. Fix SmoothVision.

I'm really thinking that you're going to see a fairly significant increase in R250 FSAA performance over the R200...I'm basing this on not one shred of factual proof, so this is all speculation...But based on what is known of R250, I'm leaning towards ATI having fixed some hardware issues in contributing to the higher performance.

Although...now that I think about it, Anand even hinted that ATI wouldn't have anything to really contend with the GF4...So, who knows...We'll have to downclock an R250 to R200 levels to see what was fixed.
 
Back
Top