View Full Version : Ati Crossfire capable of 14X FSAA
russo121
27-May-2005, 00:43
According to the theinquirer http://www.theinquirer.net/?article=23522
"...With its new technique ATI will be able to do 8x, 10x, 12x and even 14x FSAA. That surely sounds insane but if you want to have even better picture quality this is one choice that you will have. A 14 times Full Scene Anti Aliasing sounds like an interesting idea now when most of people are using 4X only...."
14x fsaa temporal ... 28x fsaa effective ? wozzaaaa hehe
Chalnoth
27-May-2005, 01:06
Gah, this is just getting f'in ridiculous.
Fuad is just talking out of his ass for the umpteenth time, and 14x + 2x temporal is not even remotely close to 28x FSAA effective (in fact, as you increase the number of samples, the effect of temporal AA will decrease).
Richard
27-May-2005, 01:23
Perhaps I'm oversimplifying things but I get the 8x, 10x and the 12x but how are they pulling off 14x? Btw, if this story is true I'd much rather have 2xSSAA on the master + 6xMSAA on the slave than higher MSAA modes.
Blazkowicz
27-May-2005, 01:30
I see it as 6x MSAA on the master and on the slave, with offset patterns, so that the combined output gives 6x MSAA plus 2x SSAA -> you end up with 12x AA.
If 10x and 14x are true then I have no idea :)
Ailuros
27-May-2005, 01:43
This one won't offer much of a performance difference but will let you use more then six times FSAA. That is the current limit with ATI cards while Nvidia can do some kind of 8X mode.
Sounds like the "AA-scaling mode" on simFusion; while geometry won't scale, AA sample density will. On a quad chip simFusion it's possible to reach 24x sparsed MSAA (4 chips * 6x).
What the end user gets is roughly single board R520 performance in such a case, just with twice as high AA samples. I've no idea what the odd sample amounts could stand for or if they're for real, yet 12x sparse MSAA (12*12) will be absolutely possible even on R4xx dual board configs.
VSA-100 mode :wink:
Chalnoth
27-May-2005, 01:51
I see it as 6x MSAA on the master and on the slave, with offset patterns, so that the combined output gives 6x MSAA plus 2x SSAA -> you end up with 12x AA.
Having both cards render the entire scene and combine via supersampling = BAD. It's just a horribly inefficient way of doing things. Not to mention you wouldn't be able to do this with two different video cards.
Ah well, knowing the source, the 14x was probably a mistake and should be 13x instead.
:D :D :wink:
Ailuros
27-May-2005, 02:07
Ah well, knowing the source, the 14x was probably a mistake and should be 13x instead.
:D :D :wink:
Why not 13.5x ? <shrugs>
Blazkowicz
27-May-2005, 02:17
Having both cards render the entire scene and combine via supersampling = BAD. It's just a horribly inefficient way of doing things. Not to mention you wouldn't be able to do this with two different video cards.
I agree, of course it's bad, but it's easy and you could afford to do that on lots of games, especially when you are CPU limited (not hard with such ridiculous high power, even in HL2..)
(yes, impossible with different card).
also, it's a shame that nothing other my voodoo5 deals with the alpha blending issue :P
Wow! 14x is a big number.. supersmooth edges but still flickerin tree leaves :cry: :lol:
Dave Baumann
27-May-2005, 02:30
Wow! 14x is a big number.. supersmooth edges but still flickerin tree leaves :cry: :lol:
Think a little...
richard huddy recently mentioned to developers that they could use alpha to mask functionality to AA alpha test on radeon9700 and above but they need to use a backdoor probably because the API doesn't directly supports it or something.
bloodbob
27-May-2005, 02:46
Think a little...
I'm obivously dumb I can't think of how 14 samples would slove the problem and I can't see how 14 samples would indicate that ati isn't just doing doing MSAA any more.
:(
I don't recall seeing this here. Pcpop claiming they have ATI's Crossfire logo. . .
http://www.pcpop.com/doc/0/90/90386.shtml
Blazkowicz
27-May-2005, 03:32
richard huddy recently mentioned to developers that they could use alpha to mask functionality to AA alpha test on radeon9700 and above but they need to use a backdoor probably because the API doesn't directly supports it or something.
does anyone know if WGF 1.0 will support that? (couldn't that be used for anti-aliasing of fonts?)
Ailuros
27-May-2005, 06:21
richard huddy recently mentioned to developers that they could use alpha to mask functionality to AA alpha test on radeon9700 and above but they need to use a backdoor probably because the API doesn't directly supports it or something.
If it works in UT1/D3D w/o having to force it on the API then I can imagine what it is.
In such a case you get virtually antialiasing on alpha tests for free, but I can't imagine that they'd encount it as AA samples.
Ailuros
27-May-2005, 06:23
Wow! 14x is a big number.. supersmooth edges but still flickerin tree leaves :cry: :lol:
You know I have something quite mean in mind, yet not directed at ATI but a major ISV....
12 geometry samples plus two texture samples :?:
P.S. While I have access to the information, I still haven't had time to read it, since I'm out here in Taiwan working. That said, I have seen an ample grid that showed mixed mode 6 SGMS + 2xOGSS, which is what I'm basing my query on
bloodbob
27-May-2005, 06:49
12 geometry samples plus two texture samples :?:
By that logic the existing ati 2x,4x,6x should be called 3x,5,7x. And non AA should be called 2x.
Bugger. "Mine is bigger than yours" all over again...
dizietsma
27-May-2005, 07:36
Wow! 14x is a big number.. supersmooth edges but still flickerin tree leaves :cry: :lol:
Think a little...
Dave has been reading the tea leaves about the tree leaves.
Remember,Matrox Parhellia was able to output 16xFAA (Fragment AntiAliasing). I wonder why noone impliments this method... It works where it suppose to. Thats on edges of objects only.
Chalnoth
27-May-2005, 07:58
Remember,Matrox Parhellia was able to output 16xFAA (Fragment AntiAliasing). I wonder why noone impliments this method... It works where it suppose to. Thats on edges of objects only.
Because it wasn't able to detect all edges all the time.
I think Dave is hinting to the fact higher AA sampling rate will be achieved via supersampling
Chalnoth
27-May-2005, 07:59
I agree, of course it's bad, but it's easy and you could afford to do that on lots of games, especially when you are CPU limited (not hard with such ridiculous high power, even in HL2..)
Except you could do the exact same thing on a single card with much higher efficiency. So using multiple cards to make for more FSAA samples would just be stupid.
Perhaps I'm oversimplifying things but I get the 8x, 10x and the 12x but how are they pulling off 14x?
Is it really that hard to put two and two... erm, six and eight together? ;)
Wow! 14x is a big number.. supersmooth edges but still flickerin tree leaves :cry: :lol:
Think a little...
But it's not smooth below 4x! :D
richard huddy recently mentioned to developers that they could use alpha to mask functionality to AA alpha test on radeon9700 and above but they need to use a backdoor probably because the API doesn't directly supports it or something.
Recently? Well, that's funny, because it's the exact same thing they touted almost three years ago (http://www.beyond3d.com/reviews/ati/radeon9700pro/index.php?page=page4.inc). Yes, DirectX doesn't support alpha to coverage, so you need some kind of backdoor. OpenGL does support it, but I'm still not able to make my a2c samples (that ran flawlessly on my old GF3) run on my 9600/9800.
bloodbob
27-May-2005, 08:43
I've worked out how they do it they get 14 cards stick it all in one machine and hook them up together!
incurable
27-May-2005, 08:49
Ah well, knowing the source, the 14x was probably a mistake and should be 13x instead.
:D :D :wink:
I'm thinking 6 extreme samples ... 8)
2senile
27-May-2005, 08:56
I think Dave is hinting to the fact higher AA sampling rate will be achieved via supersampling
I'd like to think you are correct.
Sadly, i'm just a simple user who wants my games to look the best they can & have no idea of the difficulties involved in geting there.
All I "know" is that SS looked better to me.
bloodbob
27-May-2005, 08:59
Okay so I think we all understand its 2x 6X MSAA samples combined using different texture co-ordinates.
I'm just wondering how many people are gonna hate this because of the bluring of textures that are align to samples in screen space ( AKA blurry writing ).
Chalnoth
27-May-2005, 09:28
I'm just wondering how many people are gonna hate this because of the bluring of textures that are align to samples in screen space ( AKA blurry writing ).
That shouldn't happen with supersampling if the game's designed properly.
Ailuros
27-May-2005, 10:42
Perhaps I'm oversimplifying things but I get the 8x, 10x and the 12x but how are they pulling off 14x?
Is it really that hard to put two and two... erm, six and eight together? ;)
While 6*2 = 12....
8*2 = 16
In order to get to 14x you'd need 7*2, which shouldn't be impossible either. We just haven't seen odd amount of AA-samples this far.
Ailuros
27-May-2005, 10:48
8 + 6 = 14
I don't get it :?: *sigh*
Demirug
27-May-2005, 10:52
2*4+2*1 = 10
2*6+2*1 = 14
Yeah, that's better, it needs a min common denominator to properly compose and display the final image.
Ailuros
27-May-2005, 11:01
(2*4)+(2*1) = 10
(2*6)+(2*1) = 14
(small addition otherwise I get weird results...)
Let me get this straight 2*6 stands for 6x MSAA from each board + 2xSSAA? If yes, geometry isn't scaling is it?
Demirug
27-May-2005, 11:25
(2*4)+(2*1) = 10
(2*6)+(2*1) = 14
(small addition otherwise I get weird results...)
Let me get this straight 2*6 stands for 6x MSAA from each board + 2xSSAA? If yes, geometry isn't scaling is it?
Sorry, I am already write to much.
But think about this: If you need the double pixelshader power for 2xSSAA do you need vertex/geometry power scaling?
silence
27-May-2005, 12:07
so INQ is right? :shock:
digitalwanderer
27-May-2005, 14:10
I think Dave is hinting to the fact higher AA sampling rate will be achieved via supersampling
I didn't think ATi used supersampling. :?
The_Wolf_Who_Cried_Boy
27-May-2005, 14:29
Remember,Matrox Parhellia was able to output 16xFAA (Fragment AntiAliasing). I wonder why noone impliments this method... It works where it suppose to. Thats on edges of objects only.
Because it wasn't able to detect all edges all the time.
Wasn't that because it only took one z sample per pixel, with more samples the edge detection rate would improve wouldn't it?
I'm curious to know as well, why was Matrox the only one to go down this path, was it too expensive/inefficient in terms of required transistors?
trinibwoy
27-May-2005, 14:42
Do NDA's expire at the same time that products are officially announced?
digitalwanderer
27-May-2005, 14:49
Do NDA's expire at the same time that products are officially announced?
Generally, but most NDA's expiration date is also covered by said NDA so those bound to it can't tell you when it expires. :?
My fave thing is when the NDA expires and they push the launch back a week or so. That intervening interval gets very tempting for people to tease and leak tidbits.
It's probably also why we get so much more detailed info the closer we get to product launch. ;)
Chalnoth
27-May-2005, 14:55
Wasn't that because it only took one z sample per pixel, with more samples the edge detection rate would improve wouldn't it?
The edge detection didn't have anything to do with z testing from what I can tell.
Do NDA's expire at the same time that products are officially announced?
Some NDA's are permenant (i.e. Even years after the product is released your not allowed to talk about it)...
I think Dave is hinting to the fact higher AA sampling rate will be achieved via supersampling
I didn't think ATi used supersampling. :?
Well, they haven't been. But as you probably also know, there has been a pretty vocal minority clamoring for it in their own community. And my memory is that they've been saying they could if they wanted to, they just had it fairly far down the list on their resource priorities. So maybe they saw an opportunity to knock off a couple chickadees at one stroke.
digitalwanderer
27-May-2005, 15:25
Some NDA's are permenant (i.e. Even years after the product is released your not allowed to talk about it)...
Really? :shock: Do you demand extra in your contract depending on how demanding the NDA is?
Dave Baumann
27-May-2005, 15:49
Again, you're not thinking about how two boards can potentially operate together (do a searh on the site, I've discussed it before).
cloudscapes
27-May-2005, 15:54
As I mentioned on EB, I think they should invest time in fixing AA for textures using an alpha-test.
digitalwanderer
27-May-2005, 15:55
do a search on the site, I've discussed it before
Are you talking about this one?
http://www.beyond3d.com/forum/viewtopic.php?p=446029#446029
It's technically not pure multisampling if two chips are rendering the same tile, since they will process vertices twice, and fetch textures twice. If I did the same thing on a single chip (render an MSAA tile twice, the second with altered sample positions, then combine) you'd call it hybrid-SS/MS. It's an MSxSS method, combing say two or more 6xMSAA buffers. I don't see the large gain or any inherent difference in scalability.
Vertices are processed, up to the clipping level at least, per chip on each system regardless of MSAA or not; the only time this isn’t the case is (one any solution that we’re aware of) is AFR.
As far as the application is concerned it is behaving exactly the same as an MSAA solution, its just able to go to more samples than can natively be achieved by a single chip - this would be of benefit in a consumer implementation. All of this is catered for by the hardware level and there is a transparency between what it happening on one chip with 4x MSAA and what it happening on 4 with 24x MSAA – the subsample distribution pattern logically works in the same way.
But this doesn’t, of course, preclude a mode whereby the texture subsample center is offset per chip rendering that tile to achieve the same benefits of mixed MS+SS AA.
It'll be interesting to see how restricted the list of games supporting "MS+SS AA" is.
Also, I wonder if Crossfire SS will only work with R520-based cards.
Jawed
Richard
27-May-2005, 16:29
Perhaps I'm oversimplifying things but I get the 8x, 10x and the 12x but how are they pulling off 14x?
Is it really that hard to put two and two... erm, six and eight together? ;)
Nope, except I was under the impression that 6xAA was the highest you could get now. Considering crossfire will work with R4xx cards and all.
Nope, except I was under the impression that 6xAA was the highest you could get now. Considering crossfire will work with R4xx cards and all.
Dave's and demirug's comments imply that I was wrong on that point.
They are naming convention cock-up modes. (Although they are done to make it a little easier for the end user to understand, which is a little understanable to some degree)
Demirug
27-May-2005, 18:29
It'll be interesting to see how restricted the list of games supporting "MS+SS AA" is.
Also, I wonder if Crossfire SS will only work with R520-based cards.
Jawed
How many games have a problem with nVidias hybrid AA modes?
I got myself a Dell Inspiron 9300 which has a Geforce 6800 Go in it. I will say that ATI's AA is definitely superior to NV's, and that the 8X AA (SS+MS) mode is really a sham overall. It's VERY slow (plain 4X MSAA has relatively little performance impact) and doesn't improve visuals much. NV's 4X AA is less capable than ATI's 4X. How NV let themselves get beaten in AA with two generations is beyond me...
I also found that by moving from 'Quality' mode to 'High Quality' image settings that some bad AF artifacts/shimmering went away in KOTOR. I wouldn't call settings that do that anything remotely like 'Quality'. It's pretty sad for a 6x00 series to have to resort to that I'd say. I don't remember ATI's cards doing it, even though we all know they have optimizations too.
HL2 has annoying AF shimmering even with HQ mode. So, I wonder if they have some forced app settings there, or if there's something about the game or that particular texture. I was down on the railroad tracks on the gravel ground. It's early in the game. Just got pistol.
Dave Baumann
27-May-2005, 18:51
How many games have a problem with nVidias hybrid AA modes?
I doubt there are many new titles that have issue with SSAA now, but certianly older titles would - IMO the reason they don't have issues on newer boards is becuase of the unified driver, which makes it very easy for them to attached the compatibility list so SSAA modes across all new and old boards. ATI don't do SSAA on single boards ecuase their "UDA" started with R300, meaning the SSAA compatibility fixes they had for R200 were never brought across.
However, when we think about dual boards there are reasons why these type of issues go away. Think of why Voodoo 5 operated an SSAA mode but never had any compatibility issues.
Demirug
27-May-2005, 19:27
Dave, my own research about this problem so far let me believe that you don't need special application dependent fixes for SSAA. There are some special cases (like already transformed vertices) you need to take care about but this is not application specific.
I fully understand why you bring Voodoo 5 back in the game now but maybe we should think about an other think too. In the Voodoo timeframe we don't have shaderdriven fullscreen effects. Today more and more games use this kind of effects. How well will a copy to texture downfilter work with a Voodoo 5 like SSAA solution?
Ailuros
27-May-2005, 19:35
HL2 has annoying AF shimmering even with HQ mode. So, I wonder if they have some forced app settings there, or if there's something about the game or that particular texture. I was down on the railroad tracks on the gravel ground. It's early in the game. Just got pistol.
I guess you're rather describing alpha test aliasing; can be cured with Supersampling, ideally with 4x samples or more.
Pick a lower resolution than usual and enable via Rivatuner it's 16x AA mode, which is a combination of 4xRGMS+ 4xOGSS. Quite slow but alpha test aliasing gets more effectively cured in HL2.
digitalwanderer
27-May-2005, 19:47
How NV let themselves get beaten in AA with two generations is beyond me...
In fairness to nVidia though, it's a HELL of a lot better than the FX generation AA.
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
wireframe
27-May-2005, 20:21
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
trinibwoy
27-May-2005, 20:33
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
Haha.
trinibwoy
27-May-2005, 21:00
Can someone refresh my memory on exactly why R300+ is capable of 6xMSAA and NV3x+ is only capable of 4x - something about number of passes through the ROPs right?
silence
27-May-2005, 21:01
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
depends on price.... :wink:
digitalwanderer
27-May-2005, 21:06
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
From my perspective also, I'm still AGP single-card here too. :?
But for the people who are considering dual-card solutions it'll be a big plus for ATi over nVidia....if ATi's implementation works/isn't buggy/doesn't have some hidden surprises/etc.
It's too early to call it, I'm not saying what WILL happen....I'm just putting my predictions in I guess. Sorry if it sounded like more than that, I'm having an awful run of days here. :oops:
Dave Baumann
27-May-2005, 21:13
Can someone refresh my memory on exactly why R300+ is capable of 6xMSAA and NV3x+ is only capable of 4x - something about number of passes through the ROPs right?
Well, the number of passes through the ROP is likely also a factor of the buffer and and sample compression you have on chip for this. The other reason is that NVIDIA don't use as high a resolution sparse sample grid - ATI has a subsample accuracy that can sample from 144 (12x12 grid) different sample locations per pixel. NVIDIA made an alteration to the sample grid for 4x FSAA and although its rotated I don't thenk they have altered the number of positions it san sample from.
digitalwanderer
27-May-2005, 21:20
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
On further reflection I gotta point out that the reason most "fervent ATi supporters think using dual cards is absolutely stupid and pointless" is that there is no need/huge advantage to it yet.
If you could show them an advantage to having dual cards I think they might reconsider their opinion, and 24xAA is definately looking to be an advantage.
Just a thought. ;)
trinibwoy
27-May-2005, 21:29
Can someone refresh my memory on exactly why R300+ is capable of 6xMSAA and NV3x+ is only capable of 4x - something about number of passes through the ROPs right?
Well, the number of passes through the ROP is likely also a factor of the buffer and and sample compression you have on chip for this. The other reason is that NVIDIA don't use as high a resolution sparse sample grid - ATI has a subsample accuracy that can sample from 144 (12x12 grid) different sample locations per pixel. NVIDIA made an alteration to the sample grid for 4x FSAA and although its rotated I don't thenk they have altered the number of positions it san sample from.
Oh, didn't know about the difference in grid resolution. So essentially, ATi has better sample compression so a third run through the ROP's is not as much of a bandwidth hit as it would be on Nvidia's cards? Also, what determines how many AA samples can be output per clock?
Chalnoth
27-May-2005, 21:30
because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
Which may well change once ATI's multi-card solution is announced and available.
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
Did you consider that maybe the point of the exercize was to give them a reason to change their mind? :wink:
wireframe
27-May-2005, 22:15
That's why I'm all interested in their new AA, hopefully they got a clue and got it right 'cause it sounds like ATi is getting ready to hit an out-of-the-park homerun in image quality.
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
Did you consider that maybe the point of the exercize was to give them a reason to change their mind? :wink:
Not really, and this is also why I am mostly incapable of taking sides: I don't have a mind. ;) (no brain, no ice-cream headache! (for some reason I am expecting someone to respond to this particular part of my post))
I'm just having a bit of fun with the flag waving loons. However, one thing I thought of reading this and that I do not particularly like is the thought of features (as opposed to performance) being "SLI" dependent. Sure, it's easy to argue that more performance allows more samples in this case and more options/features is always better, barring some huge sacrifice to get them, but I really think that if an IHV is going after a certain feature, they should do so in the single unit (ie: keep their vision atomic to the single board configuration). Even on the performance scaling side, I would find it repulsive if one IHV decided that "SLI" was necessary for satisfactory high-end performance. I think it's great as an over-the-top solution for those willing to throw more bucks for some more performance, but I don't like the thought of the innocent being forced into a "buy two" mentality to get what the ad campaigns promise.
Wow...I added some majestic blah-blah to something I intended to only be the first paragraph. Oh well.
Chalnoth
27-May-2005, 22:24
More importantly, using multiple cards for different samples within a pixel would just be stupid to do performance-wise.
However, one thing I thought of reading this and that I do not particularly like is the thought of features (as opposed to performance) being "SLI" dependent. Sure, it's easy to argue that more performance allows more samples in this case and more options/features is always better, barring some huge sacrifice to get them, but I really think that if an IHV is going after a certain feature, they should do so in the single unit (ie: keep their vision atomic to the single board configuration). Even on the performance scaling side, I would find it repulsive if one IHV decided that "SLI" was necessary for satisfactory high-end performance. I think it's great as an over-the-top solution for those willing to throw more bucks for some more performance, but I don't like the thought of the innocent being forced into a "buy two" mentality to get what the ad campaigns promise.
Wow...I added some majestic blah-blah to something I intended to only be the first paragraph. Oh well.
And I've been saying exactly the opposite --that SLI only makes sense if it can do features/IQ beyond current single-card settings. :lol: I guess you're more of a populist than I am. :)
I'm also assuming that the two (single card vs mvp) climb together in performance & features from generation to generation, rather than say an IHV start artificially/marketing-based knocking functionality out of single-card solutions in an effort to force people into mvp. I suppose it could be a danger it could go that route, but hopefully competition moderates any such effort by either IHV.
Demirug
27-May-2005, 22:31
More importantly, using multiple cards for different samples within a pixel would just be stupid to do performance-wise.
Why? The only performance problem I am able to see at the moment is that the a slower card will limit a faster.
wireframe
27-May-2005, 22:49
And I've been saying exactly the opposite --that SLI only makes sense if it can do features/IQ beyond current single-card settings. :lol: I guess you're more of a populist than I am. :)
Well, I can see the validity of that point, but I see it as becoming very messy in practice. A company should have a vision and follow that vision and their products should be accessible. What they should never do is something like "this SONY VCR delivers twice the picture quality of competing brands*"
*Only when coupled with a SONY receiver and monitor supporting UltraVision Technology (TM).
I am not accusing anyone of such tactics in the 3D scene, but it would be a pity if it went in that direction. In some ways I don't even think how I feel about this makes sense because you should get more if you add a second card. I just don't like thinking about how it will be angled and promoted from the "SLI" side and people buying half of that, feeling cheated, and finally being suckered into buying a second one to get what they thought they would be getting in the first place. This is all for the uninitiated masses, of course. As a side point on that, I don't really understand the economics of SLI for the masses because I figure it would be very difficult to explain to an uninformed dad why he should buy you two of the same, whereas he may be willing to pay twice as much for a single card that he may somehow understand to be better. Yet, I see people online talking about their 6600GTs. Sure, bragging and all that, but this is what I don't like about this type of thing; the other side of the coin, so to speak. I completely understand an informed customer wanting two 6800 Ultra 512MB SLI even if it is very expensive. So, you can see that in some way I am contradicting myself because I don't fully buy into the performance scaling when it is in the middle - buying a second 6600GT later to improve performance as an economically sound plan - because I figure that by the time that need arises there will be needs beyond simple performance in terms of features. You'd be better off buying a 6800GT right off the bat or simply sell your 6600GT and buy the next gen "6600GT" when it's out as it would presumably be twice as fast.
My goodness...I was gonna tell you you were wrong, but now I have blah-blah-blahed some more instead. :P
Oh, didn't know about the difference in grid resolution. So essentially, ATi has better sample compression so a third run through the ROP's is not as much of a bandwidth hit as it would be on Nvidia's cards? Also, what determines how many AA samples can be output per clock?trin, I had much the same questions recently. I think you'll find the answers in Dave's 6800U p/review, in the ROPs section. Well, most of them, as I don't remember reading about sample compression contributing to the potential number of AA passes (but then I didn't know AA required sample buffers, either, though it makes sense). I believe the ROP architecture determines the number of passes per clock, and limiting it to two might be for transistor savings (and thus maybe yields).
Well if you can get more fsaa samples from crossfire, thus inproving iq, I'm all for it!
More importantly, using multiple cards for different samples within a pixel would just be stupid to do performance-wise.
Why? The only performance problem I am able to see at the moment is that the a slower card will limit a faster.
The advantage of having two cards render two independent, slightly jittered images is that you can use RGSS. The disadvantage is that you either lose texture cache efficiency when you adjust the LOD or you don't get as good textures as you could.
Oh, didn't know about the difference in grid resolution. So essentially, ATi has better sample compression so a third run through the ROP's is not as much of a bandwidth hit as it would be on Nvidia's cards? Also, what determines how many AA samples can be output per clock?
Their sample compression should be about identical in compression ratio, at least for the modes supported.
Higher AA modes require the rasterizer to support a new sampling pattern, more bits for the coverage mask that have to be kept around througout the whole pipeline, more space in the on-chip tilebuffer, more loops through the ROPs, a modified compression algorithm, modified downsampling, additional status bits here and there...
Overall, the changes might not seem like a lot, but it's not a must-have feature for NVidia and therefore probably somewhere at the end of the to-do list.
Chalnoth
28-May-2005, 02:15
Why? The only performance problem I am able to see at the moment is that the a slower card will limit a faster.
You also have issues with texture cache and a total lack of sharing geometry processing.
I'll say it again: if you can make a single card do the rendering for all of the subsamples, it'll be vastly more efficient.
trinibwoy
28-May-2005, 03:21
Pete & Xmas - appreciate the knowledge :)
They should focus on making MSAA work with HDR rendertargets instead of just increasing the number of samples on LDR surfaces, which IMO not many next-gen games will use
martrox
28-May-2005, 10:37
It would be a pity if this "out of the park" IQ requires more than one video card because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
On further reflection I gotta point out that the reason most "fervent ATi supporters think using dual cards is absolutely stupid and pointless" is that there is no need/huge advantage to it yet.
If you could show them an advantage to having dual cards I think they might reconsider their opinion, and 24xAA is definately looking to be an advantage.
Just a thought. ;)
Absolutely, DW. Most of the problems I've had with SLI has to do with the fact that it's pointless to consider unless you go 6800GT/U. When a single 6800GT will give you pretty much everything for a lot less than 2 6600GTs, it's still a nobrainer. And all you get from SLI is a speed bump - IF it supports the game you want to play. The real question here is can ATI go nVidia much better on game compatibility. As a self admitted graphics whore, it's enough for me to consider.......show me better IQ and I'll go there......
Chalnoth
28-May-2005, 10:42
Heh, "focus?" None of these things are remotely challenging to implement. It's about design decisions.
But mostly I agree, though, right now it's more important to get MSAA working at all with FP rendertargets than it is to get today's already good MSAA to look better.
Dave Baumann
28-May-2005, 10:53
Why? The only performance problem I am able to see at the moment is that the a slower card will limit a faster.
You also have issues with texture cache and a total lack of sharing geometry processing.
I'll say it again: if you can make a single card do the rendering for all of the subsamples, it'll be vastly more efficient.
Of course a single card will be more efficient, but on titles that are vastly CPU limited or have significant performance issues in multiple rendering due to RTT issues, whats the point of doubling the render power in those cases?
Ailuros
28-May-2005, 12:35
Of course a single card will be more efficient, but on titles that are vastly CPU limited or have significant performance issues in multiple rendering due to RTT issues, whats the point of doubling the render power in those cases?
I doubt a performance increase under current circumstances with RTT or other full screen effects is feasable; there might be a way but I'd figure it'd be quite complex to implement.
Definitely for those cases being able to scale AA sample density is a nice plus (if performance allows it of course). Highest EER bragging rights included :)
Chalnoth
28-May-2005, 13:04
Of course a single card will be more efficient, but on titles that are vastly CPU limited or have significant performance issues in multiple rendering due to RTT issues, whats the point of doubling the render power in those cases?
Any program that has significant performance issues in rendering normally with multiple cards will also have significant performance issues with FSAA shared across the cards.
Dave Baumann
28-May-2005, 13:13
Any program that has significant performance issues in rendering normally with multiple cards will also have significant performance issues with FSAA shared across the cards.
The performance wouldn't be worse than a single card performance, you just get an increase in IQ (as opposed to nothing at all).
Chalnoth
28-May-2005, 13:40
The performance wouldn't be worse than a single card performance, you just get an increase in IQ (as opposed to nothing at all).
Yes it would. Even in an ideal case the performance would be somewhat worse: you still need to combine the results from the two cards (which will take some memory bandwidth), and you still need to duplicate the data to send to both video cards. In cases that are hard on multi-card rendering, the situation will be much worse.
And then, if you decide to improve the texture LOD (if you don't, the textures will get blurrier, as they did with the Voodoo5), you'll have some major texture cache coherency problems.
Edit:
Oh, and finally, if ATI is choosing to do this, they'll really need to re-evalute their sample patterns. It would be rather silly to just use the sparse patterns they currently use on the separate cards.
trinibwoy
28-May-2005, 13:44
The performance wouldn't be worse than a single card performance, you just get an increase in IQ (as opposed to nothing at all).
Yes it would. Even in an ideal case the performance would be somewhat worse: you still need to combine the results from the two cards (which will take some memory bandwidth), and you still need to duplicate the data to send to both video cards.
And then, if you decide to improve the texture LOD (if you don't, the textures will get blurrier, as they did with the Voodoo5), you'll have some major texture cache coherency problems.
Chalnoth, ever consider that Dave has probably seen this thing in action and knows for a fact that performance is not lower in this mode?
Chalnoth
28-May-2005, 13:46
Chalnoth, ever consider that Dave has probably seen this thing in action and knows for a fact that performance is not lower in this mode?
Well, that statement has about the same merit as:
"Chalnoth, ever consider that Dave has probably seen the pigs, and the fact that they can fly?"
And the simple answer is no, it's just not possible. Depending on the settings used, the performance hit may be small, but it is not going to be zero.
trinibwoy
28-May-2005, 13:49
Well, that statement has about the same merit as:
"Chalnoth, ever consider that Dave has probably seen the pigs, and the fact that they can fly?"
Ok boss. :lol:
Dave Baumann
28-May-2005, 13:53
Yes it would. Even in an ideal case the performance would be somewhat worse: you still need to combine the results from the two cards (which will take some memory bandwidth), and you still need to duplicate the data to send to both video cards. In cases that are hard on multi-card rendering, the situation will be much worse.
Combination isn’t necessarily and issue - should you not be gaining performance due to CPU then this is probably not going to be much of an effect given the bandwidth being talked about. Duplication of data will be an issue if you are interface limited, but that will limit all multiboard situations.
And then, if you decide to improve the texture LOD (if you don't, the textures will get blurrier, as they did with the Voodoo5), you'll have some major texture cache coherency problems.
That affects SSAA, but not MSAA.
And the simple answer is no, it's just not possible. Depending on the settings used, the performance hit may be small, but it is not going to be zero.
The likelihood of it occurring in real life is slim since we are talking about theoretical cases.
Chalnoth
28-May-2005, 14:48
And then, if you decide to improve the texture LOD (if you don't, the textures will get blurrier, as they did with the Voodoo5), you'll have some major texture cache coherency problems.
That affects SSAA, but not MSAA.
But if you're using multiple cards for the effect, it makes no sense to use the same texture coordinates for both cards. Supersampling is essentially implied here (or, rather, a combination between the two).
Dave Baumann
28-May-2005, 14:49
Well, didn't we just provide a reason between us?
Chalnoth
28-May-2005, 17:18
Well, didn't we just provide a reason between us?
No, because if you use the same texture LOD but use different texture coordinates, you will get similar performance to just one card doing the rendering. This will have the effect of reducing texture aliasing, but at the cost of some texture clarity. It's rather silly to waste that extra processing, if you ask me.
Regardless, even in this case, having the two cards themselves each use a mixed-mode FSAA rendering system and AFR or SFR (or supertiling) type rendering will allow for better texture LOD with a comparitively tiny performance hit, and the increased size of the z-buffer allows that to be compressed better, and there is the possibility of splitting the geometry load between the two cards, so no, it makes much more sense to not have the subsamples shared across multiple cards.
Blazkowicz
28-May-2005, 18:16
Any program that has significant performance issues in rendering normally with multiple cards will also have significant performance issues with FSAA shared across the cards.
The performance wouldn't be worse than a single card performance, you just get an increase in IQ (as opposed to nothing at all).
yes, a voodoo5 5500 doing FSAA 2x performs exactly like a voodoo4 doing no FSAA, and a voodoo5 6000 doing 4x performs like a voodoo5 5500 doing 2x.
even with render to textures/post-processing stuff unfriendly to NV SLI there still wouldn't be a performance loss, as each card does the same stuff on its own anyway?
Dave Baumann
28-May-2005, 18:48
having the two cards themselves each use a mixed-mode FSAA rendering system and AFR or SFR (or supertiling) type rendering will allow for better texture LOD with a comparitively tiny performance hit, and the increased size of the z-buffer allows that to be compressed better, and there is the possibility of splitting the geometry load between the two cards, so no, it makes much more sense to not have the subsamples shared across multiple cards.
You are not thinking along the right lines. Rules that exist on a single board don't change, think how and why those rules could possibly change when using two (hint: read back).
Colourless
28-May-2005, 19:05
yes, a voodoo5 5500 doing FSAA 2x performs exactly like a voodoo4 doing no FSAA, and a voodoo5 6000 doing 4x performs like a voodoo5 5500 doing 2x.
even with render to textures/post-processing stuff unfriendly to NV SLI there still wouldn't be a performance loss, as each card does the same stuff on its own anyway?
Yes, if it is setup properly. You send all the same data to both cards and just have differing sampling positions. Can even end up being a 'cheap' way to antialias render-to-texture targets (though there is a small chance that it might break with some titles that assume sampling positions with RTT).
Card A does the entire scene with one set of sampling positions.
Card B does the entire scene with a different set of sampling positions.
When buffer flip is done, Card B sends it's completed frame to Card A and Card A combines then sends them to scanout.
The only trouble that can come is when attmepting to lock the surfaces and read back from them. However the drivers should be able to handle that.
Chalnoth
28-May-2005, 19:12
Yes, if it is setup properly. You send all the same data to both cards and just have differing sampling positions. Can even end up being a 'cheap' way to antialias render-to-texture targets (though there is a small chance that it might break with some titles that assume sampling positions with RTT).
Well, I suppose that is a good point. It will be more efficient at RTT, but you're still paying the cost of supersampling for something that we typically do with multisampling these days at a much lower performance hit. I still think that two cards + multisampling (with no shared FSAA) would give better performance, if an equivalent multisampling mode was available.
Demirug
28-May-2005, 19:30
Yes, if it is setup properly. You send all the same data to both cards and just have differing sampling positions. Can even end up being a 'cheap' way to antialias render-to-texture targets (though there is a small chance that it might break with some titles that assume sampling positions with RTT).
Well, I suppose that is a good point. It will be more efficient at RTT, but you're still paying the cost of supersampling for something that we typically do with multisampling these days at a much lower performance hit. I still think that two cards + multisampling (with no shared FSAA) would give better performance, if an equivalent multisampling mode was available.
Yes, MSAA should always faster as SSAA.
If I want antialias RTT I can do this with MSAA, too. The DX 9.0c API contains a solution for this.
I still unsure how well this 2 chip SSAA will work if the game use a fullscreen postfilter effect.
trinibwoy
28-May-2005, 21:11
I still think that two cards + multisampling (with no shared FSAA) would give better performance, if an equivalent multisampling mode was available.
Big if.
No, because if you use the same texture LOD but use different texture coordinates, you will get similar performance to just one card doing the rendering. This will have the effect of reducing texture aliasing, but at the cost of some texture clarity. It's rather silly to waste that extra processing, if you ask me.
ATi will IMHO adjust texture LOD (ATi already made it for DX8 Radeon series - Catalyst 4.1 or 4.2 change, comparision (http://www.pretaktovani.cz/text/obr2/catalisty1b.jpg))
SS is very good solution for aliasing on alpha textures (http://www.beyond3d.com/previews/s3/s18/index.php?p=06) and it offers very nice textures (http://www.3dfx.cz/temp/screenshots/aa2_8xrgss.jpg) with adjusted texture LOD (screenshot with RGSS 8x enabled and no AF of course)
Other advantage is customizable sample positions, which can not be easily adjusted at today's single-chip solutions... So why are you so sceptical? I know this isn't the best way to implement SS, but it will be a possibility, not a must :-)
(sorry for my English)
Chalnoth
29-May-2005, 06:33
ATi will IMHO adjust texture LOD (ATi already made it for DX8 Radeon series - Catalyst 4.1 or 4.2 change, comparision (http://www.pretaktovani.cz/text/obr2/catalisty1b.jpg))
Different scenario, though. If you're using multiple cards to do the supersampling, adjusting the texture LOD has a much larger performance impact.
bloodbob
29-May-2005, 07:03
Even with multiple chips you'd get that problem too. You'd need a unified cache.
Blazkowicz
29-May-2005, 08:06
with my voodoo5 I used to set the LOD to -1.5 (or even -2 but texture aliasing comes to the play) in the drivers when using fsaa 4x. quite simple isn't it? no significant performance loss, but what a great image quality :shock: (basically like good anistropic filtering)
for 2x I only used LOD -0.75 ; and -0.25 with nothing (but with mipmap dithering), as the default LOD is slightly too "high".
Ailuros
29-May-2005, 08:24
with my voodoo5 I used to set the LOD to -1.5 (or even -2 but texture aliasing comes to the play) in the drivers when using fsaa 4x. quite simple isn't it? no significant performance loss, but what a great image quality :shock: (basically like good anistropic filtering)
for 2x I only used LOD -0.75 ; and -0.25 with nothing (but with mipmap dithering), as the default LOD is slightly too "high".
VSA-100 had quite conservative LOD adjustments to start with.
Apart from that usually with Supersampling you can notice a LOD offset of -0.5 for 2x samples, -1.0 for 4x, -2.0 for 16x etc.
Beyond that whether you'll use AA or not, using LOD values below default "0" will impact performance and gradually introduce more and more side-effects on all accelerators.
Different scenario, though. If you're using multiple cards to do the supersampling, adjusting the texture LOD has a much larger performance impact.
I wonder whether they actually need a LOD adjustment after all.
Even with multiple chips you'd get that problem too. You'd need a unified cache.
And what about VSA100-based systems? There isn't this problem (and performance hit is about 5%, as I remember). I'll try to find some info about CAE tropos or ES simFusion 6x00...
edit: Blazkowicz_: that's what I'm speaking about, I have the same experience :-)
Ailuros
29-May-2005, 09:09
Even with multiple chips you'd get that problem too. You'd need a unified cache.
And what about VSA100-based systems? There isn't this problem (and performance hit is about 5%, as I remember). I'll try to find some info about CAE tropos or ES simFusion 6x00...
edit: Blazkowicz_: that's what I'm speaking about, I have the same experience :-)
There are 3 detailed articles here on B3D about how the T-buffer exactly works on VSA-100. Way too many differences.
Cowboy X
29-May-2005, 16:35
because I think you'll find that the most fervent ATI supporters think using dual cards is absolutely stupid and pointless :P
Which may well change once ATI's multi-card solution is announced and available.
I was going to say the same thing .
There are 3 detailed articles here on B3D about how the T-buffer exactly works on VSA-100. Way too many differences.
The only relevant difference that's related to "how the T-buffer works" here is that each chip can output two samples, so with 2xAA you still have your "unified cache". The other differences are in the TMU, VSA-100 is not quad-based and likely has a different caching strategy.
Unknown Soldier
29-May-2005, 18:23
Chalnoth, ever consider that Dave has probably seen this thing in action and knows for a fact that performance is not lower in this mode?
Well, that statement has about the same merit as:
"Chalnoth, ever consider that Dave has probably seen the pigs, and the fact that they can fly?"
And the simple answer is no, it's just not possible. Depending on the settings used, the performance hit may be small, but it is not going to be zero.
You don't mind if I quote you :D
http://www.hothardware.com/viewarticle.cfm?articleid=689&page=3#update
Dave Altavilla states:
We tried to duplicate the screen shots that ATI supplied to us, and found that ATI seems to have misrepresented the micro-geometry detail of NVIDIA's 8X anti-aliasing method. The screen shots we took on our own with a GeForce 6800 GT, clearly show more detail than the screen shots distributed by ATI in their CrossFire presentation. Take a look for yourself...
We wandered around the same portion of the map ATI used for their screen shots at various resolutions with a GeForce 6800 GT installed into our test system and HL2 configured for high detail, and could not replicate ATI's screen shots. Even at lower resolutions, the same portions of the antenna atop the hotel shown in the CrossFire 14X AA screen shot are visible on the GeForce. And with the resolution turned up to 1600x1200, the detail only got better. We even tried to move towards and away from the hotel from different angles and couldn't get the portions or the antennas missing in ATI's screen shots to disappear.
So, while ATI's new 14X AA method enabled by CrossFire may provide more micro-detail than other AA methods that use fewer samples, which seems evident in the edge details on the hotel sign, the difference between NVIDIA's 8X AA and CrossFire's 14X AA as demonstrated by ATI don't seem to be indicative of actual in-game images when comparable detail settings are used.
Finally, we contacted ATi for comment on this issue that we discovered and they agreed they too could not reproduce the lack of detail that is represented in the NVIDIA screen shot they provided us above, although they assured there was no intention of misleading us with these competitive image quality comparisons. This isn't the first and probably won't be the last time we will be witness to over-zealous marketing efforts, whether accidental or intentional.
Hmm .... :oops:
Acert93
05-Jun-2005, 05:22
Dave Altavilla states:ATI screws up big time and gets caught red handed blah blah blah
Hmm .... :oops:
Yep, exactly why I look for 2 or 3 reviews on every product. I also get a little smirk when comparing benchmarks, same game, similar systems, slightly different choice in drivers (sometimes now), and how different benchmarkers get different results and have totally different conclusions.
As for the multi-gpu thing... meh. If I had money to burn maybe... but when you look at the cost of a beefed up PSU, new MB, multiple GPUs (that already run well over $300 each), heat issues, driver support, etc... I just do not get it.
I guess I am happier just getting a new GPU every 2 years. I would rather get new FEATURES than just the same features and a faster card on old technology. But if you have cash to burn... after you get your SLI/AMR/MVP or whatever mail me some :D
incurable
05-Jun-2005, 08:06
Just imagine the ruckus that would happen here had nVidia done similar a "mistake" ... :roll:
... and yet still, some people wonder why these fora are perceived as ATi fanboi territory by outsiders. :lol:
Hmm, I'm not really familiar with nvidia's >4xAA modes but, are they (supposed to be) comparing SuperAA to 4xMS*2xSS or 2xMS*4xSS ?
trinibwoy
05-Jun-2005, 20:59
Hmm, I'm not really familiar with nvidia's >4xAA modes but, are they (supposed to be) comparing SuperAA to 4xMS*2xSS or 2xMS*4xSS ?
They have both.
Chalnoth
05-Jun-2005, 23:30
I'm reasonably certain that you can't expose the 4xMS * 2xSS mode without a tweak program, though (at least, it was this way early-on....I haven't played around with these half-supersampling modes).
Randell
05-Jun-2005, 23:58
Correct, but the nHancer program makes setting up game profiles so easy (easier than via the nvidia control panel!), its silly not to use them where useful :)
digitalwanderer
06-Jun-2005, 00:11
Correct, but the nHancer program makes setting up game profiles so easy (easier than via the nvidia control panel!), its silly not to use them where useful :)
I'll vouch for nHancer in a heartbeat, it's now my proggy-o-choice for nVidia tweakage and an important tool in me geek arsenal.
It's profiles are pretty nice and a LOT better than nVidia's profile system, but I still think Radlinker has got the best per-game profile set-up around. (You just right click on the game icon and choose "Custom Options" and then have full control over all the D3D/openGL settings when you right-click on it next time.....heaven! )
But nHancer is a great program and should be included with every nVidia card sold. (Same as Radlinker should be included with every ATi card)
vBulletin® v3.8.6, Copyright ©2000-2013, Jelsoft Enterprises Ltd.