R350/NV35 Z Fillrate with FSAA

Chalnoth said:
FUDie said:
P.S. In fact, I think that NVIDIA was surprised by ATI. ATI had a WHQL certified DX9 driver ready for general release at the same time Microsoft released DX9 to the public... NVIDIA was months behind getting a DX9 driver out let alone certified.
That's not particularly impressive. ATI had hardware out for many months before DX9. Since the specs of DX9 clearly didn't change by a large amount from the beta, ATI had months of testing on final hardware before DX9 was released.
Not particularly impressive, eh? Except that NVIDIA didn't even release a DX9 driver for months for any of their parts, and they had just as much time to test their older GeForce 4 parts as ATI did. Meanwhile ATI released a DX9 driver for all Catalyst support products.

-FUDie

P.S. And, yes, it is very important to update your drivers to support newer DX versions, even if your hardware doesn't support the most interesting features... just ask any OEM.
 
NVIDIA's original plans were to update their AA implementation in the NV30. They didn't have the time to do so - the problem wasn't transistors, but time with the NV30.
But with the NV40, don't worry, they finally had the time to work on that And they don't have many more transistors either: Just 150M.

You're not trying to tell me here that you're transistor count will not increase, when moving from ordered grid to rotated/sparse grid are you?

I'm not so sure what you mean with the time comment. I recall Vivoli stating long ago that NV30 will be their spring 2002 product. Now the most likely scenario is that it's gone through a re-design and the codename indicated something completely irrelevant to what we have today.

Scratch that; it's my understanding that IHVs use up usually up to 3 months give or take for early chalkboard design. Now try to convince me that while they had time to implement the filter on scanout trick since NV25 and hybrid SS/MS modes and the evolutionized the ROPs/looping trick from 3dfx's technology books that it would have taken an ungodly amount of time to get in there also 4xRGMS. There's something just not adding up and that not since NV3x, but NV25 to be even more specific.

150-155M is the normal playground I would suspect for NV40 too. But there is a transistor count increase if they would hypothetically moove to let's say 8x sparse grid sampling (8*8).

Hmm, what's the MO bus? And I must admit I'm having a few problems figuring out what's the advantage of what you PMed me...

It's not an advantage at all times. You have the sources just ask around. I just don't know you ;)

In order to find out what the MO bus was standing for and how useful it was, there hasn't been a single change in terms of caching and/or memory optimisations since NV20? I doubt you'll find the old patents, if yes they're an interesting read (don't have any links anymore).

Still, on the GigaPixel influence front, I'd love to see a part which would be both a TBDR and an ILDP. Doubt we'd see such a thing for a while though - NV50 for ILDP, and if we're lucky, NV60 or NV70 for both at the same time. Now that'd be a might impressive part!

If there is a chance that we'll a see a fully featured high end TBDR, then you know in which direction to look first and hopefully a lot earlier than NV50 and it's followers.

Nice, 150M makes a lot more sense than the (stupid) 300M Rumor. But nVidia has to use the transistors significantly more efficient than they did with the NV35. With only 15% more transistors they need to add better AA, PS & VS 3.0 faster shader performance (per clock) and more fillrate.

Is that just it? :rolleyes:

You may add all the checkboard features functioning as they should; you know those that lie under the "present yet not currently exposed in drivers" category.
 
I ain't gonna insist on the memory controller bit, because I must admit I can't be 100% sure of it. That source got a *lot* of info, he's very technical, but he's got that bad habit of confusing fact and personal speculation. I was trusting him on that, but considering I might be doing a mistake there, I'll let this issue rest. Sorry for insisting on it.

As for the NV30 design goals. No, I'm not talking functionality. I'm talking performance. Although there are a few functionality things that they didn't manage to get working, like texturing in the Vertex Shader and the Programmable Primitive Processor.

One of the very first NV30 designs was 16x1 ( maybe 8x2 or 16x0 rather, but that's hard to confirm ) , multichip ( I doubt that means putting multiple together, more like separating VS & PS, but I'm not sure ) , 500Mhz core clock. That design probably was full FP32 too, I'm pretty convinced FP16 is a consequence of latency problems they only realized later on.

One of the first NV30 designs to tape-out was 6x2 / 12x0 ( compared to the current 4x2 / 8x0 design ) , but that tape-out didn't quite work out well. No idea if they already had realized the latency problems they had with FP32 yet.

So take the NV30. Increase its fillrate by 50% ( 6x2 / 12x0 ), retrieve register usage performance hits ( or at least, minimize it a *lot* ) - and you've practically got FP32 speed equivalent to the speed of the R300's FP24, with a few differences here and there ( R3xx can do Scalar + Vec 3, but the NV3x can do cos/sin in one cycle, and so on )
Oh, and add all new AA algorithms, probably nealry as good or as good ( if not better ) than ATI's R300 ones.

Of course, it couldn't win in AA tests yet because it'd still have 16GB/s of memory bandwidth ( they did underestimate ATI on that front I admit ) , but the original designs would probably win shading tests IMO.

Finally:
Maybe what you should say is that NVIDIA bit off more than they could chew...
Yes I should, thank you, I forgotten that expression even existed, hehe.
And no, that isn't a sign of superiority - I didn't say the idea was good, in fact, I even said they were overambitious. The initial design, should they have managed it, would have been excellent. But they didn't, precisely because they bit off more than they could chew.


Uttar
 
FUDie said:
Not particularly impressive, eh? Except that NVIDIA didn't even release a DX9 driver for months for any of their parts, and they had just as much time to test their older GeForce 4 parts as ATI did. Meanwhile ATI released a DX9 driver for all Catalyst support products.
If I remember correctly, the DX9 interface was available for all of the 40.xx drivers. The first release of those was August 29, 2002. The first ones that were WHQL-certified were the 40.72's released November 8, 2002.

nVidia driver archive:
http://www.nvidia.com/object/winxp-2k_archive.html

Anyway, none of the literature I could find at nVidia's site would tell me definitively which drivers had the DX9 interfaces and which did not. So, if you have the ability (and want to) test them, they're available for download still...
 
MfA said:
Uttar, I wouldnt say smarter ... most of us just learned to stop accomodating people like you a long time ago while dealing with Dave :)
Er... which Dave? When it comes to something like this, I tend to confuse the two.

:) ;)
 
Ailuros said:
Ah, but your response fails in one key point.
the sooner some of you daydreamers realize that simple fact the better for you. NV20 wasn't the ideal product either, but it didn't come without it's own little rudiments too.

You are talking with someone who did believe the Revenge hoax :rolleyes:
Take this advice, stop wasting your time.

http://wakeup.to/revenge :D
 
Ailuros said:
I'd love to say "proove it" but I'm tired of this nonsense to be honest. Why did they need speciall addressing for binning then with Sage2? Wait I don't want know...

Wha? Binning?

And the deferred rendering was to come with the wonderfully nebulous Mojo, which nobody is qualified to discuss except everyone's best friend Mr. Sellers. 8)

And when did I say that the false trilinear modes are on anything other than NV3x? I was taking a dig at the embarrassing screwup that is NV3x. I'd probably also prefer Quality AF to High Performance AF if I had an NV3x as well =)

In said paragraph you clearly mentioned the GF3, there was no sign of a FX in there.

My exact words:

I said:
Absolutlely. It would've definitely looked questionable compared to 'real' AF of today. It was, above everything else, an LOD trick used in conjunction with MSAA, not really AF at all... but it would've been damned impressive back in Rampage's time frame, especially compared to GeForce3's horrid AF performance. It would, however, have maintained full trilinear *cough* and would've looked *comparable* at equivalent "levels" of AF.

I mentioned GF3 in terms of performance. I now realise that my statement was a bit badly worded, and I should've probably said something more like "*cough*gffx*cough*", sorry about that. I didn't mean to imply that GF3 was doing anything odd... in fact, its AF is more or less what AF should be, which is why it's so low perfoming 8)

You're probably alone in claiming that the R100 had tri-AF, but let's just leave it at that.

I say again: Prove me wrong. Show me anything that states or shows that Radeon R6 is incapable of trilinear AF.

I don't see why it should be incapable of it though. R200 is a different chip, and a new design. Why would it carry over flaws like that?

jvd just pointed out to me that while R6 is capable of it, it isn't used because of performance, which makes sense to me.

And 3dfx had an abysmally pathetic showcasing in OEM markets, where mind you both ATI and NVIDIA back then had already strong foundations. Where's your point exactly? [/quote[

I have one word for you: Mindshare.

Then IHVs are extremely dumb today for not adopting that miraculous method and waste precious transistors on AF implementations. Same effect? In your dreams.

They CAN'T use it... unless they were to use 3dfx's exact method of multibuffering (T-Buffer / M-Buffer). Conventional methods don't exactly provide a means for jittering sample points... which the M-Buffer does.

Who's approach is more laughable is for others to decide. Right up there you just claimed that R100 was capable of tri-AF. I'm sure I put that in your mouth too...

See above re: R6.

Early Rampage designs started somewhere after 1998. After the endless feature creeps and redesigns (with the final incarnation ironically carrying a codename of "R4" for whatever weird reason), it barely would have been able to beat a NV20 in single chip, with the highly expensive dual chip holding balances after Fear would have hit shelves; the sooner some of you daydreamers realize that simple fact the better for you. NV20 wasn't the ideal product either, but it didn't come without it's own little rudiments too.

The initial Rampage design was meant to follow Voodoo2.

It was called "R4" because it was the fourth total redesign of the Rampage idea.
 
Mummy said:
Ailuros said:
Ah, but your response fails in one key point.
the sooner some of you daydreamers realize that simple fact the better for you. NV20 wasn't the ideal product either, but it didn't come without it's own little rudiments too.

You are talking with someone who did believe the Revenge hoax :rolleyes:
Take this advice, stop wasting your time.

http://wakeup.to/revenge :D

I have never taken a stance on Revenge.

I honestly don't give a flying fuck whether it was true or not. My involvement with 3dhq provided me with insight into a lot of other things, so even if Revenge was a complete hoax and Syed was a vicious con man who just wanted to fuck us all over, I've still benefitted from the experience, and I'm very happy about that.

And as a result, I will never say whether I believe it or not... because it's irrelevant to me.
 
Tagrineth said:
I have never taken a stance on Revenge.

Yeah right :rolleyes:

I honestly don't give a flying fuck whether it was true or not.

And we whoreheartly dont give a damn fuck about Rampage either.

We are bored of this kind of fanboy monkeying you 3dfx zealots always do from time to time, please do us a favor, stop bothering us with that bullshit, 3DFX is D E A D GODDAMMIT, its 2003 for fucks sake

And as a result, I will never say whether I believe it or not... because it's irrelevant to me.

Pretty pathetic.. it was very much relevant to you Tagrineth, as long 3Dfx is mentitioned everything is possible, even a R350 killer with a bunch of transistors :)
 
Wha? Binning?

Ask yourself that question not me.

And the deferred rendering was to come with the wonderfully nebulous Mojo, which nobody is qualified to discuss except everyone's best friend Mr. Sellers.

There wasn't anything impressive even in the rumoured Mojo specs in today's terms. 64bit internal accuracy thank you.

I say again: Prove me wrong. Show me anything that states or shows that Radeon R6 is incapable of trilinear AF.

http://www.digit-life.com/articles/radeon/radeon_q3_tlf_anisotrop.jpg

jvd just pointed out to me that while R6 is capable of it, it isn't used because of performance, which makes sense to me.

No comment on that. I certainly do need to get my facts str8 right?

I have one word for you: Mindshare.

ATI had enough mindshare and sales presentages to achieve higher degrees of multitexturing then, one thing that didn't happen, just because developers care more about the lowes possible common denominator. Does it also help that the difference for Serious Sam you mentioned up there in terms of multitexturing, the difference between single to quad texturing is in the ~11% ballpark? Even today how many games exceed 4-5 texture layers?


They CAN'T use it... unless they were to use 3dfx's exact method of multibuffering (T-Buffer / M-Buffer). Conventional methods don't exactly provide a means for jittering sample points... which the M-Buffer does.

Oh so that's the reason then and not maybe the fact that current implementations have proven themselves far more efficient in the meantime?

See above re: R6.

Right back at you :p

The initial Rampage design was meant to follow Voodoo2.

It was called "R4" because it was the fourth total redesign of the Rampage idea.

Bingo! You think you're telling me something new here or what? And while the competition was pumping out advanced featureset one after the other they were just revamping old tired cores until they finally would have had that damn thing ready. That still doesn't change the fact that Spectre was only NV20/5 competitive material and that's about it.

We're in the realm of 16-layer MT capable cards today and are facing in the next generation hundreds of instruction slots, MRT's and preferably full FP32 accuracy. Give it a rest.
 
Mummy said:
Tagrineth said:
I have never taken a stance on Revenge.

Yeah right :rolleyes:

Please find any quote of me saying that I believe it or that I don't believe it.

Please.

Humour me.

Mummy said:
We are bored of this kind of <bleep> monkeying you 3dfx zealots always do from time to time, please do us a favor, stop bothering us with that bullshit, 3DFX is D E A D GODDAMMIT, its 2003 for fucks sake

Mummy said:
Pretty pathetic.. it was very much relevant to you Tagrineth, as long 3Dfx is mentitioned everything is possible, even a R350 killer with a bunch of transistors :)

O...K... uh... I'm not even saying Rampage could've stood up to NV25, really. It would've killed NV20 and R200, but not much past that. IF ANYTHING it would've nipped at Ti4200's heels in the most expensive configuration... and probably would've meant no GF4MX (which would've been nice)... but beyond that, nah.

Oh, and by the way, you know what else is pretty pathetic?

One person compares a modern feature with a feature that would've been in a product two years ago, and immediately several rabid Anti-3dfx Cannonsâ„¢ start toasting Rampage... I started mentioning it because I thought it was neat that suddenly this new, famous, brilliant product by ATi has one design feature that parallels something that 3dfx were poised to do.


Ailuros said:
Wha? Binning?

Ask yourself that question not me.

I know what binning is, but I have no idea what you meant by binning for SAGE2 compatibility or something like that. Please elaborate?

Ailuros said:
There wasn't anything impressive even in the rumoured Mojo specs in today's terms. 64bit internal accuracy thank you.

Mojo was in the basic design stages. I'm sure they would've increased that to FP32 / 128-bit once DX9's spec started to materialise. When we first got wind of the idea of Mojo and 64-bit, that was pretty darn impressive, considering that 128-bit wasn't much more than a twinkle in John Carmack's eye then.

Ailuros said:
http://www.digit-life.com/articles/radeon/radeon_q3_tlf_anisotrop.jpg

Tagrineth said:
jvd just pointed out to me that while R6 is capable of it, it isn't used because of performance, which makes sense to me.

No comment on that. I certainly do need to get my facts str8 right?

=) I asked you to prove me wrong, and you did. I didn't say that what I said was absolute gospel or anything.... when I said it, I even said "I'm pretty sure" and "Prove me wrong"... and you just did.

Ailuros said:
ATI had enough mindshare and sales presentages to achieve higher degrees of multitexturing then, one thing that didn't happen, just because developers care more about the lowes possible common denominator. Does it also help that the difference for Serious Sam you mentioned up there in terms of multitexturing, the difference between single to quad texturing is in the ~11% ballpark? Even today how many games exceed 4-5 texture layers?

Ah, but did ATi *push* massive multi-texturing?

Added: And you don't think having TWO top-tier IHV's pushing heavy MT would've helped its adoption?

Ailuros said:
Oh so that's the reason then and not maybe the fact that current implementations have proven themselves far more efficient in the meantime?

Did I say otherwise? No. You once again put words in my mouth.

All I did was give you the exact reason why nobody used 3dfx's trickery... and I did that for a reason.

Although I have to say that, current implementations aren't so much more efficient, as they are more effective, especially IQ-wise, and they're more ubiquitous, so to speak. 3dfx's could provide their trick AF with around a 1% performance hit with supersampled AA or a somewhat higher percent hit (can't be determined) with multisampled AA... but at this point only nVidia would be interested in the idea, considering ATi has more or less forsaken supersampling. The multisampling version wasn't that great compared to real AF, but it did get some kind of work done.

In any case, though, as you say, there's no use for that anymore anyway because there are more useful methods in place.

Ailuros said:
See above re: R6.

Right back at you :p

Right back at you too =)

Ailuros said:
Bingo! You think you're telling me something new here or what? And while the competition was pumping out advanced featureset one after the other they were just revamping old tired cores until they finally would have had that damn thing ready. That still doesn't change the fact that Spectre was only NV20/5 competitive material and that's about it.

It's a shame, really... but had Rampage been released right after V2, it would've been pretty useless. That form of Rampage was pretty poor. V3 performed better... why do you think, once Banshee had already delayed Rampage, they went ahead with V3 instead of pushing Rampage out the doors? The next two incarnations of Rampage wouldn't have competed as well as VSA-100 either, and that says a lot.

And if you'll look up, I've discussed what Rampage should compete with, and if you look in the very long Rampage / Revenge threads here, you'll see that I've only really been claiming that level of competition. The only way it would be higher is with... questionable... results and methods.

Ailuros said:
We're in the realm of 16-layer MT capable cards today and are facing in the next generation hundreds of instruction slots, MRT's and preferably full FP32 accuracy. Give it a rest.

Yup! It's pretty cool where the graphics industry is now 8) I wonder what things would look like if 3dfx had survived, though... probably not too different, except for one huge change - NV30 probably would've been out on time, though it would've been pretty different from what we see today.
 
Hmm, I am quite certain that Rampage would NOT have smoked either NV20 or R200-it would have competed, and provided similar results, winning some battles, losing some, but that`s about it.And some people tend to forget something significant-in terms of VS and PS it was inferior to both of them, and that is something meaningful.As to the original NV30 design, it was overly ambitious and underpowered through its development cycle, and that`s that.
 
Testiculus Giganticus said:
Hmm, I am quite certain that Rampage would NOT have smoked either NV20 or R200-it would have competed, and provided similar results, winning some battles, losing some, but that`s about it.And some people tend to forget something significant-in terms of VS and PS it was inferior to both of them, and that is something meaningful.As to the original NV30 design, it was overly ambitious and underpowered through its development cycle, and that`s that.

Dual-chip Rampage would've had 256-bit DDR at 200+ MHz.

And it did have some good bandwidth saving tech, like recursive texturing.
 
But it would have lacked any kind of hardware HSR,for example.My point is not that it did not have its strenghts-granted, there were some nice things that the Rampage could do-but what remains after it is the memory of a very dilluted evolution, with way too many delays and changes of target, and this constant mentioning of the Rampage as the uber-chip that all should bow to and try to mimic.Fact of the matter is that years have passed over that tech, and it is very much old news nowadays, and nothing u will say or do will change that.And I stand by my statement, that it wouldn`t have blown anything out of the water.
 
I know what binning is, but I have no idea what you meant by binning for SAGE2 compatibility or something like that. Please elaborate?

Sage2 was most likely to have it´s own dedicated ram, paired with a hierarchical tiling scheme.

Mojo was in the basic design stages. I'm sure they would've increased that to FP32 / 128-bit once DX9's spec started to materialise. When we first got wind of the idea of Mojo and 64-bit, that was pretty darn impressive, considering that 128-bit wasn't much more than a twinkle in John Carmack's eye then.

It was the first design after Spectre/Fear that wouldn´t have had a separate geometry processor. It was projected for 2002 and it would have taken a fundamentally new design to reach upcoming generation hardware. They few spare aspects known about it don´t even place it above R3xx in functionalities, irrelevant of architecture.

Although I have to say that, current implementations aren't so much more efficient, as they are more effective, especially IQ-wise, and they're more ubiquitous, so to speak. 3dfx's could provide their trick AF with around a 1% performance hit with supersampled AA or a somewhat higher percent hit (can't be determined) with multisampled AA... but at this point only nVidia would be interested in the idea, considering ATi has more or less forsaken supersampling. The multisampling version wasn't that great compared to real AF, but it did get some kind of work done.

1% of a hit for the LOD hack while losing 1/4th of your effective fillrate with Supersampling on.

It didn´t even work independantly from either/or sampling method.

Dual-chip Rampage would've had 256-bit DDR at 200+ MHz.

256bit DDR in 2000?

And it did have some good bandwidth saving tech, like recursive texturing.

I´d still say that techniques like occlusion culling, hierarchical Z, fast Z-clear, early-Z (or their possible combinations) are still better than just that.
 
Chalnoth said:
FUDie said:
Not particularly impressive, eh? Except that NVIDIA didn't even release a DX9 driver for months for any of their parts, and they had just as much time to test their older GeForce 4 parts as ATI did. Meanwhile ATI released a DX9 driver for all Catalyst support products.
If I remember correctly, the DX9 interface was available for all of the 40.xx drivers. The first release of those was August 29, 2002. The first ones that were WHQL-certified were the 40.72's released November 8, 2002.
Glad you brought this up. Do you realize that these drivers are WHQL'ed under the DX8 version of the DCT suite? Didn't think so.

-FUDie

P.S. If memory serves (and I believe it does) Microsoft didn't release DX9 until around Dec. 19th, 2002. So there's no way NVIDIA could have gotten DX9 certification for the 40.72 drivers.
 
Ailuros said:
Sage2 was most likely to have it´s own dedicated ram, paired with a hierarchical tiling scheme.

Yeah. What does that have to do with binning Fear?

It was the first design after Spectre/Fear that wouldn´t have had a separate geometry processor. It was projected for 2002 and it would have taken a fundamentally new design to reach upcoming generation hardware. They few spare aspects known about it don´t even place it above R3xx in functionalities, irrelevant of architecture.

Late 2002 / early 2003, actually... and that was at the very beginning of Mojo's life. Do you really think 3dfx wouldn't have added features once DX9 spec was set?

1% of a hit for the LOD hack while losing 1/4th of your effective fillrate with Supersampling on.

It didn´t even work independantly from either/or sampling method.

And Multisampling. Which was effectively free while multitexturing.

Dual-chip Rampage would've had 256-bit DDR at 200+ MHz.

256bit DDR in 2000?

Sure! The Voodoo5 5500 has 256-bit SDR. 8)

It's called SLI, sweetie. :)

I´d still say that techniques like occlusion culling, hierarchical Z, fast Z-clear, early-Z (or their possible combinations) are still better than just that.

It has more, and we've already discussed it to death...

It does have some culling, but it's mostly software, and somewhat erratic.
 
Sure! The Voodoo5 5500 has 256-bit SDR.

It's called SLI, sweetie.

*sigh* why am I even wasting my time?

Late 2002 / early 2003, actually... and that was at the very beginning of Mojo's life. Do you really think 3dfx wouldn't have added features once DX9 spec was set?

Yep and it would have effectively turned into a second edition of the Rampage delay horror series. Of course can you add whatever your heart desires at short notice. Even with an optimal 18 months design to production cycle, the chalkboard design takes up in that case about 3 months. Any change after that means only one thing: delay.

And Multisampling. Which was effectively free while multitexturing.

No it wasn´t for free. Of course will you come up with some fancy definition of the term "free", but I´ve seen it all so far, heck why not that one too. On a TBDR Multisampling comes essentially fillrate and bandwidth free. Supersampling was bandwidth free on KYRO.

For the factual relevance here are some old numbers released from 3dfx themselves: MSAA on a 100MHz GP1; approximately 26 fps in 1024*768

http://www.3dconcept.ch/cgi-bin/showarchiv.cgi?show=1390

It has more, and we've already discussed it to death...

It does have some culling, but it's mostly software, and somewhat erratic.

Oh please give me a break with that old BS. I´m tired of those daydreaming trips. Dave said himself that it was an internal joke and it´s been said over and over again what it was and for what it was intended.

Can we finally put an end to that nonsense?
 
FUDie said:
Chalnoth said:
FUDie said:
Not particularly impressive, eh? Except that NVIDIA didn't even release a DX9 driver for months for any of their parts, and they had just as much time to test their older GeForce 4 parts as ATI did. Meanwhile ATI released a DX9 driver for all Catalyst support products.
P.S. If memory serves (and I believe it does) Microsoft didn't release DX9 until around Dec. 19th, 2002. So there's no way NVIDIA could have gotten DX9 certification for the 40.72 drivers.
And while OEM's may care about WHQL certification, I don't. As far as I know, the lack of OEM certification for a few months wasn't due to lack of quality of the drivers, but due to quibbling over the DX9 spec.
 
Back
Top