3dfx/3dhq Revenge superchip.

Xmas said:
OpenGL guy said:
It's not as bad as that if you have compression. I don't know if Rampage had that feature.
Compression doesn't save the fill rate. Bandwidth may have been sufficient, but Rampage didn't have fill rate free multisampling, as was already pointed out here.
And even if it had, i'm sure you agree an 18 GSamples/s R350 with compression would beat it ;)
But what you're missing is that the n samples you take for MSAA don't affect the fillrate. That's why I was disputing your 4 GSamples/s for 8x MSAA.

It's a bit more complicated for Z, but often you will be able to do Z rejections at a much higher level than per sample. Of course, if Rampage lacked hierarchical Z or early Z, then this is not an option.
 
It's a bit more complicated for Z, but often you will be able to do Z rejections at a much higher level than per sample. Of course, if Rampage lacked hierarchical Z or early Z, then this is not an option.

Besides ordinary Z-buffering, what it mostly contained as a bandwidth saving feature was it's texturing approach. No early Z or hierarchical Z buffering at all AFAIK.
 
OpenGL guy said:
But what you're missing is that the n samples you take for MSAA don't affect the fillrate. That's why I was disputing your 4 GSamples/s for 8x MSAA.

It would have with Rampage. Their were 4 pipes that were either pixel pipes or 4X MSAA pipes.
 
A few things, first...

16x AF. Yeah, right. But that means NOTHING, as proved by good ole nVidia Corporation's "Agressive" algorithm, which is nearly free but looks, well, bad.
The Rampage was supposed to have adaptive AF. The GeForce FX got adaptive AF. nVidia claims they got some 3DFX technology in the GeForce FX.
I'm not saying it's the same algorithm. I'm just saying it's possible, and if it's the case, then that 16x AF may very well have to be compared to ATI's 4x Performance AF.

As for AA...

I'm going to do the same point as for AF, but with a completely different explanation.
Quality, please?
8x AA doesn't mean anything. Considering it's 3DFX, it's probably rotated, but we don't even know that for sure...

But there are MANY ways you could "cheat" with AA - it would give lower quality, but much much better speed.

What about the same Z value being calculated for every sample in the same pixel, eh? That would also make it possible to use 1 bit to do perfect compression to save a LOT of bandwidth in best case scenarios, which in Quake 3, with the low poly count, could be quite frequent.
Or you could use slightly more advanced techniques, but still much more basic than today's ones. Or simply none, and only save fillrate.

Or even more complex - what if Rampage only calculated 4 or 2 Z values for 8x AA? That would give quality always on par with 4x AA or 2x AA, often better, with a fillrate cost equal to 4x AA or 2x AA.

There are a LOT of tricks you can use if you do that, and they can save you a LOT of fillrate & bandwidth. They lower quality, but 8x AA should still give quality higher in 4x AA overall. Maybe not even on par with ATI's 6x AA overall, though. And in some cases, it might look worse than 4x AA. Why the heck do you think ATI & nVidia never did that?

I'd guess a Rampage was never in the hands of someone who did a quality comparaison. It probably just "looked good" ( which is quite normal, since the standard at that time was 2x AA and little or no AF )

Okay, so let me already imagine the flaming :)

"Yeah, right, but 3DFX would never do that. It isn't traditional multisampling, even nVidia never did that, and..."

Err, we *never* seen 3DFX implement MultiSampling. Their SuperSampling solution was simply calculating the screen multiple times at different locations, then combining the buffers. With that approach, the "cheat" I described above wouldn't even be possible!


Okay, to summarize what I said...
1. Rampage's AF algorithm was adaptive. All adaptive algorithms are different. For all we know, Rampage's one could have been ridiculously aggressive and truly not comparable to ATI's 8x AF.
2. Rampage's AA was MultiSampling, and used a lot of clever tricks. For all we know, it could be using clever cheats, too. While that would give amazing performance, it wouldn't give quality similar to traditional 8x AA. Of course, it might still be better overall than ATI's 4x AA - but in worse cases, it might be a lot worse.

Conclusion: Without quality tests, those numbers mean very little. We do know the Rampage would have been a VERY good card in 2001. But extrapolating is useless.

Unless we get quality tests one way or another, I believe discussing this is useless.

"Yes, but if Rampage would have had good quality with those settings, ..."
:rolleyes:


Uttar

P.S. : I just read Wavey's post about MSAA affecting Rampage's fillrate.
Yes, but just how much? As I said, they could do some things as for 2x AA or 4x AA to reduce the fillrate cost.
 
DaveBaumann said:
OpenGL guy said:
But what you're missing is that the n samples you take for MSAA don't affect the fillrate. That's why I was disputing your 4 GSamples/s for 8x MSAA.

It would have with Rampage. Their were 4 pipes that were either pixel pipes or 4X MSAA pipes.

In this day and age with multisampling (happening at the rasterizer stage) that certainly sounds like a hefty price to pay for FSAA. 8)
 
LeStoffer said:
In this day and age with multisampling (happening at the rasterizer stage) that certainly sounds like a hefty price to pay for FSAA. 8)
Yes, but 3dfx designed Rampage with the assumption that anyone who wants high quality enables AA and AF, and that future games heavily rely on multitexturing. In this case, you rarely get more than one pixel per cycle, so 4 samples per cycle is sufficient.

Unfortunately, this doesn't do too well for singe-textured cases and z/stencil only passes.
 
Xmas said:
LeStoffer said:
In this day and age with multisampling (happening at the rasterizer stage) that certainly sounds like a hefty price to pay for FSAA. 8)
Yes, but 3dfx designed Rampage with the assumption that anyone who wants high quality enables AA and AF, and that future games heavily rely on multitexturing. In this case, you rarely get more than one pixel per cycle, so 4 samples per cycle is sufficient.

Unfortunately, this doesn't do too well for singe-textured cases and z/stencil only passes.

Oh, thanks for the clarification, didn't know that :)
So the Q3 8x AA 16x AF would be an optimal case: AA, AF, and a lot of multitexturing.


Uttar
 
Uttar,

16x AF. Yeah, right. But that means NOTHING, as proved by good ole nVidia Corporation's "Agressive" algorithm, which is nearly free but looks, well, bad.
The Rampage was supposed to have adaptive AF. The GeForce FX got adaptive AF. nVidia claims they got some 3DFX technology in the GeForce FX.
I'm not saying it's the same algorithm. I'm just saying it's possible, and if it's the case, then that 16x AF may very well have to be compared to ATI's 4x Performance AF.

Can't get rid of that suspicion too.

8x AA doesn't mean anything. Considering it's 3DFX, it's probably rotated, but we don't even know that for sure...

From what I can tell from the stuff of the past, there was no intention to support 8x sample AA, even on dual chip Spectre. 2x/4xRGMS with the possibility that T-buffer RGSS could have been enabled upon choice too.

Again what hit the rumour mill was:

quad texturing= 4xRGMS/noAF small performance drop
dual texturing= 2xRGMS/4xAF small performance drop

But that with two Rampage rasterizers and one Sage geometry processor, remember that.

I just read Wavey's post about MSAA affecting Rampage's fillrate.
Yes, but just how much? As I said, they could do some things as for 2x AA or 4x AA to reduce the fillrate cost.

It would have cost more in fillrate compared to NV20, exept of course the cases above; how much no idea.

So the Q3 8x AA 16x AF would be an optimal case: AA, AF, and a lot of multitexturing.

Q3 is a dual-textured game if I'm not mistaken.

Very daring estimate:

1024x768x32/ 2xAA/ 4xAF = ~ -20% from initial performance
 
OK. I got one thing straightened out.

Rampage supports 8x rotated grid (3dfx were NOT going to stop at 4x. The V5-6k was going to push 8x SSAA, remember?) if two raster cores are present.

The grid was the same as VSA-100's - rotated at all settings, with 8x using a scattered grid (Colourless - like the 6k's OpenGL default sample pattern - you should know what I'm talking about).

For performance, the MSAA would 'borrow' Z-check units from otherwise-idle pipelines whose sole TMU's are being used for multitexturing, up to four samples per cycle.

The performance AF was sort of a 'cheat' - it took less texture samples than you'd hope, but instead used the AA subsamples (using a REALLY nifty trick in MSAA), re-shifted them, and blended the results. Recursive texturing helps a LOT for this, and you're already losing fill-rate from the AA, so the hit for AF in those situations is minimal (very minor recalcs).

It looks better in Rampage's SSAA, of course, but it still works with MSAA.

Other NEAT tidbits:

Some of you might remember 3dfx's old 'Geometry Assist' option, by which they could fake hardware TCL and get a REALLY nice performance boost in some situations?

Well, Rampage didn't need that any more, of course, what with having a SAGE on hand to do geometry. ;)

But. Who remember's 3dfx's famous HSR?

It was buggy as hell, yes, but with some careful tweaking it could provide dramatic gains with a maximum 1% visible error rate (many people got 20-30fps boost with no visible quality loss, with some careful setup).

The code was based on Gigapixel's deferred architecture...

Well, Rampage supported this again. And here's the kicker: It supported *some* (but not all) functions in hardware - sort of a hardware-assisted software HSR algorithm.

Long story short, it was perfected for Rampage - that would account for the rest of the performance gains, I'd say.

By the way, the experimental boards created have two Rampage and *two* SAGE cores in parallel. Not feasible for production but still the truth, in this case. They're custom protoboards.

None of this info came from the questionable Aquoues though. All of THIS info is from reputable sources (who are at the mystery company who really can't be named right now... ^_^; ).
 
Long story short, it was perfected for Rampage - that would account for the rest of the performance gains, I'd say.

Even perfected on a new chip the specs dont add up ... and what relevance does V5 HSR have with Rampage HSR that is done in HW.

Long story short what was the point of the long story? Seems like a justification (with mirror and smoke) for the ludicrous rumours and performance figures.

And Tag..you are welcome to flame me... :LOL:
 
DaveBaumann said:
HSR Drivers were a joke, nothing more, nothing less.
This was indeed a hack implemented in the last days of 3dfx. It's amazing how many fps you can churn if you're not drawing anything to the screen...
 
The HSR drivers weren't a joke. If it was some kind of joke, they would've made it available without hacking. And even with adding the settings, many systems (mine included!) had no changes with HSR enabled, under any settings, in anything. Including Quake3. Likewise, some people actually had HSR mysteriously work in other things... not necessarily as HSR, but the setting would change things.

The reason it relates to Rampage's HW HSR is that Rampage's implementation isn't pure-hardware, it's partial-hardware but still mostly-software, but using almost the same methods.

With proper work, tweaking the HSR-capable driver's settings, as I said, you could get 20-30fps increases with at most 1% visible error - in other words, nothing anyone *playing* the game would notice, and in fact an error rate only the most observant of Quake3-freaks could notice. Thing is, the biggest visual errors came from the "MUST MAX OUT EVERYTHING" mentality, setting the HSR to the most aggressive setting and maxing out resolution.

More HSR settings than the generic 'Aggressive', 'Semi-aggressive', etc. were discovered, including individual settings for numerous aspects, and we found that 3dfx actually screwed up some of the default settings. Badly. That is, of course, why it's an undocumented function on a beta driver ;).

Aside:

Rargh! With my incessant rambling about architecture I forgot to correct something in all this...

That 254fps number IS bull. I talked to Devin, he didn't even know what I was talking about. We checked a few records. The real number quoted (not by Aquo!) was 'just a teeny bit over 180fps' at the said settings.


I know it's easy to write all this off as the delusions of old 3dfx fanatics who simply can't let go of the past... and I know that 250 -> 180fps correction probably just sounds like 'covering our asses because we made an excessive claim'... there's no way to change that. I'd REALLY like to know who was oritinally quoted as saying 250fps.
 
DaveBaumann said:
HSR Drivers were a joke, nothing more, nothing less.

Agreed. Although remember it was never made public, and it's often a good idea to test an algorithm in software before implementing it in hardware.
So, 3DFX could very well have simply implemented it to see how well it would do on Rampage - they never advertised that feature. Or it could all be BS. But I hardly see how we'd ever know. Unless nVidia uses some similar tech for the NV40, that is...

But considering how buggy HSR was, I'd be surprised if it would have been 100% bug free on that early Rampage board, even more so considering it was done in hardware and thus nearly impossible to fix via drivers. HSR VSA-100 drivers were built in the last days of 3DFX, and Rampage thus probably used something very similar.

So, of course, if Rampage's 16x AF was kinda cheating and that HSR might have caused visibility artifacts, then, yeah, that huge 200+ FPS score is reasonable. Oh, sure, 3DFX might have fixed the HSR bugs ( if there even were any, who knows... ) later, and still get 100+ FPS. But then again, maybe not. And they obviously didn't have the time to proof it...

So IMO, that score isn't *that* impressive anymore.


Uttar

P.S. : Posted this before Tagrineth's post was there, so just a quick remark...
Considering 1% visibility error, a strange way to do AF which might work but for which we got no quality comparaisons, and 180FPS - I find it quite likely.
But the 1% visibility error might be more in a modern game, with more polygons - so again, Q3 is a best-case scenario.
 
Uttar said:
Agreed. Although remember it was never made public, and it's often a good idea to test an algorithm in software before implementing it in hardware.
So, 3DFX could very well have simply implemented it to see how well it would do on Rampage - they never advertised that feature. Or it could all be BS. But I hardly see how we'd ever know. Unless nVidia uses some similar tech for the NV40, that is...

But considering how buggy HSR was, I'd be surprised if it would have been 100% bug free on that early Rampage board, even more so considering it was done in hardware and thus nearly impossible to fix via drivers. HSR VSA-100 drivers were built in the last days of 3DFX, and Rampage thus probably used something very similar.

Well, the trouble is, the algo was meant to have the hardware functions available in Rampage. They didn't re-fix the settings to allow for pure software, which is where a few problems came in - CPU can't exactly re-check ever pixel post-raster, can it? That's one of the things Rampage does in hardware, a post-check; of course it isn't 100% perfect but it's much more accurate even with aggressive settings that are completely FUBAR on VSA-100.

Also, the 'tricky' AF, from a few snapshots I've seen running the algo on VSA-100 (buggy as hell driver, but it did pull it off in one app) did have a noticeable improvement, but that was also with SuperSampling. Very few people know what the tricky AF would look like in MSAA. I suspect it would be analogous to nVidia's AF, BUT much more consistent in quality, and trilinear would be left intact. It'd basically be a more-or-less circular readout on AF analysers, much like GeForce4, and pushed back... but AF analysers probably would probably look really, REALLY weird with it. I don't doubt texture quality would improve, but yeah, it probably wouldn't improve as much as the degree of aniso would lead you to believe.


TBH, regardless of HOW the quality stacked up, 180fps at those settings in April 2001 (yes, April 2001. who wants another 3dfx PPT document?) is hellishly fast, and YES Rampage does support standard GF4-style 'pure' Aniso, as well as SuperSampling for the manic, rabit IQ freaks out there. :) Of course, SuperSampling would be deliciously slow, but SimHQ would absolutely ADORE the feature ;)
 
There I cannot disagree Tagrineth - 180FPS at those settings in H1 2001 is GREAT, even with discuttable quality ( which is obviously as good as nVidia's 4x AA / 8x AF ) and a few possible visiblity errors.

Your explanation for the HSR problems on the VSA-100 seems plausible.
But would that mean those drivers were made with Rampage in mind, and would potentially work on one? Or rather, were *meant* to work on one, since I think there was only one driver set where Rampage worked and it never leaked, right?

However, on the post-check thing... How could that work, really? That would actually imply it was done after all pixels have been rendered, or what?
If it that was the case, the Rampage is definitively a sort of deferred renderer.

And that would mean 3DFX had their own deferred technology, because they acquired GigaPixel way too late for them to have a so signifiant impact on Rampage - maybe a few optimizations, though.
3DFX having their own deferred technology - WTF?!

BTW, did Rampage support a setting to disable HSR, or is it so transparent that it can't be disabled?


Uttar
 
http://www.glideunderground.com/articles/3dfx_merger/default.asp

Is Gigapixel technology included in the Rampage core? Is it going to be included?

Currently there is no Gigapixel technology in the Rampage core. We have considered adding Gigapixel core technology to the Rampage core but it does not make sense to change the Rampage core at this stage. Gigapixel will mainly help improve on the Voodoo system in future cards by adding features which the current Voodoo cards lack strength in.


icon_climb2.gif


look what i found :)

Talion Graphics, 3DHQ, Bitboys to merge

Some internal sources claim that Talion Graphics, 3DHQ and Bitboys will merge to create the most stunning 3D technology since decades.

Senior engineer at 3DHQ commented: "We are very proud of this merge, our technologies combined will preveal over nVidia and ATi, the chip in the works will be able to boost 1566 fps in Quake 3 with 32X AA and 16X Anisotropic on a Pentium 200 (overclocked though)".

Rumors around say that the first cards will be bundled with the famous Duke Nukem Forever game, made explicitly for this graphics card chipset.

;)
 
Back
Top