More Verite - The Core......

swaaye

Entirely Suboptimal
Legend
Supporter
On the subject of the Verite chips, I have a question about the architecture in another way. I've heard the core is based on a MIPS RISC design, that it didn't do its thing the same way the other chips (Voodoo1) of the time did.

Was it an extremely flexible design in that they could literally add new features at any time? Could it be viewed almost as a general purpose CPU running code in the drivers? What of the much-ballyhooed "triangle setup" engine everyone raved about because the Voodoo1 didn't have one?

The 2D Windows performance just sucks. Could this be because it's running the 2D functions off the RISC core instead of a dedicated 2D unit?

I've noticed with the card that when I'm booting into Windows the screen will blank out for a moment and then come back. The monitor goes into power saving mode during this time, about 3 seconds or so I'd say. When it blanks the Win98 startup screen is there, and it is still there when it comes back up. Could it be uploading firmware to the board?

I'd love to hear your opinions about the design. From what I've heard it sounds quite different than the ASICs we have now.
 
I did some of my own research via Usenet. Bless Google Groups :)

This is from Jim Peterson, then of Rendition. Interesting...
Thanks, I am happy too respond.

We do support transparency and full source and destination alpha
blending and per pixel fogging in D3D. We just do what we are told to do. The game developer chooses whether to do chroma key or blending of transparent sprites with our chip, for example. Bilinear filtered chroma keyed textures actually work in our chip, so game developers tend to select this mode. BTW: Hellbender uses fogging with our chip. I think we conform to the highest standard in rendering performance and quality: We have 8 bit subpixel position texture filtering (the best other chip has 4), we use full 32x32 multiplies for perspective correction, we keep 8 bits each for red, green, blue and alpha in our pipeline, we always do full subpixel correction for triangles (since we do it on chip this is easy, others do setup on the host and don't seem to do this as much), we do support destination blending (when requested), our unified memory architecture allows us to use a rendered image as a texture source (e.g. reflection mapping) or allocate any part of our memory to any function, we have other "under the hood" 3D features, plus we do 2D operations, like fast chroma key BitBlt and host to screen copy (wow, how many words in that sentence? :).

Interestingly most 3D games still have a lot of 2D operations (dashboards, etc). Having a 3D chip that does not do these operations well will limit its real game performance. While I'm on a roll, please allow me one more point. We get to add "hardware features" anytime we want. We just write some new microcode for our RISC processor (those of you who play VQuake are using some of these new features). As far as performance, IMHO, people would do better to look at native port verses native port of the games that they play (okay, I'm biased since we do so well in this class of comparison). I understand Whiplash (a.k.a Fatal Racing) will be available as a patch from Gremlin for those who already own a copy (for a small fee, I think). We are also working to help get Descent 2 available. We will also have very compelling ports of EF2000 and Tomb Raiders, soon. This should give everyone a good basis for comparison.

Don't get me wrong, I think the 3D F/X chip set provides very good
performance. What I would like is for people to get a chance to see is
how the Verite's true Asynchronous DMA improves game performance, and, when games like IndyCar 2 use around 80% of a P166 for physics and artifical intelligence, how important it is to do triangle setup on-chip (even though it actually slows down benchmarks). We find that the games that have been ported to the Verite natively tend to be host CPU bound. So, eliminating CPU cycles by doing stuff on our chip, achieving total overlap between the CPU and the Verite by eliminating the CPU bound transport time for commands and dynamic texture loads (for video textures and when you get to that new room) as well as keeping the Verite rendering with 100% overlap with the CPU, makes for the fastest version of these games.

One final point is that, there are currently a few issues with D3D when using a DMA based chip, like the Verite (which is probably part Microsoft code and part our code). We are working with Microsoft to address these issues. We have one game ported to D3D and natively to the Verite. It's frame rate is almost 3 times faster in the native port. We are working to significantly close this gap.

Jim Peterson
Rendition
p.s. The Beta 7 VQuake is now on our web and ftp sites. Try the
"R_ANTIALIAS 1" mode :)
The opinions expressed in this message are my own personal views
and do not reflect the official views of my company.

Interesing posts from Robert Mullis, another former Rendition employee. Lots of nifty future speculation here.
Well, I guess this thread has been going on for a couple months now.
It was funny before the cards were out there and it continues to have
lots of interesting, but rarely factual, conjectures.

I work at Rendition so factor that into these comments but I will try
to be as even handed as possible.

Both Rendition and 3Dfx have very good teams building graphics chips
that are having some impact on this market. At the highest level they
are somewhat different.

Verite 1000:
Do it all graphics adapter aimed primarily at entertainment
(that is, a chip that makes all forms of entertainment run well
through very good SVGA, 2D windows/DirectDraw and 3D acceleration)
and, hopefully, a mainstream consumer market.

Voodoo:
Hot 3D add-on accelerator aimed at the gamer market. (I am sure that
they have an ultimate aim close to ours).
Now, ignoring everything but 3D, there are some architectural
differences in our approach to 3D. But it should be stated right
up front that one thing that we are commonly dedicated to is very
fast and very accurate 3D rendering. I would argue that this one
feature sets both companies quite apart from the rest today.

Voodoo:
Very good fill rate: 45MP/s (or more in some configurations, I
think, with most everything turned on).
Some triangle setup and edge walking. I am not sure just how
much set up they do but I am sure the engine is at least an edge
walking engine.
Memory Mapped IO (with perhaps substantial buffering on card) for
data coming from the CPU.
Very capable texture manipulation engine.
Very good Z buffering performance.
And other good stuff.

Verite 1000:
Full triangle setup and edge walking.
Good fill rate: 25MP/s (with what we think is typically used in
the hot 3D games today: bilinear, gouraud, perspective correct,
subpixel/texel accurate, fogged,...)
Exceptional rendering accuracy.
Bus master DMA to transfer data to the card.
Very capable texture manipulation capability.
And other good stuff.

So, how do you decide which card will run a game the fastest?
That depends entirely on the game. First, if it is only available on
one card then that card wins. (Score one for Voodoo for Valley of Ra
and one for Verite for Quake). If a game is fundamentally
limited by overdraw and/or Zbuffering it is likely going to run
faster on voodoo than verite. Today there are only two popular
Zbuffered games: Quake (which is mostly Z write) and Tomb Raider
but there will be more as time goes by. On the other hand, if a
game has intense CPU requirements for things like character AI or
model physics calculation and little of the CPU is available for
rendering then the verite, with more of the pipeline accelerated and
bus master DMA, will probably win. Indycar II (which some will
probably scream "unfair comparison" since a native port doesn't
exist for the voodoo... but I am pretty confident that if a port
did exist it would still be faster on verite) is a good example of
this sort of game. There are many other things about particular
games that might become the bottleneck for performance reasons so
the best advice continues to be: "Run the game you want to play
on both and pick the one you like the best depending upon the results."

There has been a lot of discussion about Direct3D numbers and that
Voodoo is much better than Verite 1000 in D3D. It is probably
safe to say that neither achieves its full performance capability
under D3D yet. But D3D is still very young and with a little more
time it is likely that everyone's drivers will get tuned up to the
point that the API will add an acceptably small overhead to games
given the advantage in portability it should provide. Probably the
biggest challenge for D3D is not its overhead but its lack of support
for many of the features that advanced 3D accelerators provide. APIs
typically take a kind of least common denominator approach. There are
scads of things that Verite 1000 (and I suspect Voodoo) can do that
are not available at all through the D3D API. This, of course, will
be the grist for many long drawn-out conversations between companies
like our, SW developers and MS's 3D teams over the coming months.

But back to D3D, in "benchmark" cases typically
the CPU is doing nothing but telling the gfx chip to render. This,
unfortunately, obviates a couple of verite's main architectural
advantages, namely, busmaster DMA (since there are typically few
textures in D3D benchmarks) and acceleration of more of the pipeline.
Today the three D3D programs, D3DTest, Tunnel and Twist are all so
simple in geometry that they mostly measure nothing but fill rate
and there Voodoo is the clear leader. (I might mention here that the
fill rate test of D3DTest is an apples-to-oranges test between the
two chips since voodoo runs it full screen and can do page flipping
whereas the verite running in a window has to do copy double buffering
which greatly exagerates the difference.)

When the CPU is sitting around doing nothing (like in most benchmarks)
it could be used to process more of the pipeline and speed up the
overall throughput of the app. This is rarely the case in a game but
in D3DTest if the verite were to accept triangles at the edge walk
or span interface rather than at the earlier, pre-setup, point its
triangle performance would about double. BUT, if the verite D3D driver
did that then game play in real games would most likely suffer so we
have chosen to take the hit in "naive" benchmark scores in favor of
better game play for sophisticated games.

There is a minor point that today D3D makes it very hard to achieve
processing overlap when DMA is used to move data. We are working this
with MS and believe that problem will go away soon and the actual
performance that verite is capable of will be available through D3D.
(As things are now it certainly isn't a shabby performer in the games
I've seen - more often than not the difference between Voodoo and
Verite is hardly discernible).

Anyway, 3D is a fairly complicated space. Of primary importance are
rendering rates, rendering quality (judged by no benchmark I have seen
yet...), rendering features, closeness of rendering needs and data
formats of applications with features and data formats supported by
an accelerator, and so on. I think that we will see far greater strides
forward over the next couple years that we have seen so far and the
space of discourse will increase in size which should make rich fodder
for on-going religious debates like this one.

Enjoy,
Robert Mullis.
Rendition, Inc.
(While I work at Rendition, Inc., the comments expressed here are my
own opinion and should not be construed to be a statement of official
position or policy of Rendition, Inc.).

Super interesting thread with some comments by Brian Hook.
http://groups.google.com/groups?hl=...64ce0&seekm=32E3DF82.B94@wksoftware.com#link1

Overall there were many comments saying that Verite's RISC architecture was at a severe disadvantage to ASICs like the Voodoo. Specialized hardware would always win. Many liked the flexibility a RISC core offered though.

There was a huge problem with the Verite and early D3D. The two didn't get along well, especially with the Verite's DMA transfers. In the end, it cost it like 50% of its potential performance.

It's all a very interesting historical view, and quite interesting relative to what we have today. At least to me 8)
 
I remember you could upload your own microcode to the Verite. I asked for the specs, but they wouldnt give it to me. The guy at Rendition (forgot his name) did mention that they had special microcode for Quake and also for the water in Tombraider. The microcode wasn't accessible from their API (RRedline) either, unfortunately.

So yes, it does seem like the Verite was quite programmable/flexible. Too bad it was so damn slow.
 
Oh boy...

this thread remembers me about my favorite hypothetical console-system back then :

basis : Saturn

enhancements :

66 or 100MHz SH3 with 4MB 66MHz SDRAM MainRAM
33MHz V1000 with 4MB 66MHz SDRAM VideoRAM

introduction middle of 1995


imho this system would have been far better then either the N64 or the PS1; but with 8MB RAM really really expensive :D
 
Back
Top