carmack on Cg, P10 and apology to Matrox about DM

ram

Newcomer
I need to apologize to Matrox -- their implementation of hardware displacement mapping is NOT quad based. I was thinking about a certain other companies proposed approach. Matrox's implementation actually looks quite good, so even if we don't use it because of the geometry amplification issues, I think it will serve the noble purpose of killing dead any proposal to implement a quadbased solution.

I got a 3Dlabs P10 card in last week, and yesterday I put it through its paces. Because my time is fairly over committed, first impressions often determine how much work I devote to a given card. I didn't speak to ATI for months after they gave me a beta 8500 board last year with drivers that rendered the console incorrectly. :)

I was duly impressed when the P10 just popped right up with full functional support for both the fallback ARB_ extension path (without specular highlights), and the NV10 NVidia register combiners path. I only saw two issues that were at all incorrect in any of our data, and one of them is debatable. They don't support NV_vertex_program_1_1, which I use for the NV20 path, and when I hacked my programs back to 1.0 support for testing, an issue did show up, but still, this is the best showing from a new board from any company other than Nvidia. It is too early to tell what the performance is going to be like, because they don't yet support a vertex object extension, so the CPU is hand feeding all the vertex data to the card at the moment. It was faster than I expected for those circumstances.
Given the good first impression, I was willing to go ahead and write a new back end that would let the card do the entire Doom interaction rendering in a single pass. The most expedient sounding option was to just use the Nvidia extensions that they implement, NV_vertex_program and NV_register_combiners, with seven texture units instead of the four available on GF3/GF4. Instead, I decided to try using the prototype OpenGL 2.0 extensions they provide.

The implementation went very smoothly, but I did run into the limits of their current prototype compiler before the full feature set could be implemented. I like it a lot. I am really looking forward to doing research work with this programming model after the compiler matures a bit. While the shading languages are the most critical aspects, and can be broken out as extensions to current OpenGL, there are a lot of other subtle-but-important things that are addressed in the full OpenGL 2.0 proposal.

I am now committed to supporting an OpenGL 2.0 renderer for Doom through all the spec evolutions. If anything, I have been somewhat remiss in not pushing the issues as hard as I could with all the vendors. Now really is the critical time to start nailing things down, and the decisions may stay with us for ten years.

A GL2 driver won't give any theoretical advantage over the current back ends optimized for cards with 7+ texture capability, but future research work will almost certainly be moving away from the lower level coding practices, and if some new vendor pops up (say, Rendition back from the dead) with a next-gen card, I would strongly urge them to implement GL2 instead of proprietary extensions.

I have not done a detailed comparison with Cg. There are a half dozen C-like graphics languages floating around, and honestly, I don't think there is a hell of a lot of usability difference between them at the syntax level. They are all a whole lot better than the current interfaces we are using, so I hope syntax quibbles don't get too religious. It won't be too long before all real work is done in one of these, and developers that stick with the lower level interfaces will be regarded like people that write all-assembly PC applications today. (I get some amusement from the all-assembly crowd, and it can be impressive, but it is certainly not effective) I do need to get up on a soapbox for a long discourse about why the upcoming high level languages MUST NOT have fixed, queried resource limits if they are going to reach their full potential. I will go into a lot of detail when I get a chance, but drivers must have the right and responsibility to multipass arbitrarily complex inputs to hardware with smaller limits. Get over it.

How on earth is 3D Labs allowed to use NV proprietary extensions? Did they licence them?
 
NVIDIA opened them up after ATi and Matrox started working on ATI's extensions.

That sounds pretty encouraging for P10 and for OpenGL2.0 though.
 
Sounds very encouraging indeed, for the P10. I am looking forward to this VGA more and more.

And nice how he corrects his mistake regarding Parhelia's DM right at the beginning of the .plan.

ta,
.rb
 
I knew he had been wrong with his previous .plan update about Parhelia re d.mapping but I was, um, forced to keep this quiet.

Now I know why he hasn't answered my emails. A few of my questions were adddressed by his new .plan update... Cg, P10, OGL2.0 and the like...
 
But which other company is proposing a quad approach? ATi or nVidia? Or maybe it's the *cough* BitBoys; god help us all :D
 
Mr Carmack seems rather biased against quadrilateral systems in favour of triangles, but, AFAICS, there are pros and cons with both systems.

Triangular systems would certainly be easier to use than quads arbitrary topology (such as you may get with character models), but for things like terrain, quads might be cheaper.

I also think quads are probably easier to join together: With triangular patches, I believe you need go to higher order curves to obtain the equivalent level of continuity.
 
There is no need to use d.mapping for terrain at the moment - the visual end result is not worth the effort. More work, less immediate visual impact. Carmack is right to be "biased" (as you put it, Simon) in this respect.

You telling us something, Simon? :)
 
I like this quote:

Multi chip and multi card solutions are also coming, meaning that you will be able to fit more frame rendering power in a single tower case than Pixar's entire rendering farm. Next year.

I wonder what he's talking about? Hmm next year :)
 
Thanks for the link, Rev. This way I finally was able to locate another copy of the Peercy paper I mislaid. ;)

Interesting comments in that posting, too. Floating point pipelines this year, still--sounds very promising.

ta,
-Sascha.rb
 
For the record his post at slashdot quoted:
Realtime and offline rendering ARE converging (Score:5, Informative)
by John Carmack on Thursday June 27, @10:51PM (#3784210)
(User #101025 Info)
There are some colorful comments here about how studios will never-ever-ever replace tools like renderman on render farms with hardware accelerated rendering. These comments are wrong.

The current generation of cards do not have the necessary flexibility, but cards released before the end of the year will be able to do floating point calculations, which is the last gating factor. Peercy's (IMHO seminal) paper showed that given dependent texture reads and floating point pixels, you can implement renderman shaders on real time rendering hardware by decomposing it into lots of passes. It may take hundreds of rendering passes in some cases, meaning that it won't be real time, but it can be done, and will be vastly faster than doing it all in software. It doesn't get you absolutely every last picky detail, but most users will take a couple orders of magnitude improvement in price performance and cycle time over getting to specify, say, the exact filter kernel jitter points.

There will always be some market for the finest possible rendering, using ray tracing, global illumination, etc in a software renderer. This is analogous to the remaining market for vector supercomputers. For some applications, it is still the right thing if you can afford it. The bulk of the frames will migrate to the cheaper platforms.

Note that this doesn't mean that technical directors at the film studios will have to learn a new language -- there will be translators that will go from existing langauges. Instead of sending their RIB code to the renderfarm, you will send it to a program that decomposes it for hardware acceleration. They will return image files just like everyone is used to.

Multi chip and multi card solutions are also coming, meaning that you will be able to fit more frame rendering power in a single tower case than Pixar's entire rendering farm. Next year.

I had originally estimated that it would take a few years for the tools to mature to the point that they would actually be used in production work, but some companies have done some very smart things, and I expect that production frames will be rendered on PC graphics cards before the end of next year. It will be for TV first, but it will show up in film eventually.

John Carmack
 
"Multi chip and multi card solutions are also coming, meaning that you will be able to fit more frame rendering power in a single tower case than Pixar's entire rendering farm. Next year."

So we have multi chip and multi card solutions coming, next year according to JC. My question is, is he talking about consumer level cards, or high end professional cards that costs a small fortune?
 
Reverend said:

Very interesting post, thanks for the link!

As for why a quadbased aproach would be so bad, I guess I don't quite understand realtime subdivision surfacing enough to see why without more explanation. In my software rendering or rather modelling I've always prefered quads over tris when it comes to subdivision surfaces and they never caused me any unexpected problems, guess its depends on the implementation ...
 
Thus quoth John Carmack:

I do need to get up on a soapbox for a long discourse about why the upcoming high level languages MUST NOT have fixed, queried resource limits if they are going to reach their full potential. I will go into a lot of detail when I get a chance, but drivers must have the right and responsibility to multipass arbitrarily complex inputs to hardware with smaller limits.

Could someone who is more into these things please explain the bearing this comment has on current alternatives?

Entropy
 
Reverend said:
There is no need to use d.mapping for terrain at the moment - the visual end result is not worth the effort. More work, less immediate visual impact. Carmack is right to be "biased" (as you put it, Simon) in this respect.

You telling us something, Simon? :)
Only that I try to keep up with the literature. I had a quick look at the stuff referenced by Marco (mfa) about converting tris to 3 quads (which looks nicer that the earlier Graphics Gems approach. It seems to me that it wouldn't matter if a hardware solution handled quads internally since it seems reasonably easy to convert to them. According to Graphics Gems, OTOH, it's not so nice going the other way...
What would be really nice, though, is having subdivision surfaces, but I suppose that's a long way off. There's too many different types to choose from and I can't see anyone settling on a preferred method! :rolleyes:
 
had originally estimated that it would take a few years for the tools to mature to the point that they would actually be used in production work, but some companies have done some very smart things, and I expect that production frames will be rendered on PC graphics cards before the end of next year. It will be for TV first, but it will show up in film eventually.

Very interesting, i wonder which companies he's talking about ?
 
Back
Top