Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 10-Aug-2004, 09:11   #1
Dave Baumann
Gamerscore Wh...
 
Join Date: Jan 2002
Posts: 12,947
Default John Carmack Post Doom3 Release Technology Interview

Following the release of Doom3 it’s seen many a discussions such as gameplay, graphics, benchmarking and tweaking. In this short interview we talk to id’s John Carmack about various elements of the title that are pertinent to benchmarking, as well as some of the queries that have arisen following some of the choices made within the games shader code.
"[/i]What's the situation with regards to depth bounds test implementation in the game? Doesn't appear to have any effect on a NV35 or NV40 using the cvar.
Nvidia claims some improvement, but it might require unreleased drivers. It's not a big deal one way or another.[/i]"
Read the full interview here.
Dave Baumann is offline   Reply With Quote
Old 10-Aug-2004, 09:53   #2
jvd
Naughty Boy!
 
Join Date: Feb 2002
Location: new jersey
Posts: 12,731
Send a message via AIM to jvd
Default

very interesting dave. It explains a little bit better what humus actually changed
__________________
Freexbox 360 !!!
Free Psp!
jvd is offline   Reply With Quote
Old 10-Aug-2004, 10:50   #3
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

What I would find interesting, would be a benchmark like:

normal path vs humus optimization vs specular disabled.
Anonymous is offline   Reply With Quote
Old 10-Aug-2004, 11:34   #4
Solr_Flare
Registered
 
Join Date: Aug 2004
Posts: 9
Default

Very interesting read, and it pretty much confirms what most of us figured, it was simply a design choice based on the hardware when the engine was designed and to give the best "overall" effeciency to the widest range of cards.

As the humus code showed, changing from a lookup does benefit later generation ATI hardware since it can handle the math more effeciently, thus giving it a bit more headroom. Which, either nets you a few extra frames and/or lets you use the ATI AF implementation that will net you slightly better overall image quality but without any significant cost.

In the case of Nvidia's latest hardware, however, where it seems to be more effecient at handling table lookups, switching to the humus code seems to have a reverse effect on performance.

If I were to make a guess, I'd say specular disabled would should the best performance, with humus and the original code flip flopping based on the hardware design and generation.

A bigger question I would ask then is, based on this response, is it reasonable to assume that with the proper work, the table lookup could be replaced in a way that would net significant gains to all current generation video hardware, or are table lookups going to remain the defacto best choice for all Nvidia and some older generation ati cards simply based on design, leaving math alternatives really only benefiting R3x0 ATI cards and higher?

In other words, how much further can we go without breaking image quality to the point where it isn't worth or finding the gains not significant enough to warant a change.
Solr_Flare is offline   Reply With Quote
Old 10-Aug-2004, 13:02   #5
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

Could this become a problem in future that games will favour the texture access over the straight math in shaders.

The way its meant to be programmed
Anonymous is offline   Reply With Quote
Old 10-Aug-2004, 13:51   #6
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

Quote:
Could this become a problem in future that games will favour the texture access over the straight math in shaders.
My understanding (and it is extraordinarily limited) is that shader computation/math power is increasing at a much faster rate than available bandwidth, so wherever possible, future engines and games will favour mathematical shader operations over texture lookups where possible.

Obviously, if it's a choice between a 4,000 step multipassed shader or a texture lookup, I'd guess that engine designers would take the texture lookup option, at least for the reasonably forseeable future
Anonymous is offline   Reply With Quote
Old 10-Aug-2004, 14:20   #7
DarN
Member
 
Join Date: Jan 2004
Location: Norway
Posts: 406
Default

He actually has a cvar to help calculate driver overhead. That's pretty cool. Has anyone tried this yet?
__________________
I like all womens, even the fat ones. - Fernando Martinez
Get Opera
DarN is offline   Reply With Quote
Old 10-Aug-2004, 14:57   #8
Gubbi
Senior Member
 
Join Date: Feb 2002
Posts: 2,544
Default

Quote:
Originally Posted by Rolphus
Quote:
Could this become a problem in future that games will favour the texture access over the straight math in shaders.
My understanding (and it is extraordinarily limited) is that shader computation/math power is increasing at a much faster rate than available bandwidth, so wherever possible, future engines and games will favour mathematical shader operations over texture lookups where possible.

Obviously, if it's a choice between a 4,000 step multipassed shader or a texture lookup, I'd guess that engine designers would take the texture lookup option, at least for the reasonably forseeable future
I think It'll take some time before that's the case.

The reason why ATI's GPUs sees a big speed up is because they do maximum anisotropic filtering on the texture lookup, ie. instead of taking 1 (bilinear) sample it takes 8 for every tex lookup, hence a big drop in performance when 8xAF is applied.

The patched shader remedies this by not doing a tex lookup at all.

AF on tex lookups is of course a good thing when your actually using a real graphical texture. However in this case texture lookup is used as a discrete function ( f[x]=>y ), and one can argue that AF on 1D textures makes no sense at all (at least when doing magnification). If you'd want interpolated values, bilinear would do just as well as anisotropic filtering (again only in the case of 1D textures).

I don't know if the application of AF in ATI hardware is a consequence of their texture cache design, or if it is simply a driver "bug". Considering they use an adaptive AF approach, I would think that they could indicate the degree of anisotropicity (is that a real word?) and just set it to 1. In that case the next set of Catalyst drivers is going to give a big boosts to performance, especially in high quality.

Cheers
Gubbi
Gubbi is offline   Reply With Quote
Old 10-Aug-2004, 15:19   #9
BetrayerX
Junior Member
 
Join Date: Aug 2003
Location: PR
Posts: 74
Default

I know this might be controversial, but I believe he should have been asked if he recognizes the shader replacement as a legit one....

He indirectly acknowledges it but anyway, a straight answer, specially since many believe that the fix is actually better than the original line of code in the case of the ARB2 path, would silence some voices out there (I know it smells like flame fuel, but that's not the intended purpose).

I'll put it another way, will Beyond3D use the "Humus Hack" when they benchmark Doom3?
__________________
Are you wise enough to admit your own ignorance?
BetrayerX is offline   Reply With Quote
Old 10-Aug-2004, 16:08   #10
Tokelil
Member
 
Join Date: Mar 2002
Location: Denmark
Posts: 329
Send a message via ICQ to Tokelil Send a message via MSN to Tokelil
Default

Quote:
Originally Posted by Gubbi
The reason why ATI's GPUs sees a big speed up is because they do maximum anisotropic filtering on the texture lookup, ie. instead of taking 1 (bilinear) sample it takes 8 for every tex lookup, hence a big drop in performance when 8xAF is applied.
My knowledge is very limitted where to 3d programming, so sorry if this a stupid question.

When AF is forced through the CP does the driver know that some of the textures shouldn't have AF applied?
If not shouldn't the performance on Nvidias hardware when AF is forced descrease just like Ati's? (Dont know whether it do or doesn't)
Tokelil is offline   Reply With Quote
Old 10-Aug-2004, 16:46   #11
Rolphus
Registered
 
Join Date: Aug 2004
Location: Reading, UK
Posts: 4
Send a message via ICQ to Rolphus Send a message via AIM to Rolphus Send a message via MSN to Rolphus Send a message via Yahoo to Rolphus
Default

Quote:
Originally Posted by Gubbi
Quote:
Originally Posted by Rolphus
Quote:
Could this become a problem in future that games will favour the texture access over the straight math in shaders.
My understanding (and it is extraordinarily limited) is that shader computation/math power is increasing at a much faster rate than available bandwidth, so wherever possible, future engines and games will favour mathematical shader operations over texture lookups where possible.

Obviously, if it's a choice between a 4,000 step multipassed shader or a texture lookup, I'd guess that engine designers would take the texture lookup option, at least for the reasonably forseeable future
I think It'll take some time before that's the case.

The reason why ATI's GPUs sees a big speed up is because they do maximum anisotropic filtering on the texture lookup, ie. instead of taking 1 (bilinear) sample it takes 8 for every tex lookup, hence a big drop in performance when 8xAF is applied.

The patched shader remedies this by not doing a tex lookup at all.

AF on tex lookups is of course a good thing when your actually using a real graphical texture. However in this case texture lookup is used as a discrete function ( f[x]=>y ), and one can argue that AF on 1D textures makes no sense at all (at least when doing magnification). If you'd want interpolated values, bilinear would do just as well as anisotropic filtering (again only in the case of 1D textures).

I don't know if the application of AF in ATI hardware is a consequence of their texture cache design, or if it is simply a driver "bug". Considering they use an adaptive AF approach, I would think that they could indicate the degree of anisotropicity (is that a real word?) and just set it to 1. In that case the next set of Catalyst drivers is going to give a big boosts to performance, especially in high quality.

Cheers
Gubbi
You're probably right - after all, all I know about this stuff is what I've picked up from various websites.

My point about shader power increasing a lot faster than bandwidth is a fairly direct quote from one or other developer interview - I think Tim Sweeney, but I couldn't swear to it. I know that things are pretty much in the balance at the moment, and complex math operations are still a lot slower than texture access (as far as instruction cost goes), but something that was raised in the Humus' tweak thread might be relevant here: fetching textures has a large latency compared to math ops, and can cause pipeline stalls.

You're absolutely right about the anisotropic filtering point though, I can see why that would cause a lot of problems, especially if you have to fetch 8 samples into the texture cache. Obviously, the higher the degree of anisotropy (I think that's the right word) applied, the more this will cause problems. I'm guessing that using app-controlled AF avoids this situation to an extent, which is why the performance boost isn't that big when using app AF. It's still tangible though on an R420 from what I've seen on my X800 Pro. That leads me to believe that the MAD_SAT and POW operations are less "expensive" overall than the texture lookup. This, as you've pointed out, only gets worse when AF is forced in the driver.

Of course, I could be entirely wrong, as I'm a graphics technology newb
Rolphus is offline   Reply With Quote
Old 10-Aug-2004, 18:21   #12
nggalai
Member
 
Join Date: Feb 2002
Location: /home/rb/Switzerland
Posts: 402
Default

Quote:
Originally Posted by Tokelil
1) When AF is forced through the CP does the driver know that some of the textures shouldn't have AF applied?

2) If not shouldn't the performance on Nvidias hardware when AF is forced descrease just like Ati's? (Dont know whether it do or doesn't)
1) That's for later drivers. Right now, forcing AF in the control panel equals to AF on ALL texture data. At least for DooM3.

2) Performance does decrease quite considerably on NV if you force AF through the control panel (10-30% for GFFX). Humus' / Demirug's tweaks are mostly counter-productive for NV as the driver's shader replacement gets negated.

93,
-Sascha.rb
__________________
Sascha "nggalai" Erni, -.rb
www.3dcenter.org | www.cgworld.de | www.nggalai.com

"Size is not as important as fill rate or at least thats what I have been told. 8)" -jb, August 2002
nggalai is offline   Reply With Quote
Old 10-Aug-2004, 18:43   #13
WaltC
Senior Member
 
Join Date: Jul 2002
Location: BelleVue Sanatorium, Billary, NY. Patient privileges: Internet access
Posts: 2,694
Default

Quote:
Originally Posted by BetrayerX
I know this might be controversial, but I believe he should have been asked if he recognizes the shader replacement as a legit one....

He indirectly acknowledges it but anyway, a straight answer, specially since many believe that the fix is actually better than the original line of code in the case of the ARB2 path, would silence some voices out there (I know it smells like flame fuel, but that's not the intended purpose).

I'll put it another way, will Beyond3D use the "Humus Hack" when they benchmark Doom3?
I see nothing unclear, vague, or equivocal about his statement:

Quote:
Originally Posted by Carmack
The lookup table is constant in Doom, so there isn't any real strong argument against replacing it with code. The lookup table was faster than doing the exact sequence of math ops that the table encodes, but I can certainly believe that a single power function is faster than the table lookup.
Indeed, he also says that had he known in advance the abundance of time he'd have prior to shipping D3, he might well have done things differently than he did:

Quote:
Originally Posted by Carmack
The specular function in Doom isn't a power function, it is a series of clamped biases and squares that looks something like a power function, which is all that could be done on earlier hardware without fragment programs. Because all the artwork and levels had been done with that particular function, we thought it best to mimic it exactly when we got fragment program capable hardware. If I had known how much longer Doom was going to take to ship from that time, I might have considered differently.
I think JC's as clear as he can be as to there being no doubt about as to both the efficacy and the desirability of the power function approach as applied to newer hardware, and in providing his reasons as to why he didn't take that route. In short, I think this is as close to him saying "Yea, I should have done that," as you are ever likely to see...

Edit: Hard to say for sure, of course, but this sentence, "Because all the artwork and levels had been done with that particular function, we thought it best to mimic it exactly when we got fragment program capable hardware," sounds very much as if he's saying that the artwork and levels for D3 were all finished prior to them laying hands on an R300--which is a long time ago, but certainly fits perfectly with his next sentence, as it was nearly two years after R300 shipped to D3. Anyone else with a different take on that?
WaltC is offline   Reply With Quote
Old 10-Aug-2004, 19:18   #14
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

AAAAAAAAAAAAAAh why didn't you asked about the 3Dc and which rendering path apply to ATi cards and get better performance? =[
Anonymous is offline   Reply With Quote
Old 10-Aug-2004, 20:23   #15
Sxotty
Senior Member
 
Join Date: Dec 2002
Location: Under a Crushing Burden
Posts: 4,290
Default

Quote:
Originally Posted by Anonymous
Humus eh adubo LOOOO¬ CASCA DE LARANJA AZEDA.

And Carmack is a cheater =(

It ressembles me of Gabe Newell


Is all I can say to that

I do not think walt they had the artwork done, I think his point was to make it look the same on all cards. No he may be saying if I realized how long it would take to do the artwork and levels I might have tried a different strategy on newer cards.
__________________
You bought horse armor didn't you?
Sxotty is offline   Reply With Quote
Old 10-Aug-2004, 22:34   #16
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

I think all this sounds strange i mean by choosing the option Carmack did he knew it would lower ATI's performance and then not to say anything about it.... talk about cheating for Nvidias favor or whatever you would like to call it.

Many of us took for granted that the latest ID Software engine would come optimized for all cards. But i guess that is not the case.
Also claiming time limits as the reason to not fix this for ATI cards is just BS.

Doom 3 must be the most overrated game by long time.
Gamespot has a decent review but still 8.5 is to much more like 7.5
Most Doom 3 review websites are just a good laugh.

Comparing Doom 3 to Far Cry is like comparing Skoda to BMW.
(The Hell lvl was nice but thats it.)

ID havent been honest with us and this makes me wonder what more is to come. Talk about hyping up Nvidia cards and then the hard truth comes straight back in your face.

I hope game developers choose other engines for their games since this engine is obviously not optimized for 50 % of the computer gaming market.
Anonymous is offline   Reply With Quote
Old 10-Aug-2004, 22:37   #17
BetrayerX
Junior Member
 
Join Date: Aug 2003
Location: PR
Posts: 74
Default

Quote:
Originally Posted by WaltC

I see nothing unclear, vague, or equivocal about his statement:
Oh, wait until Ruined give his interpretation of Carmack's comments and come back to tell me the same again.
__________________
Are you wise enough to admit your own ignorance?
BetrayerX is offline   Reply With Quote
Old 10-Aug-2004, 23:14   #18
Fox5
Senior Member
 
Join Date: Mar 2002
Posts: 3,674
Default

Quote:
Originally Posted by Anonymous
I think all this sounds strange i mean by choosing the option Carmack did he knew it would lower ATI's performance and then not to say anything about it.... talk about cheating for Nvidias favor or whatever you would like to call it.

Many of us took for granted that the latest ID Software engine would come optimized for all cards. But i guess that is not the case.
Also claiming time limits as the reason to not fix this for ATI cards is just BS.

Doom 3 must be the most overrated game by long time.
Gamespot has a decent review but still 8.5 is to much more like 7.5
Most Doom 3 review websites are just a good laugh.

Comparing Doom 3 to Far Cry is like comparing Skoda to BMW.
(The Hell lvl was nice but thats it.)

ID havent been honest with us and this makes me wonder what more is to come. Talk about hyping up Nvidia cards and then the hard truth comes straight back in your face.

I hope game developers choose other engines for their games since this engine is obviously not optimized for 50 % of the computer gaming market.
I far prefer doom 3 to farcry. Farcry may be better in many different aspects, but I feel doom 3 is more fun. It's more fun to shoot and run around in it than it was in farcry. Farcry is more ambitious, doom 3 is more polished.
Fox5 is offline   Reply With Quote
Old 10-Aug-2004, 23:38   #19
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

lol who had to recall a 1.2 patch (with ATI optimizations) five months after the game wasd released ??? it's not id software .

it has been shown that the humus trick brings around 1 fps .. oh dear big deal .


Quote:
Originally Posted by Anonymous
I think all this sounds strange i mean by choosing the option Carmack did he knew it would lower ATI's performance and then not to say anything about it.... talk about cheating for Nvidias favor or whatever you would like to call it.

Many of us took for granted that the latest ID Software engine would come optimized for all cards. But i guess that is not the case.
Also claiming time limits as the reason to not fix this for ATI cards is just BS.

Doom 3 must be the most overrated game by long time.
Gamespot has a decent review but still 8.5 is to much more like 7.5
Most Doom 3 review websites are just a good laugh.

Comparing Doom 3 to Far Cry is like comparing Skoda to BMW.
(The Hell lvl was nice but thats it.)

ID havent been honest with us and this makes me wonder what more is to come. Talk about hyping up Nvidia cards and then the hard truth comes straight back in your face.

I hope game developers choose other engines for their games since this engine is obviously not optimized for 50 % of the computer gaming market.
Anonymous is offline   Reply With Quote
Old 11-Aug-2004, 00:01   #20
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

Im not defending Crytek for releasing the non working 1.2 patch regarding ATI hardware. They mezzed up but set the record straight by recalling the patch.

Regarding the FPS gain on Humus trick, depending on what card you use and your other spec you can gain alot more than 1 fps. It seems to work best for the X800 series.
Anonymous is offline   Reply With Quote
Old 11-Aug-2004, 00:16   #21
karlotta
pifft
 
Join Date: Jun 2003
Location: oregon
Posts: 1,274
Default

Quote:
Originally Posted by Fox5
I far prefer doom 3 to farcry. Farcry may be better in many different aspects, but I feel doom 3 is more fun. It's more fun to shoot and run around in it than it was in farcry. Farcry is more ambitious, doom 3 is more polished.
__________________
but but but.... dang
karlotta is offline   Reply With Quote
Old 11-Aug-2004, 03:36   #22
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

I would not compare Far Cry with Doom 3, because those are two different games. If I have to, I would say that shaders looks much better in Far Cry. Shadows, lighting, special effects Doom 3 owns it.

Overall, those are two great games. I think HL2 is going to blow both games. HL2 is a pure DX9.0 game with intensive use of shaders, we haven't seen so far.
Anonymous is offline   Reply With Quote
Old 11-Aug-2004, 03:52   #23
Reverend
Naughty Boy!
 
Join Date: Jan 2002
Posts: 3,266
Default

I was looking for sites that plugged this interview and came across a comment in the Voodooextreme forums (horrors!) that had the following :

Quote:
Skinning on the CPU is done so that shadow volumes can be calculated on the CPU. Early in development of Doom3, Carmack stated that all shadow volume calculations were to be done on the CPU. The reason for this was to be fair to the people who didn't have vertex shader support on their systems (his code for shadow volumes on the CPU was faster than vertex shaders running on the CPU). Also, he didn't want to support both options. Had he realized how late Doom 3 was going to be, he may have standardized on the vertex shader model - in which case skinning could be done on the GPU as well (assuming he could fix up the tangent vectors for the normal maps on the GPU, too).

Perhaps the full vertex shader model will be available for engine licensees.
It's been 4 years since Carmack mentioned they decided to make Doom3 and while I have been following his .plan updates like a religion, I must've missed the above part (or, well... 4 years... I forget!). Is the above true, and can anyone point me to a link (or his specific .plan file, if he did disclse the above in a .plan file)?

Don't wanna bother John about this, so help by you guys is appreciated.
__________________
Reverend
Dev Anon : Best game ever? Hmm... you mean other than anything from us? (2005)
Reverend is offline   Reply With Quote
Old 11-Aug-2004, 06:25   #24
Saem
Senior Member
 
Join Date: Feb 2002
Posts: 1,532
Send a message via ICQ to Saem Send a message via AIM to Saem Send a message via MSN to Saem
Default

Holding the ultra high poly models and the low poly models and other associated process data would likely be TOO much of a burden on the on board RAM of vid cards.
__________________
Regards.
Saem is offline   Reply With Quote
Old 11-Aug-2004, 10:30   #25
Anonymous
Senior Member
 
Join Date: May 1978
Posts: 3,263
Default

Reverend,

Even though I am an anonymous poster, I have been following these threads for a long time now and have been following the technical aspects of Doom III since their release to the public. I do remember a time when he did say that all the shadow volumes would be done on the CPU. It was a long while ago, and I don't remember exactly where I saw it (I want to say VE actually), but I do remember discussing this on some forums. There was a slight controversy over this because a lot of people were saying that doing shadows on the CPU would add too much overhead and hurt performance on higher-end cards that would be able to handle the full vertex model.

As for another anonymous poster (I am not the same person) who is calling John Carmack fool - you sir, need to get some facts straight before you go calling people favoring one hardware over another. Firstly, this is john carmack, he doesn't care which hardware runs faster he just codes things as best he can. He has more integrity and dignity than that. Secondly, he even said that it uses the ARB2 path, which he has been saying for a while, that he liked best because it made all cards render everything the same - john carmack has stated time and time again that he does not like specialized rendering paths. The ARB2 path is an industry standard for OpenGL - it favors no hardware but the best hardware, simple as that. Please educate yourself before passing off your assumption as fact.
Anonymous is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
John Carmack on Java J2ME cell phone gaming suryad 3D Technology & Algorithms 1 30-Mar-2005 02:33
New Catalyst Control Center is up PatrickL 3D Hardware, Software & Output Devices 108 11-Sep-2004 19:03
Doom3 -- hopefully another Carmack interview with me Reverend 3D Architectures & Chips 31 18-Aug-2004 05:46
Can U.S. schools survive liberalism? Sabastian General Discussion 124 07-Oct-2003 22:30


All times are GMT +1. The time now is 08:08.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.