If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,947
|
Following the release of Doom3 it’s seen many a discussions such as gameplay, graphics, benchmarking and tweaking. In this short interview we talk to id’s John Carmack about various elements of the title that are pertinent to benchmarking, as well as some of the queries that have arisen following some of the choices made within the games shader code."[/i]What's the situation with regards to depth bounds test implementation in the game? Doesn't appear to have any effect on a NV35 or NV40 using the cvar.Read the full interview here. |
|
|
|
|
|
#2 |
|
Naughty Boy!
|
very interesting dave. It explains a little bit better what humus actually changed
|
|
|
|
|
|
#3 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
What I would find interesting, would be a benchmark like:
normal path vs humus optimization vs specular disabled. |
|
|
|
|
|
#4 |
|
Registered
Join Date: Aug 2004
Posts: 9
|
Very interesting read, and it pretty much confirms what most of us figured, it was simply a design choice based on the hardware when the engine was designed and to give the best "overall" effeciency to the widest range of cards.
As the humus code showed, changing from a lookup does benefit later generation ATI hardware since it can handle the math more effeciently, thus giving it a bit more headroom. Which, either nets you a few extra frames and/or lets you use the ATI AF implementation that will net you slightly better overall image quality but without any significant cost. In the case of Nvidia's latest hardware, however, where it seems to be more effecient at handling table lookups, switching to the humus code seems to have a reverse effect on performance. If I were to make a guess, I'd say specular disabled would should the best performance, with humus and the original code flip flopping based on the hardware design and generation. A bigger question I would ask then is, based on this response, is it reasonable to assume that with the proper work, the table lookup could be replaced in a way that would net significant gains to all current generation video hardware, or are table lookups going to remain the defacto best choice for all Nvidia and some older generation ati cards simply based on design, leaving math alternatives really only benefiting R3x0 ATI cards and higher? In other words, how much further can we go without breaking image quality to the point where it isn't worth or finding the gains not significant enough to warant a change. |
|
|
|
|
|
#5 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
Could this become a problem in future that games will favour the texture access over the straight math in shaders.
The way its meant to be programmed |
|
|
|
|
|
#6 | |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
Quote:
Obviously, if it's a choice between a 4,000 step multipassed shader or a texture lookup, I'd guess that engine designers would take the texture lookup option, at least for the reasonably forseeable future |
|
|
|
|
|
|
#7 |
|
Member
Join Date: Jan 2004
Location: Norway
Posts: 406
|
He actually has a cvar to help calculate driver overhead. That's pretty cool.
|
|
|
|
|
|
#8 | ||
|
Senior Member
Join Date: Feb 2002
Posts: 2,544
|
Quote:
The reason why ATI's GPUs sees a big speed up is because they do maximum anisotropic filtering on the texture lookup, ie. instead of taking 1 (bilinear) sample it takes 8 for every tex lookup, hence a big drop in performance when 8xAF is applied. The patched shader remedies this by not doing a tex lookup at all. AF on tex lookups is of course a good thing when your actually using a real graphical texture. However in this case texture lookup is used as a discrete function ( f[x]=>y ), and one can argue that AF on 1D textures makes no sense at all (at least when doing magnification). If you'd want interpolated values, bilinear would do just as well as anisotropic filtering (again only in the case of 1D textures). I don't know if the application of AF in ATI hardware is a consequence of their texture cache design, or if it is simply a driver "bug". Considering they use an adaptive AF approach, I would think that they could indicate the degree of anisotropicity (is that a real word?) and just set it to 1. In that case the next set of Catalyst drivers is going to give a big boosts to performance, especially in high quality. Cheers Gubbi |
||
|
|
|
|
|
#9 |
|
Junior Member
Join Date: Aug 2003
Location: PR
Posts: 74
|
I know this might be controversial, but I believe he should have been asked if he recognizes the shader replacement as a legit one....
He indirectly acknowledges it but anyway, a straight answer, specially since many believe that the fix is actually better than the original line of code in the case of the ARB2 path, would silence some voices out there (I know it smells like flame fuel, but that's not the intended purpose). I'll put it another way, will Beyond3D use the "Humus Hack"
__________________
Are you wise enough to admit your own ignorance? |
|
|
|
|
|
#10 | |
|
Member
|
Quote:
When AF is forced through the CP does the driver know that some of the textures shouldn't have AF applied? If not shouldn't the performance on Nvidias hardware when AF is forced descrease just like Ati's? (Dont know whether it do or doesn't) |
|
|
|
|
|
|
#11 | |||
|
Registered
|
Quote:
My point about shader power increasing a lot faster than bandwidth is a fairly direct quote from one or other developer interview - I think Tim Sweeney, but I couldn't swear to it. I know that things are pretty much in the balance at the moment, and complex math operations are still a lot slower than texture access (as far as instruction cost goes), but something that was raised in the Humus' tweak thread might be relevant here: fetching textures has a large latency compared to math ops, and can cause pipeline stalls. You're absolutely right about the anisotropic filtering point though, I can see why that would cause a lot of problems, especially if you have to fetch 8 samples into the texture cache. Obviously, the higher the degree of anisotropy (I think that's the right word) applied, the more this will cause problems. I'm guessing that using app-controlled AF avoids this situation to an extent, which is why the performance boost isn't that big when using app AF. It's still tangible though on an R420 from what I've seen on my X800 Pro. That leads me to believe that the MAD_SAT and POW operations are less "expensive" overall than the texture lookup. This, as you've pointed out, only gets worse when AF is forced in the driver. Of course, I could be entirely wrong, as I'm a graphics technology newb |
|||
|
|
|
|
|
#12 | |
|
Member
Join Date: Feb 2002
Location: /home/rb/Switzerland
Posts: 402
|
Quote:
2) Performance does decrease quite considerably on NV if you force AF through the control panel (10-30% for GFFX). Humus' / Demirug's tweaks are mostly counter-productive for NV as the driver's shader replacement gets negated. 93, -Sascha.rb
__________________
Sascha "nggalai" Erni, -.rb www.3dcenter.org | www.cgworld.de | www.nggalai.com "Size is not as important as fill rate or at least thats what I have been told. 8)" -jb, August 2002 |
|
|
|
|
|
|
#13 | |||
|
Senior Member
Join Date: Jul 2002
Location: BelleVue Sanatorium, Billary, NY. Patient privileges: Internet access
Posts: 2,694
|
Quote:
Quote:
Quote:
Edit: Hard to say for sure, of course, but this sentence, "Because all the artwork and levels had been done with that particular function, we thought it best to mimic it exactly when we got fragment program capable hardware," sounds very much as if he's saying that the artwork and levels for D3 were all finished prior to them laying hands on an R300--which is a long time ago, but certainly fits perfectly with his next sentence, as it was nearly two years after R300 shipped to D3. Anyone else with a different take on that? |
|||
|
|
|
|
|
#14 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
AAAAAAAAAAAAAAh why didn't you asked about the 3Dc and which rendering path apply to ATi cards and get better performance? =[
|
|
|
|
|
|
#15 | |
|
Senior Member
Join Date: Dec 2002
Location: Under a Crushing Burden
Posts: 4,290
|
Quote:
Is all I can say to that I do not think walt they had the artwork done, I think his point was to make it look the same on all cards. No he may be saying if I realized how long it would take to do the artwork and levels I might have tried a different strategy on newer cards.
__________________
You bought horse armor didn't you? |
|
|
|
|
|
|
#16 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
I think all this sounds strange i mean by choosing the option Carmack did he knew it would lower ATI's performance and then not to say anything about it.... talk about cheating for Nvidias favor or whatever you would like to call it.
Many of us took for granted that the latest ID Software engine would come optimized for all cards. But i guess that is not the case. Also claiming time limits as the reason to not fix this for ATI cards is just BS. Doom 3 must be the most overrated game by long time. Gamespot has a decent review but still 8.5 is to much more like 7.5 Most Doom 3 review websites are just a good laugh. Comparing Doom 3 to Far Cry is like comparing Skoda to BMW. (The Hell lvl was nice but thats it.) ID havent been honest with us and this makes me wonder what more is to come. Talk about hyping up Nvidia cards and then the hard truth comes straight back in your face. I hope game developers choose other engines for their games since this engine is obviously not optimized for 50 % of the computer gaming market. |
|
|
|
|
|
#17 | |
|
Junior Member
Join Date: Aug 2003
Location: PR
Posts: 74
|
Quote:
__________________
Are you wise enough to admit your own ignorance? |
|
|
|
|
|
|
#18 | |
|
Senior Member
Join Date: Mar 2002
Posts: 3,674
|
Quote:
|
|
|
|
|
|
|
#19 | |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
lol who had to recall a 1.2 patch (with ATI optimizations) five months after the game wasd released ??? it's not id software .
it has been shown that the humus trick brings around 1 fps .. oh dear big deal . Quote:
|
|
|
|
|
|
|
#20 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
Im not defending Crytek for releasing the non working 1.2 patch regarding ATI hardware. They mezzed up but set the record straight by recalling the patch.
Regarding the FPS gain on Humus trick, depending on what card you use and your other spec you can gain alot more than 1 fps. It seems to work best for the X800 series. |
|
|
|
|
|
#21 | |
|
pifft
Join Date: Jun 2003
Location: oregon
Posts: 1,274
|
Quote:
__________________
but but but.... dang |
|
|
|
|
|
|
#22 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
I would not compare Far Cry with Doom 3, because those are two different games. If I have to, I would say that shaders looks much better in Far Cry. Shadows, lighting, special effects Doom 3 owns it.
Overall, those are two great games. I think HL2 is going to blow both games. HL2 is a pure DX9.0 game with intensive use of shaders, we haven't seen so far. |
|
|
|
|
|
#23 | |
|
Naughty Boy!
Join Date: Jan 2002
Posts: 3,266
|
I was looking for sites that plugged this interview and came across a comment in the Voodooextreme forums (horrors!) that had the following :
Quote:
Don't wanna bother John about this, so help by you guys is appreciated.
__________________
Reverend Dev Anon : Best game ever? Hmm... you mean other than anything from us? (2005) |
|
|
|
|
|
|
#24 |
|
Senior Member
|
Holding the ultra high poly models and the low poly models and other associated process data would likely be TOO much of a burden on the on board RAM of vid cards.
__________________
Regards. |
|
|
|
|
|
#25 |
|
Senior Member
Join Date: May 1978
Posts: 3,263
|
Reverend,
Even though I am an anonymous poster, I have been following these threads for a long time now and have been following the technical aspects of Doom III since their release to the public. I do remember a time when he did say that all the shadow volumes would be done on the CPU. It was a long while ago, and I don't remember exactly where I saw it (I want to say VE actually), but I do remember discussing this on some forums. There was a slight controversy over this because a lot of people were saying that doing shadows on the CPU would add too much overhead and hurt performance on higher-end cards that would be able to handle the full vertex model. As for another anonymous poster (I am not the same person) who is calling John Carmack fool - you sir, need to get some facts straight before you go calling people favoring one hardware over another. Firstly, this is john carmack, he doesn't care which hardware runs faster he just codes things as best he can. He has more integrity and dignity than that. Secondly, he even said that it uses the ARB2 path, which he has been saying for a while, that he liked best because it made all cards render everything the same - john carmack has stated time and time again that he does not like specialized rendering paths. The ARB2 path is an industry standard for OpenGL - it favors no hardware but the best hardware, simple as that. Please educate yourself before passing off your assumption as fact. |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| John Carmack on Java J2ME cell phone gaming | suryad | 3D Technology & Algorithms | 1 | 30-Mar-2005 02:33 |
| New Catalyst Control Center is up | PatrickL | 3D Hardware, Software & Output Devices | 108 | 11-Sep-2004 19:03 |
| Doom3 -- hopefully another Carmack interview with me | Reverend | 3D Architectures & Chips | 31 | 18-Aug-2004 05:46 |
| Can U.S. schools survive liberalism? | Sabastian | General Discussion | 124 | 07-Oct-2003 22:30 |