Carmack to use shadow map in the next game?

Discussion in 'Architecture and Products' started by 991060, Aug 19, 2004.

  1. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Right, and you are?

    Since I argumented every single thing I said about 3dmark03, it would seem that I do. And why are you not commenting on those things?

    I think you are somewhat confused here. The GPU is more than just a fillrate thingie. Doom3 doesn't use vertexshaders extensively, while, like pretty much any game when you turn the resolution up high enough, it will use all fillrate available.
    3dmark03 uses the fillrate (more than Doom3, since it has no optimizations for stencilshadows) aswell as the vertexshader power, which relieves the CPU.
    While both may be fillrate-limited in high resolutions, Doom3 isn't using the vertexshaders to their full potential. So Doom3 doesn't stress the GPU as much as 3dmark03.
    And since I was discussing benchmark figures on my system in 640x480 without AA/AF, fillrate doesn't have all that much to do with it, it's mostly about whether or not the vertexshaders are up to the task of doing all the skinning and shadowvolume generation faster than the CPU, which they do.
    Have you ever even implemented either method? Can you tell me how they work, and why the CPU-based method like Doom3 would stress the GPU in 640x480 then? If not, I will just assume that you are the clueless one here.
    As for interactivity, bit silly of you to mention it again, since we are talking about timedemos which as we know aren't interactive either.

    I don't care about reviewers. YOU are missing the point. Doom3 is not playable on MY system.

    I already have a GPU-based shadowing system in my 3d engine, much like the one used in 3dmark03. It easily outperforms Doom3 on my system, like 3dmark03 does.
    Do you want to compare it to your 3d engine?

    I don't care about beta drivers. They could have any number of experimental features, hacks, cheats, bugs, whatever. I am talking about WHQL drivers only. And in the case of ATi, those haven't changed performance in 3dmark03 very much over the years. No more than any other software anyway.
    And since the FM guidelines pretty much rule out application-specific optimizations, and ATi still manages do get their drivers approved at every single release... what optimizations could they possibly have in the drivers that affect 3dmark03, but do not affect any other game?
    Since you seem to imply that ATi has such optimizations, I'd like you to explain in detail what they are.

    Pay more attention: I get 10 fps or less with 3 or more characters on screen at the same time. When there are no enemies, I can easily get 60 fps.

    As I said, changing the in-game detail or resolution doesn't have any effect whatsoever.

    I have 768 mb of memory, which I belive is twice the amount of recommended memory for Doom3. So I doubt that this is the problem.

    Obviously since my card easily gets 60 fps when there are no characters around, and about 15-20 fps when there are 1-2 characters, and changing resolution or detail has no effect, I doubt that an even faster videocard would have any effect on performance whatsoever.

    I have already solved the problem: CPU skinning and shadowvolume extrusion is not a good idea on a CPU of < 2.5 GHz. And on any R3x0-based card or better, GPU-based code IS a good idea, as 3dmark03's bruteforce approach demonstrates.
    So the solution would be to get Carmack to write proper code.
    What part of the problem don't you understand?

    So let me get this straight... It is naive to think that there are 3dmark03-specific optimizations in the drivers, but it is not naive to think that NVIDIA gets all its performance gains in popular gains from an optimized compiler?

    About Carmack's sub-standard code, to be exact.
     
  2. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    I added GPU-based shadows in less than a day in my engine. Create vertexbuffer with proper vertex format, write shader, done.

    As I have said many times before, if you skin an entire mesh to get a hit-detection, you are doing something wrong.
    You can use a hierarchy of bounding boxes for the bones, and only skin the vertices in the boxes with a hit. This will most probably reduce the amount of skinning work by at least 10 times for the average frame.
    On top of that you no longer have to pump a shitload of geometry over the AGP bus every frame, and the GPU no longer has to wait for this data to arrive, as I already mentioned earlier, so it will both reduce work on the CPU and improve concurrency between GPU and CPU.
    So the benefits should be painfully obvious.

    He had 5 years, it takes about one day. What information would possibly be an excuse for not putting in an extra day's work on a 5 year project?

    JC himself said in the QuakeCon keynotes that he wasn't entirely happy with the decision, and that he did think the game was rather CPU-heavy.
    So apparently his decision wasn't entirely technical.
    Ofcourse, the original decision was technical. At the start of the Doom3 project, GF2/3 didn't have the vertexshading power, so there was no choice. Ironically enough, the CPU-power wasn't there yet either.
    But as soon as Carmack received the first prototype of the R300, he should have known that the vertexshading power was now available. This was still at least two years away from the completion of the Doom3 project... High end CPUs were now barely capable of providing the power, while for R300 it was a breeze... I wonder what made him not decide to use that power of the R300. And 3dmark03 should have been an eye-opener, if he wasn't convinced yet. That was still long before the end of the Doom3 project.
    I don't get it. I hope Reverend included my question in the interview for John Carmack, so we will get the answer at last.
     
  3. tcchiu

    Newcomer

    Joined:
    Jun 3, 2004
    Messages:
    22
    Likes Received:
    0
    Location:
    Taiwan
    I wish the next generation of APIs will provide a query counter named "GPU stress" so people can stop arguing whether a benchmark or game _does_ stress the GPU. 8)

    Code:
    d3dDevice->CreateQuery( D3DQUERYTYPE_STRESS, &d3dQuery );
    
     
  4. Cat

    Cat
    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    74
    Likes Received:
    0
    Regarding software skinning:
    Perhaps id didn't want to limit skeleton sizes, as there are a fixed amount of parameters available to vertex programs. An arbitrary number of bones is a good thing for the artists.

    ARB_vertex_program guarantees only 96 env and local parameters.
    This allows about 30 bones.

    The R300+ and NV30+ have 256, I believe.
    Assuming you have the non-standard (but somewhat common) 256 parameters available, using 3x4 matrices per bone gives you space for about 85 bones before other parameters for light position are used, while the quaternion + translation method gives you 128, with the cost of quat->matrix conversion. If you wanted to use per-program parameters, that doubles your register space, but mixing them is kind of ugly, and perhaps costly in state change. I think Game Programming Gems has an article on cheap quat skinning.

    You can split up your model if you're overflowing the register limit, but this is also somewhat ugly.

    EDIT: The Battle of Proxycon skinned characters look like they have far fewer bones than most Doom 3 creatures.

    Here's a sample of Doom 3 creatures, listing the number of joints they have:

    archvile: 78 bones
    cacodemon: 53
    cherub: 62
    cyberdemon: 78
    hellknight: 110
    imp: 71
    pinkdemon: 72
    regular zombies: ~70
    fat zombie: 80
     
  5. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Likes Received:
    40
    Location:
    PT, EU
    Quite the contrary, HardOCP used actual playing runs and not the timedemo so their performance data is representative of gameplay.

    That's my point! While you may think there are plenty of people in your situation, there may not be enough people to support the decision to offloading more and more to the GPU.

    It wasn't meant as a "proper" comparison. It was meant to show that JC is not personally out to get you and that we don't always get what we feel we deserve.

    But the CPU is still needed for other stuff. I encode movies all the time. If the industry moves to a point where all you need is a GPU and a crappy CPU they better also make sure the GPU can make everything the CPU does at the same speed. Until that happens why should games be any different?

    I said games. ;)

    Hardware yes... software? Of course, it's an advantage when the game looks the same. You could make a port of FarCry to the Nintendo SNES but would you really want to? Isn't there a point when it starts to look like a different game? Having played FC on a PIII with a GF2 and now with my P4 + R9800 Pro, it's like night and day. And one thing I really didn't like was how FC would remove filler stuff from levels on less powerful machines (like the shark, some vegetation) without giving the option of turning them back on.

    But you seem to be the only one who has a problem with it (perhaps because most people have a GFX card which is balanced with their CPUs?). If your situation would really be that common why do you think D3 articles such as HardOCP's Hardware Guide, Anandtech's and Xbit Labs' CPU scaling articles show a balance between CPU and GPU? And if you uninstalled it why do you even care then?

    Isn't that a biased comment?

    So... in the end you'd get people with low end gfx cards and high end CPUs.

    Exactly. So games should be made in such a way to run super fast on fast hardware completely skipping the GFFX line because it's slow? That's almost as bad as TWIMTBP games automatically disabling features on ATi hardware because they don't have support for nVidia extensions.
     
  6. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Since there are apparently enough people to support a linux version, I don't see why this group wouldn't be supported, since it is much larger.

    Excuse me, but I do not consider an 1800+ CPU slow at all. While it may not be the fastest CPU around, it still has plenty of horsepower to do many tasks at acceptable speeds. It would be different if my CPU was slow (which I consider < 1 GHz or so), but it's not. It's actually above the minimum requirements of Doom3 too, and obviously fast enough to play any other game on the planet. So the question is not: "why should games be any different?", but "why should Doom3 be any different?".

    Exactly what part of the 3dmark03 renderer is not comparable to games then?

    I disagree. I didn't buy a Radeon to get the same graphics as the GF2 I upgraded from. But in the case of Doom3 apparently I should have upgraded my CPU instead, since the GF2 would give the same quality, and speed is not important when you're CPU-limited.
    Ironically enough, all other games have the exact opposite requirements. Some of the latest games don't even run on a GF2 at all. Yet the CPU is never a problem.

    I'm not the only one. I'm the only one who dares to speak up at this forum. But I know a few people who also have the problem. And there are most probably lots of people whom I don't know, who have the problem aswell.

    I wonder how they tested it. If you take the latest Athlon motherboard, with dual channel DDR400 or whatever, AGP8x etc, and stick an 1800+ in it, the PC is most probably at least 20-30% faster than my system, with SDR memory, AGP4x, ancient VIA chipset etc. So how representative would their tests be anyway? I have a 'real' 1800+ system, with all hardware from the same era. Just like a friend of mine, who has an 1800+ with DDR266, nForce chipset and a Radeon 9700, and also has the same problem.
    So what are they really testing? Just the CPU, or an actual system from the era of that CPU?

    No, it is a fact that NVIDIA has good performance with OpenGL, ATi has acceptable performance, and all others have lousy performance.

    No, because the other people with the fast GPUs still exist. Hello! I am right here!

    The FX series is only slow in ps2.0. They are very good at running DX8-shaders. Which is what eg Valve makes the cards do. The same could be done in Doom3, just run the GF3/4 path on them, and they will perform excellently. There is really no other way around the problem that FX just sucks compared to any other DX9-card. They just HAVE to get a lighter workload, unless you want unplayable framerates.
     
  7. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Likes Received:
    40
    Location:
    PT, EU
    For the record, when I go shopping for a new computer and the lowest CPU I can find in stores is the 2.8C then yes, I think 1.8 is slow.

    And like I said, hardOCP shows the min req system with playable (~30 fps) framerates. If you don't get that with your slightly beefier CPU + mainstream card, something is wrong with your setup. Yes, it might just be DOOM that finally showed you something is wrong, just like only in D3, OC cards may start to show artefacts, etc. A friend of mine has a 2.4C + GF4mx 440 (like I told you, most people I know have these kinds of setups) and he can hardly play above 640. By your theory that the game is completely CPU limited he should be able to get at least 1024 considering his CPU is almost a full 1ghz above the minimum.

    Futuremark has to make different decisions from what game devs usually do. Those decisions come from the fact that the games have to run well on the largest user base. 3Dmark can only get 5fps for all we care since it's only there to test the video card.

    If they don't speak up to id Software, it will be hard for their "problem" to be addressed.

    So, let's see. As you said you need a dx9 card to really reap the benefits of GPU skinning, etc. GFFX cards suck at dx9 and need to have a lighter load. GF6800's and X800's are a rarity compared to any other cards today. So that means, if they were to offload parts of the renderer to the GPU only the users of R9500-9800 would benefit from it, and only those who have slow CPUs compared to their video card range.

    Throw in the fact that for GPU shadow extrusion you need manifold models creating a whole lot of extra geometry that just isn't needed when done on the CPU and suddenly it starts to make sense why JC went for a CPU implementation only.
     
  8. Cat

    Cat
    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    74
    Likes Received:
    0
  9. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    There is a huge difference between what's on sale, and what is the minimum requirement for today's software. An 1.8 may be relatively slow, but it's not slow in the absolute sense. Besides, you can also buy Celerons or Semprons, and the cheapest models there are probably a lot closer to the performance of my PC than a P4 at 2.8 GHz.

    Nonsense, as I said, a friend of mine gets the exact same performance levels on his 1800+, even though everything else in his system is completely different from mine. Why are you not listening to me, and trying to find an excuse? My PC is fine, the other 1800+ is also fine. Doom3 is just coded like crap.

    Duh, a GF4MX obviously has different fillrate limits than a Radeon 9600Pro. First of all, the Radeon can render the entire lighting in 1 pass, while the GF4MX needs 3 or 4 passes per light.
    Secondly, the GF4MX has considerably less fillrate, because it has less pixel pipelines and slower memory.
    Thirdly, the GF4MX has no specific z/stencil optimizations and no doublesided stenciling, so it is a lot less efficient in rendering the stencilshadows aswell.
    So obviously a GF4MX is going to have a lot more trouble running in high resolutions since the card is both slower and needs to do a lot more work.
    If you couldn't figure that out by yourself, why are you even bothering to discuss? Obviously you don't understand anything about how different renderpaths work.

    It's not to test the videocard. It's to estimate the performance in future games on the latest hardware. Futuremark's goal is to write game tests that use rendering methods that will be used in games that will eventually be used on that hardware. They communicate with IHVs and game developers to determine which rendering methods those will be. Obviously they were spot-on with stencil shadows in 3dmark03.

    If unqualified people such as yourself keep praising JC and telling to just shut up and upgrade the CPU, I don't think many of them will even realize there is a problem.

    First of all, I said they were only slow at ps2.0. Obviously skinning and shadowing is not done in the pixelshader but in the vertexshader, so that already avoids the biggest performance problems. Secondly, I already said that most probably all FX cards excluding the 5200 will be fast enough.
    Thirdly, the group of R9500/9800 users is considerably large. Larger than the group of FX users anyway. And if we add the X800/6600/6800 users, the group will be even larger.
    So yes, many people will benefit.

    Show me one Doom3 model that is not 2-manifold.
    And shadowvolumes require a lot of extra geometry, regardless of whether it is generated on the CPU or the GPU.
    I would much prefer it if it didn't have to be uploaded every frame, and having the GPU wait for it.
    You seem clueless, how many shadowvolume engines have you written, 0?
     
  10. AndrewM

    Newcomer

    Joined:
    May 28, 2003
    Messages:
    219
    Likes Received:
    2
    Location:
    Brisbane, QLD, Australia
    Fully automatic shadow volume extrusion does have some problems tho. It's not the panacea that you're making it out to be Scali. Doom3 actually does semi-automatic extrusion. I assume you know the difference, since you keep on about how crap JC is.
     
  11. Cat

    Cat
    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    74
    Likes Received:
    0
  12. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Be specific, or be silent.

    Yes, the actual projection to infinity is done in a shader. Which I find rather silly. If you do everything else on the CPU anyway, you might aswell do the last step aswell, especially since the geometry is already being processed by the CPU anyway. I don't think that there's any performance advantage whatsoever from doing this in the vertexshader.
     
  13. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Both my implementation and 3dmark03's implementation work fine. And even JC himself mentioned that he perhaps should have done it on the GPU.
    Regardless of any advantages of CPU-based methods and disadvantages of GPU-based methods anyone can post here, there is ample empirical evidence that the GPU-based method is the best one in practice, for Doom3-like scenarios on R300 or better.

    So I would appreciate it if everyone just kept silent instead of talking theoretical nonsense.
    What is it with JC anyway? He's not god, he's human. Humans make mistakes.
    I am getting pretty tired of constantly being attacked for having legitimate criticism on Doom3. It is a fact that Doom3 doesn't run fast enough for heavy combat on my PC, and it is a fact that both my engine and the 3dmark03 engine do. Even if Doom3 runs fine on your PC, that still makes the Doom3 engine inferior to mine and 3dmark03's in terms of performance on my PC. Which is still a fact, whether you like it or not.
    And since JC designed that engine, it is his fault that the design is performing suboptimally on my PC. Now stop finding excuses for JC.
    Unless you are JC himself, and can tell me exactly why you chose to overuse CPU and underuse GPU, kindly stay out of this.
    Especially when you have no clue about the subject whatsoever, as certain people in this thread have demonstrated.
     
  14. Cat

    Cat
    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    74
    Likes Received:
    0
    What the fuck? You're just spewing nonsense now. Shadow extrusion to infinity done in a shader is a no-brainer, as doing simple per-vertex transforms on the GPU is known to be fast on all the cards that Doom 3 supports.

    You're still ignoring the fact that doing GPU skinning limits your bone counts. If you hadn't just dismissed this without a single comment, maybe you'd be taken seriously. You also need to skin on the CPU for proper per-triangle hit detection. Others have suggested transforming bone bounding boxes, and only fully skinning triangles inside those boxes that are actually hit by things.

    Your experience with shadow-volumes does not make you the final authority on them. First you invited discussion, then you decided that Carmack's code was 'poor,' and now you've proclaimed no one else is qualified to comment on this but Carmack, and your highness.

    Get over yourself.
     
  15. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Erm, what is your point here exactly?

    I didn't respond because you already said yourself that you can split up objects if the amount of bones get too high. Nothing more to be said about that. If you don't want to take me seriously, that says more about you than about me, really.

    I have mentioned that myself in this thread, and in others, I believe. Which means that per-triangle hit detection is no reason to skin the entire scene on the CPU. As I said before, perhaps 1/10th of all vertices would have to be skinned or so. Which no longer makes it a performance-issue.

    My point is clear: I have factual data, and I have heard all the theoretical nonsense from everyone already. Only Carmack can answer why he did what he did.
    I don't care about others who just tell me that my PC is broken or that I need to upgrade or whatever.
    They can't answer my questions. They just annoy me, so I wish they would shut up already. Everything has been said many times by now and I'm sick of it.
    And you should REALLY shut up, because your whole "your highness" and "get over yourself" stuff was completely uncalled for. You just totally misinterpret what I'm trying to say, and then start going for personal attacks.
     
  16. Cat

    Cat
    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    74
    Likes Received:
    0
    Splitting things up is fairly ugly, like I said. The CPU approach is elegant, simple, and works well enough for most people.

    The same can be said for the hit detection, the LUT for specular approximation, and not using assembly.

    When and where do you stop adding to the special-case workload?

    Ease of code maintanence is a good thing.

    Basically I see your complaint as 'Why doesn't id optimize for my special case?' But you've decided that Carmack's coding is poor, because of a hardly-representative 3dmark example, and your own limited experience.

    I'll repeat: you invited the discussion, and this is a discussion board, and have now declared that only Carmack can comment on the topic. You've declared everyone else's thoughts irrelevant, but your homebrewed shadow-volume code makes you an authority.
     
  17. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    We have now fully understood that this is your opinion and you have put forth your arguments.
    Do you have anything further you'd like to add?
     
  18. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    My case isn't very special at all. I have heard that too often, but nobody has produced any figures. I find it very hard to believe that most people own a 2.5+ GHz PC today. If you don't have figures to prove one way or the other, shut up already.

    Again, if you can't explain why 3dmark03 would not be representative, shut up already.
    It's just your opinion, which is apparently based on lack of knowledge on the subject.

    I never said I was an authority, and I never said my code was homebrewed. You just misinterpret and assume all over the place, then apparently get pissed off, and write a post that pisses me off.
    But it is pretty clear to me that a lot of the people who have responded to this thread have considerably less knowledge on the subject than I do.
    Then again, it's my profession.

    I have given my experiences, my opinions, my facts. Others have done that too. I never said other's thoughts were irrelevant. I said that I am tired of hearing the same stuff. Especially since only Carmack knows why he did what he did.

    So stop harassing me and stop trying to discuss something that you cannot discuss.
     
  19. Scali

    Regular

    Joined:
    Nov 19, 2003
    Messages:
    2,127
    Likes Received:
    0
    Yes, people who are not qualified, should stay out of this.
     
  20. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    So pointing out that you neglect that adding the code you would like to see in DOOM3 would require additional reprogramming of the game, thus adding a significant amount of work, qualifies as "harassment"?

    Why post here in the first place if you are not prepared to consider the input and position of others? Did you simply want an audience for your whining?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...