Optimising the Renderer using SIMD by id Software

In the shadow volumes pdf, why do the number of clock cycles increase when moving from 130nm to 90nm P4s :?: I was just skimming so forgive me if the answer is "right in front of me". :oops: Doesn't that mean it's slower?
 
Prescott had a longer pipeline than Northwood. I suspect that's what they're referring to... it's certainly not slower if the clock speed is higher, and there may be efficiency advantages elsewhere, they say as much in the article.
 
Ah righto.. I forgot about the pipeline. Thanks :) Slightly OT, but can these clock cycle numbers then be used to accurately determine how much faster (Ghz) the 90nm chip would have to be in order to beat the 130nm chip? (basically, complete the required clock cycles in a fixed time frame). The reason I ask is because I'm getting something around... a 3.2Ghz 90nm needed to do the same stuff as a 2.6Ghz-ish 130nm. I'm sure I'm missing a big factor here. Subsequently, where's the AMD-edition of these docs? :p

On another note, it would be interesting to compare GPU skinning against CPU skinning in the same paper. Edit: ack.. more links in the references sections for each paper, and working links too! It's going to be a busy day! :LOL:
 
I like the last few paragraphs, the "Intel® Pentium® 4 Processor on 90nm Technology" vs. the "Intel® Pentium® 4 Processor on 130nm Technology". Heh heh. I think that helps me comprehend what processor it is and who owns it.

That is sure some incredible programming. First to figure out all of the equations and ways to simplify them, and then to write optimized assembly for it. Yikes!
 
Last edited by a moderator:
Very interesting stuff, even for a layperson such as myself! Just for fun, I looked at one of the docs and I could follow most of the stuff and get an understanding of what was going on. I assume tho that they pretty much require full source code to take advantage of, and that a mod couldn't do it.

Is there a full tally of how big a performance difference you get in the end? Some of the steps in the skinning doc is claimed to run over 7x faster, but if they do relative little work compared to other steps (particulary compared to those that are completely untouched and not optimized), the total gain won't be near that much of course...

Great thread, thanks!
 
Guden Oden said:
Is there a full tally of how big a performance difference you get in the end? Some of the steps in the skinning doc is claimed to run over 7x faster, but if they do relative little work compared to other steps (particulary compared to those that are completely untouched and not optimized), the total gain won't be near that much of course...

It might be worth considering constructing a test map for use in both Doom 3 and Quake 4 and comparing performance directly. The paper mentioned that the optimizations were not present in Doom 3, but that they were implemented in time for Quake 4. :)
 
Alstrong said:
Slightly OT, but can these clock cycle numbers then be used to accurately determine how much faster (Ghz) the 90nm chip would have to be in order to beat the 130nm chip? (basically, complete the required clock cycles in a fixed time frame). The reason I ask is because I'm getting something around... a 3.2Ghz 90nm needed to do the same stuff as a 2.6Ghz-ish 130nm. I'm sure I'm missing a big factor here.

Not really, no. These are just some of the subsystems of the engine and lot of other parts (physics, AI, game logic) aren't present. In fact, DOOM 3 was one of the first games to show the Prescott ahead of similarly clocked Northwoods (mostly because of the larger L2 cache). OTOH I was really surprised the Northwood proved so much faster in these tests.

On another note, it would be interesting to compare GPU skinning against CPU skinning in the same paper. Edit: ack.. more links in the references sections for each paper, and working links too! It's going to be a busy day! :LOL:

You can toggle between the old (CPU) and new (GPU) skinning in Q4 with the "r_usenewskinning" CVar.

EDIT: heh, this thread got me my fifth green square. Thanks guys. ;)
 
Back
Top