Quakecon 2006 - JC keynote

I actually thought about doing it, but I figured that by the time I finished someone else would have done it so why waste my time...
 
In last year's speech he talked about console development and hardware evolution. In this years speech he talks about software development and its evolution.
 
23:39 "...and some graphics vendors make the really really bad decision in my eyes to sometimes behind peoples backs say oh we've got an extra processor free here let's offload some of our vertex processing. Bad graphics vendors that's not a good thing to do from a developer's standpoint because it adds so much more uncertainty and different levels of bugs for us and variability on that..."

So is that both ATI and NVidia?

Jawed
 
23:39 "...and some graphics vendors make the really really bad decision in my eyes to sometimes behind peoples backs say oh we've got an extra processor free here let's offload some of our vertex processing. Bad graphics vendors that's not a good thing to do from a developer's standpoint because it adds so much more uncertainty and different levels of bugs for us and variability on that..."

So is that both ATI and NVidia?

Jawed

I'd say yes.

I sure an interview with someone from ATI mentioned that some titles (jedi Academy, was one) offloded vertex work to a CPU (if it was a performance win).

Wavey Dave, also aluded to Nvidia doing the same thing (his tests seem to point him in that direction).
 
Hmm 3dMk05 + Prime consumes less power on NVidia (compared againt PS fillrate testing + Prime) whereas ATI is the other way round.

3DMk05 is generally CPU bound with these high-end cards, isn't it?

Unfortunately there aren't any 3DMk05 scores to help fill in the puzzle.

Jawed
 
As far as I know 3dmark05 is vertex setup bound in game tests 1 and 2. You can run the first two tests @ 800x600 and 1600x1200 with 4xAA/16xAF and get virtually the same performance. Disabling a few vertex units ((usually takes 2-3 on a 7900GT SLI setup. One wont be enough)) and you'll start noticing diminishing results in those tests. Game Test 3 is a little different and seems to scale better with resolution/pixel units. Its gonna vary though depending on the amount rendering power available of course.
 
As far as I know 3dmark05 is vertex setup bound in game tests 1 and 2. You can run the first two tests @ 800x600 and 1600x1200 with 4xAA/16xAF and get virtually the same performance. Disabling a few vertex units ((usually takes 2-3 on a 7900GT SLI setup. One wont be enough)) and you'll start noticing diminishing results in those tests. Game Test 3 is a little different and seems to scale better with resolution/pixel units. Its gonna vary though depending on the amount rendering power available of course.

Intersting.

How about 3DMark06 ?
 
Unfortunately there aren't any 3DMk05 scores to help fill in the puzzle.
I wonder if comparing power draw between the first multicore-CPU NV driver to offer performance improvements with Quake 4 and the previous one would shed any light on this. Maybe also with pre- and post- Q4 multicore versions (IIRC, Q4 was explicitly patched for multicore shortly after NV's driver).

Or maybe Tridam would be kind enough to provide the 3DM05 scores both with and without P95 running? Maybe even his educated speculation, if he'd be so kind.

Or maybe I'm not reading 3DM05's vertex setup prediliction correctly. The way I see it, if NV offloads that to a multicore CPU without taking into account the CPU's load, then P95 hogging one or both cores may compromise any vertex work NV's drivers send to the CPU, therefore leaving the rest of the GPU (the fragment shaders [and I guess ROPs, tho I'm not sure how power hungry they are]) hanging, thus leading to the reduced power draw.
 
Or maybe I'm not reading 3DM05's vertex setup prediliction correctly. The way I see it, if NV offloads that to a multicore CPU without taking into account the CPU's load, then P95 hogging one or both cores may compromise any vertex work NV's drivers send to the CPU, therefore leaving the rest of the GPU (the fragment shaders [and I guess ROPs, tho I'm not sure how power hungry they are]) hanging, thus leading to the reduced power draw.
Sounds reasonable to me.

Which seems to imply that only NVidia is pushing some vertex shading onto the CPU. At least in 3DMk05.

Jawed
 
I wonder if comparing power draw between the first multicore-CPU NV driver to offer performance improvements with Quake 4 and the previous one would shed any light on this. Maybe also with pre- and post- Q4 multicore versions (IIRC, Q4 was explicitly patched for multicore shortly after NV's driver).

Or maybe Tridam would be kind enough to provide the 3DM05 scores both with and without P95 running? Maybe even his educated speculation, if he'd be so kind.

Or maybe I'm not reading 3DM05's vertex setup prediliction correctly. The way I see it, if NV offloads that to a multicore CPU without taking into account the CPU's load, then P95 hogging one or both cores may compromise any vertex work NV's drivers send to the CPU, therefore leaving the rest of the GPU (the fragment shaders [and I guess ROPs, tho I'm not sure how power hungry they are]) hanging, thus leading to the reduced power draw.

I'm not entirely convinced thats the case. Assuming they are doing so I havent seen any large performance disparities between multi core and single core in 3dmark05. When troubleshooting over at the nzone forums I have seen alot of SLI/Multi GPU scores and usually the performance is consistent between clock speeds between single/dual core. Of course I could be mistaken and I dont have a dual core CPU to test this on unfortunately. Now this happening in games might be a different story, I know Nvidia's dual core drivers have increased performance over single core CPUS in alot of games in the past. But as to why your guess is as good as mine.
 
Last edited by a moderator:
That would be a strike against my vertex theory, Chris. :) Trouble is, I have no idea what I'm talking about, and that's probably proving pretty funny to ppl reading this with more than the "just enough knowledge to make them dangerous" that I'm packing. Nevertheless!

How many CPU threads do GPU drivers use? If NV offloads some vertex work to the CPU, is that limited to one thread, one core? What's the sound of one neuron firing? I'm thinking of why a faster CPU improves 3DM scores (read: reduce CPU limitation) but an additional core does nothing, as you say.

Or maybe ATI's new MC just draws a lot of power and just isn't used much in HW.fr's synthetic PS load test, and that explains R5x0's power draw. Or its separated texture units, with which the MC is still involved, IIRC.

Cripes, I'm tired, and it hurts to see an ailing Agassi losing to this Becker newb (wait a sec...). I'll look through the '05 reviews, articles, and whitepapers later.
 
Might be just me, but after just watching the presentation from start to finish, I found this year's keynote to be *slightly* less rapidly technically linguistic than his previous outings. You should give it a try :).
It's currently a boring time for the hardware industry, trapped in that we're-all-just-waiting-for-Vista no man's land.

But just you wait till all these shackles (Vista, hardware, id's WIP game) are off!
 
Neeyik, Would you mind posting single card results or 7950GX2 with with multi GPU rendering disabled? Dual Core CPUs are optimised for multi GPU systems. Thanks.
 
BTW, here's some of the Q&As I had with John regarding some of the things he'd said during his keynote :

John Carmack said:
Reverend said:
1) SMP support in Doom3
You said that SMP support for the game/engine isn't rock solid and would lead to undesirable gaming experience issues. Can you tell me what exactly was the problem? And what did you mean by (examples appreciated) the problems in gaming experiences?
We had problems with driver changes breaking things, and we never expended the effort to make all the various warnings and debug tools work in SMP mode. I wouldn't doubt that there are various options that are fully supported by the menu system that could cause a code path that isn't SMP safe to be executed, leading to a random, hard to trace crash.

John Carmack said:
Reverend said:
3) GPU vs CPU
Tim "Mr CPU" Sweeney has on numerous occasions complained to me about the extra work that needs to be done when it comes to taking advantage of GPUs and that Microsoft OSs (and therefore its API) inherently have (and simply can't avoid due to its evolutionary status) useless overheads. We've known for some time now that Tim wishes for CPU/software-based rendering to make a big return but reading your keynote suggests that you appear to have the opposite view, that you *seem* to wish _something_ can be done where much of the rendering process will fall squarely on the shoulders of the GPU.
A fully programmable CPU system with equal performance would obviously be better, but it isn't completely clear that is going to be possible. The vertex / fragment paradigm is probably the most successful multiprocessing architecture ever. Architectures with large numbers of general purpose processors have historically delivered very disappointing performance on most problems, and a LOT of research has been done. Of course, CPU systems clock a lot higher, so that may be enough to overcome the lack of dedicated control logic.

The jury is still out on this, I could see it going either way in the mid-term. In the long term, a general system will be able to do things "fast enough" that specialized hardware, even if more efficient per gate or per dollar, won't be justifiable in most systems.

I have an architecture that I would like to implement on a high performance CPU based system that could be dramatically better than the existing vertex / fragment based systems, but it is all speculative enough that I wouldn't encourage a vendor to pursue that route if they couldn't also be competitive while emulating conventional rendering solutions.

John Carmack
There are other bits from John but they're not that interesting to me. Let me know if you guys have any questions (re his keynote) and I'll pass them on. Just make sure they're interesting enough :) :devilish:
 
Back
Top