Ryan Shrout: Focusing back on the hardware side of things, in previous years’ Quakecons we've had debates about what GPU was better for certain game engines, certain titles and what features AMD and NVIDIA do better. You've said previously that CPUs now, you don't worry about what features they have as they do what you want them to do. Are we at that point with GPUs? Is the hardware race over (or almost over)?
John Carmack: I don't worry about the GPU hardware at all. I worry about the drivers a lot because there is a huge difference between what the hardware can do and what we can actually get out of it if we have to control it at a fine grain level. That's really been driven home by this past project by working at a very low level of the hardware on consoles and comparing that to these PCs that are true orders of magnitude more powerful than the PS3 or something, but struggle in many cases to keep up the same minimum latency. They have tons of bandwidth, they can render at many more multi-samples, multiple megapixels per screen, but to be able to go through the cycle and get feedback... “fence here, update this here, and draw them there...” it struggles to get that done in 16ms, and that is frustrating.
Ryan Shrout: That's an API issue, API software overhead. Have you seen any improvements in that with DX 11 and multi-threaded drivers? Are those improving that or is it still not keeping up?
John Carmack: So we don't work directly with DX 11 but from the people that I talk with that are working with that, they (say) it might [have] some improvements, but it is still quite a thick layer of stuff between you and the hardware. NVIDIA has done some direct hardware address implementations where you can bypass most of the OpenGL overhead, and other ways to bypass some of the hidden state of OpenGL. Those things are good and useful, but what I most want to see is direct surfacing of the memory. It’s all memory there at some point, and the worst thing that kills Rage on the PC is texture updates. Where on the consoles we just say “we are going to update this one pixel here,” we just store it there as a pointer. On the PC it has to go through the massive texture update routine, and it takes tens of thousands of times [longer] if you just want to update one little piece. You start to advertise that overhead when you start to update larger blocks of textures, and AMD actually went and implemented a multi-texture update specifically for id Tech 5 so you can bash up and eliminate some of the overhead by saying “I need to update these 50 small things here,” but still it’s very inefficient. So I’m hoping that as we look forward, especially with Intel integrated graphics [where] it is the main memory, there is no reason we shouldn't be looking at that. With AMD and NVIDIA there's still issues of different memory banking arrangements and complicated things that they hide in their drivers, but we are moving towards integrated memory on a lot of things. I hope we wind up being able to say “give me a pointer, give me a pitch, give me a swizzle format,” and let me do things managing it with fences myself and we'll be able to do a better job.