GPU Context Switching in Vista. . .

Does context include z/stencil? It seems to me that it would have to.

Are Z/stencil values for a fragment effectively crystalised at the point of rasterisation, making Z/stencil for the fragment part of thread state?

Does D3D10 support multiple concurrent Z/stencil buffers, each independent of the others? Or is each application merely accessing a port within a screen-resolution sized Z/stencil buffer?

How is Z/stencil organised if you run UT2K4 and HL-2 in two 640x480 windows side by side? And then dragged Notepad over the top of them? Or started mucking about with Alt-Tab.

Jawed
 
geo said:
Does the USC concept come in here at all? Say I have two business apps with 3D-y interfaces crunching away and making purty pictures. Does USC mean it's easier for the GPU to assign, say 2/3rds of the units to the active window while keeping 1/3 going on the other window?
I don't think it makes much of a difference, but I don't expect this kind of parallel context execution to happen anytime soon (if ever) anyway. It would be similar to Hyperthreading, but HT is only a significant improvement when units would idle otherwise.
 
geo said:
Does the USC concept come in here at all? Say I have two business apps with 3D-y interfaces crunching away and making purty pictures. Does USC mean it's easier for the GPU to assign, say 2/3rds of the units to the active window while keeping 1/3 going on the other window?
This is why I asked about render state, because I'm under the impression that D3D10 GPUs support multiple render states, which can execute concurrently. i.e. it's the GPU's internal thread scheduler's job to run concurrent render states, not the OS's (i.e. not WDDM).

But maybe I've got entirely the wrong end of the stick.

Jawed
 
Jawed said:
Does context include z/stencil? It seems to me that it would have to.
AFAIU, it does. everything.
This is why I asked about render state, because I'm under the impression that D3D10 GPUs support multiple render states, which can execute concurrently. i.e. it's the GPU's internal thread scheduler's job to run concurrent render states, not the OS's (i.e. not WDDM).
Again, I'm not certain but I have the feeling that the GPU is told, by the OS, what to run, when to run it, and when to swap.
 
Jawed said:
Does context include z/stencil? It seems to me that it would have to.

Are Z/stencil values for a fragment effectively crystalised at the point of rasterisation, making Z/stencil for the fragment part of thread state?
For Z, yes. Fragments don't have stencil values, though. Those only exist in the stencil buffer.

Does D3D10 support multiple concurrent Z/stencil buffers, each independent of the others? Or is each application merely accessing a port within a screen-resolution sized Z/stencil buffer?
You can create lots of Z/stencil surfaces and color render targets, independent of each other. But only one Z/stencil surface and a limited number of color render targets may be bound at any given time (given they all match in resolution, AA mode and some other limitations).

How is Z/stencil organised if you run UT2K4 and HL-2 in two 640x480 windows side by side? And then dragged Notepad over the top of them? Or started mucking about with Alt-Tab.
I think each application should be rendering to its own color and Z buffers.
 
I think there are at least three levels of GPU threading going on here:
  • application
  • render state - it would be normal for a game to use multiple render states, I presume, but most "2D" apps would only need a single render state
  • vector (a set of 64 fragments, say, in a hardware thread)
and I'm, ahem, mistakenly blurring the line between application and render state for the purposes of context switching as originally raised by Geo :oops:

Meanwhile the D3D10 GPU can, I suppose, freely schedule vectors from one or more render states, according to its own internal load-balancing mechanisms. (I'm basing this on what I understand of the way Xenos works.)

I suppose the UI compositing engine in Vista operates with a single render state, with its own screen-res colour/alpha/z/stencil (i.e. what you see when Vista desktop is visible). I presume all applications individually render to what they think is the screen, and these buffers are textured onto quads (in 3D space) by the compositing engine. The compositing engine, itself, doesn't need to be multi-threaded on the GPU (does it?).

Jawed
 
Jawed, I'm not sure what you're referring to by "render state". Usually, render states are the "settings" of a GPU, like blend mode, depth compare function, texture filter and wrap modes, etc.
 
I think Jawed is referring to the capability of Xenos of executing multiple draw calls (primitive batches in OpenGL) in a parallel/pipelined manner. Of course each of those draw calls has it's own render state that must be stored in the GPU and thus Xenos is said to support multiple render states. He is extrapolating that the same feature may be a requeriment of D3D10 or a feature in future ATI GPUs.
 
  • Like
Reactions: Geo
He is extrapolating that the same feature may be a requeriment of D3D10 or a feature in future ATI GPUs.
Uh, it's called "pipelining" and has been a feature of GPUs for the last 10 years or so. I don't know how D3D10 could possibly mandate this, since this is a performance optimization. Apart from looking at the (small batch, non-D3D) perf numbers, you can't tell if the chip is working on multiple things in different states at once or not.

Context swichting isn't that at all. In the CPU world, it doesn't matter if the CPU is executing 1 instruction at a time or 100. When you switch contexts, you stop all execution (draining the pipeline if necessary), write out the (small) current CPU state to memory, then read back some other CPU state from memory and resume execution from there, refilling the pipeline.
 
  • Like
Reactions: Geo
I know perfectly well what context switching is in a CPU ... and in a GPU. I was just trying to explain what I thought Jawed was talking about. Even my simulator supports some primitive pipelining (of just two batches). But the way it's explained in the Xenos documents seems to suggest that there is something new, either because they are comparing against something that didn't had such feature or it has been improved, for example with deeper pipelining, more render states/draw calls active in the GPU pipeline.

And in any case there are different kinds of context switchings and parallellism. If the Vista driver model is trying to copy all the features of multitasking CPUs one of those features is threads, which is a different concept than a process. Then you may also have rendering threads from the same GPU process running concurrently executing in a pipeline of draw calls (for example that would be useful if you had a few render targets or shadow maps to fill). So there would be GPU processes, GPU threads and draw calls.
 
Paul Thurrott said:
At the WinHEC 2006 tradeshow last week, Microsoft revealed that the version of the Windows Display Driver Model (WDDM) graphics driver model it will ship with Windows Vista is only the first in a series of dramatic improvements to the way that Windows displays images onscreen. Future revisions to WDDM will appear in subsequent Windows releases, Microsoft engineer Steve Pronovost revealed, and those updates will improve the way that Windows schedules graphics tasks in the graphic card's GPU (graphics processing unit). Given Microsoft's issues moving Windows to WDDM 1.0 in Vista, my guess is that it will be at least three years post-Vista that we have to worry about WDDM 2.0, which will require not-yet-released graphics hardware from ATI and NVIDIA.

http://www.windowsitpro.com/windows...rticleID/50469/windowspaulthurrott_50469.html
 
:LOL: If WDDM2.0 is a D3D10 enabler, then 3 years after Vista might be a bit problematical. ;) Clearly there's still some confusion on the matter.
 
I am not sure who have started this rumor but in the whole WDK I can’t find any word that a D3D10 driver needs WDDM 2. The reference and design guide contains only information about how to write a D3D10 driver for WDDM 1.
 
Thank ghu, Demi is here to save us. :smile:

Maybe we're getting tangled up in WGF2.0 vs WDDM2.0? Are they the same thing?
 
geo said:
Thank ghu, Demi is here to save us. :smile:

Maybe we're getting tangled up in WGF2.0 vs WDDM2.0? Are they the same thing?

No. WGF 2 is the old name for D3D10. You can still find this in the D3D10 header.
#define D3D_MAJOR_VERSION ( 10 )
#define D3D_MINOR_VERSION ( 0 )
#define D3D_SPEC_DATE_DAY ( 8 )
#define D3D_SPEC_DATE_MONTH ( 8 )
#define D3D_SPEC_DATE_YEAR ( 2005 )
#define D3D_SPEC_VERSION ( 1.03000002 )
#define WGF_MAJOR_VERSION ( 2 )
#define WGF_MINOR_VERSION ( 0 )

WDDM 2 is the improved Version of the GPU memory manager and scheduler. It will make the GPUs in their memory and threading behavior more CPU likely. With WDDM 1 the kernel already controls the memory and the job scheduling but WDDM 2 will it make much more granular. This will improve the behavior if more than one Job should be done on the GPU. Like graphics, physics and sound processing.
But this is very deep in the kernel and I am sure that WDDM 2 will not require many changes on the user mode part of the driver. BTW the D3D10 user mode driver interface looks very interesting.
 
geo said:
:LOL: If WDDM2.0 is a D3D10 enabler, then 3 years after Vista might be a bit problematical. ;) Clearly there's still some confusion on the matter.
Demirug is correct that WDDM2.0 isn't needed for D3D10.
 
Thanks, Demirug. I'd rep you again, but I'm fresh out at the moment. :smile:

So then, when are YOU expecting WDDM 2.0? And how penal do we think the context switching in WDDM 1.0 will be in the meantime?
 
Blazkowicz_ said:
what happened to wddm-basic and wddm-advanced? those are version 1.0 and 2.0?
That's what I was assuming. Microsoft seems to change names a lot, but last I heard basic and advanced were still used.
 
geo said:
And how penal do we think the context switching in WDDM 1.0 will be in the meantime?
You should be able to test this with the Vista Beta and current cards. Mac OS X might give another data point.
 
Back
Top