Memory Virtualization & DirectX 10

Kaotik · Aug 10, 2007

I did a fair bit of searching, but couldn't find what I was looking for, so here goes

Are the rumors true, that Microsoft dropped full Memory Virtualization requirement from DX10 due nVidia not being able to fullfill it, or is it false?

JHoxley · Aug 10, 2007

I don't know of any conspiracy theories regarding specific IHV's but I am aware that it's one of the less developed sections of WDDM/DXGI/D3D10 - at least IMHO anyway.

ISTR it was the WinHEC'06 slides that outlined the WDDM models going forwards that WDDM-1.0 (in Vista RTM and D3D10.0) was more of a placeholder. Yes it did the "GPU as a shared resource" but other than more rigourous control it didn't strike me as offering much beyond XPDM as far as features go. The 2.0, 2.1 and 3.0 outlines looked a lot more interesting and struck me as being the "real" WDDM virtualization and scheduling features.

I don't think I ever saw anything to suggest that anything other than 1.0 was shipping with Vista RTM and to my knowledge that's what we got... which is contradictory to your statement. Do you have any sources?

Cheers,
Jack

Demirug · Aug 10, 2007

Well the new WDK Beta that contains already the first bits for Vista SP1 (and Direct3D 10.1) doesnâ€™t show a sign of a change in the virtual memory management.
IMHO this whole rumor is based on a misinterpretation of something someone may hear. In the past there were plans to go to WDDM 2.0/2.1 with SP1. WDDM2.0/2.1 requires a more granular kind of memory virtualization then WDDM 1.0. This better form requires hardware support. Maybe nvidia was not able to integrate this in G90 and therefore asked Microsoft to drop this requirement for DEirect3D 10.1 hardware.

2senile · Feb 16, 2008

Sorry to be so late to the subject but could anybody explain, in simple terms, what is so great about Graphics card memory virtualization in Vista/DX10.

What can you "do" with it?

Couple of old links:

(I) http://www.theinquirer.net/en/inquirer/news/2007/07/11/dx10-is-do-able-on-windows-xp

The original reason was that DX10 required graphics memory to be virtualisable, a laudable goal.

(II)
http://www.firingsquad.com/hardware/directx_10_graphics_preview/page8.asp

bold operating-system-wide initiatives, including video memory virtualization

A lot of major ideas were proposed, including a multi-year effort by John Carmack to lobby for video memory virtualization.

The first link is The Inquirer & is mainly concerned with "DX10 is do-able on XP".

So, MS threw NV a life preserver and made GPU memory virtualisation completely optional. ATI, which had implemented a dandy memory virtualisation scheme got screwed, or at least got what everyone who partners with MS got. Oh wait, I said that.

Is there any truth to the above? Would I be better off with an ATi 38xx card (if I was even on Vista) or the faster 8800GT?
Or doesn't it matter at all for this generation of GPUs?

Anyway, keep replies simple so I stand some chance of understanding.

silent_guy · Feb 16, 2008

2senile said:
Is there any truth to the above?

The rumor that virtual memory isn't supported in hardware seems wrong to me. Have a look here in the CUDA forums, some time ago, at the bottom on the page:

Meanwhile, G80 does have virtual addressing. CUDA context memory spaces can be thought about as virtual adress spaces.

The comment comes from an Nvidia employee, so the info should be reliable.

Rufus · Feb 16, 2008

Virtual memory and Virtualization of memory are two different things, at least in these contexts (if I understand them correctly).

Virtual memory means the graphics card can do memory address translation. This gives you things like your arguments always showing up at the same virtual address and memory protection, so a CUDA thread can segfault if it chases stray pointers (otherwise you could play an encrypted HD movie, and have a CUDA thread that copies off the framebuffer contents).

Virtualization of memory is/was part of Microsoft's idea to have the OS manage the GPU in the same way the OS manages the CPU and main memory. The OS would be in charge of context-switching in/out different 3d apps, paging in/out texture memory, etc. I have no idea what the state of virtualization in Vista is.

Virtual memory is a must, and exists currently. Virtualization could theoretically be a good idea since the OS has more information about processes and their activity, so could schedule the GPU and CPU time together or something. The problem is that ATI and Nvidia have been writing damn good schedulers for years custom tailored to their hardware and it's particular quirks/performance. MS has never managed GPUs before, and likely would/has 1 GPU scheduler that "works" on everything from Intel integrated to an ATI/NV low-end to an R600/G80.

2senile · Feb 16, 2008

Answers much appreciated;

Afraid I don't understand why somebody like John Carmack, who must understand the strengths of ATi/nV schedulers, would " lobby for video memory virtualization" (I presume at OS level) unless it offered a clear advantage to what he wants to do. ... & what would he do with it/what advantage does it offer over individual ATi/nV routes? Is it, perhaps, that it would allow standard (as in, non-specific ATi & nV code) code that fits all, thus saving the programmer having to include different paths/tweaks to gain the best performance from different hardware?

As I think is evident, I'm still in the dark about the whole subject.

More answers/clarification required I'm afraid.

MfA · Feb 16, 2008

If you could get low enough level access (I don't see it happening with DX) it would make partial loading of textures possible. You could do (mega-)texture streaming without tiling the texture. You could efficiently implement sparse textures (ie. textures where the highest resolution LOD levels aren't always populated).

Zengar · Feb 16, 2008

I have no idea what virtual memory in context of DX10 means, as I have no idea of DX whatsoever. IMHO, virtualisation of GPU memory would allow hardware implementation of "general megatexture", ad this is what I find the most interesting thing. Basically, you can have textures of unlimited detail level. The hardware would determine if the required texture page is in videomem: if not, it will inform the OS and request the page from the application. Still, it woudl be rather difficult to implement efficiently, as GPU is currently an order of magnitude faster then CPUs, so page request will be extremely slow. Current test implementations of general virtual textures first determine which pages will be showed and then build a singe large (screen-size) texture "cache" from this parts. This image is then sufficient to texture the whole scene. The idea of partial texture loading also goes in that direction, but IMHO it doesn't prevent the problem of texture switching... This way it does not really matters if one has real partial textures or only emulates it with the cache, as you still have to pre-load texture pages.

MfA · Feb 16, 2008

With hardware supporting WDDM 2.1+ the GPU could generate an interrupt and perform a context switch when such a page fault happened, allowing you to overlap the upload with useful work. If you determined you couldn't update the page in time (ie. load/decompress the necessary part of the texture) you could always simply upscale from a lower resolution mipmap.

Zengar · Feb 17, 2008

Well, it is ture, but won't it produce an extreme stall in the GPU rendering the whole idea useless? In the time the CPU would react and upload the missing texture part, the GPU could have done lots of work... preconstruction the virtual texture cache is another story: the resources would be used more efficiently in my opinion.

MfA · Feb 17, 2008

With hardware supporting WDDM 2.1+ the GPU could generate an interrupt and perform a context switch when such a page fault happened, allowing you to overlap the upload with useful work.

Blazkowicz · Feb 17, 2008

Rufus said:
Virtual memory and Virtualization of memory are two different things, at least in these contexts (if I understand them correctly).

Virtual memory means the graphics card can do memory address translation. This gives you things like your arguments always showing up at the same virtual address and memory protection, so a CUDA thread can segfault if it chases stray pointers (otherwise you could play an encrypted HD movie, and have a CUDA thread that copies off the framebuffer contents).

I didn't know there could be memory protection on GPU

.
but regarding your exemple the framebuffer is encrypted anyway, that's the point of the HDCP garbage.

2senile · Feb 17, 2008

Shooting in the dark here but would Memory Virtualization have the greater benefit to the lower end where a card is using a mix of onboard & system Ram?

Thinking about what Tim Sweeney said in the Firing Squad article.

Tim Sweeney October '06 said:
I see DirectX 10's support for virtualized video memory and multitasking as the most exciting and forward-looking features. Though they're under-the-covers improvements, they'll help a great deal to bring graphics into the mainstream and increase the visual detail available in future games.

I wish i had been able to find something more recent but the web seems to be very quiet on the subject.

EDIT: A lot of the things mentioned I just assumed the driver/scheduler handled already. Excuse my ignorance, it might be I'm not able to ask my questions in the right way & I was originally going to post in "Beginners Questions" but this thread was presented as I was about to type.
How does context switching improve things? Is it simply a method of not having parts of memory idle/stalled (or can you actively use it, in the code you write to allow more efficient use of memory)?

Rufus · Feb 18, 2008

Blazkowicz said:
I didn't know there could be memory protection on GPU .
but regarding your exemple the framebuffer is encrypted anyway, that's the point of the HDCP garbage.

HDCP might have an encrypted FB, but you can still imagine why memory protection is a Good Thing, preventing a game from taking out your windows desktop or something. This never used to be a problem since bounds checking on textures is trivial, and that was the only "pointer" sort of thing you had in DX / OpenGL. Now that you can have arbitrary pointers in CUDA (and CTM?), you have to have another layer of protection.

And I never knew it existed I until I tried writing a program in CUDA, and couldn't figure out why for some input sizes it ran fine but for others I got garbage data. Turns out when CUDA segfaults the GPU program stop running, but the CPU side will happily copy back whatever random data happened to be on the GPU before the segfault. (I later found out there's a nice "have you segfaulted?" CUDA call to at least tell me something blew up).

MfA · Feb 18, 2008

You don't need memory address translation for memory protection though ... the former is a lot more complex.

pcchen · Feb 18, 2008

Rufus said:
And I never knew it existed I until I tried writing a program in CUDA, and couldn't figure out why for some input sizes it ran fine but for others I got garbage data. Turns out when CUDA segfaults the GPU program stop running, but the CPU side will happily copy back whatever random data happened to be on the GPU before the segfault. (I later found out there's a nice "have you segfaulted?" CUDA call to at least tell me something blew up).

I'm not sure about the level of protection GPU offered. Many times if my CUDA program went wrong, it would make something on the screen garbled, sometimes persistently. So I guess it's not a very thorough protection. Of course, in most case a protection for frame buffers is good enough.

mczak · Feb 18, 2008

Blazkowicz said:
but regarding your exemple the framebuffer is encrypted anyway, that's the point of the HDCP garbage.

There's no way the frame buffer is encrypted. HDCP is handled by the DVI transmitter.

Richard · Feb 18, 2008

2senile said:
Afraid I don't understand why somebody like John Carmack, who must understand the strengths of ATi/nV schedulers, would " lobby for video memory virtualization" (I presume at OS level) unless it offered a clear advantage to what he wants to do.

On this point, at last year's QCon keynote he did speak about this. Paraphrasing but he said that while he pushed for virtualization of texture resources in hardware in the end a software approach might actually be better from a developer's PoV because you do not have to mess with different IHV implementations of it (i.e. in OGL extensions) and his code can do exactly what he wants with no superfulous (or missing) features, like he's did with id Tech 5.

MfA · Feb 18, 2008

Indirect texture lookups are a tad slower than using a dedicated TLB though.

Memory Virtualization & DirectX 10

Kaotik

Drunk Member

JHoxley

Demirug

2senile

silent_guy

Rufus

2senile

MfA

Zengar

MfA

Zengar

MfA

Blazkowicz

2senile

Rufus

MfA

pcchen

Moderator

mczak

Richard

Mord's imaginary friend

MfA

Similar threads