Turbocache/Hypermemory and their usefulness

Diamond.G

Regular
I am knee deep in a macrumors thread trying to explain why more vram is a good thing, and I have come across an interesting argument. I feel that I am now confused as to the thread title. (Wow that made no sense...)


Basically I have a user there that is telling me turbocache/hypermemory (whatever that technology is called) would be useful for high end GPU's as well since the bandwidth provided is additive. I was pretty sure it isn't and now need some help explaining my side.


Originally Posted by diamond.g
Hmm, the bandwidth isn't additive.

Evangelions response:
Actually, it is. That's how it might work with AGP-texturing, but that's not how it works with Turbocache and the like. Hell, 3DLabs used something similar in their hi-end Wildcat 3D-cards. they could use the local VRAM and system-RAM as one big chunk of RAM. And it was smart enough to put the most often used things in to the VRAM, but the RAM was still logically handled as one big pool of RAM.
 
It is additive if you manage to match the ratio of main memory access with the bandwidth ratio. That is, if main mem bw is 1/10 of gpu mem bw, you need to read/write 10 as much to gpu mem than to system mem to get the additive effect.

I really doubt that it would be useful to high end cards though, because the pcie bus only provide a maximum of 4GB/s read and 4GB/s write compared to over 100GB/s top end gpu mem bw. Plus the system mem accesses would have to compete with CPU memory accesses and would have much higher latency and could therefore reduce shading performance.
 
It is additive if you manage to match the ratio of main memory access with the bandwidth ratio. That is, if main mem bw is 1/10 of gpu mem bw, you need to read/write 10 as much to gpu mem than to system mem to get the additive effect.

I really doubt that it would be useful to high end cards though, because the pcie bus only provide a maximum of 4GB/s read and 4GB/s write compared to over 100GB/s top end gpu mem bw. Plus the system mem accesses would have to compete with CPU memory accesses and would have much higher latency and could therefore reduce shading performance.

Okay, so the memory controller would be smart enough to be able to do all this. So how much of a performance hit or gain would we be talking about. I would think the hit is huge. But I ask cause I am not sure.
 
Hypermemory/turbo cache wouldn't affect the performance of a high end card significantly as it makes no sense to address system ram when you have local memory available.
 
memory virtualisation will be used and would be useful, I can imagine it for terrain (mega-)textures in a vast free-range game.
 
3dlabs implementation of virtual memory is what Evangelions is refering too and it is useful for high end cards. Fetching only the necessary parts of the texture and storing that in fast local ram is useful.

The alternative of storing most textures in local ram and a few in system memory isn't good because when you have to fetch from system memory the latency will be horrendous.

Most graphics chips are likely to be optimized to hide the latency of fetching from local memory so there are more likely to be stalls when fetching from system memory.
 
On a semi-related matter, how much of the theoretical 4.2 GB/s of texture upload bandwidth could one hope for in practise?
I'm talking about from RAM to VRAM, but I guess with PCI Express the direction should affect throughput?
 
Primary task for Hypermemory/Turbocache isn't performance (which is more of a side effect actually) but price cutting. Onboard graphics memory is far from cheap and less you use it, the cheaper the graphics card will be. However using only 64MB of VRAM where lets say 128MB is required would result in great performence penalty because you have just half of the memory required and some reserved just for textures using AGP or PCIe bus. Hypermemory/Turbocache replaces missing board memory in every aspect, not just for textures like on regular cards happens (memory is used as framebuffer and texture storage along with everything else performed in regular VRAM). So end result is decent performance at much lower cost. Thats why you see this tech on low end budget cards only, while mid and high end do not have it.
 
For Turbocache/Hypermemory to be useful, you need the developers to take it into account, like JC with the megatextures. If they develop for fixed memory sizes, you're not going to see a benefit.

That's one reason why DX10 is serious about virtual memory, but it will still need the devs to make optimal use of it.
 
Back
Top