Will GPUs with 4GB VRAM age poorly?

lanek · Nov 25, 2016

I.S.T. said:
Hell, 2 gigs wasn't enough on a GTX 660. That's what I got and I hit both RAM and GPU limits(Thank you, GPU-Z!) all the time on console games.

Fact: Evolve kind of sucks when you're running at a super unstable frame rate at all low except res(I don't cut that because it makes shit blurry, and in Evolve, that could get your ass killed). Doom4's much better at this, but still doesn't run super great even at all low.

Dont forget that high end some years ago was at 3072mo of ram or 4GB.... when i say 4GB is ok on a mid tier gpu's, this is not counting for future proof.... But what is future proof at 160$....

DavidGraham · Nov 26, 2016

sebbbi said:
I checked some newer 4 GB vs 8 GB reviews. Differences can be only seen in some games at ultra high graphics settings. But in these cases the game fails to deliver stable 60 fps. So I don't personally have any need to upgrade my 4 GB GPU to a 8 GB model right now.

Well, I guess according to that logic, 4GB won't be much of a hindrance, If 60fps is king, then running at the absolute Maximum settings won't be a necessity. However that doesn't suit all mentalities, quite a lot of PC gamers like to buy high end GPUs to crank all settings to the max, that wouldn't suit them at all, I my self upgraded my GPU for the sole reason that it' VRAM is not enough any more.

You also leave quite on the table in several cases, for example, Doom can still run at 1080p60fps or even 1440p60 at max settings on a 980, but it can't, had it not have only 4GB, it would have been able to select highest textures/shadows levels, Black Op 3 is the same as well!

keldor · Nov 28, 2016

The big tricky thing with memory vs. performance is that, even at ultra settings, any half compotent game engine will just step down the texture resolution on the GPU if it doesn't have enough memory. Pop a few big textures down to the next mip level, and you fit in 4 GB, though the image quality could suffer, depending on how much the camera moves (consider that you're going to be streaming textures across asynchronously via DMA, so as long as you take care not to stall the entire frame, your framerate will be virtually identical). With that in mind, we should be looking with out magnifying glass at frames for image differences rather than at benchmarks.

Blazkowicz · Nov 28, 2016

sebbbi said:
The brand new Pascal GTX 1050 has only 2 GB of memory

It is faster than current gen consoles (except PS4 Pro), but isn't going to be able to play all future console games at console quality settings, because of memory limitations. IMHO, 3 GB is still borderline acceptable for a brand new mid tier GPU, but 2 GB isn't going to cut it anymore.

Same issue with RX460 2GB vs 4GB.

I very quietly lament the lack of graphics card with DDR4 on a 64bit or 128bit bus, sure the performance would suck but that's what low end is for?
Try to go lower end than the 1050 and the vendors still sell GCN 1.0 and GK107, with perhaps 2GB gddr5 for higher performance, or gddr3 for lower cost. Can't have a cut down GP107, or a GP108, with 4GB ddr4 instead? Perhaps lpddr4.

sebbbi · Nov 28, 2016

Blazkowicz said:
Same issue with RX460 2GB vs 4GB.

That's unfortunate. Both AMD and Nvidia are still bringing new 2 GB cards to the masses (mainstream gaming cards). This means that devs need to support 2 GB cards for a long time. I was hoping that we could drop 2 GB support in a few years.

CarstenS · Nov 28, 2016

Maybe do Gouraud-shading only on those?

„System Requirements: Graphics card with 2 GiByte Memory, 4 GiByte for textured environments“
I would buy it!

Putas · Nov 28, 2016

2 GB is valid choice for Geforce 1050 and Radeon 460. Games that would play better with more memory are exception rather than a rule. Only if low level APIs will retain their memory hungriness would I be worried for their future.

Silent_Buddha · Nov 28, 2016

Putas said:
2 GB is valid choice for Geforce 1050 and Radeon 460. Games that would play better with more memory are exception rather than a rule. Only if low level APIs will retain their memory hungriness would I be worried for their future.

I'm not sure why. You can still increase texture quality without increasing rendering resolution. Meaning a low end card should still be able to use more memory without greatly increasing the rendering load.

Regards,
SB

Putas · Nov 29, 2016

Silent_Buddha said:
I'm not sure why. You can still increase texture quality without increasing rendering resolution. Meaning a low end card should still be able to use more memory without greatly increasing the rendering load.

Regards,
SB

Either the increased texture resolution will have diminishing results because of low screen resolution or demands on texturing will increase greatly- and texturing power usually scales with other parts of chip.

sebbbi · Nov 29, 2016

Putas said:
Either the increased texture resolution will have diminishing results because of low screen resolution or demands on texturing will increase greatly- and texturing power usually scales with other parts of chip.

Higher texture resolution = additional 2x2 higher quality mip levels. GPU accesses higher mips only for surfaces close to the camera (assuming roughly uniform texel density on all meshes). Cost of rendering further away geometry thus remains the same.

Background in general is more expensive to render than foreground (more objects, more triangles / pixel, more discontinuities -> more texture cache misses). A surface close to the camera blocks big chunk of background -> you see a sudden frame rate improvement. If this surface has 2x2 higher quality texture, it will only make the frame rate more even (reduces max frame rate a bit, but has no effect on min frame rate). Close up geometry has bigger continuous surfaces -> less texture cache misses. Thus the performance degradation of super high resolution textures is minimal.

I of course agree with you that high res textures matter a lot less on lower output resolution. 2x2 lower resolution (4K -> 1080p) means that GPU accesses 1 mip level lower version of each texture. 1 mip level = 2x2 higher resolution = 4x higher memory cost. But streaming of course makes the extra memory cost much more manageable. And its worth noting that on 1080p the streaming system will load each mip at 1/2 distance closer than 4K. Thus high resolution texture packs at 1080p are much more friendly to the streaming system compared to 4K.

Silent_Buddha said:
I'm not sure why. You can still increase texture quality without increasing rendering resolution. Meaning a low end card should still be able to use more memory without greatly increasing the rendering load.

Regards,
SB

Modern game engines have sophisticated texture streaming systems. You only stream in highest mips for objects very close to the camera. Every added (2x2 higher detailed) mip level will be streamed in at 1/2 distance closer than the previous mip level. This is how mip mapping works. GPU doesn't even access the highest mip levels for objects that are not very close to the camera -> no need to have them in memory. Of course you have some extra guard band bias for streaming to guarantee that data is ready in VRAM before the sampler would access it.

Thus higher quality textures do not cost as much extra memory as people commonly think. 90%+ of the rendered objects are further away from the camera -> only lower mip is accessed.

Theoretically, you only need to have one texel loaded per pixel on the screen. So 2M texels for 1080p and 4M texels for 4K. If we assume point sampling, this is easily proven by https://en.wikipedia.org/wiki/Pigeonhole_principle. Bilinear filtering needs to blend with neighbors. Fortunately most surfaces have continuous UVs and are significantly larger than single pixel, so most bilinear accesses will be shared by the neighbors. Thus you need to only pay extra memory cost for discontinuities (object edges, UV seams, mips). As you increase the resolution, the area of each surface grows quadratically (x*x), while the count of edge pixels grows linearly (x). Thus edge pixels become increasingly small percentage. Mip mapping with trilinear filtering adds a significant cost, but this is a constant multiplier, independent of amount of textures or the size of the textures. When rounded towards the less detailed mip, there's a 25% extra cost, and when rounded towards the more detailed mip, there's a 5x cost multiplier. Of course these additional samples are also shared with neighbors, mitigating some of the cost. Virtual texturing gets pretty close to the theoretical maximum. The biggest difference is that you can't load single individual texels (too much seek). You need to group nearby texels to tiles (commonly 128x128 pixels). This further increases the impact of discontinuities. My research done with virtual texturing tells me that screen pixel count x4 is a good upper estimate of required amount of texels to texture a single frame. You'd want to have at least 4x larger cache to ensure no streaming when the player rotates the camera around.

Sophisticated modern texture streaming systems get pretty close to virtual texturing. You don't have per pixel occlusion (VT doesn't load hidden textures at all). But nearby objects require most of the memory, and usually there's not that much overlap near the camera. Most engines also use some kind of large scale occlusion culling / level partitioning technologies to avoid loading all further away textures to memory. It is not as good as virtual texturing, but getting closer as the technology evolves. Too bad Microsoft artificially limited tiled resources to Windows 8+, meaning that no PC games yet use that feature. It would be trivial to load textures to system RAM (16 GB+ is common) and use tiled resources to change the active subset at fine granularity based on visibility. There would be no additional visible popping, as CPU->GPU copy is order of magnitude faster than loading textures on demand from HDD.

Pascal P100 can already page fault from CPU memory. Knights Landing MCDRAM can also be configured as memory or as cache. 4 GB of fast VRAM (for example HBM2) as a cache + page fault from 32+ GB of CPU memory (DDR4) should be more than enough for 4K and even 8K. Unfortunately PC hardware and OS install base is too fragmented to make this a reality soon.

CarstenS · Nov 29, 2016

You speak of theory and probably of (some?) poster child examples of streaming systems, correct? In practice however, with insufficient vidmem, we notice in the lab time and again that a lot of games (AAA-Titles) either omit texture details or whole detail layers completely or stream in the necessary texture data so to late that you can see parts of the scene becoming more detailed gradually well after you have finished moving. Or maybe they don't use proper streaming at all.

That maybe is ok for someone who is tight on budget or does not want to spend more than say 120 bucks on a graphics card, but quality oriented people/high-end-enthusiast gamers et al. frown at that kind of immersion-breaking behaviour and rather invest in a card with more than the bare minimum of memory.

sebbbi · Nov 29, 2016

CarstenS said:
You speak of theory and probably of (some?) poster child examples of streaming systems, correct? In practice however, with insufficient vidmem, we notice in the lab time and again that a lot of games (AAA-Titles) either omit texture details or whole detail layers completely or stream in the necessary texture data so to late that you can see parts of the scene becoming more detailed gradually well after you have finished moving. Or maybe they don't use proper streaming at all.

You are talking about traditional HDD -> VMEM streaming. Popping is visible because a single HDD seek time can be almost as long as a frame.

What I was trying to say is that you shouldn't need extra VMEM to avoid popping vs 8 GB cards. You could stream from HDD to system RAM, which is usually is much bigger than VMEM (16 GB is common). Then swap between system RAM <-> VMEM quickly when needed. This would result in experience that is indistinguishable from having 8 GB VMEM. Unfortunately it seems that many games are doing streaming based on VMEM size, and immediately uploading textures to VMEM. I would assume this is because of console centric design choices (UMA). You could do better on PC.

Of course having more memory and more processing power is always a good thing, but I personally don't like that some developers are wasting those resources instead of using them properly to get some tangible benefits out of it. PS4 can render 4K graphics with highly detailed textures (Ratchet and Clank, Tomb Raider) with shared 5 GB of memory (total). That's only 2 GB to 3 GB for graphics. No disturbing texture popping either. It can be done if it has high enough priority.

CarstenS said:
That maybe is ok for someone who is tight on budget or does not want to spend more than say 120 bucks on a graphics card, but quality oriented people/high-end-enthusiast gamers et al. frown at that kind of immersion-breaking behaviour and rather invest in a card with more than the bare minimum of memory.

I am just saying that there's no technical limitation that most users (including those with 2 GB graphics cards) shouldn't see the best version of the same. If you have extra system RAM, the game engine should use system RAM instead. Swapping between system RAM and VMEM is very fast. You definitely won't see any additional "gradual" loading compared to loading immediately to 8 GB VMEM. You might "see" lower mip texture for one frame, if the streaming system fails prediction, but it is impossible for human to detect single frame lower resolution texture on a new surface that wasn't visible last frame (it takes brains some time to identify that surface).

I was criticizing new GPUs with 2 GB graphics memory because 2 GB is soon becoming a limiting factor for features. Allowing players to scale down texture quality is fine (for me as a developer). But disabling key features can be tricky. Volumetric data structures for example take plenty of memory and might not be easy to scale down in all cases (assume your whole lighting system depends on that -> lower resolution might leak badly for example). Also using GPU for simulation purposes is hard to scale down as it would affect the game play. I would definitely want to see more GPU based simulation in games.

Rootax · Nov 29, 2016

(I love my fury X so far. Very interesting posts from all of you, I learn a lot, thx !)

DavidGraham · Nov 29, 2016

sebbbi said:
If you have extra system RAM, the game engine should use system RAM instead. Swapping between system RAM and VMEM is very fast.

I hate to sound like a broken record, but some games already do that. I was having troubles running some games at maximum settings (Textures included) even on a 3GB 660Ti, 8GB system @720p, the experience was stuttery, textures often fail to load till after sometime! I was clearly hitting my VRAM limit. Only after I upgraded my RAM to 16GB did my experience become smooth again! I was even able to run those titles at locked 30fps @1080p Max settings! These games were AC Unity, Batman Arkham Knight, Shadow Of Mordor, and COD Black Ops 3.

sebbbi · Nov 29, 2016

DavidGraham said:
I was having troubles running some games at maximum settings (Textures included) even on a 3GB 660Ti, 8GB system @720p, the experience was stuttery, textures often fail to load till after sometime! I was clearly hitting my VRAM limit. Only after I upgraded my RAM to 16GB did my experience become smooth again! I was even able to run those titles at locked 30fps @1080p Max settings! These games were AC Unity, Batman Arkham Knight, Shadow Of Mordor, and COD Black Ops 3.

Exactly! This further proves my point that system RAM streaming works. These games are doing the right thing! I would assume that the 8 GB RAM system was paging data to HDD.

I was complaining about games that need 8 GB of GPU memory (VMEM) to work properly. You should not need that much video memory as you can stream the textures to system RAM. This thread is all about 4 GB vs 8 GB GPU memory. My point is that 4 GB is fine as long as you have enough system RAM... assuming the game is not programmed badly.

I would expect that most gamers have 16 GB of memory. 16 GB of DDR4 is only 80$. 8 GB -> 16 GB upgrade is 50$ (if you have slots free). If you don't have 16 GB of system mem, then that should be higher priority than buying a new 8 GB GPU.

DavidGraham · Nov 29, 2016

sebbbi said:
I was complaining about games that need 8 GB of GPU memory (VMEM) to work properly...
assuming the game is not programmed badly.

And there already are some badly programmed games over there, Dishonored 2 was already mentioned (demanding 5GB @1080p!), Deus Ex Mankind Divided acted the same way, even demanding more if you enable MSAA.This was measured with 16GB systems!! Now the recent Watch_Dogs 2 acts similarly, a 4GB FuryX basically ties the 6GB 1060, and the 8GB RX 480, and that is without HFTS (Conservative Rasterization), HBAO+, TXAA or maximum draw distance! Also the difference between a 6 GB 1060 and the 3GB version is huge!

http://www.pcgameshardware.de/Watch-Dogs-2-Spiel-55550/Specials/Test-Review-Benchmark-1214553/

After Doom, Mirror's Edge Catalyst, COD Infinite Warfare, Deus Ex Mankind Divided, Dishonored 2, and now Watch_Dogs 2, we are starting to see a pattern being formed in 2016, and it's not a pretty picture!

Malo · Nov 29, 2016

I was going to ask about Dishonored 2. It seems crazy a game can fill a 970 at 1360x768. The detail is pretty crazy in the game so is this just due to really high quality plus a lot of unique texturing? It is built on idtech5 but apparently with a lot of modifications. Are they maxing out the vram on purpose and only discarding when necessary?

Putas · Nov 29, 2016

sebbbi said:
Higher texture resolution = additional 2x2 higher quality mip levels. GPU accesses higher mips only for surfaces close to the camera (assuming roughly uniform texel density on all meshes). Cost of rendering further away geometry thus remains the same.

Background in general is more expensive to render than foreground (more objects, more triangles / pixel, more discontinuities -> more texture cache misses). A surface close to the camera blocks big chunk of background -> you see a sudden frame rate improvement. If this surface has 2x2 higher quality texture, it will only make the frame rate more even (reduces max frame rate a bit, but has no effect on min frame rate). Close up geometry has bigger continuous surfaces -> less texture cache misses. Thus the performance degradation of super high resolution textures is minimal.

I think you narrowed down possible cases by considering only foreground and background. Degradation could be minimal as long as there is nothing in the middle. But GPU resources are balanced for some average scenario, and rendering super high resolution texture (much above the expected) in the foreground should slow you down as well. On background it often comes free, I agree, but then there should be also minimal IQ benefit.

I.S.T. · Nov 29, 2016

DavidGraham said:
And there already are some badly programmed games over there, Dishonored 2 was already mentioned (demanding 5GB @1080p!), Deus Ex Mankind Divided acted the same way, even demanding more if you enable MSAA.This was measured with 16GB systems!! Now the recent Watch_Dogs 2 acts similarly, a 4GB FuryX basically ties the 6GB 1060, and the 8GB RX 480, and that is without HFTS (Conservative Rasterization), HBAO+, TXAA or maximum draw distance! Also the difference between a 6 GB 1060 and the 3GB version is huge!

http://www.pcgameshardware.de/Watch-Dogs-2-Spiel-55550/Specials/Test-Review-Benchmark-1214553/

After Doom, Mirror's Edge Catalyst, COD Infinite Warfare, Deus Ex Mankind Divided, Dishonored 2, and now Watch_Dogs 2, we are starting to see a pattern being formed in 2016, and it's not a pretty picture!

Dumb question, but isn't that gap between 1060 3GB and 1060 6GB about the same as the GPU gap between them?

DavidGraham · Nov 29, 2016

I.S.T. said:
Dumb question, but isn't that gap between 1060 3GB and 1060 6GB about the same as the GPU gap between them?

They are usually very close to each other, the 3GB version can sometimes fall behind by 10%, however in memory sensitive games the difference between them expands massively! (like in Doom @Vulkan for example).

Will GPUs with 4GB VRAM age poorly?

lanek

DavidGraham

keldor

Blazkowicz

sebbbi

CarstenS

Moderator

Putas

Silent_Buddha

Putas

sebbbi

CarstenS

Moderator

sebbbi

Rootax

DavidGraham

sebbbi

DavidGraham

Malo

Yak Mechanicum

Putas

I.S.T.

DavidGraham

Similar threads