Will GPUs with 4GB VRAM age poorly?

... a lot of intelligent and insightful things.
I know you are right with everything you have written. However you are seeing thing just through your eyes. If the world consisted only of Sebbbis (and equally proficient developers), 4GB would probably be enough for a long time. The truth is however, for every Sebbbi there are 25+ Joe Average programmers. Don't get me wrong, most of them are hard-working and dedicated people, but will never-ever reach your level. Having 16 GB will help them get similar performance (and frame consistency) you can achieve with 4 GB.

When you look at the IT industry today there are a lot of people who would have struggled to make a living from software development 25 years ago. Having lots of CPU power and memory to waste means can afford to be inefficient. There are people out there who use Node and dozens of JavaScript packages in order to write software that effectively consists of 100 lines of code. It's a terrible waste, but everybody has come to accept it outside of certain areas, because it enabled people to do stuff they could not have done with IT technology 25 years ago. GPU will go the same way sooner or later.

I really admire your skill, but not everybody can be like you.
 
This wasteful VRAM behavior on PC didn't really start before the advent of the current generation of Home Consoles. Having 8 GB of shared RAM meant that developers became liberal and probably wasteful with their approach to the VRAM issue. You can only hold the flood so much before you drown in it. Vega/Pascal's paging or not, I see no indication of of the wind changing direction any time soon. That process starts with developers not the other way around.
 
Xeon Phi MCDRAM can be configured as cache to the DDR4 main memory. Similarly in future desktop: GPUs 8 GB of HBM2 could be configured as a cache to 64 GB of DDR4 main memory. Could work either at cache line granularity or at page granularity.

It would probably be preferred to go with the larger granularity. 64GB of backing store implies 4.5-5 bytes for the tag information on 128M lines of storage in the HBM2 pool, translating into losing .6-.75 GB of capacity due to raw tag storage without consideration for any other organization or bookkeeping above that. A fair amount of bandwidth would be lost to naively using those tags as well, since every cache miss would probably need yet another miss to check the tag in HBM2 first. Doubling latency also might not be pretty, even for a GPU.
For DRAM utilization and transfers between pools, it might be worth it to at least keep to DRAM page size (1K-2K at least), and then perhaps more to keep in line with the natural granularity of the VM pages.
It gets more coarse, but it reduces capacity loss to metadata, might allow for the status structure to stay on-chip more, and matching the VM's page formats might allow for more flexibility.

AMD's HPC concept does allow for using the on-package memory as a cache, but due to similar capacity and efficiency concerns it seems like they would prefer that the software explicitly manage things if possible.
 
The high latency of GDDR5 memory prevents that, it can work perfectly well in a GPU environment, As GPUs hide latency well with their parallel nature. However, in a general system environment, they are not ideal. CPUs need as little latency as possible, hence why DDR4 or DDR3 is preferable.

I thought this is a long dismissed theory with the reality that GDDR5 has similar timings to DDR3. The high latency observed on GPUs is a result of its operating clock speed (absolute) and throughput-oriented memory pipeline (relative).
Yeah, GDDR5 has about the same latency as DDR3, I think they both clock in between 10-15nanoseconds depending on implementation and I assume DDR4 is about the same as well.
 
Why didn't they?
Because obviously the gains at 4GB is small compared to the heavy amount of paging they can create by using a modded 2GB VRAM card for that test demo - context that test demo with Deus Ex rather than much further down the line in future.
And this can be seen as more than an assumption as they are launching Vega as a 4GB card and 8GB card, so from a product narrative and consumer perspective the 'ideal' test would had been 4GB like the card.
It was a very controlled test to create a worst case scenario by using 2GB, and at a time when they knew for sure they would have a 4GB GPU config as part of Vega launch.
Cheers
 
Because obviously the gains at 4GB is small compared to the heavy amount of paging they can create by using a modded 2GB VRAM card for that test demo - context that test demo with Deus Ex rather than much further down the line in future.
And this can be seen as more than an assumption as they are launching Vega as a 4GB card and 8GB card, so from a product narrative and consumer perspective the 'ideal' test would had been 4GB like the card.
It was a very controlled test to create a worst case scenario by using 2GB, and at a time when they knew for sure they would have a 4GB GPU config as part of Vega launch.
Their Vega slides showed roughly 40%/60% ratio of used/unused memory (in two select AAA games). If that's the ratio their memory paging system manages to achieve, then a 2 GB card would handle 2 GB / 0.4 = 5 GB load. That's pretty much where high end games are nowadays with maximum settings. If they would've instead shown 8 GB card with this tech, they would have needed a game that consumes 8 GB / 0.4 = 20 GB video memory. Games like that don't exist yet.
 
Their Vega slides showed roughly 40%/60% ratio of used/unused memory (in two select AAA games). If that's the ratio their memory paging system manages to achieve, then a 2 GB card would handle 2 GB / 0.4 = 5 GB load. That's pretty much where high end games are nowadays with maximum settings. If they would've instead shown 8 GB card with this tech, they would have needed a game that consumes 8 GB / 0.4 = 20 GB video memory. Games like that don't exist yet.
But they only ever tested one game and that was Deus Ex Mankind Divided for real world.
Sorry Sebbi but showing what is not used by two games (which is not necessarily accurate in the way they measured them) and then deliberately using 2GB VRAM card for a different game when they have 4GB Vega card for launch is alarm bells.
Amazed you do not see how they went out of their way to create a 2GB modded Vega when they had a production 4GB launching product in engineering.
There is probably limited performance gain with their real world 4GB product with a real world game with the settings they used for that Deus Ex test demo, but I guess it will be proved either way once the 4GB Vega launches and everyone can test it with Deus Ex Mankind Divided.
As I mentioned it may become more relevant further down the line, but that is not anytime soon.
Especially as possibly the very 1st Vega is 8GB and followed up by the 4GB (although I would expect it sooner rather than later).
So the 2GB is more of a tech demo for now (why they chose it IMO) and who knows what the future will be like by the time they have the technology across all product lines and games push hard enough to make it viable performance gains for both 4GB and 8GB GPUs; tbh I think they will be competing with some kind of Optane variant by then (possibly from Micron) or there will be better unified memory/paging solution for consumer products/software.
Cheers
 
Last edited:
There is probably limited performance gain with their real world 4GB product with a real world game with the settings they used for that Deus Ex test demo, but I guess it will be proved either way once the 4GB Vega launches and everyone can test it with Deus Ex Mankind Divided.
Limited gain only because a paging technology requires a large enough dataset to actually require paging or streaming. The technology seems much more interesting for low to mid range products with typically limited memory right now.

I'd imagine some of the remastered Bethesda games with an open world, mods, and player built structures could make use of it in the near future. Plenty of other games as well, especially with the resource management troubles of LLAPIs.
 
Sorry Sebbi but showing what is not used by two games (which is not necessarily accurate in the way they measured them) and then deliberately using 2GB VRAM card for a different game when they have 4GB Vega card for launch is alarm bells.
Amazed you do not see how they went out of their way to create a 2GB modded Vega when they had a production 4GB launching product in engineering.
There is probably limited performance gain with their real world 4GB product with a real world game with the settings they used for that Deus Ex test demo
Of course the gains would be limited (close to zero) with a 4 GB card in most current games. Only a handful of current games show noticeable performance drop when moving from 8 GB to 4 GB card. Those two games they showed didn't allocate 10 GB of memory (4 GB / 0.4 = 10 GB). Is there any games out that allocate 10 GB? Having excess unused memory doesn't speed up anything. Not a good showcase for new tech. Thus they chose a 2 GB card.

There's no need for alarm bells. This was simply a first glimpse of their new paging technology. Most likely the drivers aren't final yet. We don't even know if it works 100% automatically in all games, or whether they need to do a driver profile per game to configure the paging system (like Crossfire/SLI). They only showed two games at this point and that's perfectly fine. We don't even know whether they have chosen the final GPU clock rates, memory configurations or models (high/mid/low) yet. We will certainly get more info as the launch gets near. Reviewers will thoroughly test with broad range of games. Then we see how well it works.

I am simply personally interested in this technology, as I have been programming multiple custom fine grained software paging systems (virtual texturing). The potential if huge. 2.5x memory reduction is pretty conservative compared to the results I have seen with custom software paging solutions. I strongly believe that this system is doable and is the right direction. Nvidia recently introduced a similar paging system to CUDA with Pascal (https://devblogs.nvidia.com/parallelforall/beyond-gpu-memory-limits-unified-memory-pascal/). It is clear that both IHVs see that this is the future.
 
AMD Radeon RX 570 vs. NVIDIA GeForce GTX 1060 3GB TechSpot
Each card has its strengths and weaknesses. The GTX 1060 3GB's greatest weakness is its the limited memory buffer, but to be fair this is rarely an issue at 1080p. As a brief tangent, the 3GB model was released almost a year ago and back then we heard how it was dead on arrival, even for those gaming at 1080p. Well, with the exception of one game, we have 28 examples that show the 3GB 1060 providing quite playable performance at 1080p, particularly in relation to the RX 570. If you bought a 3GB model last year, you're probably wondering what all the fuss was about.

http://www.techspot.com/review/1411-radeon-rx-570-vs-geforce-gtx-1060-3gb/
 
As you said, hardware PRT has also too big page size. 16 KB would be much better. Software indirection doesn't have this problem.
Really sorry for the Necro bump here, you made this post ages ago. With the move towards higher resolution assets, is the page size on PRT still too large? or do you feel it's a better fit now? (64K vs the 16K) seems to align with 4K. Unless of course, if it doesn't matter at all.
 
Then and Now: 6 Generations of GeForce Graphics Compared
The list includes four major Nvidia architectures released between March 2010 and June 2016: Fermi (GTX 480 and GTX 580), Kepler (GTX 680 and GTX 780), Maxwell (GTX 980 and 980 Ti) and Pascal (GTX 1080).
October 12, 2017
Comparison_02.png
https://www.techspot.com/article/1191-nvidia-geforce-six-generations-tested/
 
The increasing use for more complex implementations of PBR should increase memory usage for the material layers. Instead of say one 4096x4096 DXT texture you have in some scenarios have up to 8 or more layers for a material surface. As time goes by and reg ps4 x1 consoles are phased out and Scorpio and PS5/X4 and more powerful videocards come to the forefront those implemetations of PBR and their corresponding material layers can get more numerous and larger.
 
Maybe its overkill but in the future maybe we'll see skin with almost as many layers as this rendered in realtime. Perhaps once enough PS5/X4's are out in the wild a couple years into their lifecycles.
fa8484ac44685f129ee41a706f156179.jpg
 
Call of Duty: WW2: PC graphics analysis benchmark review
11/03/2017
index.php

Up-to Full HD (1920x1080) in fact any ~4 GB graphics card of decent caliber should do the job well. Likely 4 GB or more should be sufficient for 2560x1440. If you want to play Ultra quality with Ultra HD as preferred monitor resolution, 4GB or better is always advised but in this game, not needed. The title basically is a fill most you can type of game. E.g. it buffers as much as need and as it would like to.

http://www.guru3d.com/articles-pages/call-of-duty-ww2-pc-graphics-analysis-benchmark-review,1.html
 
Back
Top