D3D12 gives control of all GPU memory to the developer (apart from the chunk the OS won't let go) - one good reason being that developers are sick of IHV software engineers moving the goalposts with every driver release.and our software engineers can keep less frequently used data in the 512MB segment
That does not in any way prevent strange memory layouts. It also does not give developer any control about how resources get laid out within GPU memory. What it does is it gives developer option to control lifetime of resources so that driver/d3d does not have to figure that out on its own.D3D12 gives control of all GPU memory to the developer (apart from the chunk the OS won't let go) - one good reason being that developers are sick of IHV software engineers moving the goalposts with every driver release.
Well people are vaccinated against many things. That does not mean that this isn't a feature though. It has been noted in this very thread that without this feature there would card would have to be sold as advertised (4GB, 64 ROPs, 2MB L2) or cut down to 3GB, 48ROPs 1.5MB L2. Given that this is a quite a tech forum and that NV obviously lied and GTX 970 is not as advertised it would be nice to figure out how much this lie actually hurts GTX 970 owners. I think we can all agree that 3GB option would be worse and since we can't have a GTX 980 with 13 SMs this requires some creativity.Did Jen-Hsun really try to pull the "it's a feature" card?
Does he even surf the internet to know from moment zero that people have been vaccinated against that move for a while?
That I mostly agree with.First of all, in case I didn't make it clear earlier in this thread, I agree that the reactions all around the web are exaggerated.
But to call this a "consumer-oriented" feature is quite the stretch.
Back in Kepler days, nVidia sold both the GTX680 and 670 with all the ROPs, L2 cache, memory channels and memory amount.
The GK104 with assumedly damaged and disabled ROP, memory controller and L2 partitions was made as a third card, the GTX 660 Ti.
Maxwell came and they now have a GTX x70 with disabled partitions in the backend and L2 cache, probably so they can harvest more chips for the x70's price point, ultimately making more money from the same wafer.
In the end, the GTX 970 is actually a midway of what would be a GTX 960 Ti and a "full" 970. Similarly, the 970's release price is also halfway between the release prices they had for the 660 Ti and 670.
I guess the 670 proved to be a lot more popular so they probably figured out they would make more money selling a weathered down and cheaper GTX x70 than two different cards around it this time.
Is it a feature? Yes. But make no mistake: it's not a feature for the end consumer. It's a feature for nVidia to make more money out of the same GM104 wafers.
If this works "as advertised" it could open some interesting options for future GM200 salvage parts (for example 1 fully disabled ROP cluster vs. 2 partially disabled).
It's arguably faster than PCI-E and much lower latency. Using the same heuristics as for what you spill into host memory and read/write via PCI-E seems sensible to me (just don't spill to PCI-E before you run out of this memory pool). The one place where this is a real problem and *might* NOT work reliably is for compute (i.e. what many people are using to test using CUDA). where it's much harder to predict access patterns.Technically, Nvidia have provide a features, after disable the SMX's in the 970, they have been able to keep a 224bit MC and 3.5GB at full speed.... as you cant disable it by lasercut, they have decide to do it by software.. this said, the last 512MB and the last 32bit memory controller could be accessed.. but in practice if you want to use both partition.. the performance will be dramatically bad.... So, even if you could access it, you dont want to do it.
And this is precisely why advanced techniques under D3D12 (mixed compute/traditional graphics pipeline, developer managed sub-buffers of persistent memory allocations) are going to fall foul.It's arguably faster than PCI-E and much lower latency. Using the same heuristics as for what you spill into host memory and read/write via PCI-E seems sensible to me (just don't spill to PCI-E before you run out of this memory pool). The one place where this is a real problem and *might* NOT work reliably is for compute (i.e. what many people are using to test using CUDA). where it's much harder to predict access patterns.
Yup. Personally, I would never mind having only 3.5GB instead of 4GB. Or 224bit instead of 256bit. Seriously, who cares?And this is precisely why advanced techniques under D3D12 (mixed compute/traditional graphics pipeline, developer managed sub-buffers of persistent memory allocations) are going to fall foul.
Yup. Personally, I would never mind having only 3.5GB instead of 4GB. Or 224bit instead of 256bit. Seriously, who cares?
Jittery performance because of games not being aware that some of the memory is slower (as evidenced in the previous post) is the real problem.
Nvidia is being a bit dishonest – but only in the sense that with today's games and APIs, it would be best to just expose 3.5GB/224bit and nothing more. It is still better than 3GB/192bit, while avoiding potential trouble. Having those last 512MB is more about marketing, less about allowing maximum performance, as there is no way to be certain that future games won't have problems with this.
That's what reviews are for. One determines the performance from reviews and buys if it's what one needs. Whether the card has 3.5 or 4GB changes almost nothing, even in terms of futureproofing.The problem is they dont sell you the gpu with 52 rop 224bit and 3.5GB.. they sell you the gpu as a 256bit bus, 64rop and 4GB of memory. Personally, if i had a 970 i will surely not care about the difference.
But the problem is if we are ok with that, the next time, they will sell you a 8GB gpu with only 6GB available, a 4096 shader gpu with only 3072SP available with funny slide telling you that they have design the gpu for 8K gaming ...
That's what reviews are for. One determines the performance from reviews and buys if it's what one needs. Whether the card has 3.5 or 4GB changes almost nothing, even in terms of futureproofing.
If need be, I'll just lower some performance settings. With the weird memory layout, there's no certainty that lowering quality settings will help.
It is not the case?GTX970 is working as intended.
Specially AMD.In the future NVIDIA can claim their card has up to 4GB of memory serving up to 64 ROPs via an up to 256-bit bus, then everybody should be happy.
It is not the case?