Ext3h
Regular
They do make it sound like they're bypassing the CPU and System Memory completely (which Direct Storage alone still has to traverse) which suggests they're taking advantage of P2P DMA direct from SSD to GPU which AMD platforms are certainly capable of at a hardware level.
Correct, at 23:28 it's even confirmed that this DMA straight into the Resizable BAR. So this is 100% standardized PCIe 3.0 protocol features.
No, they all have, no exceptions for NVMe. But what's critical, is that the NMVe driver pushing via PCIe effectively forces the CPU to back-off until the transfer has completed.The SSD certification they're doing may be to ensure the SSDs have the requisite DMA capabilities.
So it's actually a strict performance requirement, in order to ensure that granting the NVMe access isn't going to stall command access from the CPU.
Stalling access from the CPU is a nightmare when it comes to GPUs. Just trust me on that one, you absolutely do not want that to happen, ever.
Actually less so. Because the magic isn't happening on the GPU side. The new magic is happening on the chipset and the NVMe driver side, primarily. NVMe driver side, as an un-cached read into memory mapped DMA needs to be intercpted and redirected appropriately.If true that does support the possibility that RTX-IO (if it actually exists) is doing something similar.
NVidia of course can do this stuff. Multi-GPU with Cuda is all P2P DMA (if you don't have NVLink on your platform). Has been for a decade. Except the worst thing you could do was a A->B->C->A ring shift of buffers, as you would end up with perf drops from collisions on the PCIe switch like crazy.
AFAIK PCIe switches are usually always cut-through design, not store-and-forward. Good for cost, good for latency, bad for collision handling. Just thinking about it, a major requirement on the chipset is probably switching to store-and-forward operation mode dynamically, in order to not loose bus utilization from excessive back-offs.
You might - by some weird chance - see NVidia GPUs simply working on SAS enabled mainboards. Probably all it requires is for NVidia to actually just not actively try and mess up.
Last edited: