I believe the goal is a shared and coherent address space, where the compute elements are able to handle pages not being in physical memory.My understanding previously, that the AMD's 3rd gen HSA systems would pound on the same memory addresses without penalties, thereby totally eliminating the need to copy data between memory spaces.
Hitting the same addresses at any given point is likely to have penalties, much as it would for programs running on different cores. The penalties would likely be worse and side effects probably stranger, since this is crossing between two memory subsystems with very different latencies and consistency models.
AMD has promised filters that block unnecessary coherence requests, but that won't do much for cases where data is being thrashed back and forth.
The bigger question is what happens when the CPU reads GPU memory. That was the massive performance hit that Llano had. The CPU domain's ordering rules and consistency requirements meant accessing the uncached domain flushes everything and serializes accesses.Edit: The cache explanation from Dumbo made some sense to me. The caches should be accounted for when talking about coherency. So any address space that CPU works on cannot be read faster than cpu's cache writing those results back, and that can't be faster than 20GB/sec. Have I understood correctly?
Stuff over Onion is different, and since it is on the same order as what the CPUs could be expected to pull over their interfaces, it's just bad relative to what the GPU can do on its own.
The distinction is made in the properties assigned to the memory pages. One thing I've thought about is aliasing virtual pages with different caching settings, which seems to be possible. My reading on this seems to indicate operating systems are expected to squash that due to the undefined behavior that can result, however.
I was wondering why the consoles weren't utilizing the modern paging capabilities of the CPUs they were using, although I sort of thought they'd have a bigger game reserve or at least a bigger pageable limit.Eurogamer Portugal released translated version of Richard Leadbetter's DF article [eng version is not up yet] that talks about PS4 memory. Apparently, SDK documentation says that game dev's can have guaranteed access to only 4.5GB of ram. Additional 1GB is described as "flexible".
Without knowing what's in the OS reserve, it's difficult to say why it is so big. The obsession with avoiding paging activity is rather extreme, in my opinion. A big transition, such as between a game and a full-screen media app, should provide enough information to institute a hierarchy of latency needs that permits both sides to roll in and out automatically. Apps or their SDKs should be able to evolve in concert with the consoles, and only so much of their working sets are going to be allergic to demand paging.
Last edited by a moderator: