Not all transistors or mm2 of die area are created equal yield wise.
So at best what size would the esram be on the apu?
Not all transistors or mm2 of die area are created equal yield wise.
So at best what size would the esram be on the apu?
PRT for that type of memory would be interesting. No need to tile manually.
Partial Resident Rendertarget ...
If that is reasonably doable, AMD should look no further to improve the perfs of its own APUs.It's just an unrealistic but beautiful setup I have in mind. Say we have a virtual memory range for the embedded RAM, around 1GB mapped to main memory. Whenever a page-fault occurs the move engines automagically swap the page from main memory into the embedded memory via a default or a custom handler, and some L(east)RU page back to main memory. If the page references a PRT then the PRT handler would take care of making the content available.
This is effectively a programmable 2 level cache hierarchy. If the access pattern on the embedded RAM is local and predictable it should be possible to hide the quite long latency of handling such an "exception".
If the handler programs can be complex it could be possible to implement a transaction scheme - the handlers can decide to record page-faults and preempt or abort the shader that wants to work on it - later, when some pages become old enough, it could swap those recorded pages in and revive the related shaders. So the MMU handler acts like a custom scheduler, driven by memory access patterns.
Quite crazy, but beautiful.
because who needs a friggin' GPU.I have a question, now that the esram is at like 196GB/s. what kind of real world benefits will we see? higher framerates? higher resolution? more effcts? how will the higher bandwidth help with the tiled rendering stuff ms showed off at build?
i guess i should say more stable framerates when under stress? like when there's alot happening on screen at once? i'm still confused how it went from 102GB/s to 196GB/s but overall its a good thing right?
It's just an unrealistic but beautiful setup I have in mind. Say we have a virtual memory range for the embedded RAM, around 1GB mapped to main memory. Whenever a page-fault occurs the move engines automagically swap the page from main memory into the embedded memory via a default or a custom handler, and some L(east)RU page back to main memory. If the page references a PRT then the PRT handler would take care of making the content available.
This is effectively a programmable 2 level cache hierarchy. If the access pattern on the embedded RAM is local and predictable it should be possible to hide the quite long latency of handling such an "exception".
If the handler programs can be complex it could be possible to implement a transaction scheme - the handlers can decide to record page-faults and preempt or abort the shader that wants to work on it - later, when some pages become old enough, it could swap those recorded pages in and revive the related shaders. So the MMU handler acts like a custom scheduler, driven by memory access patterns.
Quite crazy, but beautiful.
I missed an episode it seems, has this info been confirmed ?It was 196GB/s only in a very specific theoretical situation, it was 133GB/s in real world scenarios.
Its still a nice boost though.
I missed an episode it seems, has this info been confirmed ?
Why would it be crazy/unrealistic?
It has not been confirmed but the information that came with it stated that it was 133GB/s in real world scenarios and thats only under specific conditions as well.