Well I do not read it the same way, it is faster than the GPU connection to the main memory pool which is according to the rumors ~60GB shared between the CPU and GPU.So 102.4 GB/s sounds right for the GPU connection. But the internal bandwidth should be in the order of 819.2 GB/s to sustain the same kind of ROPs operations that Xenos did, without resorting to compression (which, according to the Xbox 360 designers, is undesirable because unpredictable).
He may clarify but I don't think he speak of what you are speaking about, ie and external die or what not, just the fact that the scratchpad would be only dedicated to raster operations / tight to the ROPs.
By the way 12 ROPs is nothing amazing, pitcairn have 32 (though under fed).
the whole point is what is more efficient, keep modern, powerful, efficient ROPS and link them to a scratchpad memory pool through a reasonably fast interface (enough to keep the ROPS fed all the time)? Or go with sucky ROPs as in Xenos and design really wide inteface between those ROPs an the scratchpad memory?
I'm not sure that the Scratchpad will be stuck to the ROPs, you may want the shader cores to be able to read your render target without having to moving you data back to the main RAM.
Though I've no idea about how much internal bandwidth there is lasst AMD GPUs I speak of the bandwidth at the level of what seems a "ring" on this graph: the thing that connect the L2, shader cores, ROPs to the memory controller /everything (in grey just above the memory controllers).
I could see the Scratchpad tight to that ring with the ROPs using it as favorite/primary target but having the option to render to the main if they want. That could be the most flexible solution (say you ROPs have to render a low res render target, that compress well, you don't want to move the data in the scratchpad +> render straight to the main ram, it seems MSFT when with a healthy amount of bandwidth vs something like trinity, could be like a Z only render target for occlusion purpose for example).
Though I don't know how much bandwidth that ring/crossbar provides ( I would think significantly more than the external bandwidth for obvious reasons though), I guess it is scalable (not the same needs for cap verde, pitcairn, tahiti). Something like xenos would be backward looking imho.
Last edited by a moderator: