Sure it's actually STACKED memory though? From what I can decipher from peering at that terrible image it looks more like a traditional multi-chip module methinks.
No. ROPs are independent of tessellation performance.Would having ROPs etc embedded in RAM again be of any benefit to tessellation?
I wouldn't expect anything less! The limitations of Xenos eDRAM should have been learnt from.But what if the memory pool on "Durango-GPU" is really just that: a completely flexible memory pool. It could be used as a cache or a workspace for binned tiles or whatever?
Tiling has still proven an unwanted limtation in Xenos, and I doubt devs would want to be forced down that route. But if the eDRAM is whatever the devs want to do with it, rather than a forced framebuffer space, and the system has enough BW for the rendering requirements, then it'd just be a bonus. However, one has to assume that the reason to add the cost of large local storage to the GPU is to address a BW limitation in the rest of the system. It must be there to serve BW heavy tasks. Whether these would have to be FB tasks are not, I don't know. I don't know how render pipelines currently and in future can be broken up across a large main RAM and a small working eDRAM. We have a whole thread on this discussion somewhere!Would this not make the size relatively irrelevant? Yes for a game at 1080p with 4xMSAA and a lot of extra/large buffers there are going to be a handful of "tiles" but does not such a configuration avoid the issues with the Xenos eDRAM?
Yes, I was agreeing with you on that. I said:Shifty, was not the major issues with tiling (a) the significant cost of re-working geometry across tiles due to geometry passing over tile edges and (b) as it was tied to the ROPs (and not a general memory) it allowed for very limited number of applications.
The idea of a general memory pool of fast *general* embedded memory doesn't mean it has to work like Xenos
What I wonder is what BW advantages would be worth the realestate cost? The idea of 'free' transparency appeals to me. PS3 was a real backwards step in that regard vs. PS2.But if the eDRAM is whatever the devs want to do with it, rather than a forced framebuffer space, and the system has enough BW for the rendering requirements, then it'd just be a bonus.
So, the ESDRAM would mean DDR3 is a given?.
Yes, I was agreeing with you on that. I said:
What I wonder is what BW advantages would be worth the realestate cost? The idea of 'free' transparency appeals to me. PS3 was a real backwards step in that regard vs. PS2.[/quiote]
Sorry, sleep deprivation. I would venture a guess that there are some neat Compute scenarios that having a "large" very fast local memory could be an advantage. I remember a very interesting sparse sample GI solution from 2005 which was mostly slowed by memory.
I hadn't thought of that. Considering the rate GPUs can churn through data, a fast bidirectional store seems plausibly advantageous. That said, none of the IHVs seem to be going that route. I suppose the typical dataset is too large to benefit from a small local store.
So, the ESDRAM would mean DDR3 is a given?.
[puts on snarky hat] Typical datasets too large to benefit from local store? Where is all the vigor for local store from the Cell thread! If smart data organization and data streaming worked great for the 256KB the LS in SPEs then this should be a dream scenario![/snarky hat]
I think DDR3 is unlikely because over the expected lifetime of the device, it would be more expensive than DDR4.
Just because you can, dosen't mean people want to - development complexity is one of the reasons for giving up Cell.[puts on snarky hat] Typical datasets too large to benefit from local store? Where is all the vigor for local store from the Cell thread! If smart data organization and data streaming worked great for the 256KB the LS in SPEs then this should be a dream scenario![/snarky hat]
For graphics, yes. But for GPU manufacturers selling their chips to supercomputer builders, if local, fast working space was beneficial for GPGPU work, wouldn't they be adding it? Might be a bit too early yet, or it might be that GPGPU using massive parallel processing on massive datasets just needs access to large buckets of data and a few MBs won't be any help.I am sure there are a lot of scenarios where the memory would not be big enough but I think the issue would be cost/benefit. I don't have an answer there but I wouldn't point to the GPU IHV's as a reason against as their issue is more market driven than technology limited. A Xenos framebuffer would die on the many-resolution desktop and even a proprietary memory we are discussing wouldn't be a "standard spec" so you would be trading less-used proprietary tech space for general use hardware which would get you killed in current gen/last gen benchmarks.
According to the marketing book 101 the answer is cores...*What exactly are we supposed to call GPU ALUs/math units these days??
There is a massive, qualitative difference with a pool of <256K (remember, code has to fit too), and one of tens of megabytes. One of them can fit entire real-world datasets that you want to work with (namely, the framebuffer), and the other cannot.
That was my point I was being snarky due to the view some have proposed concerning the SPE LS in the other thread, especially since SPEs were often used for graphics.
So, the ESDRAM would mean DDR3 is a given?.
Reading your post, I'm pretty ashamed I didn't see that. The best I can do is blame lack of sleep.