Yeah should be enough for graphics, now we only need ~1GB for the CPU too =).If the game cards are fast enough than 256MB should be enough. At least for NGP-s resolution and textures. Streaming for example with 30MB/s could be enough to make the 512MB only usefull in rare cases.
what happens if the binning process (i suppose thats "storing and sorting triangles" ) runs out of memory?Well, I suppose we should all be thankful to Simon F for developing a good 2-bit texture compression format at least! Also TBDR means MSAA won't take more memory. Probably the most interesting factor to consider is the performance penalty if/when the binning process runs out of space since it takes a *variable* amount of memory.
what happens if the binning process (i suppose thats "storing and sorting triangles" ) runs out of memory?
Does the GPU then render the incomplete scene to free memory before accepting additional commands? That would require an Z-Buffer aswell i suppose...
How do you get a bucket to overflow when you always empty it in time?
ummmm...shake it ?
you cant, but thats an analogy for immediate renderers I guess.How do you get a bucket to overflow when you always empty it in time?
you cant, but thats an analogy for immediate renderers I guess.
With a TBDR you cant render a pixel until you know theres no further triangle that "hits" it - which means you cant empty your bucket until you poured all the water in (finished the scene). Thats assuming the TBDR requires to operate a single pass, if not then it needs to create a incomplete picture (and some information about ZValues) and then it can empty the bucket before accepting more water.
and how is multicore related to the problems of overflowing buffers?The 4 cores operate quite in a complicated fashion when it comes to macro- and micro-tiling.
http://worldwide.espacenet.com/publ...T=D&date=20090604&CC=WO&NR=2009068895A1&KC=A1
http://worldwide.espacenet.com/publ...T=D&date=20090604&CC=WO&NR=2009068895A1&KC=A1
Or display list related patents like that one: http://worldwide.espacenet.com/publ...T=D&date=20090924&CC=WO&NR=2009115778A1&KC=A1
and how is multicore related to the problems of overflowing buffers?
Don't you think engineers have taken any of those considerations into account? Heck IMG has more than a few patents that affect display list control, compression etc etc. I can't imagine the display list(-s) are that small that they can be overflown that easily and doubt even more that if you manage to overflow that one you wouldn't get an IMR or hybrid whatever into the exact same theoretical trouble.the problem is that you cant start (fragment-)processing a single tile unless you know there is nothing, like say a translucent triangle above the ones you have in your displaylist, that affects the outcome.
you are limited in the amount of information you can store before you begin rendering, so either you decide to drop something and hope none notices or you render what-you-have and then accept new data (the extreme example beeing immediate renderers, or some "hybrid" renderer that only defers aslong there is space).
4 cores only means 4 times the buffer size, doesnt change a thing for me. fixed size is fixed size and dynamic workload can exceed it.How is it unrelated in this particular thread in the first place? You've got 4 GPU cores in the NGP, so what how and why would overflow? Do you have same sized or dynamically sized macro tiles for all of the 4 cores and one or multiple display lists? I can only imagine they use multi-level display lists (or buffers or whatever one wants to call them), compress the hell out of it and store only the absolutely necessary.
Im pretty sure IMG has chosen a way to deal with this, however unlikely a problem it might be. Certainly devs on a closed platform will be vary of any limits (which likely are high enough so you dont reach them in realtime graphics) but a generic OpenGL driver surely has to take such things into account. And I just asked which way they deal with the problem (as Im curious of the implications)Don't you think engineers have taken any of those considerations into account? Heck IMG has more than a few patents that affect display list control, compression etc etc. I can't imagine the display list(-s) are that small that they can be overflown that easily and doubt even more that if you manage to overflow that one you wouldn't get an IMR or hybrid whatever into the exact same theoretical trouble.
4 cores only means 4 times the buffer size, doesnt change a thing for me. fixed size is fixed size and dynamic workload can exceed it.
Im pretty sure IMG has chosen a way to deal with this, however unlikely a problem it might be. Certainly devs on a closed platform will be vary of any limits (which likely are high enough so you dont reach them in realtime graphics) but a generic OpenGL driver surely has to take such things into account. And I just asked which way they deal with the problem (as Im curious of the implications).
It's not necessarily secret sauce as such, but you're right that we haven't talked about it much in public yet. I'll see about changing that, so there's a bit more information about how MP works at the work distribution and memory costs level.I'd be interested too in an as simple as possible explanation (in order to understand it myself) but I suspect they keep it under wraps as some sort of secret sauce.
Doesn't the GPU-dedicated RAM need to increase in bandwidth as you increase the number of cores? Not theoretically of course, but practically.
You couldn't get a "high-end" version of your GPUs to "infinitely" scale linearly with increasing the number of cores without increasing memory bandwidth, right?
Right, but the bandwidth requirement for extra cores is low.
Actually I think we can be specific in saying that there is no significant change in memory cost associated with multi-core, I'm not sure why anyone would think there was.
We said the same thing, just in different words.ToTTenTranz said:I'm not trying to start a flamewar between comrades or anything.. but where are we standing exactly?
No MP config I know of has considered adjusting memory config to increase bandwidth to help performance, versus single core. We obviously don't want to give away figures (sadly, I'd love to walk you through a profiled frame or frames and discuss the consumers).Is it "so low" that you consider it "non significant"?
What ratios are we talking about? For each 100% increase in cores, you'll need 10% increase in memory bandwidth? More? Less? Not allowed to specify?