If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#526 |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
That's not what I understand under overflow; nice try though nonetheless.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
|
|
|
|
|
#527 | |
|
Senior Member
Join Date: Dec 2004
Posts: 1,746
|
Quote:
With a TBDR you cant render a pixel until you know theres no further triangle that "hits" it - which means you cant empty your bucket until you poured all the water in (finished the scene). Thats assuming the TBDR requires to operate a single pass, if not then it needs to create a incomplete picture (and some information about ZValues) and then it can empty the bucket before accepting more water. |
|
|
|
|
|
|
#528 | |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
Quote:
http://worldwide.espacenet.com/publi...068895A1&KC=A1 http://worldwide.espacenet.com/publi...068895A1&KC=A1 Or display list related patents like that one: http://worldwide.espacenet.com/publi...115778A1&KC=A1
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
|
|
|
|
|
|
#529 | |
|
Senior Member
Join Date: Dec 2004
Posts: 1,746
|
Quote:
the problem is that you cant start (fragment-)processing a single tile unless you know there is nothing, like say a translucent triangle above the ones you have in your displaylist, that affects the outcome. you are limited in the amount of information you can store before you begin rendering, so either you decide to drop something and hope none notices or you render what-you-have and then accept new data (the extreme example beeing immediate renderers, or some "hybrid" renderer that only defers aslong there is space). |
|
|
|
|
|
|
#530 | ||
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
Quote:
Quote:
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
||
|
|
|
|
|
#531 |
|
Senior Member
Join Date: Mar 2010
Location: Cleveland, OH
Posts: 1,567
|
IMG's GPUs do have support for multi-pass rendering, they don't have a problem loading a framebuffer from memory (and storing one to memory) if necessary. I'm almost certain that they'll stop binning and render what they have if they run out of space (including the space for the depth/stencil buffer they now need..)
|
|
|
|
|
|
#532 | ||
|
Senior Member
Join Date: Dec 2004
Posts: 1,746
|
Quote:
Quote:
@Exophase: Thanks. I guess this could be a reason recent PowerVR GPUs cant guarantee order-independent transparency, the output might dependent on where the rendering is halted and stored possibly truncated (less accurate then one-pass rendering) |
||
|
|
|
|
|
#533 |
|
Member
Join Date: Mar 2007
Location: Nebraska
Posts: 451
|
![]() Well then.. |
|
|
|
|
|
#534 | ||
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
Quote:
Quote:
In any case if there should be any cases where for whatever reason a DR would be forced to operate as an IMR (always in a highly relative sense) one of its advantages it would lose IMO would be effective fill-rate amongst others. But since in the embedded space GPUs in general don't have excessive fill-rates I don't see it as problem. If the NGP GPU is clocked at 200MHz as rumors want it, then it has 400MTexels and 3.2 GPixels z/stencil raw fill-rates per core. Besides as Arun already noted, senior Simon's 2bpp & 4bpp PVRTC are a blessing aside other things.
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
||
|
|
|
|
|
#535 |
|
Tiled
Join Date: Oct 2003
Location: Kings Langley, UK
Posts: 2,675
|
It's not necessarily secret sauce as such, but you're right that we haven't talked about it much in public yet. I'll see about changing that, so there's a bit more information about how MP works at the work distribution and memory costs level.
__________________
A major redesign of the core ALU pineapple boomerang fortress. |
|
|
|
|
|
#536 |
|
Member
Join Date: Mar 2002
Location: UK
Posts: 570
|
Actually I think we can be specific in saying that there is no significant change in memory cost associated with multi-core, I'm not sure why anyone would think there was.
Pretty certain there's been a public talk given on Multi-core by Tony King Smith that explained it's operation pretty well. John. |
|
|
|
|
|
#537 | ||
|
Senior Member
Join Date: Jul 2008
Posts: 2,157
|
Quote:
Quote:
I'm not trying to start a flamewar between comrades or anything.. but where are we standing exactly? Is it "so low" that you consider it "non significant"? What ratios are we talking about? For each 100% increase in cores, you'll need 10% increase in memory bandwidth? More? Less? Not allowed to specify? |
||
|
|
|
|
|
#538 |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
Series5XT scales up to 16 cores and not more. Ditto though for Series6/Rogue.
In any case if they claim officially themselves that past 16 it doesn't make any sense anymore, then obviously it wouldn't be worth bothering for something over 16 in pure theory for such a hypothetical case. As for where you're standing, uhmm trust the more experienced one out of the two
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
|
|
|
|
|
#539 |
|
Senior Member
Join Date: Jul 2008
Posts: 2,157
|
There's a core amount cap for series 6?
|
|
|
|
|
|
#540 | ||
|
Tiled
Join Date: Oct 2003
Location: Kings Langley, UK
Posts: 2,675
|
Quote:
Quote:
__________________
A major redesign of the core ALU pineapple boomerang fortress. |
||
|
|
|
|
|
#541 | |
|
Tiled
Join Date: Oct 2003
Location: Kings Langley, UK
Posts: 2,675
|
Quote:
__________________
A major redesign of the core ALU pineapple boomerang fortress. |
|
|
|
|
|
|
#542 | ||
|
Member
Join Date: Mar 2002
Location: UK
Posts: 570
|
Quote:
Quote:
|
||
|
|
|
|
|
#543 | |
|
Senior Member
Join Date: Feb 2002
Posts: 1,865
|
Quote:
The real difficulty lies in predicting what types of codes your customer wants to run, what resources will be spent on memory paths, and design for maximum efficiency in terms of gates/power/cost. If you over engineer, then your design will be bloated with baggage that goes largely unused, and you leave an open window for your competitors to do more with less. On the other hand, obviously you want to provide the capabilities that the customer may want, as well as provide juicy new IP to license. So IMG offers both cores with different levels of complexity, and also the possibility to widen many of these to fit perceived need. What I would like to see is bandwidth usage data for different but typical tasks, for, say, IMG, Mali, and Tegra respectively. |
|
|
|
|
|
|
#544 | |
|
Grumpy Mod
Join Date: Dec 2004
Location: In a pretty pink padded cell
Posts: 26,045
|
Quote:
Both Rys and JohnH are telling us that memory usage is linear, based on workload. You'll clearly need X times as much memory to drive X number of cores as they all consume data, but there's no additional penalty.
__________________
Shifty Geezer ... Tolerance for internet moronism is exhausted. Anyone talking about people's attitudes in the Console fora, rather than games and technology, will feel my wrath. Read the FAQ to remind yourself how to behave and avoid unsightly incidents. |
|
|
|
|
|
|
#545 |
|
Tiled
Join Date: Oct 2003
Location: Kings Langley, UK
Posts: 2,675
|
For the same number of pixels, bandwidth goes up almost negligably when you add cores to work on them. So it's not linear at all (for us anyway) when doing MP.
__________________
A major redesign of the core ALU pineapple boomerang fortress. |
|
|
|
|
|
#546 | |
|
Senior Member
Join Date: Mar 2010
Location: Cleveland, OH
Posts: 1,567
|
Quote:
As far as bandwidth goes, there'd be no increase in outgoing to render targets since this is subdivided between cores with no overlap. There may be some instances where the same data needs to be loaded into separate cores where it would have been retained in the cache of a single core, but in that case it'll either stay in the cache of all the cores that loaded it or it wouldn't have stayed in the cache of the single core. In fact, the multiple cores will texture cache better because they'll have smaller working sets but the same amount of cache each (presumably). And if they have a shared L2 cache that's even better; I'd fully expect bandwidth requirements to go down after this, not up. Maybe someone can tell me if Series5XT MP has anything like this (like Mali-400MP does) (then again, someone please tell me if there's some glaring flaw in my reasoning) |
|
|
|
|
|
|
#547 | |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,767
|
Quote:
Else for workload X irrelevant in theory if you have N pipelines from a hypothetical single core vs. the same amount of N pipelines spread over Y cores the bandwidth requirements are fairly similar, yes?
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs. |
|
|
|
|
|
|
#548 | |
|
Grumpy Mod
Join Date: Dec 2004
Location: In a pretty pink padded cell
Posts: 26,045
|
I missed a 'bandwidth' there.
Quote:
__________________
Shifty Geezer ... Tolerance for internet moronism is exhausted. Anyone talking about people's attitudes in the Console fora, rather than games and technology, will feel my wrath. Read the FAQ to remind yourself how to behave and avoid unsightly incidents. |
|
|
|
|
|
|
#549 |
|
Senior Member
Join Date: Jun 2008
Posts: 1,747
|
Isn't the main benefit of TBDRs is that you don't need to go off-chip as IMRs meaning you need less bandwidth for a given workload?
|
|
|
|
|
|
#550 | |
|
Senior Member
Join Date: Jul 2008
Posts: 2,157
|
Quote:
By increasing the number of cores, one assumes the purpose is to also increase the number of rendered pixels, increased geometry, post-processing effects, higher-resolution textures, etc. And by doing so, the SGX5 architecture would naturally need to also increase the memory bandwidth available for the whole GPU (and not memory bandwidth per-core), or the graphics subsystem would face a bottleneck eventually. Maybe for not using the right terms, the answers I was getting were not for the question I made, hence all the confusion. |
|
|
|
|
![]() |
| Tags |
| 543mp4, dreamcast, imgtec, sgx, tbdr, vindicated |
| Thread Tools | |
| Display Modes | |
|
|