Console GPU afterthought: Hybrid fixed/unified shaders

ROG27 · Dec 28, 2005

I've been thinking...cars have been doing it lately...so why not GPUs? Fixed-type ALU's give the performance benefits in exchange for flexibility and efficiency, while unified pipes tend to take a performance hit. Why not have a primarily fixed-focus gpu with powerful fixed-shaders (8 + 24 for instance) and then incorporate another 8 to 16 ALUs for doing unified shading to add flexibility and efficiency where necessary.

People's take? RSX possibility? Hollywood possibility?

nelg · Dec 28, 2005

ROG27 said:
while unified pipes tend to take a performance hit.

Where do you get this from. IMHO the only hit unified shaders take is the associated scheduling logic adds to die space.

ROG27 · Dec 28, 2005

nelg said:
Where do you get this from. IMHO the only hit unified shaders take is the associated scheduling logic adds to die space.

Not a hit so much as a cost/performance trade-off. Because of all that extra die space being used for scheduling etc. is it viable to create unified shaders with an output of more than 2 shader ops per cycle?

ROG27 · Dec 28, 2005

I'm unsure of how many programmable shader ops you can get per cycle in fixed vs unified shader gpu's...but 96 for unified and 132 for fixed comes to mind for the latest and greatest.

Arun · Dec 28, 2005

Scheduling has a relatively high fixed cost, and an average per-transistor cost, with the fixed cost depending a lot on the implementation (cf. PowerVR SGX, which manages to get an unified architecture with a real scheduler in a minimal number of transistors).

If anything, when you get at much bigger transistor budgets (1T+), I'd suspect unified architectures would begin to scale better than non-unified ones (which, for the LOVE OF GOD, shouldn't be called "fixed" - such a keyword is reserved for DX7-style fixed-function architectures, damnit!)

Uttar

BRiT · Dec 28, 2005

ROG27 said:
Not a hit so much as a cost/performance trade-off. Because of all that extra die space being used for scheduling etc. is it viable to create unified shaders with an output of more than 2 shader ops per cycle?

So how much extra die space does scheduling etc take up?

ROG27 · Dec 28, 2005

BRiT said:
So how much extra die space does scheduling etc take up?

obviously enough trannies to eat into peformance vs. a standard non-unified architecture with fewer ALUs.

ROG27 · Dec 28, 2005

you can't do/have everything and dissipate enough heat to make the thing work with the current manufacturing process (90 nm) IMO. Perhaps this is why higher performing Unified solutions are not hitting shelves at lightspeed as of yet.

ROG27 · Dec 28, 2005

Uttar said:
Scheduling has a relatively high fixed cost, and an average per-transistor cost, with the fixed cost depending a lot on the implementation (cf. PowerVR SGX, which manages to get an unified architecture with a real scheduler in a minimal number of transistors).

If anything, when you get at much bigger transistor budgets (1T+), I'd suspect unified architectures would begin to scale better than non-unified ones (which, for the LOVE OF GOD, shouldn't be called "fixed" - such a keyword is reserved for DX7-style fixed-function architectures, damnit!)

Uttar

Exactly...so why wouldn't a hybrid GPU be a nice solution until we get there (1T+).

Arun · Dec 28, 2005

ROG27 said:
Exactly...so why wouldn't a hybrid GPU be a nice solution until we get there (1T+).

Reread what I said. Part of the scheduling costs are static, that means they don't scale with the number of pipelines you got for a given efficiency. As such, a hybrid solution would have higher transistor costs for a given performance than EITHER unified or non-unified solutions, at least IMO.

Uttar

ROG27 · Dec 28, 2005

Uttar said:
Reread what I said. Part of the scheduling costs are static, that means they don't scale with the number of pipelines you got for a given efficiency. As such, a hybrid solution would have higher transistor costs for a given performance than EITHER unified or non-unified solutions, at least IMO.

Uttar

Oh ok...I understand what you are saying...but I guess how costly the static part is would be the key to understanding if a hybrid would be worth it or not. If the variable transistor cost (per unified ALU added) is significantly higher than the fixed allocated cost, than perhaps a limited number of unified ALUs would be the better solution currently. But we know not (at least I don't) what these tranny costs for scheduling are.

_xxx_ · Dec 28, 2005

Just a stupid thought, but why couldn't nV or ATI license the SGX design or some parts of it if that's really a better design (which I certainly don't know, but the idea is theoretically possible)?

Console GPU afterthought: Hybrid fixed/unified shaders

ROG27

nelg

ROG27

ROG27

Arun

Unknown.

BRiT

(>• •)>⌐■-■ (⌐■-■)

ROG27

ROG27

ROG27

Arun

Unknown.

ROG27

_xxx_

Similar threads