Nvidias next step in superscalar design of SMs?

Didn't tests done by B3D members show that this superscalar efficiency was pretty much nonexistant, due to the hardware running out of available registers?
 
I'm not sure that was the conclusion exactly.
It did seem that various tests pointed out that there may not be enough operand bandwidth to sustain full-width issue of 3-operand instructions. However, instruction mixes that would otherwise leave register ports unused could potentially dual-issue.

There were some weird behaviors and register allocation, though, so it may be that there are facets to the architecture that have been obscured by other issues.
 
Could the 48 SP design also be a way to boost yields?

With GF100 if an SM is bad, it can be fused off and sold as a GTX470 or GTX465, etc. Even the GTX480 has fused off SMs so the slowest one can be discarded and boost net clocks of everyone else.

With 48 SPs in GF106, is it conceivable that a group of 16 SPs could be fused off if they're bad, giving you a "limping" SM with only 32 SPs.. which is a lot better than no SM at all.

Such an ability would only make sense if defects were commonly in the SPs (which are roughly 1/3 of the chip, but probably the most complex regions)

The question is whether this is practical. Can VLSI chips be micro-fused at such a fine level? I assume the fuse would switch the 2 SM schedulers to never use a certain bank of SPs. This is a much finer level of control than turning off an entire SM.

Side question: when a full SM is fused off, does it still burn power? Maybe it's just the block scheduler that gets modified (to never schedule the bad SM), not the SM itself.
 
Back
Top