ATi presentation on future GPU design

MfA said:
That kinda sounds like a haiku, makes less sense than most though.

Saving contexts is exactly what a pixel shader does when it puts aside a quad when a texture fetch occurs.

The question though is whether it actually puts it aside or simply delays it using some form of virtual pipe stages. And the control flow is shared by all the pixels within a pipeline currently as well. Threads as commonly accepted throughout most of computer architecture imply independance and that independance appears to be lacking within the current designs of a pixel pipeline.
 
nelg said:
If I understand this thread correctly (which is a big if) would Mr. Spink being trying say that one method of multi threading is more akin to time slicing?

I would say that the time slicing between independant control flows vs data slicing using the same control flow is one of the major distinguishers between a multi-threaded architecture and streaming architecture.

Aaron spink
speaking for myself inc.
 
aaronspink said:
The question though is whether it actually puts it aside or simply delays it using some form of virtual pipe stages.

Well as before, it's really irrelevant since if 3D IHVs and researchers call it multithreading Im going to take their definitions over those of opionated outsiders ...

In the case of the CineFX patent though you can see it is a little of both ... it will in effect be pipelined, but for flexibility this pipelining is implemented with a central FIFO rather than distributed storage. So you can compensate for varying context sizes.

In principle the gatekeeper should be able to directly reinsert pixel-quads into the FIFO if the requirements for it's execution arent met yet ... but it doesnt seem aimed at testing if the texture fetch has completed, so in the end it does just pipeline the execution of shaders by interleaving chunks of it for as many pixel-quads fit in the FIFO.

I dont know if it's a terribly good idea though, because of the multiple memory interfaces, and possibly memory access reordering, you can be sure memory accesses wont always complete in order. Impossible to say for sure if there are realistic alternatives though (or rather you can give a proof positive, but not negative). For instance just reinserting into a FIFO could actually slow things down if it is too small to even handle best case latency.

And the control flow is shared by all the pixels within a pipeline currently as well. Threads as commonly accepted throughout most of computer architecture imply independance and that independance appears to be lacking within the current designs of a pixel pipeline.

Which is irrelevant to me if NVIDIA/ATI/3DLABS/etc disagree with mainstream computer architecture.

Marco
 
Back
Top