Jawed
Legend
I'm curious about the kind of hardware that'll be used to implement streamout in D3D10 GPUs.
I suppose we have two precursors: Xenos memexport/backbuffer->frontbuffer copy and render to vertex buffer in ATI's DX9 GPUs.
In Xenos a portion of the Backend Central handles Memexport. Taking a wild stab in the dark, I'd guess that the same hardware also handles the render target data produced by the resolve process, i.e. the resolved backbuffer data which needs to be written to the frontbuffer. Both processes seem to require taking a stream of data and putting it in system RAM.
In R2VB, ROPs write directly to memory. I suppose, depending on the quantity of data points written, this may or may not occupy all of the ROPs simultaneously.
---
So, what kind of hardware configuration is best suited to performing streamout, and what degree of parallelism would suit streamout?
Is it a bad idea to use a GPU's ROPs to handle streamout? If so, what sort of complexity would a dedicated streamout unit entail? Presuming that streamout and pixel output can both occur simultaneously, what kind of demands is streamout going to place on the overall architecture?
What kind of streamout bandwidth will early D3D10 GPUs aim at? Will streamout be a features-first, performance-later runt in the first D3D10 GPUs?
Jawed
I suppose we have two precursors: Xenos memexport/backbuffer->frontbuffer copy and render to vertex buffer in ATI's DX9 GPUs.
In Xenos a portion of the Backend Central handles Memexport. Taking a wild stab in the dark, I'd guess that the same hardware also handles the render target data produced by the resolve process, i.e. the resolved backbuffer data which needs to be written to the frontbuffer. Both processes seem to require taking a stream of data and putting it in system RAM.
In R2VB, ROPs write directly to memory. I suppose, depending on the quantity of data points written, this may or may not occupy all of the ROPs simultaneously.
---
So, what kind of hardware configuration is best suited to performing streamout, and what degree of parallelism would suit streamout?
Is it a bad idea to use a GPU's ROPs to handle streamout? If so, what sort of complexity would a dedicated streamout unit entail? Presuming that streamout and pixel output can both occur simultaneously, what kind of demands is streamout going to place on the overall architecture?
What kind of streamout bandwidth will early D3D10 GPUs aim at? Will streamout be a features-first, performance-later runt in the first D3D10 GPUs?
Jawed